database search tool: Topics by Science.gov

Sample records for database search tool

Comet: an open-source MS/MS sequence database search tool.

PubMed

Eng, Jimmy K; Jahan, Tahmina A; Hoopmann, Michael R

2013-01-01

Proteomics research routinely involves identifying peptides and proteins via MS/MS sequence database search. Thus the database search engine is an integral tool in many proteomics research groups. Here, we introduce the Comet search engine to the existing landscape of commercial and open-source database search tools. Comet is open source, freely available, and based on one of the original sequence database search tools that has been widely used for many years. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Federated Search Tools in Fusion Centers: Bridging Databases in the Information Sharing Environment

DTIC Science & Technology

2012-09-01

considerable variation in how fusion centers plan for, gather requirements, select and acquire federated search tools to bridge disparate databases...centers, when considering integrating federated search tools; by evaluating the importance of the planning, requirements gathering, selection and...acquisition processes for integrating federated search tools; by acknowledging the challenges faced by some fusion centers during these integration processes
Global search tool for the Advanced Photon Source Integrated Relational Model of Installed Systems (IRMIS) database.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Quock, D. E. R.; Cianciarulo, M. B.; APS Engineering Support Division

2007-01-01

The Integrated Relational Model of Installed Systems (IRMIS) is a relational database tool that has been implemented at the Advanced Photon Source to maintain an updated account of approximately 600 control system software applications, 400,000 process variables, and 30,000 control system hardware components. To effectively display this large amount of control system information to operators and engineers, IRMIS was initially built with nine Web-based viewers: Applications Organizing Index, IOC, PLC, Component Type, Installed Components, Network, Controls Spares, Process Variables, and Cables. However, since each viewer is designed to provide details from only one major category of the control system, themore » necessity for a one-stop global search tool for the entire database became apparent. The user requirements for extremely fast database search time and ease of navigation through search results led to the choice of Asynchronous JavaScript and XML (AJAX) technology in the implementation of the IRMIS global search tool. Unique features of the global search tool include a two-tier level of displayed search results, and a database data integrity validation and reporting mechanism.« less
Subject Specific Databases: A Powerful Research Tool

ERIC Educational Resources Information Center

Young, Terrence E., Jr.

2004-01-01

Subject specific databases, or vortals (vertical portals), are databases that provide highly detailed research information on a particular topic. They are the smallest, most focused search tools on the Internet and, in recent years, they've been on the rise. Currently, more of the so-called "mainstream" search engines, subject directories, and…
SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters.

PubMed

Wang, Chunlin; Lefkowitz, Elliot J

2004-10-28

Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single node. Used together, QS-search and DS-BLAST provide a flexible solution to adapt sequential similarity searching applications in high performance computing environments. Their ease of use and their ability to wrap a variety of database search programs provide an analytical architecture to assist both the seasoned bioinformaticist and the wet-bench biologist.
SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters

PubMed Central

Wang, Chunlin; Lefkowitz, Elliot J

2004-01-01

Background Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. Results We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single node. Conclusions Used together, QS-search and DS-BLAST provide a flexible solution to adapt sequential similarity searching applications in high performance computing environments. Their ease of use and their ability to wrap a variety of database search programs provide an analytical architecture to assist both the seasoned bioinformaticist and the wet-bench biologist. PMID:15511296
Internet Databases of the Properties, Enzymatic Reactions, and Metabolism of Small Molecules—Search Options and Applications in Food Science

PubMed Central

Minkiewicz, Piotr; Darewicz, Małgorzata; Iwaniak, Anna; Bucholska, Justyna; Starowicz, Piotr; Czyrko, Emilia

2016-01-01

Internet databases of small molecules, their enzymatic reactions, and metabolism have emerged as useful tools in food science. Database searching is also introduced as part of chemistry or enzymology courses for food technology students. Such resources support the search for information about single compounds and facilitate the introduction of secondary analyses of large datasets. Information can be retrieved from databases by searching for the compound name or structure, annotating with the help of chemical codes or drawn using molecule editing software. Data mining options may be enhanced by navigating through a network of links and cross-links between databases. Exemplary databases reviewed in this article belong to two classes: tools concerning small molecules (including general and specialized databases annotating food components) and tools annotating enzymes and metabolism. Some problems associated with database application are also discussed. Data summarized in computer databases may be used for calculation of daily intake of bioactive compounds, prediction of metabolism of food components, and their biological activity as well as for prediction of interactions between food component and drugs. PMID:27929431
Internet Databases of the Properties, Enzymatic Reactions, and Metabolism of Small Molecules-Search Options and Applications in Food Science.

PubMed

Minkiewicz, Piotr; Darewicz, Małgorzata; Iwaniak, Anna; Bucholska, Justyna; Starowicz, Piotr; Czyrko, Emilia

2016-12-06

Internet databases of small molecules, their enzymatic reactions, and metabolism have emerged as useful tools in food science. Database searching is also introduced as part of chemistry or enzymology courses for food technology students. Such resources support the search for information about single compounds and facilitate the introduction of secondary analyses of large datasets. Information can be retrieved from databases by searching for the compound name or structure, annotating with the help of chemical codes or drawn using molecule editing software. Data mining options may be enhanced by navigating through a network of links and cross-links between databases. Exemplary databases reviewed in this article belong to two classes: tools concerning small molecules (including general and specialized databases annotating food components) and tools annotating enzymes and metabolism. Some problems associated with database application are also discussed. Data summarized in computer databases may be used for calculation of daily intake of bioactive compounds, prediction of metabolism of food components, and their biological activity as well as for prediction of interactions between food component and drugs.
Search Engines for Tomorrow's Scholars

ERIC Educational Resources Information Center

Fagan, Jody Condit

2011-01-01

Today's scholars face an outstanding array of choices when choosing search tools: Google Scholar, discipline-specific abstracts and index databases, library discovery tools, and more recently, Microsoft's re-launch of their academic search tool, now dubbed Microsoft Academic Search. What are these tools' strengths for the emerging needs of…
Genetic Testing Registry

MedlinePlus

... Splign Vector Alignment Search Tool (VAST) All Data & Software Resources... Domains & Structures BioSystems Cn3D Conserved Domain Database (CDD) Conserved Domain Search Service (CD Search) Structure (Molecular Modeling Database) Vector Alignment ...
National Center for Biotechnology Information

MedlinePlus

... Splign Vector Alignment Search Tool (VAST) All Data & Software Resources... Domains & Structures BioSystems Cn3D Conserved Domain Database (CDD) Conserved Domain Search Service (CD Search) Structure (Molecular Modeling Database) Vector Alignment ...
Evaluation of an open source tool for indexing and searching enterprise radiology and pathology reports

NASA Astrophysics Data System (ADS)

Kim, Woojin; Boonn, William

2010-03-01

Data mining of existing radiology and pathology reports within an enterprise health system can be used for clinical decision support, research, education, as well as operational analyses. In our health system, the database of radiology and pathology reports exceeds 13 million entries combined. We are building a web-based tool to allow search and data analysis of these combined databases using freely available and open source tools. This presentation will compare performance of an open source full-text indexing tool to MySQL's full-text indexing and searching and describe implementation procedures to incorporate these capabilities into a radiology-pathology search engine.
Criteria for Comparing Children's Web Search Tools.

ERIC Educational Resources Information Center

Kuntz, Jerry

1999-01-01

Presents criteria for evaluating and comparing Web search tools designed for children. Highlights include database size; accountability; categorization; search access methods; help files; spell check; URL searching; links to alternative search services; advertising; privacy policy; and layout and design. (LRW)
Evaluation of Federated Searching Options for the School Library

ERIC Educational Resources Information Center

Abercrombie, Sarah E.

2008-01-01

Three hosted federated search tools, Follett One Search, Gale PowerSearch Plus, and WebFeat Express, were configured and implemented in a school library. Databases from five vendors and the OPAC were systematically searched. Federated search results were compared with each other and to the results of the same searches in the database's native…
Ocean Drilling Program: Science Operator Search Engine

Science.gov Websites

and products Drilling services and tools Online Janus database Search the ODP/TAMU web site ODP's main -USIO site, plus IODP, ODP, and DSDP Publications, together or separately. ODP | Search | Database
Text mining for search term development in systematic reviewing: A discussion of some methods and challenges.

PubMed

Stansfield, Claire; O'Mara-Eves, Alison; Thomas, James

2017-09-01

Using text mining to aid the development of database search strings for topics described by diverse terminology has potential benefits for systematic reviews; however, methods and tools for accomplishing this are poorly covered in the research methods literature. We briefly review the literature on applications of text mining for search term development for systematic reviewing. We found that the tools can be used in 5 overarching ways: improving the precision of searches; identifying search terms to improve search sensitivity; aiding the translation of search strategies across databases; searching and screening within an integrated system; and developing objectively derived search strategies. Using a case study and selected examples, we then reflect on the utility of certain technologies (term frequency-inverse document frequency and Termine, term frequency, and clustering) in improving the precision and sensitivity of searches. Challenges in using these tools are discussed. The utility of these tools is influenced by the different capabilities of the tools, the way the tools are used, and the text that is analysed. Increased awareness of how the tools perform facilitates the further development of methods for their use in systematic reviews. Copyright © 2017 John Wiley & Sons, Ltd.
Large-scale feature searches of collections of medical imagery

NASA Astrophysics Data System (ADS)

Hedgcock, Marcus W.; Karshat, Walter B.; Levitt, Tod S.; Vosky, D. N.

1993-09-01

Large scale feature searches of accumulated collections of medical imagery are required for multiple purposes, including clinical studies, administrative planning, epidemiology, teaching, quality improvement, and research. To perform a feature search of large collections of medical imagery, one can either search text descriptors of the imagery in the collection (usually the interpretation), or (if the imagery is in digital format) the imagery itself. At our institution, text interpretations of medical imagery are all available in our VA Hospital Information System. These are downloaded daily into an off-line computer. The text descriptors of most medical imagery are usually formatted as free text, and so require a user friendly database search tool to make searches quick and easy for any user to design and execute. We are tailoring such a database search tool (Liveview), developed by one of the authors (Karshat). To further facilitate search construction, we are constructing (from our accumulated interpretation data) a dictionary of medical and radiological terms and synonyms. If the imagery database is digital, the imagery which the search discovers is easily retrieved from the computer archive. We describe our database search user interface, with examples, and compare the efficacy of computer assisted imagery searches from a clinical text database with manual searches. Our initial work on direct feature searches of digital medical imagery is outlined.
FlavonoidSearch: A system for comprehensive flavonoid annotation by mass spectrometry.

PubMed

Akimoto, Nayumi; Ara, Takeshi; Nakajima, Daisuke; Suda, Kunihiro; Ikeda, Chiaki; Takahashi, Shingo; Muneto, Reiko; Yamada, Manabu; Suzuki, Hideyuki; Shibata, Daisuke; Sakurai, Nozomu

2017-04-28

Currently, in mass spectrometry-based metabolomics, limited reference mass spectra are available for flavonoid identification. In the present study, a database of probable mass fragments for 6,867 known flavonoids (FsDatabase) was manually constructed based on new structure- and fragmentation-related rules using new heuristics to overcome flavonoid complexity. We developed the FlavonoidSearch system for flavonoid annotation, which consists of the FsDatabase and a computational tool (FsTool) to automatically search the FsDatabase using the mass spectra of metabolite peaks as queries. This system showed the highest identification accuracy for the flavonoid aglycone when compared to existing tools and revealed accurate discrimination between the flavonoid aglycone and other compounds. Sixteen new flavonoids were found from parsley, and the diversity of the flavonoid aglycone among different fruits and vegetables was investigated.
Ocean Drilling Program: Mirror Sites

Science.gov Websites

Publication services and products Drilling services and tools Online Janus database Search the ODP/TAMU web information, see www.iodp-usio.org. ODP | Search | Database | Drilling | Publications | Science | Cruise Info
Ocean Drilling Program: TAMU Staff Directory

Science.gov Websites

products Drilling services and tools Online Janus database Search the ODP/TAMU web site ODP's main web site Employment Opportunities ODP | Search | Database | Drilling | Publications | Science | Cruise Info | Public

Extension modules for storage, visualization and querying of genomic, genetic and breeding data in Tripal databases

PubMed Central

Lee, Taein; Cheng, Chun-Huai; Ficklin, Stephen; Yu, Jing; Humann, Jodi; Main, Dorrie

2017-01-01

Abstract Tripal is an open-source database platform primarily used for development of genomic, genetic and breeding databases. We report here on the release of the Chado Loader, Chado Data Display and Chado Search modules to extend the functionality of the core Tripal modules. These new extension modules provide additional tools for (1) data loading, (2) customized visualization and (3) advanced search functions for supported data types such as organism, marker, QTL/Mendelian Trait Loci, germplasm, map, project, phenotype, genotype and their respective metadata. The Chado Loader module provides data collection templates in Excel with defined metadata and data loaders with front end forms. The Chado Data Display module contains tools to visualize each data type and the metadata which can be used as is or customized as desired. The Chado Search module provides search and download functionality for the supported data types. Also included are the tools to visualize map and species summary. The use of materialized views in the Chado Search module enables better performance as well as flexibility of data modeling in Chado, allowing existing Tripal databases with different metadata types to utilize the module. These Tripal Extension modules are implemented in the Genome Database for Rosaceae (rosaceae.org), CottonGen (cottongen.org), Citrus Genome Database (citrusgenomedb.org), Genome Database for Vaccinium (vaccinium.org) and the Cool Season Food Legume Database (coolseasonfoodlegume.org). Database URL: https://www.citrusgenomedb.org/, https://www.coolseasonfoodlegume.org/, https://www.cottongen.org/, https://www.rosaceae.org/, https://www.vaccinium.org/
Content based information retrieval in forensic image databases.

PubMed

Geradts, Zeno; Bijhold, Jurrien

2002-03-01

This paper gives an overview of the various available image databases and ways of searching these databases on image contents. The developments in research groups of searching in image databases is evaluated and compared with the forensic databases that exist. Forensic image databases of fingerprints, faces, shoeprints, handwriting, cartridge cases, drugs tablets, and tool marks are described. The developments in these fields appear to be valuable for forensic databases, especially that of the framework in MPEG-7, where the searching in image databases is standardized. In the future, the combination of the databases (also DNA-databases) and possibilities to combine these can result in stronger forensic evidence.
Efficient bibliographic searches on allergy using ISI databases.

PubMed

Sáez Gómez, J M; Annan, J W; Negro Alvarez, J M; Guillen-Grima, F; Bozzola, C M; Ivancevich, J C; Aguinaga Ontoso, E

2008-01-01

The aim of this article is to provide an introduction to using databases from the Thomson ISI Web of Knowledge, with special reference to Citation Indexes as an analysis tool for publications, and also to explain the meaning of the well-known Impact Factor. We present the partially modified new Consultation Interface to enhance information search routines of these databases. It introduces distinctive methods in search bibliography, including the correct application of analysis tools, paying particular attention to Journal Citation Reports and Impact Factor. We finish this article with comment on the consequences of using the Impact Factor as a quality indicator for the assessment of journals and publications, and how to ensure measures for indexing in the Thomson ISI Databases.
Rapid identification of anonymous subjects in large criminal databases: problems and solutions in IAFIS III/FBI subject searches

NASA Astrophysics Data System (ADS)

Kutzleb, C. D.

1997-02-01

The high incidence of recidivism (repeat offenders) in the criminal population makes the use of the IAFIS III/FBI criminal database an important tool in law enforcement. The problems and solutions employed by IAFIS III/FBI criminal subject searches are discussed for the following topics: (1) subject search selectivity and reliability; (2) the difficulty and limitations of identifying subjects whose anonymity may be a prime objective; (3) database size, search workload, and search response time; (4) techniques and advantages of normalizing the variability in an individual's name and identifying features into identifiable and discrete categories; and (5) the use of database demographics to estimate the likelihood of a match between a search subject and database subjects.
GWFASTA: server for FASTA search in eukaryotic and microbial genomes.

PubMed

Issac, Biju; Raghava, G P S

2002-09-01

Similarity searches are a powerful method for solving important biological problems such as database scanning, evolutionary studies, gene prediction, and protein structure prediction. FASTA is a widely used sequence comparison tool for rapid database scanning. Here we describe the GWFASTA server that was developed to assist the FASTA user in similarity searches against partially and/or completely sequenced genomes. GWFASTA consists of more than 60 microbial genomes, eight eukaryote genomes, and proteomes of annotatedgenomes. Infact, it provides the maximum number of databases for similarity searching from a single platform. GWFASTA allows the submission of more than one sequence as a single query for a FASTA search. It also provides integrated post-processing of FASTA output, including compositional analysis of proteins, multiple sequences alignment, and phylogenetic analysis. Furthermore, it summarizes the search results organism-wise for prokaryotes and chromosome-wise for eukaryotes. Thus, the integration of different tools for sequence analyses makes GWFASTA a powerful toolfor biologists.
Building and evaluating an informatics tool to facilitate analysis of a biomedical literature search service in an academic medical center library.

PubMed

Hinton, Elizabeth G; Oelschlegel, Sandra; Vaughn, Cynthia J; Lindsay, J Michael; Hurst, Sachiko M; Earl, Martha

2013-01-01

This study utilizes an informatics tool to analyze a robust literature search service in an academic medical center library. Structured interviews with librarians were conducted focusing on the benefits of such a tool, expectations for performance, and visual layout preferences. The resulting application utilizes Microsoft SQL Server and .Net Framework 3.5 technologies, allowing for the use of a web interface. Customer tables and MeSH terms are included. The National Library of Medicine MeSH database and entry terms for each heading are incorporated, resulting in functionality similar to searching the MeSH database through PubMed. Data reports will facilitate analysis of the search service.
CLAST: CUDA implemented large-scale alignment search tool.

PubMed

Yano, Masahiro; Mori, Hiroshi; Akiyama, Yutaka; Yamada, Takuji; Kurokawa, Ken

2014-12-11

Metagenomics is a powerful methodology to study microbial communities, but it is highly dependent on nucleotide sequence similarity searching against sequence databases. Metagenomic analyses with next-generation sequencing technologies produce enormous numbers of reads from microbial communities, and many reads are derived from microbes whose genomes have not yet been sequenced, limiting the usefulness of existing sequence similarity search tools. Therefore, there is a clear need for a sequence similarity search tool that can rapidly detect weak similarity in large datasets. We developed a tool, which we named CLAST (CUDA implemented large-scale alignment search tool), that enables analyses of millions of reads and thousands of reference genome sequences, and runs on NVIDIA Fermi architecture graphics processing units. CLAST has four main advantages over existing alignment tools. First, CLAST was capable of identifying sequence similarities ~80.8 times faster than BLAST and 9.6 times faster than BLAT. Second, CLAST executes global alignment as the default (local alignment is also an option), enabling CLAST to assign reads to taxonomic and functional groups based on evolutionarily distant nucleotide sequences with high accuracy. Third, CLAST does not need a preprocessed sequence database like Burrows-Wheeler Transform-based tools, and this enables CLAST to incorporate large, frequently updated sequence databases. Fourth, CLAST requires <2 GB of main memory, making it possible to run CLAST on a standard desktop computer or server node. CLAST achieved very high speed (similar to the Burrows-Wheeler Transform-based Bowtie 2 for long reads) and sensitivity (equal to BLAST, BLAT, and FR-HIT) without the need for extensive database preprocessing or a specialized computing platform. Our results demonstrate that CLAST has the potential to be one of the most powerful and realistic approaches to analyze the massive amount of sequence data from next-generation sequencing technologies.
Ocean Drilling Program: Web Site Access Statistics

Science.gov Websites

and products Drilling services and tools Online Janus database Search the ODP/TAMU web site ODP's main See statistics for JOIDES members. See statistics for Janus database. 1997 October November December accessible only on www-odp.tamu.edu. ** End of ODP, start of IODP. Privacy Policy ODP | Search | Database
A literature search tool for intelligent extraction of disease-associated genes.

PubMed

Jung, Jae-Yoon; DeLuca, Todd F; Nelson, Tristan H; Wall, Dennis P

2014-01-01

To extract disorder-associated genes from the scientific literature in PubMed with greater sensitivity for literature-based support than existing methods. We developed a PubMed query to retrieve disorder-related, original research articles. Then we applied a rule-based text-mining algorithm with keyword matching to extract target disorders, genes with significant results, and the type of study described by the article. We compared our resulting candidate disorder genes and supporting references with existing databases. We demonstrated that our candidate gene set covers nearly all genes in manually curated databases, and that the references supporting the disorder-gene link are more extensive and accurate than other general purpose gene-to-disorder association databases. We implemented a novel publication search tool to find target articles, specifically focused on links between disorders and genotypes. Through comparison against gold-standard manually updated gene-disorder databases and comparison with automated databases of similar functionality we show that our tool can search through the entirety of PubMed to extract the main gene findings for human diseases rapidly and accurately.
Database systems for knowledge-based discovery.

PubMed

Jagarlapudi, Sarma A R P; Kishan, K V Radha

2009-01-01

Several database systems have been developed to provide valuable information from the bench chemist to biologist, medical practitioner to pharmaceutical scientist in a structured format. The advent of information technology and computational power enhanced the ability to access large volumes of data in the form of a database where one could do compilation, searching, archiving, analysis, and finally knowledge derivation. Although, data are of variable types the tools used for database creation, searching and retrieval are similar. GVK BIO has been developing databases from publicly available scientific literature in specific areas like medicinal chemistry, clinical research, and mechanism-based toxicity so that the structured databases containing vast data could be used in several areas of research. These databases were classified as reference centric or compound centric depending on the way the database systems were designed. Integration of these databases with knowledge derivation tools would enhance the value of these systems toward better drug design and discovery.
Tachyon search speeds up retrieval of similar sequences by several orders of magnitude.

PubMed

Tan, Joshua; Kuchibhatla, Durga; Sirota, Fernanda L; Sherman, Westley A; Gattermayer, Tobias; Kwoh, Chia Yee; Eisenhaber, Frank; Schneider, Georg; Maurer-Stroh, Sebastian

2012-06-15

The usage of current sequence search tools becomes increasingly slower as databases of protein sequences continue to grow exponentially. Tachyon, a new algorithm that identifies closely related protein sequences ~200 times faster than standard BLAST, circumvents this limitation with a reduced database and oligopeptide matching heuristic. The tool is publicly accessible as a webserver at http://tachyon.bii.a-star.edu.sg and can also be accessed programmatically through SOAP.
Web Database Development: Implications for Academic Publishing.

ERIC Educational Resources Information Center

Fernekes, Bob

This paper discusses the preliminary planning, design, and development of a pilot project to create an Internet accessible database and search tool for locating and distributing company data and scholarly work. Team members established four project objectives: (1) to develop a Web accessible database and decision tool that creates Web pages on the…
The Human Transcript Database: A Catalogue of Full Length cDNA Inserts

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bouckk John; Michael McLeod; Kim Worley

1999-09-10

The BCM Search Launcher provided improved access to web-based sequence analysis services during the granting period and beyond. The Search Launcher web site grouped analysis procedures by function and provided default parameters that provided reasonable search results for most applications. For instance, most queries were automatically masked for repeat sequences prior to sequence database searches to avoid spurious matches. In addition to the web-based access and arrangements that were made using the functions easier, the BCM Search Launcher provided unique value-added applications like the BEAUTY sequence database search tool that combined information about protein domains and sequence database search resultsmore » to give an enhanced, more complete picture of the reliability and relative value of the information reported. This enhanced search tool made evaluating search results more straight-forward and consistent. Some of the favorite features of the web site are the sequence utilities and the batch client functionality that allows processing of multiple samples from the command line interface. One measure of the success of the BCM Search Launcher is the number of sites that have adopted the models first developed on the site. The graphic display on the BLAST search from the NCBI web site is one such outgrowth, as is the display of protein domain search results within BLAST search results, and the design of the Biology Workbench application. The logs of usage and comments from users confirm the great utility of this resource.« less
Specialist Bibliographic Databases

PubMed Central

2016-01-01

Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls. PMID:27134485
Specialist Bibliographic Databases.

PubMed

Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A; Trukhachev, Vladimir I; Kostyukova, Elena I; Gerasimov, Alexey N; Kitas, George D

2016-05-01

Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls.
OntoMate: a text-mining tool aiding curation at the Rat Genome Database

PubMed Central

Liu, Weisong; Laulederkind, Stanley J. F.; Hayman, G. Thomas; Wang, Shur-Jen; Nigam, Rajni; Smith, Jennifer R.; De Pons, Jeff; Dwinell, Melinda R.; Shimoyama, Mary

2015-01-01

The Rat Genome Database (RGD) is the premier repository of rat genomic, genetic and physiologic data. Converting data from free text in the scientific literature to a structured format is one of the main tasks of all model organism databases. RGD spends considerable effort manually curating gene, Quantitative Trait Locus (QTL) and strain information. The rapidly growing volume of biomedical literature and the active research in the biological natural language processing (bioNLP) community have given RGD the impetus to adopt text-mining tools to improve curation efficiency. Recently, RGD has initiated a project to use OntoMate, an ontology-driven, concept-based literature search engine developed at RGD, as a replacement for the PubMed (http://www.ncbi.nlm.nih.gov/pubmed) search engine in the gene curation workflow. OntoMate tags abstracts with gene names, gene mutations, organism name and most of the 16 ontologies/vocabularies used at RGD. All terms/ entities tagged to an abstract are listed with the abstract in the search results. All listed terms are linked both to data entry boxes and a term browser in the curation tool. OntoMate also provides user-activated filters for species, date and other parameters relevant to the literature search. Using the system for literature search and import has streamlined the process compared to using PubMed. The system was built with a scalable and open architecture, including features specifically designed to accelerate the RGD gene curation process. With the use of bioNLP tools, RGD has added more automation to its curation workflow. Database URL: http://rgd.mcw.edu PMID:25619558
Make Mine a Metasearcher, Please!

ERIC Educational Resources Information Center

Repman, Judi; Carlson, Randal D.

2000-01-01

Describes metasearch tools and explains their value in helping library media centers improve students' Web searches. Discusses Boolean queries and the emphasis on speed at the expense of comprehensiveness; and compares four metasearch tools, including the number of search engines consulted, user control, and databases included. (LRW)
BEAUTY-X: enhanced BLAST searches for DNA queries.

PubMed

Worley, K C; Culpepper, P; Wiese, B A; Smith, R F

1998-01-01

BEAUTY (BLAST Enhanced Alignment Utility) is an enhanced version of the BLAST database search tool that facilitates identification of the functions of matched sequences. Three recent improvements to the BEAUTY program described here make the enhanced output (1) available for DNA queries, (2) available for searches of any protein database, and (3) more up-to-date, with periodic updates of the domain information. BEAUTY searches of the NCBI and EMBL non-redundant protein sequence databases are available from the BCM Search Launcher Web pages (http://gc.bcm.tmc. edu:8088/search-launcher/launcher.html). BEAUTY Post-Processing of submitted search results is available using the BCM Search Launcher Batch Client (version 2.6) (ftp://gc.bcm.tmc. edu/pub/software/search-launcher/). Example figures are available at http://dot.bcm.tmc. edu:9331/papers/beautypp.html (kworley,culpep)@bcm.tmc.edu
SW#db: GPU-Accelerated Exact Sequence Similarity Database Search.

PubMed

Korpar, Matija; Šošić, Martin; Blažeka, Dino; Šikić, Mile

2015-01-01

In recent years we have witnessed a growth in sequencing yield, the number of samples sequenced, and as a result-the growth of publicly maintained sequence databases. The increase of data present all around has put high requirements on protein similarity search algorithms with two ever-opposite goals: how to keep the running times acceptable while maintaining a high-enough level of sensitivity. The most time consuming step of similarity search are the local alignments between query and database sequences. This step is usually performed using exact local alignment algorithms such as Smith-Waterman. Due to its quadratic time complexity, alignments of a query to the whole database are usually too slow. Therefore, the majority of the protein similarity search methods prior to doing the exact local alignment apply heuristics to reduce the number of possible candidate sequences in the database. However, there is still a need for the alignment of a query sequence to a reduced database. In this paper we present the SW#db tool and a library for fast exact similarity search. Although its running times, as a standalone tool, are comparable to the running times of BLAST, it is primarily intended to be used for exact local alignment phase in which the database of sequences has already been reduced. It uses both GPU and CPU parallelization and was 4-5 times faster than SSEARCH, 6-25 times faster than CUDASW++ and more than 20 times faster than SSW at the time of writing, using multiple queries on Swiss-prot and Uniref90 databases.
SalmonDB: a bioinformatics resource for Salmo salar and Oncorhynchus mykiss

PubMed Central

Di Génova, Alex; Aravena, Andrés; Zapata, Luis; González, Mauricio; Maass, Alejandro; Iturra, Patricia

2011-01-01

SalmonDB is a new multiorganism database containing EST sequences from Salmo salar, Oncorhynchus mykiss and the whole genome sequence of Danio rerio, Gasterosteus aculeatus, Tetraodon nigroviridis, Oryzias latipes and Takifugu rubripes, built with core components from GMOD project, GOPArc system and the BioMart project. The information provided by this resource includes Gene Ontology terms, metabolic pathways, SNP prediction, CDS prediction, orthologs prediction, several precalculated BLAST searches and domains. It also provides a BLAST server for matching user-provided sequences to any of the databases and an advanced query tool (BioMart) that allows easy browsing of EST databases with user-defined criteria. These tools make SalmonDB database a valuable resource for researchers searching for transcripts and genomic information regarding S. salar and other salmonid species. The database is expected to grow in the near feature, particularly with the S. salar genome sequencing project. Database URL: http://genomicasalmones.dim.uchile.cl/ PMID:22120661

SalmonDB: a bioinformatics resource for Salmo salar and Oncorhynchus mykiss.

PubMed

Di Génova, Alex; Aravena, Andrés; Zapata, Luis; González, Mauricio; Maass, Alejandro; Iturra, Patricia

2011-01-01

SalmonDB is a new multiorganism database containing EST sequences from Salmo salar, Oncorhynchus mykiss and the whole genome sequence of Danio rerio, Gasterosteus aculeatus, Tetraodon nigroviridis, Oryzias latipes and Takifugu rubripes, built with core components from GMOD project, GOPArc system and the BioMart project. The information provided by this resource includes Gene Ontology terms, metabolic pathways, SNP prediction, CDS prediction, orthologs prediction, several precalculated BLAST searches and domains. It also provides a BLAST server for matching user-provided sequences to any of the databases and an advanced query tool (BioMart) that allows easy browsing of EST databases with user-defined criteria. These tools make SalmonDB database a valuable resource for researchers searching for transcripts and genomic information regarding S. salar and other salmonid species. The database is expected to grow in the near feature, particularly with the S. salar genome sequencing project. Database URL: http://genomicasalmones.dim.uchile.cl/
Evaluating the effect of database inflation in proteogenomic search on sensitive and reliable peptide identification.

PubMed

Li, Honglan; Joh, Yoon Sung; Kim, Hyunwoo; Paek, Eunok; Lee, Sang-Won; Hwang, Kyu-Baek

2016-12-22

Proteogenomics is a promising approach for various tasks ranging from gene annotation to cancer research. Databases for proteogenomic searches are often constructed by adding peptide sequences inferred from genomic or transcriptomic evidence to reference protein sequences. Such inflation of databases has potential of identifying novel peptides. However, it also raises concerns on sensitive and reliable peptide identification. Spurious peptides included in target databases may result in underestimated false discovery rate (FDR). On the other hand, inflation of decoy databases could decrease the sensitivity of peptide identification due to the increased number of high-scoring random hits. Although several studies have addressed these issues, widely applicable guidelines for sensitive and reliable proteogenomic search have hardly been available. To systematically evaluate the effect of database inflation in proteogenomic searches, we constructed a variety of real and simulated proteogenomic databases for yeast and human tandem mass spectrometry (MS/MS) data, respectively. Against these databases, we tested two popular database search tools with various approaches to search result validation: the target-decoy search strategy (with and without a refined scoring-metric) and a mixture model-based method. The effect of separate filtering of known and novel peptides was also examined. The results from real and simulated proteogenomic searches confirmed that separate filtering increases the sensitivity and reliability in proteogenomic search. However, no one method consistently identified the largest (or the smallest) number of novel peptides from real proteogenomic searches. We propose to use a set of search result validation methods with separate filtering, for sensitive and reliable identification of peptides in proteogenomic search.
Measurement tools for the diagnosis of nasal septal deviation: a systematic review

PubMed Central

2014-01-01

Objective To perform a systematic review of measurement tools utilized for the diagnosis of nasal septal deviation (NSD). Methods Electronic database searches were performed using MEDLINE (from 1966 to second week of August 2013), EMBASE (from 1966 to second week of August 2013), Web of Science (from 1945 to second week of August 2013) and all Evidence Based Medicine Reviews Files (EBMR); Cochrane Database of Systematic Review (CDSR), Cochrane Central Register of Controlled Trials (CCTR), Cochrane Methodology Register (CMR), Database of Abstracts of Reviews of Effects (DARE), American College of Physicians Journal Club (ACP Journal Club), Health Technology Assessments (HTA), NHS Economic Evaluation Database (NHSEED) till the second quarter of 2013. The search terms used in database searches were ‘nasal septum’, ‘deviation’, ‘diagnosis’, ‘nose deformities’ and ‘nose malformation’. The studies were reviewed using the updated Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool. Results Online searches resulted in 23 abstracts after removal of duplicates that resulted from overlap of studies between the electronic databases. An additional 15 abstracts were excluded due to lack of relevance. A total of 8 studies were systematically reviewed. Conclusions Diagnostic modalities such as acoustic rhinometry, rhinomanometry and nasal spectral sound analysis may be useful in identifying NSD in anterior region of the nasal cavity, but these tests in isolation are of limited utility. Compared to anterior rhinoscopy, nasal endoscopy, and imaging the above mentioned index tests lack sensitivity and specificity in identifying the presence, location, and severity of NSD. PMID:24762010
Burn Injury Assessment Tool with Morphable 3D Human Body Models

DTIC Science & Technology

2017-04-21

waist, arms and legs measurements) as stored in most anthropometry databases . To improve on bum area estimations, the bum tool will allow the user to...different algorithm for morphing that relies on searching of an extensive anthropometric database , which is created from thousands of randomly...interpolation methods are required. Develop Patient Database : Patient data entered (name, gender, age, anthropometric measurements), collected (photographic
The EMBL-EBI bioinformatics web and programmatic tools framework.

PubMed

Li, Weizhong; Cowley, Andrew; Uludag, Mahmut; Gur, Tamer; McWilliam, Hamish; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Lopez, Rodrigo

2015-07-01

Since 2009 the EMBL-EBI Job Dispatcher framework has provided free access to a range of mainstream sequence analysis applications. These include sequence similarity search services (https://www.ebi.ac.uk/Tools/sss/) such as BLAST, FASTA and PSI-Search, multiple sequence alignment tools (https://www.ebi.ac.uk/Tools/msa/) such as Clustal Omega, MAFFT and T-Coffee, and other sequence analysis tools (https://www.ebi.ac.uk/Tools/pfa/) such as InterProScan. Through these services users can search mainstream sequence databases such as ENA, UniProt and Ensembl Genomes, utilising a uniform web interface or systematically through Web Services interfaces (https://www.ebi.ac.uk/Tools/webservices/) using common programming languages, and obtain enriched results with novel visualisations. Integration with EBI Search (https://www.ebi.ac.uk/ebisearch/) and the dbfetch retrieval service (https://www.ebi.ac.uk/Tools/dbfetch/) further expands the usefulness of the framework. New tools and updates such as NCBI BLAST+, InterProScan 5 and PfamScan, new categories such as RNA analysis tools (https://www.ebi.ac.uk/Tools/rna/), new databases such as ENA non-coding, WormBase ParaSite, Pfam and Rfam, and new workflow methods, together with the retirement of depreciated services, ensure that the framework remains relevant to today's biological community. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
PIPI: PTM-Invariant Peptide Identification Using Coding Method.

PubMed

Yu, Fengchao; Li, Ning; Yu, Weichuan

2016-12-02

In computational proteomics, the identification of peptides with an unlimited number of post-translational modification (PTM) types is a challenging task. The computational cost associated with database search increases exponentially with respect to the number of modified amino acids and linearly with respect to the number of potential PTM types at each amino acid. The problem becomes intractable very quickly if we want to enumerate all possible PTM patterns. To address this issue, one group of methods named restricted tools (including Mascot, Comet, and MS-GF+) only allow a small number of PTM types in database search process. Alternatively, the other group of methods named unrestricted tools (including MS-Alignment, ProteinProspector, and MODa) avoids enumerating PTM patterns with an alignment-based approach to localizing and characterizing modified amino acids. However, because of the large search space and PTM localization issue, the sensitivity of these unrestricted tools is low. This paper proposes a novel method named PIPI to achieve PTM-invariant peptide identification. PIPI belongs to the category of unrestricted tools. It first codes peptide sequences into Boolean vectors and codes experimental spectra into real-valued vectors. For each coded spectrum, it then searches the coded sequence database to find the top scored peptide sequences as candidates. After that, PIPI uses dynamic programming to localize and characterize modified amino acids in each candidate. We used simulation experiments and real data experiments to evaluate the performance in comparison with restricted tools (i.e., Mascot, Comet, and MS-GF+) and unrestricted tools (i.e., Mascot with error tolerant search, MS-Alignment, ProteinProspector, and MODa). Comparison with restricted tools shows that PIPI has a close sensitivity and running speed. Comparison with unrestricted tools shows that PIPI has the highest sensitivity except for Mascot with error tolerant search and ProteinProspector. These two tools simplify the task by only considering up to one modified amino acid in each peptide, which results in a higher sensitivity but has difficulty in dealing with multiple modified amino acids. The simulation experiments also show that PIPI has the lowest false discovery proportion, the highest PTM characterization accuracy, and the shortest running time among the unrestricted tools.
Protein structural similarity search by Ramachandran codes

PubMed Central

Lo, Wei-Cheng; Huang, Po-Jung; Chang, Chih-Hung; Lyu, Ping-Chiang

2007-01-01

Background Protein structural data has increased exponentially, such that fast and accurate tools are necessary to access structure similarity search. To improve the search speed, several methods have been designed to reduce three-dimensional protein structures to one-dimensional text strings that are then analyzed by traditional sequence alignment methods; however, the accuracy is usually sacrificed and the speed is still unable to match sequence similarity search tools. Here, we aimed to improve the linear encoding methodology and develop efficient search tools that can rapidly retrieve structural homologs from large protein databases. Results We propose a new linear encoding method, SARST (Structural similarity search Aided by Ramachandran Sequential Transformation). SARST transforms protein structures into text strings through a Ramachandran map organized by nearest-neighbor clustering and uses a regenerative approach to produce substitution matrices. Then, classical sequence similarity search methods can be applied to the structural similarity search. Its accuracy is similar to Combinatorial Extension (CE) and works over 243,000 times faster, searching 34,000 proteins in 0.34 sec with a 3.2-GHz CPU. SARST provides statistically meaningful expectation values to assess the retrieved information. It has been implemented into a web service and a stand-alone Java program that is able to run on many different platforms. Conclusion As a database search method, SARST can rapidly distinguish high from low similarities and efficiently retrieve homologous structures. It demonstrates that the easily accessible linear encoding methodology has the potential to serve as a foundation for efficient protein structural similarity search tools. These search tools are supposed applicable to automated and high-throughput functional annotations or predictions for the ever increasing number of published protein structures in this post-genomic era. PMID:17716377
An application of a relational database system for high-throughput prediction of elemental compositions from accurate mass values.

PubMed

Sakurai, Nozomu; Ara, Takeshi; Kanaya, Shigehiko; Nakamura, Yukiko; Iijima, Yoko; Enomoto, Mitsuo; Motegi, Takeshi; Aoki, Koh; Suzuki, Hideyuki; Shibata, Daisuke

2013-01-15

High-accuracy mass values detected by high-resolution mass spectrometry analysis enable prediction of elemental compositions, and thus are used for metabolite annotations in metabolomic studies. Here, we report an application of a relational database to significantly improve the rate of elemental composition predictions. By searching a database of pre-calculated elemental compositions with fixed kinds and numbers of atoms, the approach eliminates redundant evaluations of the same formula that occur in repeated calculations with other tools. When our approach is compared with HR2, which is one of the fastest tools available, our database search times were at least 109 times shorter than those of HR2. When a solid-state drive (SSD) was applied, the search time was 488 times shorter at 5 ppm mass tolerance and 1833 times at 0.1 ppm. Even if the search by HR2 was performed with 8 threads in a high-spec Windows 7 PC, the database search times were at least 26 and 115 times shorter without and with the SSD. These improvements were enhanced in a low spec Windows XP PC. We constructed a web service 'MFSearcher' to query the database in a RESTful manner. Available for free at http://webs2.kazusa.or.jp/mfsearcher. The web service is implemented in Java, MySQL, Apache and Tomcat, with all major browsers supported. sakurai@kazusa.or.jp Supplementary data are available at Bioinformatics online.
100 Colleges Sign Up with Google to Speed Access to Library Resources

ERIC Educational Resources Information Center

Young, Jeffrey R.

2005-01-01

More than 100 colleges and universities have arranged to give people using the Google Scholar search engine on their campuses more-direct access to library materials. Google Scholar is a free tool that searches scholarly materials on the Web and in academic databases. The new arrangements essentially let Google know which online databases the…
NALDB: nucleic acid ligand database for small molecules targeting nucleic acid

PubMed Central

Kumar Mishra, Subodh; Kumar, Amit

2016-01-01

Nucleic acid ligand database (NALDB) is a unique database that provides detailed information about the experimental data of small molecules that were reported to target several types of nucleic acid structures. NALDB is the first ligand database that contains ligand information for all type of nucleic acid. NALDB contains more than 3500 ligand entries with detailed pharmacokinetic and pharmacodynamic information such as target name, target sequence, ligand 2D/3D structure, SMILES, molecular formula, molecular weight, net-formal charge, AlogP, number of rings, number of hydrogen bond donor and acceptor, potential energy along with their Ki, Kd, IC50 values. All these details at single platform would be helpful for the development and betterment of novel ligands targeting nucleic acids that could serve as a potential target in different diseases including cancers and neurological disorders. With maximum 255 conformers for each ligand entry, our database is a multi-conformer database and can facilitate the virtual screening process. NALDB provides powerful web-based search tools that make database searching efficient and simplified using option for text as well as for structure query. NALDB also provides multi-dimensional advanced search tool which can screen the database molecules on the basis of molecular properties of ligand provided by database users. A 3D structure visualization tool has also been included for 3D structure representation of ligands. NALDB offers an inclusive pharmacological information and the structurally flexible set of small molecules with their three-dimensional conformers that can accelerate the virtual screening and other modeling processes and eventually complement the nucleic acid-based drug discovery research. NALDB can be routinely updated and freely available on bsbe.iiti.ac.in/bsbe/naldb/HOME.php. Database URL: http://bsbe.iiti.ac.in/bsbe/naldb/HOME.php PMID:26896846
Open, Cross Platform Chemistry Application Unifying Structure Manipulation, External Tools, Databases and Visualization

DTIC Science & Technology

2012-11-27

with powerful analysis tools and an informatics approach leveraging best-of-breed NoSQL databases, in order to store, search and retrieve relevant...dictionaries, and JavaScript also has good support. The MongoDB project[15] was chosen as a scalable NoSQL data store for the cheminfor- matics components
Evaluating the Effectiveness of Tools for Online Database Instruction

ERIC Educational Resources Information Center

Mery, Yvonne; DeFrain, Erica; Kline, Elizabeth; Sult, Leslie

2014-01-01

The intent of this study was to evaluate the Guide on the Side (Gots), an online learning tool developed by the University of Arizona Libraries, and a screencast tutorial for teaching information literacy and database searching skills. Ninety undergraduate students were randomly assigned into three groups: group 1 completed a GotS tutorial; group…
Sentence-Based Metadata: An Approach and Tool for Viewing Database Designs.

ERIC Educational Resources Information Center

Boyle, John M.; Gunge, Jakob; Bryden, John; Librowski, Kaz; Hanna, Hsin-Yi

2002-01-01

Describes MARS (Museum Archive Retrieval System), a research tool which enables organizations to exchange digital images and documents by means of a common thesaurus structure, and merge the descriptive data and metadata of their collections. Highlights include theoretical basis; searching the MARS database; and examples in European museums.…
search GenBank: interactive orchestration and ad-hoc choreography of Web services in the exploration of the biomedical resources of the National Center For Biotechnology Information

PubMed Central

2013-01-01

Background Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing. Results We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user’s query, advanced data searching based on the specified user’s query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases. Conclusions search GenBank extends standard capabilities of the NCBI Entrez search engine in querying biomedical databases. The possibility of creating and saving macros in the search GenBank is a unique feature and has a great potential. The potential will further grow in the future with the increasing density of networks of relationships between data stored in particular databases. search GenBank is available for public use at http://sgb.biotools.pl/. PMID:23452691
search GenBank: interactive orchestration and ad-hoc choreography of Web services in the exploration of the biomedical resources of the National Center For Biotechnology Information.

PubMed

Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Siążnik, Artur

2013-03-01

Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing. We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user's query, advanced data searching based on the specified user's query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases. search GenBank extends standard capabilities of the NCBI Entrez search engine in querying biomedical databases. The possibility of creating and saving macros in the search GenBank is a unique feature and has a great potential. The potential will further grow in the future with the increasing density of networks of relationships between data stored in particular databases. search GenBank is available for public use at http://sgb.biotools.pl/.
HepSEQ: International Public Health Repository for Hepatitis B

PubMed Central

Gnaneshan, Saravanamuttu; Ijaz, Samreen; Moran, Joanne; Ramsay, Mary; Green, Jonathan

2007-01-01

HepSEQ is a repository for an extensive library of public health and molecular data relating to hepatitis B virus (HBV) infection collected from international sources. It is hosted by the Centre for Infections, Health Protection Agency (HPA), England, United Kingdom. This repository has been developed as a web-enabled, quality-controlled database to act as a tool for surveillance, HBV case management and for research. The web front-end for the database system can be accessed from . The format of the database system allows for comprehensive molecular, clinical and epidemiological data to be deposited into a functional database, to search and manipulate the stored data and to extract and visualize the information on epidemiological, virological, clinical, nucleotide sequence and mutational aspects of HBV infection through web front-end. Specific tools, built into the database, can be utilized to analyse deposited data and provide information on HBV genotype, identify mutations with known clinical significance (e.g. vaccine escape, precore and antiviral-resistant mutations) and carry out sequence homology searches against other deposited strains. Further mechanisms are also in place to allow specific tailored searches of the database to be undertaken. PMID:17130143
Custom Search Engines: Tools & Tips

ERIC Educational Resources Information Center

Notess, Greg R.

2008-01-01

Few have the resources to build a Google or Yahoo! from scratch. Yet anyone can build a search engine based on a subset of the large search engines' databases. Use Google Custom Search Engine or Yahoo! Search Builder or any of the other similar programs to create a vertical search engine targeting sites of interest to users. The basic steps to…
RNA FRABASE 2.0: an advanced web-accessible database with the capacity to search the three-dimensional fragments within RNA structures

PubMed Central

2010-01-01

Background Recent discoveries concerning novel functions of RNA, such as RNA interference, have contributed towards the growing importance of the field. In this respect, a deeper knowledge of complex three-dimensional RNA structures is essential to understand their new biological functions. A number of bioinformatic tools have been proposed to explore two major structural databases (PDB, NDB) in order to analyze various aspects of RNA tertiary structures. One of these tools is RNA FRABASE 1.0, the first web-accessible database with an engine for automatic search of 3D fragments within PDB-derived RNA structures. This search is based upon the user-defined RNA secondary structure pattern. In this paper, we present and discuss RNA FRABASE 2.0. This second version of the system represents a major extension of this tool in terms of providing new data and a wide spectrum of novel functionalities. An intuitionally operated web server platform enables very fast user-tailored search of three-dimensional RNA fragments, their multi-parameter conformational analysis and visualization. Description RNA FRABASE 2.0 has stored information on 1565 PDB-deposited RNA structures, including all NMR models. The RNA FRABASE 2.0 search engine algorithms operate on the database of the RNA sequences and the new library of RNA secondary structures, coded in the dot-bracket format extended to hold multi-stranded structures and to cover residues whose coordinates are missing in the PDB files. The library of RNA secondary structures (and their graphics) is made available. A high level of efficiency of the 3D search has been achieved by introducing novel tools to formulate advanced searching patterns and to screen highly populated tertiary structure elements. RNA FRABASE 2.0 also stores data and conformational parameters in order to provide "on the spot" structural filters to explore the three-dimensional RNA structures. An instant visualization of the 3D RNA structures is provided. RNA FRABASE 2.0 is freely available at http://rnafrabase.cs.put.poznan.pl. Conclusions RNA FRABASE 2.0 provides a novel database and powerful search engine which is equipped with new data and functionalities that are unavailable elsewhere. Our intention is that this advanced version of the RNA FRABASE will be of interest to all researchers working in the RNA field. PMID:20459631
RNA FRABASE 2.0: an advanced web-accessible database with the capacity to search the three-dimensional fragments within RNA structures.

PubMed

Popenda, Mariusz; Szachniuk, Marta; Blazewicz, Marek; Wasik, Szymon; Burke, Edmund K; Blazewicz, Jacek; Adamiak, Ryszard W

2010-05-06

Recent discoveries concerning novel functions of RNA, such as RNA interference, have contributed towards the growing importance of the field. In this respect, a deeper knowledge of complex three-dimensional RNA structures is essential to understand their new biological functions. A number of bioinformatic tools have been proposed to explore two major structural databases (PDB, NDB) in order to analyze various aspects of RNA tertiary structures. One of these tools is RNA FRABASE 1.0, the first web-accessible database with an engine for automatic search of 3D fragments within PDB-derived RNA structures. This search is based upon the user-defined RNA secondary structure pattern. In this paper, we present and discuss RNA FRABASE 2.0. This second version of the system represents a major extension of this tool in terms of providing new data and a wide spectrum of novel functionalities. An intuitionally operated web server platform enables very fast user-tailored search of three-dimensional RNA fragments, their multi-parameter conformational analysis and visualization. RNA FRABASE 2.0 has stored information on 1565 PDB-deposited RNA structures, including all NMR models. The RNA FRABASE 2.0 search engine algorithms operate on the database of the RNA sequences and the new library of RNA secondary structures, coded in the dot-bracket format extended to hold multi-stranded structures and to cover residues whose coordinates are missing in the PDB files. The library of RNA secondary structures (and their graphics) is made available. A high level of efficiency of the 3D search has been achieved by introducing novel tools to formulate advanced searching patterns and to screen highly populated tertiary structure elements. RNA FRABASE 2.0 also stores data and conformational parameters in order to provide "on the spot" structural filters to explore the three-dimensional RNA structures. An instant visualization of the 3D RNA structures is provided. RNA FRABASE 2.0 is freely available at http://rnafrabase.cs.put.poznan.pl. RNA FRABASE 2.0 provides a novel database and powerful search engine which is equipped with new data and functionalities that are unavailable elsewhere. Our intention is that this advanced version of the RNA FRABASE will be of interest to all researchers working in the RNA field.
Mining Hidden Gems Beneath the Surface: A Look At the Invisible Web.

ERIC Educational Resources Information Center

Carlson, Randal D.; Repman, Judi

2002-01-01

Describes resources for researchers called the Invisible Web that are hidden from the usual search engines and other tools and contrasts them with those resources available on the surface Web. Identifies specialized search tools, databases, and strategies that can be used to locate credible in-depth information. (Author/LRW)

Tools to Ease Your Internet Adventures: Part I.

ERIC Educational Resources Information Center

Descy, Don E.

1993-01-01

This first of a two-part series highlights three tools that improve accessibility to Internet resources: (1) Alex, a database that accesses files in FTP (file transfer protocol) sites; (2) Archie, software that searches for file names with a user's search term; and (3) Gopher, a menu-driven program to access Internet sites. (LRW)
Exploring FlyBase Data Using QuickSearch.

PubMed

Marygold, Steven J; Antonazzo, Giulia; Attrill, Helen; Costa, Marta; Crosby, Madeline A; Dos Santos, Gilberto; Goodman, Joshua L; Gramates, L Sian; Matthews, Beverley B; Rey, Alix J; Thurmond, Jim

2016-12-08

FlyBase (flybase.org) is the primary online database of genetic, genomic, and functional information about Drosophila species, with a major focus on the model organism Drosophila melanogaster. The long and rich history of Drosophila research, combined with recent surges in genomic-scale and high-throughput technologies, mean that FlyBase now houses a huge quantity of data. Researchers need to be able to rapidly and intuitively query these data, and the QuickSearch tool has been designed to meet these needs. This tool is conveniently located on the FlyBase homepage and is organized into a series of simple tabbed interfaces that cover the major data and annotation classes within the database. This unit describes the functionality of all aspects of the QuickSearch tool. With this knowledge, FlyBase users will be equipped to take full advantage of all QuickSearch features and thereby gain improved access to data relevant to their research. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
An Interactive Iterative Method for Electronic Searching of Large Literature Databases

ERIC Educational Resources Information Center

Hernandez, Marco A.

2013-01-01

PubMed® is an on-line literature database hosted by the U.S. National Library of Medicine. Containing over 21 million citations for biomedical literature--both abstracts and full text--in the areas of the life sciences, behavioral studies, chemistry, and bioengineering, PubMed® represents an important tool for researchers. PubMed® searches return…
Transterm—extended search facilities and improved integration with other databases

PubMed Central

Jacobs, Grant H.; Stockwell, Peter A.; Tate, Warren P.; Brown, Chris M.

2006-01-01

Transterm has now been publicly available for >10 years. Major changes have been made since its last description in this database issue in 2002. The current database provides data for key regions of mRNA sequences, a curated database of mRNA motifs and tools to allow users to investigate their own motifs or mRNA sequences. The key mRNA regions database is derived computationally from Genbank. It contains 3′ and 5′ flanking regions, the initiation and termination signal context and coding sequence for annotated CDS features from Genbank and RefSeq. The database is non-redundant, enabling summary files and statistics to be prepared for each species. Advances include providing extended search facilities, the database may now be searched by BLAST in addition to regular expressions (patterns) allowing users to search for motifs such as known miRNA sequences, and the inclusion of RefSeq data. The database contains >40 motifs or structural patterns important for translational control. In this release, patterns from UTRsite and Rfam are also incorporated with cross-referencing. Users may search their sequence data with Transterm or user-defined patterns. The system is accessible at . PMID:16381889
Peptide Identification by Database Search of Mixture Tandem Mass Spectra*

PubMed Central

Wang, Jian; Bourne, Philip E.; Bandeira, Nuno

2011-01-01

In high-throughput proteomics the development of computational methods and novel experimental strategies often rely on each other. In certain areas, mass spectrometry methods for data acquisition are ahead of computational methods to interpret the resulting tandem mass spectra. Particularly, although there are numerous situations in which a mixture tandem mass spectrum can contain fragment ions from two or more peptides, nearly all database search tools still make the assumption that each tandem mass spectrum comes from one peptide. Common examples include mixture spectra from co-eluting peptides in complex samples, spectra generated from data-independent acquisition methods, and spectra from peptides with complex post-translational modifications. We propose a new database search tool (MixDB) that is able to identify mixture tandem mass spectra from more than one peptide. We show that peptides can be reliably identified with up to 95% accuracy from mixture spectra while considering only a 0.01% of all possible peptide pairs (four orders of magnitude speedup). Comparison with current database search methods indicates that our approach has better or comparable sensitivity and precision at identifying single-peptide spectra while simultaneously being able to identify 38% more peptides from mixture spectra at significantly higher precision. PMID:21862760
The Online Bioinformatics Resources Collection at the University of Pittsburgh Health Sciences Library System--a one-stop gateway to online bioinformatics databases and software tools.

PubMed

Chen, Yi-Bu; Chattopadhyay, Ansuman; Bergen, Phillip; Gadd, Cynthia; Tannery, Nancy

2007-01-01

To bridge the gap between the rising information needs of biological and medical researchers and the rapidly growing number of online bioinformatics resources, we have created the Online Bioinformatics Resources Collection (OBRC) at the Health Sciences Library System (HSLS) at the University of Pittsburgh. The OBRC, containing 1542 major online bioinformatics databases and software tools, was constructed using the HSLS content management system built on the Zope Web application server. To enhance the output of search results, we further implemented the Vivísimo Clustering Engine, which automatically organizes the search results into categories created dynamically based on the textual information of the retrieved records. As the largest online collection of its kind and the only one with advanced search results clustering, OBRC is aimed at becoming a one-stop guided information gateway to the major bioinformatics databases and software tools on the Web. OBRC is available at the University of Pittsburgh's HSLS Web site (http://www.hsls.pitt.edu/guides/genetics/obrc).
The CTBTO Link to the database of the International Seismological Centre (ISC)

NASA Astrophysics Data System (ADS)

Bondar, I.; Storchak, D. A.; Dando, B.; Harris, J.; Di Giacomo, D.

2011-12-01

The CTBTO Link to the database of the International Seismological Centre (ISC) is a project to provide access to seismological data sets maintained by the ISC using specially designed interactive tools. The Link is open to National Data Centres and to the CTBTO. By means of graphical interfaces and database queries tailored to the needs of the monitoring community, the users are given access to a multitude of products. These include the ISC and ISS bulletins, covering the seismicity of the Earth since 1904; nuclear and chemical explosions; the EHB bulletin; the IASPEI Reference Event list (ground truth database); and the IDC Reviewed Event Bulletin. The searches are divided into three main categories: The Area Based Search (a spatio-temporal search based on the ISC Bulletin), the REB search (a spatio-temporal search based on specific events in the REB) and the IMS Station Based Search (a search for historical patterns in the reports of seismic stations close to a particular IMS seismic station). The outputs are HTML based web-pages with a simplified version of the ISC Bulletin showing the most relevant parameters with access to ISC, GT, EHB and REB Bulletins in IMS1.0 format for single or multiple events. The CTBTO Link offers a tool to view REB events in context within the historical seismicity, look at observations reported by non-IMS networks, and investigate station histories and residual patterns for stations registered in the International Seismographic Station Registry.
TogoTable: cross-database annotation system using the Resource Description Framework (RDF) data model.

PubMed

Kawano, Shin; Watanabe, Tsutomu; Mizuguchi, Sohei; Araki, Norie; Katayama, Toshiaki; Yamaguchi, Atsuko

2014-07-01

TogoTable (http://togotable.dbcls.jp/) is a web tool that adds user-specified annotations to a table that a user uploads. Annotations are drawn from several biological databases that use the Resource Description Framework (RDF) data model. TogoTable uses database identifiers (IDs) in the table as a query key for searching. RDF data, which form a network called Linked Open Data (LOD), can be searched from SPARQL endpoints using a SPARQL query language. Because TogoTable uses RDF, it can integrate annotations from not only the reference database to which the IDs originally belong, but also externally linked databases via the LOD network. For example, annotations in the Protein Data Bank can be retrieved using GeneID through links provided by the UniProt RDF. Because RDF has been standardized by the World Wide Web Consortium, any database with annotations based on the RDF data model can be easily incorporated into this tool. We believe that TogoTable is a valuable Web tool, particularly for experimental biologists who need to process huge amounts of data such as high-throughput experimental output. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
A Web-based Tool for SDSS and 2MASS Database Searches

NASA Astrophysics Data System (ADS)

Hendrickson, M. A.; Uomoto, A.; Golimowski, D. A.

We have developed a web site using HTML, Php, Python, and MySQL that extracts, processes, and displays data from the Sloan Digital Sky Survey (SDSS) and the Two-Micron All-Sky Survey (2MASS). The goal is to locate brown dwarf candidates in the SDSS database by looking at color cuts; however, this site could also be useful for targeted searches of other databases as well. MySQL databases are created from broad searches of SDSS and 2MASS data. Broad queries on the SDSS and 2MASS database servers are run weekly so that observers have the most up-to-date information from which to select candidates for observation. Observers can look at detailed information about specific objects including finding charts, images, and available spectra. In addition, updates from previous observations can be added by any collaborators; this format makes observational collaboration simple. Observers can also restrict the database search, just before or during an observing run, to select objects of special interest.
The Chinchilla Research Resource Database: resource for an otolaryngology disease model

PubMed Central

Shimoyama, Mary; Smith, Jennifer R.; De Pons, Jeff; Tutaj, Marek; Khampang, Pawjai; Hong, Wenzhou; Erbe, Christy B.; Ehrlich, Garth D.; Bakaletz, Lauren O.; Kerschner, Joseph E.

2016-01-01

The long-tailed chinchilla (Chinchilla lanigera) is an established animal model for diseases of the inner and middle ear, among others. In particular, chinchilla is commonly used to study diseases involving viral and bacterial pathogens and polymicrobial infections of the upper respiratory tract and the ear, such as otitis media. The value of the chinchilla as a model for human diseases prompted the sequencing of its genome in 2012 and the more recent development of the Chinchilla Research Resource Database (http://crrd.mcw.edu) to provide investigators with easy access to relevant datasets and software tools to enhance their research. The Chinchilla Research Resource Database contains a complete catalog of genes for chinchilla and, for comparative purposes, human. Chinchilla genes can be viewed in the context of their genomic scaffold positions using the JBrowse genome browser. In contrast to the corresponding records at NCBI, individual gene reports at CRRD include functional annotations for Disease, Gene Ontology (GO) Biological Process, GO Molecular Function, GO Cellular Component and Pathway assigned to chinchilla genes based on annotations from the corresponding human orthologs. Data can be retrieved via keyword and gene-specific searches. Lists of genes with similar functional attributes can be assembled by leveraging the hierarchical structure of the Disease, GO and Pathway vocabularies through the Ontology Search and Browser tool. Such lists can then be further analyzed for commonalities using the Gene Annotator (GA) Tool. All data in the Chinchilla Research Resource Database is freely accessible and downloadable via the CRRD FTP site or using the download functions available in the search and analysis tools. The Chinchilla Research Resource Database is a rich resource for researchers using, or considering the use of, chinchilla as a model for human disease. Database URL: http://crrd.mcw.edu PMID:27173523
Simrank: Rapid and sensitive general-purpose k-mer search tool

PubMed Central

2011-01-01

Background Terabyte-scale collections of string-encoded data are expected from consortia efforts such as the Human Microbiome Project http://nihroadmap.nih.gov/hmp. Intra- and inter-project data similarity searches are enabled by rapid k-mer matching strategies. Software applications for sequence database partitioning, guide tree estimation, molecular classification and alignment acceleration have benefited from embedded k-mer searches as sub-routines. However, a rapid, general-purpose, open-source, flexible, stand-alone k-mer tool has not been available. Results Here we present a stand-alone utility, Simrank, which allows users to rapidly identify database strings the most similar to query strings. Performance testing of Simrank and related tools against DNA, RNA, protein and human-languages found Simrank 10X to 928X faster depending on the dataset. Conclusions Simrank provides molecular ecologists with a high-throughput, open source choice for comparing large sequence sets to find similarity. PMID:21524302
Tachyon search speeds up retrieval of similar sequences by several orders of magnitude

PubMed Central

Tan, Joshua; Kuchibhatla, Durga; Sirota, Fernanda L.; Sherman, Westley A.; Gattermayer, Tobias; Kwoh, Chia Yee; Eisenhaber, Frank; Schneider, Georg; Maurer-Stroh, Sebastian

2012-01-01

Summary: The usage of current sequence search tools becomes increasingly slower as databases of protein sequences continue to grow exponentially. Tachyon, a new algorithm that identifies closely related protein sequences ~200 times faster than standard BLAST, circumvents this limitation with a reduced database and oligopeptide matching heuristic. Availability and implementation: The tool is publicly accessible as a webserver at http://tachyon.bii.a-star.edu.sg and can also be accessed programmatically through SOAP. Contact: sebastianms@bii.a-star.edu.sg Supplementary information: Supplementary data are available at the Bioinformatics online. PMID:22531216
muBLASTP: database-indexed protein sequence search on multicore CPUs.

PubMed

Zhang, Jing; Misra, Sanchit; Wang, Hao; Feng, Wu-Chun

2016-11-04

The Basic Local Alignment Search Tool (BLAST) is a fundamental program in the life sciences that searches databases for sequences that are most similar to a query sequence. Currently, the BLAST algorithm utilizes a query-indexed approach. Although many approaches suggest that sequence search with a database index can achieve much higher throughput (e.g., BLAT, SSAHA, and CAFE), they cannot deliver the same level of sensitivity as the query-indexed BLAST, i.e., NCBI BLAST, or they can only support nucleotide sequence search, e.g., MegaBLAST. Due to different challenges and characteristics between query indexing and database indexing, the existing techniques for query-indexed search cannot be used into database indexed search. muBLASTP, a novel database-indexed BLAST for protein sequence search, delivers identical hits returned to NCBI BLAST. On Intel Haswell multicore CPUs, for a single query, the single-threaded muBLASTP achieves up to a 4.41-fold speedup for alignment stages, and up to a 1.75-fold end-to-end speedup over single-threaded NCBI BLAST. For a batch of queries, the multithreaded muBLASTP achieves up to a 5.7-fold speedups for alignment stages, and up to a 4.56-fold end-to-end speedup over multithreaded NCBI BLAST. With a newly designed index structure for protein database and associated optimizations in BLASTP algorithm, we re-factored BLASTP algorithm for modern multicore processors that achieves much higher throughput with acceptable memory footprint for the database index.
DoOPSearch: a web-based tool for finding and analysing common conserved motifs in the promoter regions of different chordate and plant genes

PubMed Central

Sebestyén, Endre; Nagy, Tibor; Suhai, Sándor; Barta, Endre

2009-01-01

Background The comparative genomic analysis of a large number of orthologous promoter regions of the chordate and plant genes from the DoOP databases shows thousands of conserved motifs. Most of these motifs differ from any known transcription factor binding site (TFBS). To identify common conserved motifs, we need a specific tool to be able to search amongst them. Since conserved motifs from the DoOP databases are linked to genes, the result of such a search can give a list of genes that are potentially regulated by the same transcription factor(s). Results We have developed a new tool called DoOPSearch for the analysis of the conserved motifs in the promoter regions of chordate or plant genes. We used the orthologous promoters of the DoOP database to extract thousands of conserved motifs from different taxonomic groups. The advantage of this approach is that different sets of conserved motifs might be found depending on how broad the taxonomic coverage of the underlying orthologous promoter sequence collection is (consider e.g. primates vs. mammals or Brassicaceae vs. Viridiplantae). The DoOPSearch tool allows the users to search these motif collections or the promoter regions of DoOP with user supplied query sequences or any of the conserved motifs from the DoOP database. To find overrepresented gene ontologies, the gene lists obtained can be analysed further using a modified version of the GeneMerge program. Conclusion We present here a comparative genomics based promoter analysis tool. Our system is based on a unique collection of conserved promoter motifs characteristic of different taxonomic groups. We offer both a command line and a web-based tool for searching in these motif collections using user specified queries. These can be either short promoter sequences or consensus sequences of known transcription factor binding sites. The GeneMerge analysis of the search results allows the user to identify statistically overrepresented Gene Ontology terms that might provide a clue on the function of the motifs and genes. PMID:19534755
CERES Search and Subset Tool

Atmospheric Science Data Center

2016-06-24

... data granules using a high resolution spatial metadata database and directly accessing the archived data granules. Subset results are ... data granules using a high resolution spatial metadata database and directly accessing the archived data granules. Subset results are ...
The PathoYeastract database: an information system for the analysis of gene and genomic transcription regulation in pathogenic yeasts.

PubMed

Monteiro, Pedro Tiago; Pais, Pedro; Costa, Catarina; Manna, Sauvagya; Sá-Correia, Isabel; Teixeira, Miguel Cacho

2017-01-04

We present the PATHOgenic YEAst Search for Transcriptional Regulators And Consensus Tracking (PathoYeastract - http://pathoyeastract.org) database, a tool for the analysis and prediction of transcription regulatory associations at the gene and genomic levels in the pathogenic yeasts Candida albicans and C. glabrata Upon data retrieval from hundreds of publications, followed by curation, the database currently includes 28 000 unique documented regulatory associations between transcription factors (TF) and target genes and 107 DNA binding sites, considering 134 TFs in both species. Following the structure used for the YEASTRACT database, PathoYeastract makes available bioinformatics tools that enable the user to exploit the existing information to predict the TFs involved in the regulation of a gene or genome-wide transcriptional response, while ranking those TFs in order of their relative importance. Each search can be filtered based on the selection of specific environmental conditions, experimental evidence or positive/negative regulatory effect. Promoter analysis tools and interactive visualization tools for the representation of TF regulatory networks are also provided. The PathoYeastract database further provides simple tools for the prediction of gene and genomic regulation based on orthologous regulatory associations described for other yeast species, a comparative genomics setup for the study of cross-species evolution of regulatory networks. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
blastjs: a BLAST+ wrapper for Node.js.

PubMed

Page, Martin; MacLean, Dan; Schudoma, Christian

2016-02-27

To cope with the ever-increasing amount of sequence data generated in the field of genomics, the demand for efficient and fast database searches that drive functional and structural annotation in both large- and small-scale genome projects is on the rise. The tools of the BLAST+ suite are the most widely employed bioinformatic method for these database searches. Recent trends in bioinformatics application development show an increasing number of JavaScript apps that are based on modern frameworks such as Node.js. Until now, there is no way of using database searches with the BLAST+ suite from a Node.js codebase. We developed blastjs, a Node.js library that wraps the search tools of the BLAST+ suite and thus allows to easily add significant functionality to any Node.js-based application. blastjs is a library that allows the incorporation of BLAST+ functionality into bioinformatics applications based on JavaScript and Node.js. The library was designed to be as user-friendly as possible and therefore requires only a minimal amount of code in the client application. The library is freely available under the MIT license at https://github.com/teammaclean/blastjs.
ScienceDirect through SciVerse: a new way to approach Elsevier.

PubMed

Bengtson, Jason

2011-01-01

SciVerse is the new combined portal from Elsevier that services their ScienceDirect collection, SciTopics, and their Scopus database. Using SciVerse to access ScienceDirect is the specific focus of this review. Along with advanced keyword searching and citation searching options, SciVerse also incorporates a very useful image search feature. The aim seems to be not only to create an interface that provides broad functionality on par with other database search tools that many searchers use regularly but also to create an open platform that could be changed to respond effectively to the needs of customers.
SpolSimilaritySearch - A web tool to compare and search similarities between spoligotypes of Mycobacterium tuberculosis complex.

PubMed

Couvin, David; Zozio, Thierry; Rastogi, Nalin

2017-07-01

Spoligotyping is one of the most commonly used polymerase chain reaction (PCR)-based methods for identification and study of genetic diversity of Mycobacterium tuberculosis complex (MTBC). Despite its known limitations if used alone, the methodology is particularly useful when used in combination with other methods such as mycobacterial interspersed repetitive units - variable number of tandem DNA repeats (MIRU-VNTRs). At a worldwide scale, spoligotyping has allowed identification of information on 103,856 MTBC isolates (corresponding to 98049 clustered strains plus 5807 unique isolates from 169 countries of patient origin) contained within the SITVIT2 proprietary database of the Institut Pasteur de la Guadeloupe. The SpolSimilaritySearch web-tool described herein (available at: http://www.pasteur-guadeloupe.fr:8081/SpolSimilaritySearch) incorporates a similarity search algorithm allowing users to get a complete overview of similar spoligotype patterns (with information on presence or absence of 43 spacers) in the aforementioned worldwide database. This tool allows one to analyze spread and evolutionary patterns of MTBC by comparing similar spoligotype patterns, to distinguish between widespread, specific and/or confined patterns, as well as to pinpoint patterns with large deleted blocks, which play an intriguing role in the genetic epidemiology of M. tuberculosis. Finally, the SpolSimilaritySearch tool also provides with the country distribution patterns for each queried spoligotype. Copyright © 2017 Elsevier Ltd. All rights reserved.
The thyrotropin receptor mutation database: update 2003.

PubMed

Führer, Dagmar; Lachmund, Peter; Nebel, Istvan-Tibor; Paschke, Ralf

2003-12-01

In 1999 we have created a TSHR mutation database compiling TSHR mutations with their basic characteristics and associated clinical conditions (www.uni-leipzig.de/innere/tshr). Since then, more than 2887 users from 36 countries have logged into the TSHR mutation database and have contributed several valuable suggestions for further improvement of the database. We now present an updated and extended version of the TSHR database to which several novel features have been introduced: 1. detailed functional characteristics on all 65 mutations (43 activating and 22 inactivating mutations) reported to date, 2. 40 pedigrees with detailed information on molecular aspects, clinical courses and treatment options in patients with gain-of-function and loss-of-function germline TSHR mutations, 3. a first compilation of site-directed mutagenesis studies, 4. references with Medline links, 5. a user friendly search tool for specific database searches, user-specific database output and 6. an administrator tool for the submission of novel TSHR mutations. The TSHR mutation database is installed as one of the locus specific HUGO mutation databases. It is listed under index TSHR 603372 (http://ariel.ucs.unimelb.edu.au/~cotton/glsdbq.htm) and can be accessed via www.uni-leipzig.de/innere/tshr.

MetalS(3), a database-mining tool for the identification of structurally similar metal sites.

PubMed

Valasatava, Yana; Rosato, Antonio; Cavallaro, Gabriele; Andreini, Claudia

2014-08-01

We have developed a database search tool to identify metal sites having structural similarity to a query metal site structure within the MetalPDB database of minimal functional sites (MFSs) contained in metal-binding biological macromolecules. MFSs describe the local environment around the metal(s) independently of the larger context of the macromolecular structure. Such a local environment has a determinant role in tuning the chemical reactivity of the metal, ultimately contributing to the functional properties of the whole system. The database search tool, which we called MetalS(3) (Metal Sites Similarity Search), can be accessed through a Web interface at http://metalweb.cerm.unifi.it/tools/metals3/ . MetalS(3) uses a suitably adapted version of an algorithm that we previously developed to systematically compare the structure of the query metal site with each MFS in MetalPDB. For each MFS, the best superposition is kept. All these superpositions are then ranked according to the MetalS(3) scoring function and are presented to the user in tabular form. The user can interact with the output Web page to visualize the structural alignment or the sequence alignment derived from it. Options to filter the results are available. Test calculations show that the MetalS(3) output correlates well with expectations from protein homology considerations. Furthermore, we describe some usage scenarios that highlight the usefulness of MetalS(3) to obtain mechanistic and functional hints regardless of homology.
Explore a Career in Health Sciences Information

MedlinePlus

... tools that range from traditional print journals to electronic databases and the latest mobile devices, health sciences ... an expert search of the literature. connecting licensed electronic resources and decision tools into a patient's electronic ...
Decision making in family medicine: randomized trial of the effects of the InfoClinique and Trip database search engines.

PubMed

Labrecque, Michel; Ratté, Stéphane; Frémont, Pierre; Cauchon, Michel; Ouellet, Jérôme; Hogg, William; McGowan, Jessie; Gagnon, Marie-Pierre; Njoya, Merlin; Légaré, France

2013-10-01

To compare the ability of users of 2 medical search engines, InfoClinique and the Trip database, to provide correct answers to clinical questions and to explore the perceived effects of the tools on the clinical decision-making process. Randomized trial. Three family medicine units of the family medicine program of the Faculty of Medicine at Laval University in Quebec city, Que. Fifteen second-year family medicine residents. Residents generated 30 structured questions about therapy or preventive treatment (2 questions per resident) based on clinical encounters. Using an Internet platform designed for the trial, each resident answered 20 of these questions (their own 2, plus 18 of the questions formulated by other residents, selected randomly) before and after searching for information with 1 of the 2 search engines. For each question, 5 residents were randomly assigned to begin their search with InfoClinique and 5 with the Trip database. The ability of residents to provide correct answers to clinical questions using the search engines, as determined by third-party evaluation. After answering each question, participants completed a questionnaire to assess their perception of the engine's effect on the decision-making process in clinical practice. Of 300 possible pairs of answers (1 answer before and 1 after the initial search), 254 (85%) were produced by 14 residents. Of these, 132 (52%) and 122 (48%) pairs of answers concerned questions that had been assigned an initial search with InfoClinique and the Trip database, respectively. Both engines produced an important and similar absolute increase in the proportion of correct answers after searching (26% to 62% for InfoClinique, for an increase of 36%; 24% to 63% for the Trip database, for an increase of 39%; P = .68). For all 30 clinical questions, at least 1 resident produced the correct answer after searching with either search engine. The mean (SD) time of the initial search for each question was 23.5 (7.6) minutes with InfoClinique and 22.3 (7.8) minutes with the Trip database (P = .30). Participants' perceptions of each engine's effect on the decision-making process were very positive and similar for both search engines. Family medicine residents' ability to provide correct answers to clinical questions increased dramatically and similarly with the use of both InfoClinique and the Trip database. These tools have strong potential to increase the quality of medical care.
GMODWeb: a web framework for the generic model organism database

PubMed Central

O'Connor, Brian D; Day, Allen; Cain, Scott; Arnaiz, Olivier; Sperling, Linda; Stein, Lincoln D

2008-01-01

The Generic Model Organism Database (GMOD) initiative provides species-agnostic data models and software tools for representing curated model organism data. Here we describe GMODWeb, a GMOD project designed to speed the development of model organism database (MOD) websites. Sites created with GMODWeb provide integration with other GMOD tools and allow users to browse and search through a variety of data types. GMODWeb was built using the open source Turnkey web framework and is available from . PMID:18570664
Multimedia explorer: image database, image proxy-server and search-engine.

PubMed Central

Frankewitsch, T.; Prokosch, U.

1999-01-01

Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval. PMID:10566463
Multimedia explorer: image database, image proxy-server and search-engine.

PubMed

Frankewitsch, T; Prokosch, U

1999-01-01

Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval.
NALDB: nucleic acid ligand database for small molecules targeting nucleic acid.

PubMed

Kumar Mishra, Subodh; Kumar, Amit

2016-01-01

Nucleic acid ligand database (NALDB) is a unique database that provides detailed information about the experimental data of small molecules that were reported to target several types of nucleic acid structures. NALDB is the first ligand database that contains ligand information for all type of nucleic acid. NALDB contains more than 3500 ligand entries with detailed pharmacokinetic and pharmacodynamic information such as target name, target sequence, ligand 2D/3D structure, SMILES, molecular formula, molecular weight, net-formal charge, AlogP, number of rings, number of hydrogen bond donor and acceptor, potential energy along with their Ki, Kd, IC50 values. All these details at single platform would be helpful for the development and betterment of novel ligands targeting nucleic acids that could serve as a potential target in different diseases including cancers and neurological disorders. With maximum 255 conformers for each ligand entry, our database is a multi-conformer database and can facilitate the virtual screening process. NALDB provides powerful web-based search tools that make database searching efficient and simplified using option for text as well as for structure query. NALDB also provides multi-dimensional advanced search tool which can screen the database molecules on the basis of molecular properties of ligand provided by database users. A 3D structure visualization tool has also been included for 3D structure representation of ligands. NALDB offers an inclusive pharmacological information and the structurally flexible set of small molecules with their three-dimensional conformers that can accelerate the virtual screening and other modeling processes and eventually complement the nucleic acid-based drug discovery research. NALDB can be routinely updated and freely available on bsbe.iiti.ac.in/bsbe/naldb/HOME.php. Database URL: http://bsbe.iiti.ac.in/bsbe/naldb/HOME.php. © The Author(s) 2016. Published by Oxford University Press.
PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome

PubMed Central

Sarika; Arora, Vasu; Iquebal, M. A.; Rai, Anil; Kumar, Dinesh

2013-01-01

Molecular markers play a significant role for crop improvement in desirable characteristics, such as high yield, resistance to disease and others that will benefit the crop in long term. Pigeonpea (Cajanus cajan L.) is the recently sequenced legume by global consortium led by ICRISAT (Hyderabad, India) and been analysed for gene prediction, synteny maps, markers, etc. We present PIgeonPEa Microsatellite DataBase (PIPEMicroDB) with an automated primer designing tool for pigeonpea genome, based on chromosome wise as well as location wise search of primers. Total of 123 387 Short Tandem Repeats (STRs) were extracted from pigeonpea genome, available in public domain using MIcroSAtellite tool (MISA). The database is an online relational database based on ‘three-tier architecture’ that catalogues information of microsatellites in MySQL and user-friendly interface is developed using PHP. Search for STRs may be customized by limiting their location on chromosome as well as number of markers in that range. This is a novel approach and is not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of selected markers with left and right flankings of size up to 500 bp. This will enable researchers to select markers of choice at desired interval over the chromosome. Furthermore, one can use individual STRs of a targeted region over chromosome to narrow down location of gene of interest or linked Quantitative Trait Loci (QTLs). Although it is an in silico approach, markers’ search based on characteristics and location of STRs is expected to be beneficial for researchers. Database URL: http://cabindb.iasri.res.in/pigeonpea/ PMID:23396298
PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome.

PubMed

Sarika; Arora, Vasu; Iquebal, M A; Rai, Anil; Kumar, Dinesh

2013-01-01

Molecular markers play a significant role for crop improvement in desirable characteristics, such as high yield, resistance to disease and others that will benefit the crop in long term. Pigeonpea (Cajanus cajan L.) is the recently sequenced legume by global consortium led by ICRISAT (Hyderabad, India) and been analysed for gene prediction, synteny maps, markers, etc. We present PIgeonPEa Microsatellite DataBase (PIPEMicroDB) with an automated primer designing tool for pigeonpea genome, based on chromosome wise as well as location wise search of primers. Total of 123 387 Short Tandem Repeats (STRs) were extracted from pigeonpea genome, available in public domain using MIcroSAtellite tool (MISA). The database is an online relational database based on 'three-tier architecture' that catalogues information of microsatellites in MySQL and user-friendly interface is developed using PHP. Search for STRs may be customized by limiting their location on chromosome as well as number of markers in that range. This is a novel approach and is not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of selected markers with left and right flankings of size up to 500 bp. This will enable researchers to select markers of choice at desired interval over the chromosome. Furthermore, one can use individual STRs of a targeted region over chromosome to narrow down location of gene of interest or linked Quantitative Trait Loci (QTLs). Although it is an in silico approach, markers' search based on characteristics and location of STRs is expected to be beneficial for researchers. Database URL: http://cabindb.iasri.res.in/pigeonpea/
Tandem Mass Spectrum Sequencing: An Alternative to Database Search Engines in Shotgun Proteomics.

PubMed

Muth, Thilo; Rapp, Erdmann; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

2016-01-01

Protein identification via database searches has become the gold standard in mass spectrometry based shotgun proteomics. However, as the quality of tandem mass spectra improves, direct mass spectrum sequencing gains interest as a database-independent alternative. In this chapter, the general principle of this so-called de novo sequencing is introduced along with pitfalls and challenges of the technique. The main tools available are presented with a focus on user friendly open source software which can be directly applied in everyday proteomic workflows.
RiceMetaSys for salt and drought stress responsive genes in rice: a web interface for crop improvement.

PubMed

Sandhu, Maninder; Sureshkumar, V; Prakash, Chandra; Dixit, Rekha; Solanke, Amolkumar U; Sharma, Tilak Raj; Mohapatra, Trilochan; S V, Amitha Mithra

2017-09-30

Genome-wide microarray has enabled development of robust databases for functional genomics studies in rice. However, such databases do not directly cater to the needs of breeders. Here, we have attempted to develop a web interface which combines the information from functional genomic studies across different genetic backgrounds with DNA markers so that they can be readily deployed in crop improvement. In the current version of the database, we have included drought and salinity stress studies since these two are the major abiotic stresses in rice. RiceMetaSys, a user-friendly and freely available web interface provides comprehensive information on salt responsive genes (SRGs) and drought responsive genes (DRGs) across genotypes, crop development stages and tissues, identified from multiple microarray datasets. 'Physical position search' is an attractive tool for those using QTL based approach for dissecting tolerance to salt and drought stress since it can provide the list of SRGs and DRGs in any physical interval. To identify robust candidate genes for use in crop improvement, the 'common genes across varieties' search tool is useful. Graphical visualization of expression profiles across genes and rice genotypes has been enabled to facilitate the user and to make the comparisons more impactful. Simple Sequence Repeat (SSR) search in the SRGs and DRGs is a valuable tool for fine mapping and marker assisted selection since it provides primers for survey of polymorphism. An external link to intron specific markers is also provided for this purpose. Bulk retrieval of data without any limit has been enabled in case of locus and SSR search. The aim of this database is to facilitate users with a simple and straight-forward search options for identification of robust candidate genes from among thousands of SRGs and DRGs so as to facilitate linking variation in expression profiles to variation in phenotype. Database URL: http://14.139.229.201.
Design and implementation of the NPOI database and website

NASA Astrophysics Data System (ADS)

Newman, K.; Jorgensen, A. M.; Landavazo, M.; Sun, B.; Hutter, D. J.; Armstrong, J. T.; Mozurkewich, David; Elias, N.; van Belle, G. T.; Schmitt, H. R.; Baines, E. K.

2014-07-01

The Navy Precision Optical Interferometer (NPOI) has been recording astronomical observations for nearly two decades, at this point with hundreds of thousands of individual observations recorded to date for a total data volume of many terabytes. To make maximum use of the NPOI data it is necessary to organize them in an easily searchable manner and be able to extract essential diagnostic information from the data to allow users to quickly gauge data quality and suitability for a specific science investigation. This sets the motivation for creating a comprehensive database of observation metadata as well as, at least, reduced data products. The NPOI database is implemented in MySQL using standard database tools and interfaces. The use of standard database tools allows us to focus on top-level database and interface implementation and take advantage of standard features such as backup, remote access, mirroring, and complex queries which would otherwise be time-consuming to implement. A website was created in order to give scientists a user friendly interface for searching the database. It allows the user to select various metadata to search for and also allows them to decide how and what results are displayed. This streamlines the searches, making it easier and quicker for scientists to find the information they are looking for. The website has multiple browser and device support. In this paper we present the design of the NPOI database and website, and give examples of its use.
Analysis Tool Web Services from the EMBL-EBI.

PubMed

McWilliam, Hamish; Li, Weizhong; Uludag, Mahmut; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Cowley, Andrew Peter; Lopez, Rodrigo

2013-07-01

Since 2004 the European Bioinformatics Institute (EMBL-EBI) has provided access to a wide range of databases and analysis tools via Web Services interfaces. This comprises services to search across the databases available from the EMBL-EBI and to explore the network of cross-references present in the data (e.g. EB-eye), services to retrieve entry data in various data formats and to access the data in specific fields (e.g. dbfetch), and analysis tool services, for example, sequence similarity search (e.g. FASTA and NCBI BLAST), multiple sequence alignment (e.g. Clustal Omega and MUSCLE), pairwise sequence alignment and protein functional analysis (e.g. InterProScan and Phobius). The REST/SOAP Web Services (http://www.ebi.ac.uk/Tools/webservices/) interfaces to these databases and tools allow their integration into other tools, applications, web sites, pipeline processes and analytical workflows. To get users started using the Web Services, sample clients are provided covering a range of programming languages and popular Web Service tool kits, and a brief guide to Web Services technologies, including a set of tutorials, is available for those wishing to learn more and develop their own clients. Users of the Web Services are informed of improvements and updates via a range of methods.
Analysis Tool Web Services from the EMBL-EBI

PubMed Central

McWilliam, Hamish; Li, Weizhong; Uludag, Mahmut; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Cowley, Andrew Peter; Lopez, Rodrigo

2013-01-01

Since 2004 the European Bioinformatics Institute (EMBL-EBI) has provided access to a wide range of databases and analysis tools via Web Services interfaces. This comprises services to search across the databases available from the EMBL-EBI and to explore the network of cross-references present in the data (e.g. EB-eye), services to retrieve entry data in various data formats and to access the data in specific fields (e.g. dbfetch), and analysis tool services, for example, sequence similarity search (e.g. FASTA and NCBI BLAST), multiple sequence alignment (e.g. Clustal Omega and MUSCLE), pairwise sequence alignment and protein functional analysis (e.g. InterProScan and Phobius). The REST/SOAP Web Services (http://www.ebi.ac.uk/Tools/webservices/) interfaces to these databases and tools allow their integration into other tools, applications, web sites, pipeline processes and analytical workflows. To get users started using the Web Services, sample clients are provided covering a range of programming languages and popular Web Service tool kits, and a brief guide to Web Services technologies, including a set of tutorials, is available for those wishing to learn more and develop their own clients. Users of the Web Services are informed of improvements and updates via a range of methods. PMID:23671338
A practical approach for inexpensive searches of radiology report databases.

PubMed

Desjardins, Benoit; Hamilton, R Curtis

2007-06-01

We present a method to perform full text searches of radiology reports for the large number of departments that do not have this ability as part of their radiology or hospital information system. A tool written in Microsoft Access (front-end) has been designed to search a server (back-end) containing the indexed backup weekly copy of the full relational database extracted from a radiology information system (RIS). This front end-/back-end approach has been implemented in a large academic radiology department, and is used for teaching, research and administrative purposes. The weekly second backup of the 80 GB, 4 million record RIS database takes 2 hours. Further indexing of the exported radiology reports takes 6 hours. Individual searches of the indexed database typically take less than 1 minute on the indexed database and 30-60 minutes on the nonindexed database. Guidelines to properly address privacy and institutional review board issues are closely followed by all users. This method has potential to improve teaching, research, and administrative programs within radiology departments that cannot afford more expensive technology.
Mi-DISCOVERER: A bioinformatics tool for the detection of mi-RNA in human genome.

PubMed

Arshad, Saadia; Mumtaz, Asia; Ahmad, Freed; Liaquat, Sadia; Nadeem, Shahid; Mehboob, Shahid; Afzal, Muhammad

2010-11-27

MicroRNAs (miRNAs) are 22 nucleotides non-coding RNAs that play pivotal regulatory roles in diverse organisms including the humans and are difficult to be identified due to lack of either sequence features or robust algorithms to efficiently identify. Therefore, we made a tool that is Mi-Discoverer for the detection of miRNAs in human genome. The tools used for the development of software are Microsoft Office Access 2003, the JDK version 1.6.0, BioJava version 1.0, and the NetBeans IDE version 6.0. All already made miRNAs softwares were web based; so the advantage of our project was to make a desktop facility to the user for sequence alignment search with already identified miRNAs of human genome present in the database. The user can also insert and update the newly discovered human miRNA in the database. Mi-Discoverer, a bioinformatics tool successfully identifies human miRNAs based on multiple sequence alignment searches. It's a non redundant database containing a large collection of publicly available human miRNAs.
Mi-DISCOVERER: A bioinformatics tool for the detection of mi-RNA in human genome

PubMed Central

Arshad, Saadia; Mumtaz, Asia; Ahmad, Freed; Liaquat, Sadia; Nadeem, Shahid; Mehboob, Shahid; Afzal, Muhammad

2010-01-01

MicroRNAs (miRNAs) are 22 nucleotides non-coding RNAs that play pivotal regulatory roles in diverse organisms including the humans and are difficult to be identified due to lack of either sequence features or robust algorithms to efficiently identify. Therefore, we made a tool that is Mi-Discoverer for the detection of miRNAs in human genome. The tools used for the development of software are Microsoft Office Access 2003, the JDK version 1.6.0, BioJava version 1.0, and the NetBeans IDE version 6.0. All already made miRNAs softwares were web based; so the advantage of our project was to make a desktop facility to the user for sequence alignment search with already identified miRNAs of human genome present in the database. The user can also insert and update the newly discovered human miRNA in the database. Mi-Discoverer, a bioinformatics tool successfully identifies human miRNAs based on multiple sequence alignment searches. It's a non redundant database containing a large collection of publicly available human miRNAs. PMID:21364831
Identifying contributors of two-person DNA mixtures by familial database search.

PubMed

Chung, Yuk-Ka; Fung, Wing K

2013-01-01

The role of familial database search as a crime-solving tool has been increasingly recognized by forensic scientists. As an enhancement to the existing familial search approach on single source cases, this article presents our current progress in exploring the potential use of familial search to mixture cases. A novel method was established to predict the outcome of the search, from which a simple strategy for determining an appropriate scale of investigation by the police force is developed. Illustrated by an example using Swedish data, our approach is shown to have the potential for assisting the police force to decide on the scale of investigation, thereby achieving desirable crime-solving rate with reasonable cost.
Knowledge, skills and attitudes of hospital pharmacists in the use of information technology and electronic tools to support clinical practice: A Brazilian survey

PubMed Central

Vasconcelos, Hemerson Bruno da Silva; Woods, David John

2017-01-01

This study aimed to identify the knowledge, skills and attitudes of Brazilian hospital pharmacists in the use of information technology and electronic tools to support clinical practice. Methods: A questionnaire was sent by email to clinical pharmacists working public and private hospitals in Brazil. The instrument was validated using the method of Polit and Beck to determine the content validity index. Data (n = 348) were analyzed using descriptive statistics, Pearson's Chi-square test and Gamma correlation tests. Results: Pharmacists had 1–4 electronic devices for personal use, mainly smartphones (84.8%; n = 295) and laptops (81.6%; n = 284). At work, pharmacists had access to a computer (89.4%; n = 311), mostly connected to the internet (83.9%; n = 292). They felt competent (very capable/capable) searching for a web page/web site on a specific subject (100%; n = 348), downloading files (99.7%; n = 347), using spreadsheets (90.2%; n = 314), searching using MeSH terms in PubMed (97.4%; n = 339) and general searching for articles in bibliographic databases (such as Medline/PubMed: 93.4%; n = 325). Pharmacists did not feel competent in using statistical analysis software (somewhat capable/incapable: 78.4%; n = 273). Most pharmacists reported that they had not received formal education to perform most of these actions except searching using MeSH terms. Access to bibliographic databases was available in Brazilian hospitals, however, most pharmacists (78.7%; n = 274) reported daily use of a non-specific search engine such as Google. This result may reflect the lack of formal knowledge and training in the use of bibliographic databases and difficulty with the English language. The need to expand knowledge about information search tools was recognized by most pharmacists in clinical practice in Brazil, especially those with less time dedicated exclusively to clinical activity (Chi-square, p = 0.006). Conclusion: These results will assist in defining minimal competencies for the training of pharmacists in the field of information technology to support clinical practice. Knowledge and skill gaps are evident in the use of bibliographic databases, spreadsheets and statistical tools. PMID:29272292
Knowledge, skills and attitudes of hospital pharmacists in the use of information technology and electronic tools to support clinical practice: A Brazilian survey.

PubMed

Néri, Eugenie Desirèe Rabelo; Meira, Assuero Silva; Vasconcelos, Hemerson Bruno da Silva; Woods, David John; Fonteles, Marta Maria de França

2017-01-01

This study aimed to identify the knowledge, skills and attitudes of Brazilian hospital pharmacists in the use of information technology and electronic tools to support clinical practice. A questionnaire was sent by email to clinical pharmacists working public and private hospitals in Brazil. The instrument was validated using the method of Polit and Beck to determine the content validity index. Data (n = 348) were analyzed using descriptive statistics, Pearson's Chi-square test and Gamma correlation tests. Pharmacists had 1-4 electronic devices for personal use, mainly smartphones (84.8%; n = 295) and laptops (81.6%; n = 284). At work, pharmacists had access to a computer (89.4%; n = 311), mostly connected to the internet (83.9%; n = 292). They felt competent (very capable/capable) searching for a web page/web site on a specific subject (100%; n = 348), downloading files (99.7%; n = 347), using spreadsheets (90.2%; n = 314), searching using MeSH terms in PubMed (97.4%; n = 339) and general searching for articles in bibliographic databases (such as Medline/PubMed: 93.4%; n = 325). Pharmacists did not feel competent in using statistical analysis software (somewhat capable/incapable: 78.4%; n = 273). Most pharmacists reported that they had not received formal education to perform most of these actions except searching using MeSH terms. Access to bibliographic databases was available in Brazilian hospitals, however, most pharmacists (78.7%; n = 274) reported daily use of a non-specific search engine such as Google. This result may reflect the lack of formal knowledge and training in the use of bibliographic databases and difficulty with the English language. The need to expand knowledge about information search tools was recognized by most pharmacists in clinical practice in Brazil, especially those with less time dedicated exclusively to clinical activity (Chi-square, p = 0.006). These results will assist in defining minimal competencies for the training of pharmacists in the field of information technology to support clinical practice. Knowledge and skill gaps are evident in the use of bibliographic databases, spreadsheets and statistical tools.

LSE-Sign: A lexical database for Spanish Sign Language.

PubMed

Gutierrez-Sigut, Eva; Costello, Brendan; Baus, Cristina; Carreiras, Manuel

2016-03-01

The LSE-Sign database is a free online tool for selecting Spanish Sign Language stimulus materials to be used in experiments. It contains 2,400 individual signs taken from a recent standardized LSE dictionary, and a further 2,700 related nonsigns. Each entry is coded for a wide range of grammatical, phonological, and articulatory information, including handshape, location, movement, and non-manual elements. The database is accessible via a graphically based search facility which is highly flexible both in terms of the search options available and the way the results are displayed. LSE-Sign is available at the following website: http://www.bcbl.eu/databases/lse/.
LFQuant: a label-free fast quantitative analysis tool for high-resolution LC-MS/MS proteomics data.

PubMed

Zhang, Wei; Zhang, Jiyang; Xu, Changming; Li, Ning; Liu, Hui; Ma, Jie; Zhu, Yunping; Xie, Hongwei

2012-12-01

Database searching based methods for label-free quantification aim to reconstruct the peptide extracted ion chromatogram based on the identification information, which can limit the search space and thus make the data processing much faster. The random effect of the MS/MS sampling can be remedied by cross-assignment among different runs. Here, we present a new label-free fast quantitative analysis tool, LFQuant, for high-resolution LC-MS/MS proteomics data based on database searching. It is designed to accept raw data in two common formats (mzXML and Thermo RAW), and database search results from mainstream tools (MASCOT, SEQUEST, and X!Tandem), as input data. LFQuant can handle large-scale label-free data with fractionation such as SDS-PAGE and 2D LC. It is easy to use and provides handy user interfaces for data loading, parameter setting, quantitative analysis, and quantitative data visualization. LFQuant was compared with two common quantification software packages, MaxQuant and IDEAL-Q, on the replication data set and the UPS1 standard data set. The results show that LFQuant performs better than them in terms of both precision and accuracy, and consumes significantly less processing time. LFQuant is freely available under the GNU General Public License v3.0 at http://sourceforge.net/projects/lfquant/. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
RAG-3D: A search tool for RNA 3D substructures

DOE PAGES

Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef; ...

2015-08-24

In this study, to address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally describedmore » in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding.« less
RAG-3D: a search tool for RNA 3D substructures

PubMed Central

Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef; Schlick, Tamar

2015-01-01

To address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally described in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding. PMID:26304547
RAG-3D: A search tool for RNA 3D substructures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef

In this study, to address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally describedmore » in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding.« less
Molecule database framework: a framework for creating database applications with chemical structure search capability

PubMed Central

2013-01-01

Background Research in organic chemistry generates samples of novel chemicals together with their properties and other related data. The involved scientists must be able to store this data and search it by chemical structure. There are commercial solutions for common needs like chemical registration systems or electronic lab notebooks. However for specific requirements of in-house databases and processes no such solutions exist. Another issue is that commercial solutions have the risk of vendor lock-in and may require an expensive license of a proprietary relational database management system. To speed up and simplify the development for applications that require chemical structure search capabilities, I have developed Molecule Database Framework. The framework abstracts the storing and searching of chemical structures into method calls. Therefore software developers do not require extensive knowledge about chemistry and the underlying database cartridge. This decreases application development time. Results Molecule Database Framework is written in Java and I created it by integrating existing free and open-source tools and frameworks. The core functionality includes: • Support for multi-component compounds (mixtures) • Import and export of SD-files • Optional security (authorization) For chemical structure searching Molecule Database Framework leverages the capabilities of the Bingo Cartridge for PostgreSQL and provides type-safe searching, caching, transactions and optional method level security. Molecule Database Framework supports multi-component chemical compounds (mixtures). Furthermore the design of entity classes and the reasoning behind it are explained. By means of a simple web application I describe how the framework could be used. I then benchmarked this example application to create some basic performance expectations for chemical structure searches and import and export of SD-files. Conclusions By using a simple web application it was shown that Molecule Database Framework successfully abstracts chemical structure searches and SD-File import and export to simple method calls. The framework offers good search performance on a standard laptop without any database tuning. This is also due to the fact that chemical structure searches are paged and cached. Molecule Database Framework is available for download on the projects web page on bitbucket: https://bitbucket.org/kienerj/moleculedatabaseframework. PMID:24325762
Molecule database framework: a framework for creating database applications with chemical structure search capability.

PubMed

Kiener, Joos

2013-12-11

Research in organic chemistry generates samples of novel chemicals together with their properties and other related data. The involved scientists must be able to store this data and search it by chemical structure. There are commercial solutions for common needs like chemical registration systems or electronic lab notebooks. However for specific requirements of in-house databases and processes no such solutions exist. Another issue is that commercial solutions have the risk of vendor lock-in and may require an expensive license of a proprietary relational database management system. To speed up and simplify the development for applications that require chemical structure search capabilities, I have developed Molecule Database Framework. The framework abstracts the storing and searching of chemical structures into method calls. Therefore software developers do not require extensive knowledge about chemistry and the underlying database cartridge. This decreases application development time. Molecule Database Framework is written in Java and I created it by integrating existing free and open-source tools and frameworks. The core functionality includes:•Support for multi-component compounds (mixtures)•Import and export of SD-files•Optional security (authorization)For chemical structure searching Molecule Database Framework leverages the capabilities of the Bingo Cartridge for PostgreSQL and provides type-safe searching, caching, transactions and optional method level security. Molecule Database Framework supports multi-component chemical compounds (mixtures).Furthermore the design of entity classes and the reasoning behind it are explained. By means of a simple web application I describe how the framework could be used. I then benchmarked this example application to create some basic performance expectations for chemical structure searches and import and export of SD-files. By using a simple web application it was shown that Molecule Database Framework successfully abstracts chemical structure searches and SD-File import and export to simple method calls. The framework offers good search performance on a standard laptop without any database tuning. This is also due to the fact that chemical structure searches are paged and cached. Molecule Database Framework is available for download on the projects web page on bitbucket: https://bitbucket.org/kienerj/moleculedatabaseframework.
Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology

PubMed Central

Latendresse, Mario; Paley, Suzanne M.; Krummenacker, Markus; Ong, Quang D.; Billington, Richard; Kothari, Anamika; Weaver, Daniel; Lee, Thomas; Subhraveti, Pallavi; Spaulding, Aaron; Fulcher, Carol; Keseler, Ingrid M.; Caspi, Ron

2016-01-01

Pathway Tools is a bioinformatics software environment with a broad set of capabilities. The software provides genome-informatics tools such as a genome browser, sequence alignments, a genome-variant analyzer and comparative-genomics operations. It offers metabolic-informatics tools, such as metabolic reconstruction, quantitative metabolic modeling, prediction of reaction atom mappings and metabolic route search. Pathway Tools also provides regulatory-informatics tools, such as the ability to represent and visualize a wide range of regulatory interactions. This article outlines the advances in Pathway Tools in the past 5 years. Major additions include components for metabolic modeling, metabolic route search, computation of atom mappings and estimation of compound Gibbs free energies of formation; addition of editors for signaling pathways, for genome sequences and for cellular architecture; storage of gene essentiality data and phenotype data; display of multiple alignments, and of signaling and electron-transport pathways; and development of Python and web-services application programming interfaces. Scientists around the world have created more than 9800 Pathway/Genome Databases by using Pathway Tools, many of which are curated databases for important model organisms. PMID:26454094
In Silico PCR Tools for a Fast Primer, Probe, and Advanced Searching.

PubMed

Kalendar, Ruslan; Muterko, Alexandr; Shamekova, Malika; Zhambakin, Kabyl

2017-01-01

The polymerase chain reaction (PCR) is fundamental to molecular biology and is the most important practical molecular technique for the research laboratory. The principle of this technique has been further used and applied in plenty of other simple or complex nucleic acid amplification technologies (NAAT). In parallel to laboratory "wet bench" experiments for nucleic acid amplification technologies, in silico or virtual (bioinformatics) approaches have been developed, among which in silico PCR analysis. In silico NAAT analysis is a useful and efficient complementary method to ensure the specificity of primers or probes for an extensive range of PCR applications from homology gene discovery, molecular diagnosis, DNA fingerprinting, and repeat searching. Predicting sensitivity and specificity of primers and probes requires a search to determine whether they match a database with an optimal number of mismatches, similarity, and stability. In the development of in silico bioinformatics tools for nucleic acid amplification technologies, the prospects for the development of new NAAT or similar approaches should be taken into account, including forward-looking and comprehensive analysis that is not limited to only one PCR technique variant. The software FastPCR and the online Java web tool are integrated tools for in silico PCR of linear and circular DNA, multiple primer or probe searches in large or small databases and for advanced search. These tools are suitable for processing of batch files that are essential for automation when working with large amounts of data. The FastPCR software is available for download at http://primerdigital.com/fastpcr.html and the online Java version at http://primerdigital.com/tools/pcr.html .
ECOTOX (ECOTOXICOLOGY DATABASE): AN ASSESSMENT TOOL FOR THE 21ST CENTURY

EPA Science Inventory

The ECOTOX (ECOTOXicology Database) system developed by the U.S. EPA, National Health and Environmental Effecrs Research Laboratory (NHEERL), Mid-Continent Ecology Division in Duluth, MN, (MED-Duluth), provides a web browser search interface for locating aquatic and terrestrial t...
A collection of open source applications for mass spectrometry data mining.

PubMed

Gallardo, Óscar; Ovelleiro, David; Gay, Marina; Carrascal, Montserrat; Abian, Joaquin

2014-10-01

We present several bioinformatics applications for the identification and quantification of phosphoproteome components by MS. These applications include a front-end graphical user interface that combines several Thermo RAW formats to MASCOT™ Generic Format extractors (EasierMgf), two graphical user interfaces for search engines OMSSA and SEQUEST (OmssaGui and SequestGui), and three applications, one for the management of databases in FASTA format (FastaTools), another for the integration of search results from up to three search engines (Integrator), and another one for the visualization of mass spectra and their corresponding database search results (JsonVisor). These applications were developed to solve some of the common problems found in proteomic and phosphoproteomic data analysis and were integrated in the workflow for data processing and feeding on our LymPHOS database. Applications were designed modularly and can be used standalone. These tools are written in Perl and Python programming languages and are supported on Windows platforms. They are all released under an Open Source Software license and can be freely downloaded from our software repository hosted at GoogleCode. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Gaining knowledge from previously unexplained spectra-application of the PTM-Explorer software to detect PTM in HUPO BPP MS/MS data.

PubMed

Chamrad, Daniel C; Körting, Gerhard; Schäfer, Heike; Stephan, Christian; Thiele, Herbert; Apweiler, Rolf; Meyer, Helmut E; Marcus, Katrin; Blüggel, Martin

2006-09-01

A novel software tool named PTM-Explorer has been applied to LC-MS/MS datasets acquired within the Human Proteome Organisation (HUPO) Brain Proteome Project (BPP). PTM-Explorer enables automatic identification of peptide MS/MS spectra that were not explained in typical sequence database searches. The main focus was detection of PTMs, but PTM-Explorer detects also unspecific peptide cleavage, mass measurement errors, experimental modifications, amino acid substitutions, transpeptidation products and unknown mass shifts. To avoid a combinatorial problem the search is restricted to a set of selected protein sequences, which stem from previous protein identifications using a common sequence database search. Prior to application to the HUPO BPP data, PTM-Explorer was evaluated on excellently manually characterized and evaluated LC-MS/MS data sets from Alpha-A-Crystallin gel spots obtained from mouse eye lens. Besides various PTMs including phosphorylation, a wealth of experimental modifications and unspecific cleavage products were successfully detected, completing the primary structure information of the measured proteins. Our results indicate that a large amount of MS/MS spectra that currently remain unidentified in standard database searches contain valuable information that can only be elucidated using suitable software tools.
searchSCF: Using MongoDB to Enable Richer Searches of Locally Hosted Science Data Repositories

NASA Astrophysics Data System (ADS)

Knosp, B.

2016-12-01

Science teams today are in the unusual position of almost having too much data available to them. Modern sensors and models are capable of outputting terabytes of data per day, which can make it difficult to find specific subsets of data. The sheer size of files can also make it time consuming to retrieve this big data from national data archive centers. Thus, many science teams choose to store what data they can on their local systems, but they are not always equipped with tools to help them intelligently organize and search their data. In its local data repository, the Aura Microwave Limb Sounder (MLS) science team at NASA's Jet Propulsion Laboratory has collected over 300TB of atmospheric science data from 71 missions/models that aid in validation, algorithm development, and research activities. When the project began, the team developed a MySQL database to aid in data queries, but this database was only designed to keep track of MLS and a few ancillary data sets, leving much of the data uncatalogued. The team has also seen database query time rise over the life of the mission. Even though the MLS science team's data holdings are not the size of a national data center's, team members still need tools to help them discover and utilize the data that they have on-hand. Over the past year, members of the science team have been looking for solutions to (1) store information on all the data sets they have collected in a single database, (2) store more metadata about each data file, (3) develop queries that can find relationships among these disparate data types, and (4) plug any new functions developed around this database into existing analysis, visualization, and web tools, transparently to users. In this presentation, I will discuss the searchSCF package that is currently under development. This package includes a NoSQL database management system (MongoDB) and a set of Python tools that both ingests data into the database and supports user queries. I will also highlight case studies of how this system could be used by the MLS science team, and how it could be implemented by other science teams with local data repositories.
The Research Potential of the Electronic OED Database at the University of Waterloo: A Case Study.

ERIC Educational Resources Information Center

Berg, Donna Lee

1991-01-01

Discusses the history and structure of the online database of the second edition of the Oxford English Dictionary (OED) and the software tools developed at the University of Waterloo to manipulate the unusually complex database. Four sample searches that indicate some types of problems that might be encountered are appended. (DB)
Combined experiment Phase 2 data characterization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miller, M.S.; Shipley, D.E.; Young, T.S.

1995-11-01

The National Renewable Energy Laboratory`s ``Combined Experiment`` has yielded a large quantity of experimental data on the operation of a downwind horizontal axis wind turbine under field conditions. To fully utilize this valuable resource and identify particular episodes of interest, a number of databases were created that characterize individual data events and rotational cycles over a wide range of parameters. Each of the 59 five-minute data episodes collected during Phase 11 of the Combined Experiment have been characterized by the mean, minimum, maximum, and standard deviation of all data channels, except the blade surface pressures. Inflow condition, aerodynamic force coefficient,more » and minimum leading edge pressure coefficient databases have also been established, characterizing each of nearly 21,000 blade rotational cycles. In addition, a number of tools have been developed for searching these databases for particular episodes of interest. Due to their extensive size, only a portion of the episode characterization databases are included in an appendix, and examples of the cycle characterization databases are given. The search tools are discussed and the FORTRAN or C code for each is included in appendices.« less
Overview of Nuclear Physics Data: Databases, Web Applications and Teaching Tools

NASA Astrophysics Data System (ADS)

McCutchan, Elizabeth

2017-01-01

The mission of the United States Nuclear Data Program (USNDP) is to provide current, accurate, and authoritative data for use in pure and applied areas of nuclear science and engineering. This is accomplished by compiling, evaluating, and disseminating extensive datasets. Our main products include the Evaluated Nuclear Structure File (ENSDF) containing information on nuclear structure and decay properties and the Evaluated Nuclear Data File (ENDF) containing information on neutron-induced reactions. The National Nuclear Data Center (NNDC), through the website www.nndc.bnl.gov, provides web-based retrieval systems for these and many other databases. In addition, the NNDC hosts several on-line physics tools, useful for calculating various quantities relating to basic nuclear physics. In this talk, I will first introduce the quantities which are evaluated and recommended in our databases. I will then outline the searching capabilities which allow one to quickly and efficiently retrieve data. Finally, I will demonstrate how the database searches and web applications can provide effective teaching tools concerning the structure of nuclei and how they interact. Work supported by the Office of Nuclear Physics, Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-98CH10886.
Human Disease Insight: An integrated knowledge-based platform for disease-gene-drug information.

PubMed

Tasleem, Munazzah; Ishrat, Romana; Islam, Asimul; Ahmad, Faizan; Hassan, Md Imtaiyaz

2016-01-01

The scope of the Human Disease Insight (HDI) database is not limited to researchers or physicians as it also provides basic information to non-professionals and creates disease awareness, thereby reducing the chances of patient suffering due to ignorance. HDI is a knowledge-based resource providing information on human diseases to both scientists and the general public. Here, our mission is to provide a comprehensive human disease database containing most of the available useful information, with extensive cross-referencing. HDI is a knowledge management system that acts as a central hub to access information about human diseases and associated drugs and genes. In addition, HDI contains well-classified bioinformatics tools with helpful descriptions. These integrated bioinformatics tools enable researchers to annotate disease-specific genes and perform protein analysis, search for biomarkers and identify potential vaccine candidates. Eventually, these tools will facilitate the analysis of disease-associated data. The HDI provides two types of search capabilities and includes provisions for downloading, uploading and searching disease/gene/drug-related information. The logistical design of the HDI allows for regular updating. The database is designed to work best with Mozilla Firefox and Google Chrome and is freely accessible at http://humandiseaseinsight.com. Copyright © 2015 King Saud Bin Abdulaziz University for Health Sciences. Published by Elsevier Ltd. All rights reserved.
Decision making in family medicine

PubMed Central

Labrecque, Michel; Ratté, Stéphane; Frémont, Pierre; Cauchon, Michel; Ouellet, Jérôme; Hogg, William; McGowan, Jessie; Gagnon, Marie-Pierre; Njoya, Merlin; Légaré, France

2013-01-01

Abstract Objective To compare the ability of users of 2 medical search engines, InfoClinique and the Trip database, to provide correct answers to clinical questions and to explore the perceived effects of the tools on the clinical decision-making process. Design Randomized trial. Setting Three family medicine units of the family medicine program of the Faculty of Medicine at Laval University in Quebec city, Que. Participants Fifteen second-year family medicine residents. Intervention Residents generated 30 structured questions about therapy or preventive treatment (2 questions per resident) based on clinical encounters. Using an Internet platform designed for the trial, each resident answered 20 of these questions (their own 2, plus 18 of the questions formulated by other residents, selected randomly) before and after searching for information with 1 of the 2 search engines. For each question, 5 residents were randomly assigned to begin their search with InfoClinique and 5 with the Trip database. Main outcome measures The ability of residents to provide correct answers to clinical questions using the search engines, as determined by third-party evaluation. After answering each question, participants completed a questionnaire to assess their perception of the engine’s effect on the decision-making process in clinical practice. Results Of 300 possible pairs of answers (1 answer before and 1 after the initial search), 254 (85%) were produced by 14 residents. Of these, 132 (52%) and 122 (48%) pairs of answers concerned questions that had been assigned an initial search with InfoClinique and the Trip database, respectively. Both engines produced an important and similar absolute increase in the proportion of correct answers after searching (26% to 62% for InfoClinique, for an increase of 36%; 24% to 63% for the Trip database, for an increase of 39%; P = .68). For all 30 clinical questions, at least 1 resident produced the correct answer after searching with either search engine. The mean (SD) time of the initial search for each question was 23.5 (7.6) minutes with InfoClinique and 22.3 (7.8) minutes with the Trip database (P = .30). Participants’ perceptions of each engine’s effect on the decision-making process were very positive and similar for both search engines. Conclusion Family medicine residents’ ability to provide correct answers to clinical questions increased dramatically and similarly with the use of both InfoClinique and the Trip database. These tools have strong potential to increase the quality of medical care. PMID:24130286
Databases and Associated Tools for Glycomics and Glycoproteomics.

PubMed

Lisacek, Frederique; Mariethoz, Julien; Alocci, Davide; Rudd, Pauline M; Abrahams, Jodie L; Campbell, Matthew P; Packer, Nicolle H; Ståhle, Jonas; Widmalm, Göran; Mullen, Elaine; Adamczyk, Barbara; Rojas-Macias, Miguel A; Jin, Chunsheng; Karlsson, Niclas G

2017-01-01

The access to biodatabases for glycomics and glycoproteomics has proven to be essential for current glycobiological research. This chapter presents available databases that are devoted to different aspects of glycobioinformatics. This includes oligosaccharide sequence databases, experimental databases, 3D structure databases (of both glycans and glycorelated proteins) and association of glycans with tissue, disease, and proteins. Specific search protocols are also provided using tools associated with experimental databases for converting primary glycoanalytical data to glycan structural information. In particular, researchers using glycoanalysis methods by U/HPLC (GlycoBase), MS (GlycoWorkbench, UniCarb-DB, GlycoDigest), and NMR (CASPER) will benefit from this chapter. In addition we also include information on how to utilize glycan structural information to query databases that associate glycans with proteins (UniCarbKB) and with interactions with pathogens (SugarBind).
DRUMS: a human disease related unique gene mutation search engine.

PubMed

Li, Zuofeng; Liu, Xingnan; Wen, Jingran; Xu, Ye; Zhao, Xin; Li, Xuan; Liu, Lei; Zhang, Xiaoyan

2011-10-01

With the completion of the human genome project and the development of new methods for gene variant detection, the integration of mutation data and its phenotypic consequences has become more important than ever. Among all available resources, locus-specific databases (LSDBs) curate one or more specific genes' mutation data along with high-quality phenotypes. Although some genotype-phenotype data from LSDB have been integrated into central databases little effort has been made to integrate all these data by a search engine approach. In this work, we have developed disease related unique gene mutation search engine (DRUMS), a search engine for human disease related unique gene mutation as a convenient tool for biologists or physicians to retrieve gene variant and related phenotype information. Gene variant and phenotype information were stored in a gene-centred relational database. Moreover, the relationships between mutations and diseases were indexed by the uniform resource identifier from LSDB, or another central database. By querying DRUMS, users can access the most popular mutation databases under one interface. DRUMS could be treated as a domain specific search engine. By using web crawling, indexing, and searching technologies, it provides a competitively efficient interface for searching and retrieving mutation data and their relationships to diseases. The present system is freely accessible at http://www.scbit.org/glif/new/drums/index.html. © 2011 Wiley-Liss, Inc.

Artificial Intelligence Based Selection of Optimal Cutting Tool and Process Parameters for Effective Turning and Milling Operations

NASA Astrophysics Data System (ADS)

Saranya, Kunaparaju; John Rozario Jegaraj, J.; Ramesh Kumar, Katta; Venkateshwara Rao, Ghanta

2016-06-01

With the increased trend in automation of modern manufacturing industry, the human intervention in routine, repetitive and data specific activities of manufacturing is greatly reduced. In this paper, an attempt has been made to reduce the human intervention in selection of optimal cutting tool and process parameters for metal cutting applications, using Artificial Intelligence techniques. Generally, the selection of appropriate cutting tool and parameters in metal cutting is carried out by experienced technician/cutting tool expert based on his knowledge base or extensive search from huge cutting tool database. The present proposed approach replaces the existing practice of physical search for tools from the databooks/tool catalogues with intelligent knowledge-based selection system. This system employs artificial intelligence based techniques such as artificial neural networks, fuzzy logic and genetic algorithm for decision making and optimization. This intelligence based optimal tool selection strategy is developed using Mathworks Matlab Version 7.11.0 and implemented. The cutting tool database was obtained from the tool catalogues of different tool manufacturers. This paper discusses in detail, the methodology and strategies employed for selection of appropriate cutting tool and optimization of process parameters based on multi-objective optimization criteria considering material removal rate, tool life and tool cost.
Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor.

PubMed

Kohany, Oleksiy; Gentles, Andrew J; Hankus, Lukasz; Jurka, Jerzy

2006-10-25

Repbase is a reference database of eukaryotic repetitive DNA, which includes prototypic sequences of repeats and basic information described in annotations. Updating and maintenance of the database requires specialized tools, which we have created and made available for use with Repbase, and which may be useful as a template for other curated databases. We describe the software tools RepbaseSubmitter and Censor, which are designed to facilitate updating and screening the content of Repbase. RepbaseSubmitter is a java-based interface for formatting and annotating Repbase entries. It eliminates many common formatting errors, and automates actions such as calculation of sequence lengths and composition, thus facilitating curation of Repbase sequences. In addition, it has several features for predicting protein coding regions in sequences; searching and including Pubmed references in Repbase entries; and searching the NCBI taxonomy database for correct inclusion of species information and taxonomic position. Censor is a tool to rapidly identify repetitive elements by comparison to known repeats. It uses WU-BLAST for speed and sensitivity, and can conduct DNA-DNA, DNA-protein, or translated DNA-translated DNA searches of genomic sequence. Defragmented output includes a map of repeats present in the query sequence, with the options to report masked query sequence(s), repeat sequences found in the query, and alignments. Censor and RepbaseSubmitter are available as both web-based services and downloadable versions. They can be found at http://www.girinst.org/repbase/submission.html (RepbaseSubmitter) and http://www.girinst.org/censor/index.php (Censor).
Development of a PubMed Based Search Tool for Identifying Sex and Gender Specific Health Literature.

PubMed

Song, Michael M; Simonsen, Cheryl K; Wilson, Joanna D; Jenkins, Marjorie R

2016-02-01

An effective literature search strategy is critical to achieving the aims of Sex and Gender Specific Health (SGSH): to understand sex and gender differences through research and to effectively incorporate the new knowledge into the clinical decision making process to benefit both male and female patients. The goal of this project was to develop and validate an SGSH literature search tool that is readily and freely available to clinical researchers and practitioners. PubMed, a freely available search engine for the Medline database, was selected as the platform to build the SGSH literature search tool. Combinations of Medical Subject Heading terms, text words, and title words were evaluated for optimal specificity and sensitivity. The search tool was then validated against reference bases compiled for two disease states, diabetes and stroke. Key sex and gender terms and limits were bundled to create a search tool to facilitate PubMed SGSH literature searches. During validation, the search tool retrieved 50 of 94 (53.2%) stroke and 62 of 95 (65.3%) diabetes reference articles selected for validation. A general keyword search of stroke or diabetes combined with sex difference retrieved 33 of 94 (35.1%) stroke and 22 of 95 (23.2%) diabetes reference base articles, with lower sensitivity and specificity for SGSH content. The Texas Tech University Health Sciences Center SGSH PubMed Search Tool provides higher sensitivity and specificity to sex and gender specific health literature. The tool will facilitate research, clinical decision-making, and guideline development relevant to SGSH.
Development of a PubMed Based Search Tool for Identifying Sex and Gender Specific Health Literature

PubMed Central

Song, Michael M.; Simonsen, Cheryl K.; Wilson, Joanna D.

2016-01-01

Abstract Background: An effective literature search strategy is critical to achieving the aims of Sex and Gender Specific Health (SGSH): to understand sex and gender differences through research and to effectively incorporate the new knowledge into the clinical decision making process to benefit both male and female patients. The goal of this project was to develop and validate an SGSH literature search tool that is readily and freely available to clinical researchers and practitioners. Methods: PubMed, a freely available search engine for the Medline database, was selected as the platform to build the SGSH literature search tool. Combinations of Medical Subject Heading terms, text words, and title words were evaluated for optimal specificity and sensitivity. The search tool was then validated against reference bases compiled for two disease states, diabetes and stroke. Results: Key sex and gender terms and limits were bundled to create a search tool to facilitate PubMed SGSH literature searches. During validation, the search tool retrieved 50 of 94 (53.2%) stroke and 62 of 95 (65.3%) diabetes reference articles selected for validation. A general keyword search of stroke or diabetes combined with sex difference retrieved 33 of 94 (35.1%) stroke and 22 of 95 (23.2%) diabetes reference base articles, with lower sensitivity and specificity for SGSH content. Conclusions: The Texas Tech University Health Sciences Center SGSH PubMed Search Tool provides higher sensitivity and specificity to sex and gender specific health literature. The tool will facilitate research, clinical decision-making, and guideline development relevant to SGSH. PMID:26555409
Measuring Person-Centered Care: A Critical Comparative Review of Published Tools

ERIC Educational Resources Information Center

Edvardsson, David; Innes, Anthea

2010-01-01

Purpose of the study: To present a critical comparative review of published tools measuring the person-centeredness of care for older people and people with dementia. Design and Methods: Included tools were identified by searches of PubMed, Cinahl, the Bradford Dementia Group database, and authors' files. The terms "Person-centered,"…
Searching mixed DNA profiles directly against profile databases.

PubMed

Bright, Jo-Anne; Taylor, Duncan; Curran, James; Buckleton, John

2014-03-01

DNA databases have revolutionised forensic science. They are a powerful investigative tool as they have the potential to identify persons of interest in criminal investigations. Routinely, a DNA profile generated from a crime sample could only be searched for in a database of individuals if the stain was from single contributor (single source) or if a contributor could unambiguously be determined from a mixed DNA profile. This meant that a significant number of samples were unsuitable for database searching. The advent of continuous methods for the interpretation of DNA profiles offers an advanced way to draw inferential power from the considerable investment made in DNA databases. Using these methods, each profile on the database may be considered a possible contributor to a mixture and a likelihood ratio (LR) can be formed. Those profiles which produce a sufficiently large LR can serve as an investigative lead. In this paper empirical studies are described to determine what constitutes a large LR. We investigate the effect on a database search of complex mixed DNA profiles with contributors in equal proportions with dropout as a consideration, and also the effect of an incorrect assignment of the number of contributors to a profile. In addition, we give, as a demonstration of the method, the results using two crime samples that were previously unsuitable for database comparison. We show that effective management of the selection of samples for searching and the interpretation of the output can be highly informative. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Mass Spectral Library with Search Program, Data Version: NIST v17

National Institute of Standards and Technology Data Gateway

SRD 1A NIST/EPA/NIH Mass Spectral Library with Search Program, Data Version: NIST v17 (PC database for purchase) Available with full-featured NIST MS Search Program for Windows integrated tools, the NIST '98 is a fully evaluated collection of electron-ionization mass spectra. (147,198 Compounds with Spectra; 147,194 Chemical Structures; 174,948 Spectra )
PubstractHelper: A Web-based Text-Mining Tool for Marking Sentences in Abstracts from PubMed Using Multiple User-Defined Keywords.

PubMed

Chen, Chou-Cheng; Ho, Chung-Liang

2014-01-01

While a huge amount of information about biological literature can be obtained by searching the PubMed database, reading through all the titles and abstracts resulting from such a search for useful information is inefficient. Text mining makes it possible to increase this efficiency. Some websites use text mining to gather information from the PubMed database; however, they are database-oriented, using pre-defined search keywords while lacking a query interface for user-defined search inputs. We present the PubMed Abstract Reading Helper (PubstractHelper) website which combines text mining and reading assistance for an efficient PubMed search. PubstractHelper can accept a maximum of ten groups of keywords, within each group containing up to ten keywords. The principle behind the text-mining function of PubstractHelper is that keywords contained in the same sentence are likely to be related. PubstractHelper highlights sentences with co-occurring keywords in different colors. The user can download the PMID and the abstracts with color markings to be reviewed later. The PubstractHelper website can help users to identify relevant publications based on the presence of related keywords, which should be a handy tool for their research. http://bio.yungyun.com.tw/ATM/PubstractHelper.aspx and http://holab.med.ncku.edu.tw/ATM/PubstractHelper.aspx.
Invisible Authorship: Women's Names, Databases, and Technology.

ERIC Educational Resources Information Center

Tescione, Susan M.

Bibliographic databases act as search tools to locate relevant literature and information, but they also disseminate information about the works indexed in the records. Articles and authors that cannot be found cannot be cited, and the ability to disseminate data for that particular work is diminished. This study found no significant differences…
Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology.

PubMed

Karp, Peter D; Latendresse, Mario; Paley, Suzanne M; Krummenacker, Markus; Ong, Quang D; Billington, Richard; Kothari, Anamika; Weaver, Daniel; Lee, Thomas; Subhraveti, Pallavi; Spaulding, Aaron; Fulcher, Carol; Keseler, Ingrid M; Caspi, Ron

2016-09-01

Pathway Tools is a bioinformatics software environment with a broad set of capabilities. The software provides genome-informatics tools such as a genome browser, sequence alignments, a genome-variant analyzer and comparative-genomics operations. It offers metabolic-informatics tools, such as metabolic reconstruction, quantitative metabolic modeling, prediction of reaction atom mappings and metabolic route search. Pathway Tools also provides regulatory-informatics tools, such as the ability to represent and visualize a wide range of regulatory interactions. This article outlines the advances in Pathway Tools in the past 5 years. Major additions include components for metabolic modeling, metabolic route search, computation of atom mappings and estimation of compound Gibbs free energies of formation; addition of editors for signaling pathways, for genome sequences and for cellular architecture; storage of gene essentiality data and phenotype data; display of multiple alignments, and of signaling and electron-transport pathways; and development of Python and web-services application programming interfaces. Scientists around the world have created more than 9800 Pathway/Genome Databases by using Pathway Tools, many of which are curated databases for important model organisms. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
XTCE GOVSAT Tool Suite 1.0

NASA Technical Reports Server (NTRS)

Rice, J. Kevin

2013-01-01

The XTCE GOVSAT software suite contains three tools: validation, search, and reporting. The Extensible Markup Language (XML) Telemetric and Command Exchange (XTCE) GOVSAT Tool Suite is written in Java for manipulating XTCE XML files. XTCE is a Consultative Committee for Space Data Systems (CCSDS) and Object Management Group (OMG) specification for describing the format and information in telemetry and command packet streams. These descriptions are files that are used to configure real-time telemetry and command systems for mission operations. XTCE s purpose is to exchange database information between different systems. XTCE GOVSAT consists of rules for narrowing the use of XTCE for missions. The Validation Tool is used to syntax check GOVSAT XML files. The Search Tool is used to search (i.e. command and telemetry mnemonics) the GOVSAT XML files and view the results. Finally, the Reporting Tool is used to create command and telemetry reports. These reports can be displayed or printed for use by the operations team.
Nucleic Acid Database (NDB)

Science.gov Websites

the NDB archive or in the Non-Redundant list Advanced Search Search for structures based on structural features, chemical features, binding modes, citation and experimental information Featured Tools RNA 3D Motif Atlas, a representative collection of RNA 3D internal and hairpin loop motifs Non-redundant Lists
Serials Information on CD-ROM: A Reference Perspective.

ERIC Educational Resources Information Center

Karch, Linda S.

1990-01-01

Describes Ulrich's PLUS (a CD-ROM version of Ulrich's serials directories) and EBSCO's CD-ROM version of "The Serials Directory," and compares the two in terms of their use as reference tools. Areas discussed include database content, user aids, system features, search features, and a comparison of search results. Equipment requirements…
The LANL hemorrhagic fever virus database, a new platform for analyzing biothreat viruses.

PubMed

Kuiken, Carla; Thurmond, Jim; Dimitrijevic, Mira; Yoon, Hyejin

2012-01-01

Hemorrhagic fever viruses (HFVs) are a diverse set of over 80 viral species, found in 10 different genera comprising five different families: arena-, bunya-, flavi-, filo- and togaviridae. All these viruses are highly variable and evolve rapidly, making them elusive targets for the immune system and for vaccine and drug design. About 55,000 HFV sequences exist in the public domain today. A central website that provides annotated sequences and analysis tools will be helpful to HFV researchers worldwide. The HFV sequence database collects and stores sequence data and provides a user-friendly search interface and a large number of sequence analysis tools, following the model of the highly regarded and widely used Los Alamos HIV database [Kuiken, C., B. Korber, and R.W. Shafer, HIV sequence databases. AIDS Rev, 2003. 5: p. 52-61]. The database uses an algorithm that aligns each sequence to a species-wide reference sequence. The NCBI RefSeq database [Sayers et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res., 39, D38-D51.] is used for this; if a reference sequence is not available, a Blast search finds the best candidate. Using this method, sequences in each genus can be retrieved pre-aligned. The HFV website can be accessed via http://hfv.lanl.gov.
LoopX: A Graphical User Interface-Based Database for Comprehensive Analysis and Comparative Evaluation of Loops from Protein Structures.

PubMed

Kadumuri, Rajashekar Varma; Vadrevu, Ramakrishna

2017-10-01

Due to their crucial role in function, folding, and stability, protein loops are being targeted for grafting/designing to create novel or alter existing functionality and improve stability and foldability. With a view to facilitate a thorough analysis and effectual search options for extracting and comparing loops for sequence and structural compatibility, we developed, LoopX a comprehensively compiled library of sequence and conformational features of ∼700,000 loops from protein structures. The database equipped with a graphical user interface is empowered with diverse query tools and search algorithms, with various rendering options to visualize the sequence- and structural-level information along with hydrogen bonding patterns, backbone φ, ψ dihedral angles of both the target and candidate loops. Two new features (i) conservation of the polar/nonpolar environment and (ii) conservation of sequence and conformation of specific residues within the loops have also been incorporated in the search and retrieval of compatible loops for a chosen target loop. Thus, the LoopX server not only serves as a database and visualization tool for sequence and structural analysis of protein loops but also aids in extracting and comparing candidate loops for a given target loop based on user-defined search options.
User Guidelines for the Brassica Database: BRAD.

PubMed

Wang, Xiaobo; Cheng, Feng; Wang, Xiaowu

2016-01-01

The genome sequence of Brassica rapa was first released in 2011. Since then, further Brassica genomes have been sequenced or are undergoing sequencing. It is therefore necessary to develop tools that help users to mine information from genomic data efficiently. This will greatly aid scientific exploration and breeding application, especially for those with low levels of bioinformatic training. Therefore, the Brassica database (BRAD) was built to collect, integrate, illustrate, and visualize Brassica genomic datasets. BRAD provides useful searching and data mining tools, and facilitates the search of gene annotation datasets, syntenic or non-syntenic orthologs, and flanking regions of functional genomic elements. It also includes genome-analysis tools such as BLAST and GBrowse. One of the important aims of BRAD is to build a bridge between Brassica crop genomes with the genome of the model species Arabidopsis thaliana, thus transferring the bulk of A. thaliana gene study information for use with newly sequenced Brassica crops.
Mass measurement errors of Fourier-transform mass spectrometry (FTMS): distribution, recalibration, and application.

PubMed

Zhang, Jiyang; Ma, Jie; Dou, Lei; Wu, Songfeng; Qian, Xiaohong; Xie, Hongwei; Zhu, Yunping; He, Fuchu

2009-02-01

The hybrid linear trap quadrupole Fourier-transform (LTQ-FT) ion cyclotron resonance mass spectrometer, an instrument with high accuracy and resolution, is widely used in the identification and quantification of peptides and proteins. However, time-dependent errors in the system may lead to deterioration of the accuracy of these instruments, negatively influencing the determination of the mass error tolerance (MET) in database searches. Here, a comprehensive discussion of LTQ/FT precursor ion mass error is provided. On the basis of an investigation of the mass error distribution, we propose an improved recalibration formula and introduce a new tool, FTDR (Fourier-transform data recalibration), that employs a graphic user interface (GUI) for automatic calibration. It was found that the calibration could adjust the mass error distribution to more closely approximate a normal distribution and reduce the standard deviation (SD). Consequently, we present a new strategy, LDSF (Large MET database search and small MET filtration), for database search MET specification and validation of database search results. As the name implies, a large-MET database search is conducted and the search results are then filtered using the statistical MET estimated from high-confidence results. By applying this strategy to a standard protein data set and a complex data set, we demonstrate the LDSF can significantly improve the sensitivity of the result validation procedure.
Citation searching: a systematic review case study of multiple risk behaviour interventions

PubMed Central

2014-01-01

Background The value of citation searches as part of the systematic review process is currently unknown. While the major guides to conducting systematic reviews state that citation searching should be carried out in addition to searching bibliographic databases there are still few studies in the literature that support this view. Rather than using a predefined search strategy to retrieve studies, citation searching uses known relevant papers to identify further papers. Methods We describe a case study about the effectiveness of using the citation sources Google Scholar, Scopus, Web of Science and OVIDSP MEDLINE to identify records for inclusion in a systematic review. We used the 40 included studies identified by traditional database searches from one systematic review of interventions for multiple risk behaviours. We searched for each of the included studies in the four citation sources to retrieve the details of all papers that have cited these studies. We carried out two analyses; the first was to examine the overlap between the four citation sources to identify which citation tool was the most useful; the second was to investigate whether the citation searches identified any relevant records in addition to those retrieved by the original database searches. Results The highest number of citations was retrieved from Google Scholar (1680), followed by Scopus (1173), then Web of Science (1095) and lastly OVIDSP (213). To retrieve all the records identified by the citation tracking searching all four resources was required. Google Scholar identified the highest number of unique citations. The citation tracking identified 9 studies that met the review’s inclusion criteria. Eight of these had already been identified by the traditional databases searches and identified in the screening process while the ninth was not available in any of the databases when the original searches were carried out. It would, however, have been identified by two of the database search strategies if searches had been carried out later. Conclusions Based on the results from this investigation, citation searching as a supplementary search method for systematic reviews may not be the best use of valuable time and resources. It would be useful to verify these findings in other reviews. PMID:24893958
SIRW: A web server for the Simple Indexing and Retrieval System that combines sequence motif searches with keyword searches.

PubMed

Ramu, Chenna

2003-07-01

SIRW (http://sirw.embl.de/) is a World Wide Web interface to the Simple Indexing and Retrieval System (SIR) that is capable of parsing and indexing various flat file databases. In addition it provides a framework for doing sequence analysis (e.g. motif pattern searches) for selected biological sequences through keyword search. SIRW is an ideal tool for the bioinformatics community for searching as well as analyzing biological sequences of interest.
The Hawaiian Algal Database: a laboratory LIMS and online resource for biodiversity data

PubMed Central

Wang, Norman; Sherwood, Alison R; Kurihara, Akira; Conklin, Kimberly Y; Sauvage, Thomas; Presting, Gernot G

2009-01-01

Background Organization and presentation of biodiversity data is greatly facilitated by databases that are specially designed to allow easy data entry and organized data display. Such databases also have the capacity to serve as Laboratory Information Management Systems (LIMS). The Hawaiian Algal Database was designed to showcase specimens collected from the Hawaiian Archipelago, enabling users around the world to compare their specimens with our photographs and DNA sequence data, and to provide lab personnel with an organizational tool for storing various biodiversity data types. Description We describe the Hawaiian Algal Database, a comprehensive and searchable database containing photographs and micrographs, geo-referenced collecting information, taxonomic checklists and standardized DNA sequence data. All data for individual samples are linked through unique accession numbers. Users can search online for sample information by accession number, numerous levels of taxonomy, or collection site. At the present time the database contains data representing over 2,000 samples of marine, freshwater and terrestrial algae from the Hawaiian Archipelago. These samples are primarily red algae, although other taxa are being added. Conclusion The Hawaiian Algal Database is a digital repository for Hawaiian algal samples and acts as a LIMS for the laboratory. Users can make use of the online search tool to view and download specimen photographs and micrographs, DNA sequences and relevant habitat data, including georeferenced collecting locations. It is publicly available at . PMID:19728892

EuPathDB: the eukaryotic pathogen genomics database resource

PubMed Central

Aurrecoechea, Cristina; Barreto, Ana; Basenko, Evelina Y.; Brestelli, John; Brunk, Brian P.; Cade, Shon; Crouch, Kathryn; Doherty, Ryan; Falke, Dave; Fischer, Steve; Gajria, Bindu; Harb, Omar S.; Heiges, Mark; Hertz-Fowler, Christiane; Hu, Sufen; Iodice, John; Kissinger, Jessica C.; Lawrence, Cris; Li, Wei; Pinney, Deborah F.; Pulman, Jane A.; Roos, David S.; Shanmugasundram, Achchuthan; Silva-Franco, Fatima; Steinbiss, Sascha; Stoeckert, Christian J.; Spruill, Drew; Wang, Haiming; Warrenfeltz, Susanne; Zheng, Jie

2017-01-01

The Eukaryotic Pathogen Genomics Database Resource (EuPathDB, http://eupathdb.org) is a collection of databases covering 170+ eukaryotic pathogens (protists & fungi), along with relevant free-living and non-pathogenic species, and select pathogen hosts. To facilitate the discovery of meaningful biological relationships, the databases couple preconfigured searches with visualization and analysis tools for comprehensive data mining via intuitive graphical interfaces and APIs. All data are analyzed with the same workflows, including creation of gene orthology profiles, so data are easily compared across data sets, data types and organisms. EuPathDB is updated with numerous new analysis tools, features, data sets and data types. New tools include GO, metabolic pathway and word enrichment analyses plus an online workspace for analysis of personal, non-public, large-scale data. Expanded data content is mostly genomic and functional genomic data while new data types include protein microarray, metabolic pathways, compounds, quantitative proteomics, copy number variation, and polysomal transcriptomics. New features include consistent categorization of searches, data sets and genome browser tracks; redesigned gene pages; effective integration of alternative transcripts; and a EuPathDB Galaxy instance for private analyses of a user's data. Forthcoming upgrades include user workspaces for private integration of data with existing EuPathDB data and improved integration and presentation of host–pathogen interactions. PMID:27903906
Bioinformatics Approaches to Classifying Allergens and Predicting Cross-Reactivity

PubMed Central

Schein, Catherine H.; Ivanciuc, Ovidiu; Braun, Werner

2007-01-01

The major advances in understanding why patients respond to several seemingly different stimuli have been through the isolation, sequencing and structural analysis of proteins that induce an IgE response. The most significant finding is that allergenic proteins from very different sources can have nearly identical sequences and structures, and that this similarity can account for clinically observed cross-reactivity. The increasing amount of information on the sequence, structure and IgE epitopes of allergens is now available in several databases and powerful bioinformatics search tools allow user access to relevant information. Here, we provide an overview of these databases and describe state-of-the art bioinformatics tools to identify the common proteins that may be at the root of multiple allergy syndromes. Progress has also been made in quantitatively defining characteristics that discriminate allergens from non-allergens. Search and software tools for this purpose have been developed and implemented in the Structural Database of Allergenic Proteins (SDAP, http://fermi.utmb.edu/SDAP/). SDAP contains information for over 800 allergens and extensive bibliographic references in a relational database with links to other publicly available databases. SDAP is freely available on the Web to clinicians and patients, and can be used to find structural and functional relations among known allergens and to identify potentially cross-reacting antigens. Here we illustrate how these bioinformatics tools can be used to group allergens, and to detect areas that may account for common patterns of IgE binding and cross-reactivity. Such results can be used to guide treatment regimens for allergy sufferers. PMID:17276876
Supervised learning of tools for content-based search of image databases

NASA Astrophysics Data System (ADS)

Delanoy, Richard L.

1996-03-01

A computer environment, called the Toolkit for Image Mining (TIM), is being developed with the goal of enabling users with diverse interests and varied computer skills to create search tools for content-based image retrieval and other pattern matching tasks. Search tools are generated using a simple paradigm of supervised learning that is based on the user pointing at mistakes of classification made by the current search tool. As mistakes are identified, a learning algorithm uses the identified mistakes to build up a model of the user's intentions, construct a new search tool, apply the search tool to a test image, display the match results as feedback to the user, and accept new inputs from the user. Search tools are constructed in the form of functional templates, which are generalized matched filters capable of knowledge- based image processing. The ability of this system to learn the user's intentions from experience contrasts with other existing approaches to content-based image retrieval that base searches on the characteristics of a single input example or on a predefined and semantically- constrained textual query. Currently, TIM is capable of learning spectral and textural patterns, but should be adaptable to the learning of shapes, as well. Possible applications of TIM include not only content-based image retrieval, but also quantitative image analysis, the generation of metadata for annotating images, data prioritization or data reduction in bandwidth-limited situations, and the construction of components for larger, more complex computer vision algorithms.
Ocean Drilling Program: Publication Services: Online Manuscript Submission

Science.gov Websites

products Drilling services and tools Online Janus database Search the ODP/TAMU web site ODP/TAMU Science Operator Home ODP's main web site Publications Policy Author Instructions Scientific Results Manuscript use the submission and review forms available on the IODP-USIO publications web site. ODP | Search
A knowledge based search tool for performance measures in health care systems.

PubMed

Beyan, Oya D; Baykal, Nazife

2012-02-01

Performance measurement is vital for improving the health care systems. However, we are still far from having accepted performance measurement models. Researchers and developers are seeking comparable performance indicators. We developed an intelligent search tool to identify appropriate measures for specific requirements by matching diverse care settings. We reviewed the literature and analyzed 229 performance measurement studies published after 2000. These studies are evaluated with an original theoretical framework and stored in the database. A semantic network is designed for representing domain knowledge and supporting reasoning. We have applied knowledge based decision support techniques to cope with uncertainty problems. As a result we designed a tool which simplifies the performance indicator search process and provides most relevant indicators by employing knowledge based systems.
Comparative homology agreement search: An effective combination of homology-search methods

PubMed Central

Alam, Intikhab; Dress, Andreas; Rehmsmeier, Marc; Fuellen, Georg

2004-01-01

Many methods have been developed to search for homologous members of a protein family in databases, and the reliability of results and conclusions may be compromised if only one method is used, neglecting the others. Here we introduce a general scheme for combining such methods. Based on this scheme, we implemented a tool called comparative homology agreement search (chase) that integrates different search strategies to obtain a combined “E value.” Our results show that a consensus method integrating distinct strategies easily outperforms any of its component algorithms. More specifically, an evaluation based on the Structural Classification of Proteins database reveals that, on average, a coverage of 47% can be obtained in searches for distantly related homologues (i.e., members of the same superfamily but not the same family, which is a very difficult task), accepting only 10 false positives, whereas the individual methods obtain a coverage of 28–38%. PMID:15367730
A Fast, Minimalist Search Tool for Remote Sensing Data

NASA Astrophysics Data System (ADS)

Lynnes, C. S.; Macharrie, P. G.; Elkins, M.; Joshi, T.; Fenichel, L. H.

2005-12-01

We present a tool that emphasizes speed and simplicity in searching remotely sensed Earth Science data. The tool, nicknamed "Mirador" (Spanish for a scenic overlook), provides only four freetext search form fields, for Keywords, Location, Data Start and Data Stop. This contrasts with many current Earth Science search tools that offer highly structured interfaces in order to ensure precise, non-zero results. The disadvantages of the structured approach lie in its complexity and resultant learning curve, as well as the time it takes to formulate and execute the search, thus discouraging iterative discovery. On the other hand, the success of the basic Google search interface shows that many users are willing to forgo high search precision if the search process is fast enough to enable rapid iteration. Therefore, we employ several methods to increase the speed of search formulation and execution. Search formulation is expedited by the minimalist search form, with only one required field. Also, a gazetteer enables the use of geographic terms as shorthand for latitude/longitude coordinates. The search execution is accelerated by initially presenting dataset results (returned from a Google Mini appliance) with an estimated number of "hits" for each dataset based on the user's space-time constraints. The more costly file-level search is executed against a PostGres database only when the user "drills down", and then covering only the fraction of the time period needed to return the next page of results. The simplicity of the search form makes the tool easy to learn and use, and the speed of the searches enables an iterative form of data discovery.
CADDIS Volume 5. Causal Databases: CADLink

EPA Pesticide Factsheets

CADLink, an improved tool for searching and organizing literature-based evidence, will be released in Fall 2016. The original CADDIS literature resource, CADLit, is unavailable as we make these improvements.
The LANL hemorrhagic fever virus database, a new platform for analyzing biothreat viruses

PubMed Central

Kuiken, Carla; Thurmond, Jim; Dimitrijevic, Mira; Yoon, Hyejin

2012-01-01

Hemorrhagic fever viruses (HFVs) are a diverse set of over 80 viral species, found in 10 different genera comprising five different families: arena-, bunya-, flavi-, filo- and togaviridae. All these viruses are highly variable and evolve rapidly, making them elusive targets for the immune system and for vaccine and drug design. About 55 000 HFV sequences exist in the public domain today. A central website that provides annotated sequences and analysis tools will be helpful to HFV researchers worldwide. The HFV sequence database collects and stores sequence data and provides a user-friendly search interface and a large number of sequence analysis tools, following the model of the highly regarded and widely used Los Alamos HIV database [Kuiken, C., B. Korber, and R.W. Shafer, HIV sequence databases. AIDS Rev, 2003. 5: p. 52–61]. The database uses an algorithm that aligns each sequence to a species-wide reference sequence. The NCBI RefSeq database [Sayers et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res., 39, D38–D51.] is used for this; if a reference sequence is not available, a Blast search finds the best candidate. Using this method, sequences in each genus can be retrieved pre-aligned. The HFV website can be accessed via http://hfv.lanl.gov. PMID:22064861
The Auroral Planetary Imaging and Spectroscopy (APIS) service

NASA Astrophysics Data System (ADS)

Lamy, L.; Prangé, R.; Henry, F.; Le Sidaner, P.

2015-06-01

The Auroral Planetary Imaging and Spectroscopy (APIS) service, accessible online, provides an open and interactive access to processed auroral observations of the outer planets and their satellites. Such observations are of interest for a wide community at the interface between planetology, magnetospheric and heliospheric physics. APIS consists of (i) a high level database, built from planetary auroral observations acquired by the Hubble Space Telescope (HST) since 1997 with its mostly used Far-Ultraviolet spectro-imagers, (ii) a dedicated search interface aimed at browsing efficiently this database through relevant conditional search criteria and (iii) the ability to interactively work with the data online through plotting tools developed by the Virtual Observatory (VO) community, such as Aladin and Specview. This service is VO compliant and can therefore also been queried by external search tools of the VO community. The diversity of available data and the capability to sort them out by relevant physical criteria shall in particular facilitate statistical studies, on long-term scales and/or multi-instrumental multi-spectral combined analysis.
Ocean Drilling Program: Privacy Policy

Science.gov Websites

and products Drilling services and tools Online Janus database Search the ODP/TAMU web site ODP's main web site ODP/TAMU Science Operator Home Ocean Drilling Program Privacy Policy The following is the privacy policy for the www-odp.tamu.edu web site. 1. Cookies are used in the Database portion of the web
Identifying Gel-Separated Proteins Using In-Gel Digestion, Mass Spectrometry, and Database Searching: Consider the Chemistry

ERIC Educational Resources Information Center

Albright, Jessica C.; Dassenko, David J.; Mohamed, Essa A.; Beussman, Douglas J.

2009-01-01

Matrix-assisted laser desorption/ionization (MALDI) mass spectrometry is an important bioanalytical technique in drug discovery, proteomics, and research at the biology-chemistry interface. This is an especially powerful tool when combined with gel separation of proteins and database mining using the mass spectral data. Currently, few hands-on…
Find a Bed Bug Pesticide Product

EPA Pesticide Factsheets

Introduces the Bed Bug Product Search Tool, to help consumers find EPA-registered pesticides for bed bug infestation control. Inclusion in this database is not an endorsement. Always follow label directions carefully.
Information extraction for enhanced access to disease outbreak reports.

PubMed

Grishman, Ralph; Huttunen, Silja; Yangarber, Roman

2002-08-01

Document search is generally based on individual terms in the document. However, for collections within limited domains it is possible to provide more powerful access tools. This paper describes a system designed for collections of reports of infectious disease outbreaks. The system, Proteus-BIO, automatically creates a table of outbreaks, with each table entry linked to the document describing that outbreak; this makes it possible to use database operations such as selection and sorting to find relevant documents. Proteus-BIO consists of a Web crawler which gathers relevant documents; an information extraction engine which converts the individual outbreak events to a tabular database; and a database browser which provides access to the events and, through them, to the documents. The information extraction engine uses sets of patterns and word classes to extract the information about each event. Preparing these patterns and word classes has been a time-consuming manual operation in the past, but automated discovery tools now make this task significantly easier. A small study comparing the effectiveness of the tabular index with conventional Web search tools demonstrated that users can find substantially more documents in a given time period with Proteus-BIO.
The APIS service : a tool for accessing value-added HST planetary auroral observations over 1997-2015

NASA Astrophysics Data System (ADS)

Lamy, L.; Henry, F.; Prangé, R.; Le Sidaner, P.

2015-10-01

The Auroral Planetary Imaging and Spectroscopy (APIS) service http://obspm.fr/apis/ provides an open and interactive access to processed auroral observations of the outer planets and their satellites. Such observations are of interest for a wide community at the interface between planetology, magnetospheric and heliospheric physics. APIS consists of (i) a high level database, built from planetary auroral observations acquired by the Hubble Space Telescope (HST) since 1997 with its mostly used Far-Ultraviolet spectro- imagers, (ii) a dedicated search interface aimed at browsing efficiently this database through relevant conditional search criteria (Figure 1) and (iii) the ability to interactively work with the data online through plotting tools developed by the Virtual Observatory (VO) community, such as Aladin and Specview. This service is VO compliant and can therefore also been queried by external search tools of the VO community. The diversity of available data and the capability to sort them out by relevant physical criteria shall in particular facilitate statistical studies, on long-term scales and/or multi-instrumental multispectral combined analysis [1,2]. We will present the updated capabilities of APIS with several examples. Several tutorials are available online.
ClinicalKey: a point-of-care search engine.

PubMed

Vardell, Emily

2013-01-01

ClinicalKey is a new point-of-care resource for health care professionals. Through controlled vocabulary, ClinicalKey offers a cross section of resources on diseases and procedures, from journals to e-books and practice guidelines to patient education. A sample search was conducted to demonstrate the features of the database, and a comparison with similar tools is presented.
SCOOP: A Measurement and Database of Student Online Search Behavior and Performance

ERIC Educational Resources Information Center

Zhou, Mingming

2015-01-01

The ability to access and process massive amounts of online information is required in many learning situations. In order to develop a better understanding of student online search process especially in academic contexts, an online tool (SCOOP) is developed for tracking mouse behavior on the web to build a more extensive account of student web…
Paths of Discovery: Comparing the Search Effectiveness of EBSCO Discovery Service, Summon, Google Scholar, and Conventional Library Resources

ERIC Educational Resources Information Center

Asher, Andrew D.; Duke, Lynda M.; Wilson, Suzanne

2013-01-01

In 2011, researchers at Bucknell University and Illinois Wesleyan University compared the search efficacy of Serial Solutions Summon, EBSCO Discovery Service, Google Scholar, and conventional library databases. Using a mixed-methods approach, qualitative and quantitative data were gathered on students' usage of these tools. Regardless of the…
The NIDDK Information Network: A Community Portal for Finding Data, Materials, and Tools for Researchers Studying Diabetes, Digestive, and Kidney Diseases

PubMed Central

Whetzel, Patricia L.; Grethe, Jeffrey S.; Banks, Davis E.; Martone, Maryann E.

2015-01-01

The NIDDK Information Network (dkNET; http://dknet.org) was launched to serve the needs of basic and clinical investigators in metabolic, digestive and kidney disease by facilitating access to research resources that advance the mission of the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK). By research resources, we mean the multitude of data, software tools, materials, services, projects and organizations available to researchers in the public domain. Most of these are accessed via web-accessible databases or web portals, each developed, designed and maintained by numerous different projects, organizations and individuals. While many of the large government funded databases, maintained by agencies such as European Bioinformatics Institute and the National Center for Biotechnology Information, are well known to researchers, many more that have been developed by and for the biomedical research community are unknown or underutilized. At least part of the problem is the nature of dynamic databases, which are considered part of the “hidden” web, that is, content that is not easily accessed by search engines. dkNET was created specifically to address the challenge of connecting researchers to research resources via these types of community databases and web portals. dkNET functions as a “search engine for data”, searching across millions of database records contained in hundreds of biomedical databases developed and maintained by independent projects around the world. A primary focus of dkNET are centers and projects specifically created to provide high quality data and resources to NIDDK researchers. Through the novel data ingest process used in dkNET, additional data sources can easily be incorporated, allowing it to scale with the growth of digital data and the needs of the dkNET community. Here, we provide an overview of the dkNET portal and its functions. We show how dkNET can be used to address a variety of use cases that involve searching for research resources. PMID:26393351
Online tools for individuals with depression and neurologic conditions: A scoping review.

PubMed

Lukmanji, Sara; Pham, Tram; Blaikie, Laura; Clark, Callie; Jetté, Nathalie; Wiebe, Samuel; Bulloch, Andrew; Holroyd-Leduc, Jayna; Macrodimitris, Sophia; Mackie, Aaron; Patten, Scott B

2017-08-01

Patients with neurologic conditions commonly have depression. Online tools have the potential to improve outcomes in these patients in an efficient and accessible manner. We aimed to identify evidence-informed online tools for patients with comorbid neurologic conditions and depression. A scoping review of online tools (free, publicly available, and not requiring a facilitator) for patients with depression and epilepsy, Parkinson disease (PD), multiple sclerosis (MS), traumatic brain injury (TBI), or migraine was conducted. MEDLINE, EMBASE, PsycINFO, Cochrane Database of Systematic Reviews, and Cochrane CENTRAL Register of Controlled Trials were searched from database inception to January 2017 for all 5 neurologic conditions. Gray literature using Google and Google Scholar as well as app stores for both Android and Apple devices were searched. Self-management or self-efficacy online tools were not included unless they were specifically targeted at depression and one of the neurologic conditions and met the other eligibility criteria. Only 4 online tools were identified. Of these 4 tools, 2 were web-based self-management programs for patients with migraine or MS and depression. The other 2 were mobile apps for patients with PD or TBI and depression. No online tools were found for epilepsy. There are limited depression tools for people with neurologic conditions that are evidence-informed, publicly available, and free. Future research should focus on the development of high-quality, evidence-based online tools targeted at neurologic patients.

Exploring Cystic Fibrosis Using Bioinformatics Tools: A Module Designed for the Freshman Biology Course

ERIC Educational Resources Information Center

Zhang, Xiaorong

2011-01-01

We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…
GenderMedDB: an interactive database of sex and gender-specific medical literature.

PubMed

Oertelt-Prigione, Sabine; Gohlke, Björn-Oliver; Dunkel, Mathias; Preissner, Robert; Regitz-Zagrosek, Vera

2014-01-01

Searches for sex and gender-specific publications are complicated by the absence of a specific algorithm within search engines and by the lack of adequate archives to collect the retrieved results. We previously addressed this issue by initiating the first systematic archive of medical literature containing sex and/or gender-specific analyses. This initial collection has now been greatly enlarged and re-organized as a free user-friendly database with multiple functions: GenderMedDB (http://gendermeddb.charite.de). GenderMedDB retrieves the included publications from the PubMed database. Manuscripts containing sex and/or gender-specific analysis are continuously screened and the relevant findings organized systematically into disciplines and diseases. Publications are furthermore classified by research type, subject and participant numbers. More than 11,000 abstracts are currently included in the database, after screening more than 40,000 publications. The main functions of the database include searches by publication data or content analysis based on pre-defined classifications. In addition, registrants are enabled to upload relevant publications, access descriptive publication statistics and interact in an open user forum. Overall, GenderMedDB offers the advantages of a discipline-specific search engine as well as the functions of a participative tool for the gender medicine community.
UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on protein-DNA interactions.

PubMed

Robasky, Kimberly; Bulyk, Martha L

2011-01-01

The Universal PBM Resource for Oligonucleotide-Binding Evaluation (UniPROBE) database is a centralized repository of information on the DNA-binding preferences of proteins as determined by universal protein-binding microarray (PBM) technology. Each entry for a protein (or protein complex) in UniPROBE provides the quantitative preferences for all possible nucleotide sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In this update, we describe >130% expansion of the database content, incorporation of a protein BLAST (blastp) tool for finding protein sequence matches in UniPROBE, the introduction of UniPROBE accession numbers and additional database enhancements. The UniPROBE database is available at http://uniprobe.org.
Visibiome: an efficient microbiome search engine based on a scalable, distributed architecture.

PubMed

Azman, Syafiq Kamarul; Anwar, Muhammad Zohaib; Henschel, Andreas

2017-07-24

Given the current influx of 16S rRNA profiles of microbiota samples, it is conceivable that large amounts of them eventually are available for search, comparison and contextualization with respect to novel samples. This process facilitates the identification of similar compositional features in microbiota elsewhere and therefore can help to understand driving factors for microbial community assembly. We present Visibiome, a microbiome search engine that can perform exhaustive, phylogeny based similarity search and contextualization of user-provided samples against a comprehensive dataset of 16S rRNA profiles environments, while tackling several computational challenges. In order to scale to high demands, we developed a distributed system that combines web framework technology, task queueing and scheduling, cloud computing and a dedicated database server. To further ensure speed and efficiency, we have deployed Nearest Neighbor search algorithms, capable of sublinear searches in high-dimensional metric spaces in combination with an optimized Earth Mover Distance based implementation of weighted UniFrac. The search also incorporates pairwise (adaptive) rarefaction and optionally, 16S rRNA copy number correction. The result of a query microbiome sample is the contextualization against a comprehensive database of microbiome samples from a diverse range of environments, visualized through a rich set of interactive figures and diagrams, including barchart-based compositional comparisons and ranking of the closest matches in the database. Visibiome is a convenient, scalable and efficient framework to search microbiomes against a comprehensive database of environmental samples. The search engine leverages a popular but computationally expensive, phylogeny based distance metric, while providing numerous advantages over the current state of the art tool.
Database resources of the National Center for Biotechnology Information

PubMed Central

Wheeler, David L.; Barrett, Tanya; Benson, Dennis A.; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Kenton, David L.; Khovayko, Oleg; Lipman, David J.; Madden, Thomas L.; Maglott, Donna R.; Ostell, James; Pruitt, Kim D.; Schuler, Gregory D.; Schriml, Lynn M.; Sequeira, Edwin; Sherry, Stephen T.; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Suzek, Tugba O.; Tatusov, Roman; Tatusova, Tatiana A.; Wagner, Lukas; Yaschenko, Eugene

2006-01-01

In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Retroviral Genotyping Tools, HIV-1, Human Protein Interaction Database, SAGEmap, Gene Expression Omnibus, Entrez Probe, GENSAT, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of the resources can be accessed through the NCBI home page at: . PMID:16381840
Database resources of the National Center for Biotechnology Information.

PubMed

Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; Dicuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Krasnov, Sergey; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Karsch-Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian

2012-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Database resources of the National Center for Biotechnology

PubMed Central

Wheeler, David L.; Church, Deanna M.; Federhen, Scott; Lash, Alex E.; Madden, Thomas L.; Pontius, Joan U.; Schuler, Gregory D.; Schriml, Lynn M.; Sequeira, Edwin; Tatusova, Tatiana A.; Wagner, Lukas

2003-01-01

In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, PubMed, PubMed Central (PMC), LocusLink, the NCBITaxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR (e-PCR), Open Reading Frame (ORF) Finder, References Sequence (RefSeq), UniGene, HomoloGene, ProtEST, Database of Single Nucleotide Polymorphisms (dbSNP), Human/Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker (MM), Evidence Viewer (EV), Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov. PMID:12519941
The Magnetics Information Consortium (MagIC) Online Database: Uploading, Searching and Visualizing Paleomagnetic and Rock Magnetic Data

NASA Astrophysics Data System (ADS)

Koppers, A.; Tauxe, L.; Constable, C.; Pisarevsky, S.; Jackson, M.; Solheid, P.; Banerjee, S.; Johnson, C.; Genevey, A.; Delaney, R.; Baker, P.; Sbarbori, E.

2005-12-01

The Magnetics Information Consortium (MagIC) operates an online relational database including both rock and paleomagnetic data. The goal of MagIC is to store all measurements and their derived properties for studies of paleomagnetic directions (inclination, declination) and their intensities, and for rock magnetic experiments (hysteresis, remanence, susceptibility, anisotropy). MagIC is hosted under EarthRef.org at http://earthref.org/MAGIC/ and has two search nodes, one for paleomagnetism and one for rock magnetism. These nodes provide basic search capabilities based on location, reference, methods applied, material type and geological age, while allowing the user to drill down from sites all the way to the measurements. At each stage, the data can be saved and, if the available data supports it, the data can be visualized by plotting equal area plots, VGP location maps or typical Zijderveld, hysteresis, FORC, and various magnetization and remanence diagrams. All plots are made in SVG (scalable vector graphics) and thus can be saved and easily read into the user's favorite graphics programs without loss of resolution. User contributions to the MagIC database are critical to achieve a useful research tool. We have developed a standard data and metadata template (version 1.6) that can be used to format and upload all data at the time of publication in Earth Science journals. Software tools are provided to facilitate easy population of these templates within Microsoft Excel. These tools allow for the import/export of text files and they provide advanced functionality to manage/edit the data, and to perform various internal checks to high grade the data and to make them ready for uploading. The uploading is all done online by using the MagIC Contribution Wizard at http://earthref.org/MAGIC/upload.htm that takes only a few minutes to process a contribution of approximately 5,000 data records. After uploading these standardized MagIC template files will be stored in the digital archives of EarthRef.org from where they can be downloaded at all times. Finally, the contents of these template files will be automatically parsed into the online relational database, making the data available for online searches in the paleomagnetic and rock magnetic search nodes. The MagIC database contains all data transferred from the IAGA paleomagnetic poles database (GPMDB), the lava flow paleosecular variation database (PSVRL), lake sediment database (SECVR) and the PINT database. In addition to that a substantial number of data compiled under the Time Averaged Field Investigations project is now included plus a significant fraction of the data collected at SIO and the IRM. Ongoing additions of legacy data include ~40 papers from studies on the Hawaiian Islands, data compilations from archeomagnetic studies and updates to the lake sediment dataset.
SATRAT: Staphylococcus aureus transcript regulatory network analysis tool.

PubMed

Gopal, Tamilselvi; Nagarajan, Vijayaraj; Elasri, Mohamed O

2015-01-01

Staphylococcus aureus is a commensal organism that primarily colonizes the nose of healthy individuals. S. aureus causes a spectrum of infections that range from skin and soft-tissue infections to fatal invasive diseases. S. aureus uses a large number of virulence factors that are regulated in a coordinated fashion. The complex regulatory mechanisms have been investigated in numerous high-throughput experiments. Access to this data is critical to studying this pathogen. Previously, we developed a compilation of microarray experimental data to enable researchers to search, browse, compare, and contrast transcript profiles. We have substantially updated this database and have built a novel exploratory tool-SATRAT-the S. aureus transcript regulatory network analysis tool, based on the updated database. This tool is capable of performing deep searches using a query and generating an interactive regulatory network based on associations among the regulators of any query gene. We believe this integrated regulatory network analysis tool would help researchers explore the missing links and identify novel pathways that regulate virulence in S. aureus. Also, the data model and the network generation code used to build this resource is open sourced, enabling researchers to build similar resources for other bacterial systems.
NASA Technology Takes Center Stage

NASA Technical Reports Server (NTRS)

2004-01-01

In today's fast-paced business world, there is often more information available to researchers than there is time to search through it. Data mining has become the answer to finding the proverbial "needle in a haystack," as companies must be able to quickly locate specific pieces of information from large collections of data. Perilog, a suite of data-mining tools, searches for hidden patterns in large databases to determine previously unrecognized relationships. By retrieving and organizing contextually relevant data from any sequence of terms - from genetic data to musical notes - the software can intelligently compile information about desired topics from databases.
VaProS: a database-integration approach for protein/genome information retrieval.

PubMed

Gojobori, Takashi; Ikeo, Kazuho; Katayama, Yukie; Kawabata, Takeshi; Kinjo, Akira R; Kinoshita, Kengo; Kwon, Yeondae; Migita, Ohsuke; Mizutani, Hisashi; Muraoka, Masafumi; Nagata, Koji; Omori, Satoshi; Sugawara, Hideaki; Yamada, Daichi; Yura, Kei

2016-12-01

Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein-protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts' knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/ .
Glycan fragment database: a database of PDB-based glycan 3D structures.

PubMed

Jo, Sunhwan; Im, Wonpil

2013-01-01

The glycan fragment database (GFDB), freely available at http://www.glycanstructure.org, is a database of the glycosidic torsion angles derived from the glycan structures in the Protein Data Bank (PDB). Analogous to protein structure, the structure of an oligosaccharide chain in a glycoprotein, referred to as a glycan, can be characterized by the torsion angles of glycosidic linkages between relatively rigid carbohydrate monomeric units. Knowledge of accessible conformations of biologically relevant glycans is essential in understanding their biological roles. The GFDB provides an intuitive glycan sequence search tool that allows the user to search complex glycan structures. After a glycan search is complete, each glycosidic torsion angle distribution is displayed in terms of the exact match and the fragment match. The exact match results are from the PDB entries that contain the glycan sequence identical to the query sequence. The fragment match results are from the entries with the glycan sequence whose substructure (fragment) or entire sequence is matched to the query sequence, such that the fragment results implicitly include the influences from the nearby carbohydrate residues. In addition, clustering analysis based on the torsion angle distribution can be performed to obtain the representative structures among the searched glycan structures.
Kangaroo – A pattern-matching program for biological sequences

PubMed Central

2002-01-01

Background Biologists are often interested in performing a simple database search to identify proteins or genes that contain a well-defined sequence pattern. Many databases do not provide straightforward or readily available query tools to perform simple searches, such as identifying transcription binding sites, protein motifs, or repetitive DNA sequences. However, in many cases simple pattern-matching searches can reveal a wealth of information. We present in this paper a regular expression pattern-matching tool that was used to identify short repetitive DNA sequences in human coding regions for the purpose of identifying potential mutation sites in mismatch repair deficient cells. Results Kangaroo is a web-based regular expression pattern-matching program that can search for patterns in DNA, protein, or coding region sequences in ten different organisms. The program is implemented to facilitate a wide range of queries with no restriction on the length or complexity of the query expression. The program is accessible on the web at http://bioinfo.mshri.on.ca/kangaroo/ and the source code is freely distributed at http://sourceforge.net/projects/slritools/. Conclusion A low-level simple pattern-matching application can prove to be a useful tool in many research settings. For example, Kangaroo was used to identify potential genetic targets in a human colorectal cancer variant that is characterized by a high frequency of mutations in coding regions containing mononucleotide repeats. PMID:12150718
UnCover on the Web: search hints and applications in library environments.

PubMed

Galpern, N F; Albert, K M

1997-01-01

Among the huge maze of resources available on the Internet, UnCoverWeb stands out as a valuable tool for medical libraries. This up-to-date, free-access, multidisciplinary database of periodical references is searched through an easy-to-learn graphical user interface that is a welcome improvement over the telnet version. This article reviews the basic and advanced search techniques for UnCoverWeb, as well as providing information on the document delivery functions and table of contents alerting service called Reveal. UnCover's currency is evaluated and compared with other current awareness resources. System deficiencies are discussed, with the conclusion that although UnCoverWeb lacks the sophisticated features of many commercial database search services, it is nonetheless a useful addition to the repertoire of information sources available in a library.
Introducing a New Interface for the Online MagIC Database by Integrating Data Uploading, Searching, and Visualization

NASA Astrophysics Data System (ADS)

Jarboe, N.; Minnett, R.; Constable, C.; Koppers, A. A.; Tauxe, L.

2013-12-01

The Magnetics Information Consortium (MagIC) is dedicated to supporting the paleomagnetic, geomagnetic, and rock magnetic communities through the development and maintenance of an online database (http://earthref.org/MAGIC/), data upload and quality control, searches, data downloads, and visualization tools. While MagIC has completed importing some of the IAGA paleomagnetic databases (TRANS, PINT, PSVRL, GPMDB) and continues to import others (ARCHEO, MAGST and SECVR), further individual data uploading from the community contributes a wealth of easily-accessible rich datasets. Previously uploading of data to the MagIC database required the use of an Excel spreadsheet using either a Mac or PC. The new method of uploading data utilizes an HTML 5 web interface where the only computer requirement is a modern browser. This web interface will highlight all errors discovered in the dataset at once instead of the iterative error checking process found in the previous Excel spreadsheet data checker. As a web service, the community will always have easy access to the most up-to-date and bug free version of the data upload software. The filtering search mechanism of the MagIC database has been changed to a more intuitive system where the data from each contribution is displayed in tables similar to how the data is uploaded (http://earthref.org/MAGIC/search/). Searches themselves can be saved as a permanent URL, if desired. The saved search URL could then be used as a citation in a publication. When appropriate, plots (equal area, Zijderveld, ARAI, demagnetization, etc.) are associated with the data to give the user a quicker understanding of the underlying dataset. The MagIC database will continue to evolve to meet the needs of the paleomagnetic, geomagnetic, and rock magnetic communities.
SEQATOMS: a web tool for identifying missing regions in PDB in sequence context.

PubMed

Brandt, Bernd W; Heringa, Jaap; Leunissen, Jack A M

2008-07-01

With over 46 000 proteins, the Protein Data Bank (PDB) is the most important database with structural information of biological macromolecules. PDB files contain sequence and coordinate information. Residues present in the sequence can be absent from the coordinate section, which means their position in space is unknown. Similarity searches are routinely carried out against sequences taken from PDB SEQRES. However, there no distinction is made between residues that have a known or unknown position in the 3D protein structure. We present a FASTA sequence database that is produced by combining the sequence and coordinate information. All residues absent from the PDB coordinate section are masked with lower-case letters, thereby providing a view of these residues in the context of the entire protein sequence, which facilitates inspecting 'missing' regions. We also provide a masked version of the CATH domain database. A user-friendly BLAST interface is available for similarity searching. In contrast to standard (stand-alone) BLAST output, which only contains upper-case letters, our output retains the lower-case letters of the masked regions. Thus, our server can be used to perform BLAST searching case-sensitively. Here, we have applied it to the study of missing regions in their sequence context. SEQATOMS is available at http://www.bioinformatics.nl/tools/seqatoms/.
Quality tools and resources to support organisational improvement integral to high-quality primary care: a systematic review of published and grey literature.

PubMed

Janamian, Tina; Upham, Susan J; Crossland, Lisa; Jackson, Claire L

2016-04-18

To conduct a systematic review of the literature to identify existing online primary care quality improvement tools and resources to support organisational improvement related to the seven elements in the Primary Care Practice Improvement Tool (PC-PIT), with the identified tools and resources to progress to a Delphi study for further assessment of relevance and utility. Systematic review of the international published and grey literature. CINAHL, Embase and PubMed databases were searched in March 2014 for articles published between January 2004 and December 2013. GreyNet International and other relevant websites and repositories were also searched in March-April 2014 for documents dated between 1992 and 2012. All citations were imported into a bibliographic database. Published and unpublished tools and resources were included in the review if they were in English, related to primary care quality improvement and addressed any of the seven PC-PIT elements of a high-performing practice. Tools and resources that met the eligibility criteria were then evaluated for their accessibility, relevance, utility and comprehensiveness using a four-criteria appraisal framework. We used a data extraction template to systematically extract information from eligible tools and resources. A content analysis approach was used to explore the tools and resources and collate relevant information: name of the tool or resource, year and country of development, author, name of the organisation that provided access and its URL, accessibility information or problems, overview of each tool or resource and the quality improvement element(s) it addresses. If available, a copy of the tool or resource was downloaded into the bibliographic database, along with supporting evidence (published or unpublished) on its use in primary care. This systematic review identified 53 tools and resources that can potentially be provided as part of a suite of tools and resources to support primary care practices in improving the quality of their practice, to achieve improved health outcomes.
DOE Office of Scientific and Technical Information (OSTI.GOV)

IRIS is a search tool plug-in that is used to implement latent topic feedback for enhancing text navigation. It accepts a list of returned documents from an information retrieval wywtem that is generated from keyword search queries. Data is pulled directly from a topic information database and processed by IRIS to determine the most prominent and relevant topics, along with topic-ngrams, associated with the list of returned documents. User selected topics are then used to expand the query and presumabley refine the search results.
Dfam: a database of repetitive DNA based on profile hidden Markov models.

PubMed

Wheeler, Travis J; Clements, Jody; Eddy, Sean R; Hubley, Robert; Jones, Thomas A; Jurka, Jerzy; Smit, Arian F A; Finn, Robert D

2013-01-01

We present a database of repetitive DNA elements, called Dfam (http://dfam.janelia.org). Many genomes contain a large fraction of repetitive DNA, much of which is made up of remnants of transposable elements (TEs). Accurate annotation of TEs enables research into their biology and can shed light on the evolutionary processes that shape genomes. Identification and masking of TEs can also greatly simplify many downstream genome annotation and sequence analysis tasks. The commonly used TE annotation tools RepeatMasker and Censor depend on sequence homology search tools such as cross_match and BLAST variants, as well as Repbase, a collection of known TE families each represented by a single consensus sequence. Dfam contains entries corresponding to all Repbase TE entries for which instances have been found in the human genome. Each Dfam entry is represented by a profile hidden Markov model, built from alignments generated using RepeatMasker and Repbase. When used in conjunction with the hidden Markov model search tool nhmmer, Dfam produces a 2.9% increase in coverage over consensus sequence search methods on a large human benchmark, while maintaining low false discovery rates, and coverage of the full human genome is 54.5%. The website provides a collection of tools and data views to support improved TE curation and annotation efforts. Dfam is also available for download in flat file format or in the form of MySQL table dumps.
Comparison of PubMed, Scopus, Web of Science, and Google Scholar: strengths and weaknesses.

PubMed

Falagas, Matthew E; Pitsouni, Eleni I; Malietzis, George A; Pappas, Georgios

2008-02-01

The evolution of the electronic age has led to the development of numerous medical databases on the World Wide Web, offering search facilities on a particular subject and the ability to perform citation analysis. We compared the content coverage and practical utility of PubMed, Scopus, Web of Science, and Google Scholar. The official Web pages of the databases were used to extract information on the range of journals covered, search facilities and restrictions, and update frequency. We used the example of a keyword search to evaluate the usefulness of these databases in biomedical information retrieval and a specific published article to evaluate their utility in performing citation analysis. All databases were practical in use and offered numerous search facilities. PubMed and Google Scholar are accessed for free. The keyword search with PubMed offers optimal update frequency and includes online early articles; other databases can rate articles by number of citations, as an index of importance. For citation analysis, Scopus offers about 20% more coverage than Web of Science, whereas Google Scholar offers results of inconsistent accuracy. PubMed remains an optimal tool in biomedical electronic research. Scopus covers a wider journal range, of help both in keyword searching and citation analysis, but it is currently limited to recent articles (published after 1995) compared with Web of Science. Google Scholar, as for the Web in general, can help in the retrieval of even the most obscure information but its use is marred by inadequate, less often updated, citation information.

Mind Maps: Hot New Tools Proposed for Cyberspace Librarians.

ERIC Educational Resources Information Center

Humphreys, Nancy K.

1999-01-01

Describes how online searchers can use a software tool based on back-of-the-book indexes to assist in dealing with search engine databases compiled by spiders that crawl across the entire Internet or through large Web sites. Discusses human versus machine knowledge, conversion of indexes to mind maps or mini-thesauri, middleware, eXtensible Markup…
The Magnetics Information Consortium (MagIC) Online Database: Uploading, Searching and Visualizing Paleomagnetic and Rock Magnetic Data

NASA Astrophysics Data System (ADS)

Minnett, R.; Koppers, A.; Tauxe, L.; Constable, C.; Pisarevsky, S. A.; Jackson, M.; Solheid, P.; Banerjee, S.; Johnson, C.

2006-12-01

The Magnetics Information Consortium (MagIC) is commissioned to implement and maintain an online portal to a relational database populated by both rock and paleomagnetic data. The goal of MagIC is to archive all measurements and the derived properties for studies of paleomagnetic directions (inclination, declination) and intensities, and for rock magnetic experiments (hysteresis, remanence, susceptibility, anisotropy). MagIC is hosted under EarthRef.org at http://earthref.org/MAGIC/ and has two search nodes, one for paleomagnetism and one for rock magnetism. Both nodes provide query building based on location, reference, methods applied, material type and geological age, as well as a visual map interface to browse and select locations. The query result set is displayed in a digestible tabular format allowing the user to descend through hierarchical levels such as from locations to sites, samples, specimens, and measurements. At each stage, the result set can be saved and, if supported by the data, can be visualized by plotting global location maps, equal area plots, or typical Zijderveld, hysteresis, and various magnetization and remanence diagrams. User contributions to the MagIC database are critical to achieving a useful research tool. We have developed a standard data and metadata template (Version 2.1) that can be used to format and upload all data at the time of publication in Earth Science journals. Software tools are provided to facilitate population of these templates within Microsoft Excel. These tools allow for the import/export of text files and provide advanced functionality to manage and edit the data, and to perform various internal checks to maintain data integrity and prepare for uploading. The MagIC Contribution Wizard at http://earthref.org/MAGIC/upload.htm executes the upload and takes only a few minutes to process several thousand data records. The standardized MagIC template files are stored in the digital archives of EarthRef.org where they remain available for download by the public (in both text and Excel format). Finally, the contents of these template files are automatically parsed into the online relational database, making the data available for online searches in the paleomagnetic and rock magnetic search nodes. The MagIC database contains all data transferred from the IAGA paleomagnetic poles database (GPMDB), the lava flow paleosecular variation database (PSVRL), lake sediment database (SECVR) and the PINT database. Additionally, a substantial number of data compiled under the Time Averaged Field Investigations project is now included plus a significant fraction of the data collected at SIO and the IRM. Ongoing additions of legacy data include over 40 papers from studies on the Hawaiian Islands and Mexico, data compilations from archeomagnetic studies and updates to the lake sediment dataset.
Techniques for searching the CINAHL database using the EBSCO interface.

PubMed

Lawrence, Janna C

2007-04-01

The cumulative index to Nursing and Allied Health Literature (CINAHL) is a useful research tool for accessing articles of interest to nurses and health care professionals. More than 2,800 journals are indexed by CINAHL and can be searched easily using assigned subject headings. Detailed instructions about conducting, combining, and saving searches in CINAHL are provided in this article. Establishing an account at EBSCO further allows a nurse to save references and searches and to receive e-mail alerts when new articles on a topic of interest are published.
DynGO: a tool for visualizing and mining of Gene Ontology and its associations

PubMed Central

Liu, Hongfang; Hu, Zhang-Zhi; Wu, Cathy H

2005-01-01

Background A large volume of data and information about genes and gene products has been stored in various molecular biology databases. A major challenge for knowledge discovery using these databases is to identify related genes and gene products in disparate databases. The development of Gene Ontology (GO) as a common vocabulary for annotation allows integrated queries across multiple databases and identification of semantically related genes and gene products (i.e., genes and gene products that have similar GO annotations). Meanwhile, dozens of tools have been developed for browsing, mining or editing GO terms, their hierarchical relationships, or their "associated" genes and gene products (i.e., genes and gene products annotated with GO terms). Tools that allow users to directly search and inspect relations among all GO terms and their associated genes and gene products from multiple databases are needed. Results We present a standalone package called DynGO, which provides several advanced functionalities in addition to the standard browsing capability of the official GO browsing tool (AmiGO). DynGO allows users to conduct batch retrieval of GO annotations for a list of genes and gene products, and semantic retrieval of genes and gene products sharing similar GO annotations. The result are shown in an association tree organized according to GO hierarchies and supported with many dynamic display options such as sorting tree nodes or changing orientation of the tree. For GO curators and frequent GO users, DynGO provides fast and convenient access to GO annotation data. DynGO is generally applicable to any data set where the records are annotated with GO terms, as illustrated by two examples. Conclusion We have presented a standalone package DynGO that provides functionalities to search and browse GO and its association databases as well as several additional functions such as batch retrieval and semantic retrieval. The complete documentation and software are freely available for download from the website . PMID:16091147
ASGARD: an open-access database of annotated transcriptomes for emerging model arthropod species.

PubMed

Zeng, Victor; Extavour, Cassandra G

2012-01-01

The increased throughput and decreased cost of next-generation sequencing (NGS) have shifted the bottleneck genomic research from sequencing to annotation, analysis and accessibility. This is particularly challenging for research communities working on organisms that lack the basic infrastructure of a sequenced genome, or an efficient way to utilize whatever sequence data may be available. Here we present a new database, the Assembled Searchable Giant Arthropod Read Database (ASGARD). This database is a repository and search engine for transcriptomic data from arthropods that are of high interest to multiple research communities but currently lack sequenced genomes. We demonstrate the functionality and utility of ASGARD using de novo assembled transcriptomes from the milkweed bug Oncopeltus fasciatus, the cricket Gryllus bimaculatus and the amphipod crustacean Parhyale hawaiensis. We have annotated these transcriptomes to assign putative orthology, coding region determination, protein domain identification and Gene Ontology (GO) term annotation to all possible assembly products. ASGARD allows users to search all assemblies by orthology annotation, GO term annotation or Basic Local Alignment Search Tool. User-friendly features of ASGARD include search term auto-completion suggestions based on database content, the ability to download assembly product sequences in FASTA format, direct links to NCBI data for predicted orthologs and graphical representation of the location of protein domains and matches to similar sequences from the NCBI non-redundant database. ASGARD will be a useful repository for transcriptome data from future NGS studies on these and other emerging model arthropods, regardless of sequencing platform, assembly or annotation status. This database thus provides easy, one-stop access to multi-species annotated transcriptome information. We anticipate that this database will be useful for members of multiple research communities, including developmental biology, physiology, evolutionary biology, ecology, comparative genomics and phylogenomics. Database URL: asgard.rc.fas.harvard.edu.
RTECS database (on the internet). Online data

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

The Registry of Toxic Effects of Chemical Substances (RTECS (trademark)) is a database of toxicological information compiled, maintained, and updated by the National Institute for Occupational Safety and Health. The program is mandated by the Occupational Safety and Health Act of 1970. The original edition, known as the `Toxic Substances List,` was published on June 28, 1971, and included toxicologic data for approximately 5,000 chemicals. Since that time, the list has continuously grown and been updated, and its name changed to the current title, `Registry of Toxic Effects of Chemical Substances.` RTECS (trademark) now contains over 133,000 chemicals as NIOSHmore » strives to fulfill the mandate to list `all known toxic substances...and the concentrations at which...toxicity is known to occur.` This database is now available for searching through the Gov. Research-Center (GRC) service. GRC is a single online web-based search service to well known Government databases. Featuring powerful search and retrieval software, GRC is an important research tool. The GRC web site is at http://grc.ntis.gov.« less
New Tools to Document and Manage Data/Metadata: Example NGEE Arctic and ARM

NASA Astrophysics Data System (ADS)

Crow, M. C.; Devarakonda, R.; Killeffer, T.; Hook, L.; Boden, T.; Wullschleger, S.

2017-12-01

Tools used for documenting, archiving, cataloging, and searching data are critical pieces of informatics. This poster describes tools being used in several projects at Oak Ridge National Laboratory (ORNL), with a focus on the U.S. Department of Energy's Next Generation Ecosystem Experiment in the Arctic (NGEE Arctic) and Atmospheric Radiation Measurements (ARM) project, and their usage at different stages of the data lifecycle. The Online Metadata Editor (OME) is used for the documentation and archival stages while a Data Search tool supports indexing, cataloging, and searching. The NGEE Arctic OME Tool [1] provides a method by which researchers can upload their data and provide original metadata with each upload while adhering to standard metadata formats. The tool is built upon a Java SPRING framework to parse user input into, and from, XML output. Many aspects of the tool require use of a relational database including encrypted user-login, auto-fill functionality for predefined sites and plots, and file reference storage and sorting. The Data Search Tool conveniently displays each data record in a thumbnail containing the title, source, and date range, and features a quick view of the metadata associated with that record, as well as a direct link to the data. The search box incorporates autocomplete capabilities for search terms and sorted keyword filters are available on the side of the page, including a map for geo-searching. These tools are supported by the Mercury [2] consortium (funded by DOE, NASA, USGS, and ARM) and developed and managed at Oak Ridge National Laboratory. Mercury is a set of tools for collecting, searching, and retrieving metadata and data. Mercury collects metadata from contributing project servers, then indexes the metadata to make it searchable using Apache Solr, and provides access to retrieve it from the web page. Metadata standards that Mercury supports include: XML, Z39.50, FGDC, Dublin-Core, Darwin-Core, EML, and ISO-19115.
Finding Protein and Nucleotide Similarities with FASTA

PubMed Central

Pearson, William R.

2016-01-01

The FASTA programs provide a comprehensive set of rapid similarity searching tools ( fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local and global similarity searches ( ssearch36, ggsearch36) and for searching with short peptides and oligonucleotides ( fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matrices and gap penalties, improving alignment boundary accuracy and search sensitivity (Unit 3.5). The FASTA programs can produce “BLAST-like” alignment and tabular output, for ease of integration into existing analysis pipelines, and can search small, representative databases, and then report results for a larger set of sequences, using links from the smaller dataset. The FASTA programs work with a wide variety of database formats, including mySQL and postgreSQL databases (Unit 9.4). The programs also provide a strategy for integrating domain and active site annotations into alignments and highlighting the mutational state of functionally critical residues. These protocols describe how to use the FASTA programs to characterize protein and DNA sequences, using protein:protein, protein:DNA, and DNA:DNA comparisons. PMID:27010337
Finding Protein and Nucleotide Similarities with FASTA.

PubMed

Pearson, William R

2016-03-24

The FASTA programs provide a comprehensive set of rapid similarity searching tools (fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local, and global similarity searches (ssearch36, ggsearch36), and for searching with short peptides and oligonucleotides (fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matrices and gap penalties, improving alignment boundary accuracy and search sensitivity. The FASTA programs can produce "BLAST-like" alignment and tabular output, for ease of integration into existing analysis pipelines, and can search small, representative databases, and then report results for a larger set of sequences, using links from the smaller dataset. The FASTA programs work with a wide variety of database formats, including mySQL and postgreSQL databases. The programs also provide a strategy for integrating domain and active site annotations into alignments and highlighting the mutational state of functionally critical residues. These protocols describe how to use the FASTA programs to characterize protein and DNA sequences, using protein:protein, protein:DNA, and DNA:DNA comparisons. Copyright © 2016 John Wiley & Sons, Inc.
HMMER web server: 2018 update.

PubMed

Potter, Simon C; Luciani, Aurélien; Eddy, Sean R; Park, Youngmi; Lopez, Rodrigo; Finn, Robert D

2018-06-14

The HMMER webserver [http://www.ebi.ac.uk/Tools/hmmer] is a free-to-use service which provides fast searches against widely used sequence databases and profile hidden Markov model (HMM) libraries using the HMMER software suite (http://hmmer.org). The results of a sequence search may be summarized in a number of ways, allowing users to view and filter the significant hits by domain architecture or taxonomy. For large scale usage, we provide an application programmatic interface (API) which has been expanded in scope, such that all result presentations are available via both HTML and API. Furthermore, we have refactored our JavaScript visualization library to provide standalone components for different result representations. These consume the aforementioned API and can be integrated into third-party websites. The range of databases that can be searched against has been expanded, adding four sequence datasets (12 in total) and one profile HMM library (6 in total). To help users explore the biological context of their results, and to discover new data resources, search results are now supplemented with cross references to other EMBL-EBI databases.
Update on Genomic Databases and Resources at the National Center for Biotechnology Information.

PubMed

Tatusova, Tatiana

2016-01-01

The National Center for Biotechnology Information (NCBI), as a primary public repository of genomic sequence data, collects and maintains enormous amounts of heterogeneous data. Data for genomes, genes, gene expressions, gene variation, gene families, proteins, and protein domains are integrated with the analytical, search, and retrieval resources through the NCBI website, text-based search and retrieval system, provides a fast and easy way to navigate across diverse biological databases.Comparative genome analysis tools lead to further understanding of evolution processes quickening the pace of discovery. Recent technological innovations have ignited an explosion in genome sequencing that has fundamentally changed our understanding of the biology of living organisms. This huge increase in DNA sequence data presents new challenges for the information management system and the visualization tools. New strategies have been designed to bring an order to this genome sequence shockwave and improve the usability of associated data.
A systematic review on critical thinking in medical education.

PubMed

Chan, Zenobia C Y

2016-04-18

Critical thinking is the ability to raise discriminating questions in an attempt to search for better ideas, a deeper understanding and better solutions relating to a given issue. This systematic review provides a summary of efforts that have been made to enhance and assess critical thinking in medical education. Nine databases [Ovid MEDLINE(R), AMED, Academic Search Premier, ERIC, CINAHL, Web of Science, JSTOR, SCOPUS and PsycINFO] were searched to identify journal articles published from the start of each database to October 2012. A total of 41 articles published from 1981 to 2012 were categorised into two main themes: (i) evaluation of current education on critical thinking and (ii) development of new strategies about critical thinking. Under each theme, the teaching strategies, assessment tools, uses of multimedia and stakeholders were analysed. While a majority of studies developed teaching strategies and multimedia tools, a further examination of their quality and variety could yield some insights. The articles on assessment placed a greater focus on learning outcomes than on learning processes. It is expected that more research will be conducted on teacher development and students' voices.
GEOmetadb: powerful alternative search engine for the Gene Expression Omnibus

PubMed Central

Zhu, Yuelin; Davis, Sean; Stephens, Robert; Meltzer, Paul S.; Chen, Yidong

2008-01-01

The NCBI Gene Expression Omnibus (GEO) represents the largest public repository of microarray data. However, finding data in GEO can be challenging. We have developed GEOmetadb in an attempt to make querying the GEO metadata both easier and more powerful. All GEO metadata records as well as the relationships between them are parsed and stored in a local MySQL database. A powerful, flexible web search interface with several convenient utilities provides query capabilities not available via NCBI tools. In addition, a Bioconductor package, GEOmetadb that utilizes a SQLite export of the entire GEOmetadb database is also available, rendering the entire GEO database accessible with full power of SQL-based queries from within R. Availability: The web interface and SQLite databases available at http://gbnci.abcc.ncifcrf.gov/geo/. The Bioconductor package is available via the Bioconductor project. The corresponding MATLAB implementation is also available at the same website. Contact: yidong@mail.nih.gov PMID:18842599
An Automated Ab Initio Framework for Identifying New Ferroelectrics

NASA Astrophysics Data System (ADS)

Smidt, Tess; Reyes-Lillo, Sebastian E.; Jain, Anubhav; Neaton, Jeffrey B.

Ferroelectric materials have a wide-range of technological applications including non-volatile RAM and optoelectronics. In this work, we present an automated first-principles search for ferroelectrics. We integrate density functional theory, crystal structure databases, symmetry tools, workflow software, and a custom analysis toolkit to build a library of known and proposed ferroelectrics. We screen thousands of candidates using symmetry relations between nonpolar and polar structure pairs. We use two search strategies 1) polar-nonpolar pairs with the same composition and 2) polar-nonpolar structure type pairs. Results are automatically parsed, stored in a database, and accessible via a web interface showing distortion animations and plots of polarization and total energy as a function of distortion. We benchmark our results against experimental data, present new ferroelectric candidates found through our search, and discuss future work on expanding this search methodology to other material classes such as anti-ferroelectrics and multiferroics.
The MAJORANA Parts Tracking Database

NASA Astrophysics Data System (ADS)

Abgrall, N.; Aguayo, E.; Avignone, F. T.; Barabash, A. S.; Bertrand, F. E.; Brudanin, V.; Busch, M.; Byram, D.; Caldwell, A. S.; Chan, Y.-D.; Christofferson, C. D.; Combs, D. C.; Cuesta, C.; Detwiler, J. A.; Doe, P. J.; Efremenko, Yu.; Egorov, V.; Ejiri, H.; Elliott, S. R.; Esterline, J.; Fast, J. E.; Finnerty, P.; Fraenkle, F. M.; Galindo-Uribarri, A.; Giovanetti, G. K.; Goett, J.; Green, M. P.; Gruszko, J.; Guiseppe, V. E.; Gusev, K.; Hallin, A. L.; Hazama, R.; Hegai, A.; Henning, R.; Hoppe, E. W.; Howard, S.; Howe, M. A.; Keeter, K. J.; Kidd, M. F.; Kochetov, O.; Konovalov, S. I.; Kouzes, R. T.; LaFerriere, B. D.; Leon, J. Diaz; Leviner, L. E.; Loach, J. C.; MacMullin, J.; Martin, R. D.; Meijer, S. J.; Mertens, S.; Miller, M. L.; Mizouni, L.; Nomachi, M.; Orrell, J. L.; O`Shaughnessy, C.; Overman, N. R.; Petersburg, R.; Phillips, D. G.; Poon, A. W. P.; Pushkin, K.; Radford, D. C.; Rager, J.; Rielage, K.; Robertson, R. G. H.; Romero-Romero, E.; Ronquest, M. C.; Shanks, B.; Shima, T.; Shirchenko, M.; Snavely, K. J.; Snyder, N.; Soin, A.; Suriano, A. M.; Tedeschi, D.; Thompson, J.; Timkin, V.; Tornow, W.; Trimble, J. E.; Varner, R. L.; Vasilyev, S.; Vetter, K.; Vorren, K.; White, B. R.; Wilkerson, J. F.; Wiseman, C.; Xu, W.; Yakushev, E.; Young, A. R.; Yu, C.-H.; Yumatov, V.; Zhitnikov, I.

2015-04-01

The MAJORANA DEMONSTRATOR is an ultra-low background physics experiment searching for the neutrinoless double beta decay of 76Ge. The MAJORANA Parts Tracking Database is used to record the history of components used in the construction of the DEMONSTRATOR. The tracking implementation takes a novel approach based on the schema-free database technology CouchDB. Transportation, storage, and processes undergone by parts such as machining or cleaning are linked to part records. Tracking parts provide a great logistics benefit and an important quality assurance reference during construction. In addition, the location history of parts provides an estimate of their exposure to cosmic radiation. A web application for data entry and a radiation exposure calculator have been developed as tools for achieving the extreme radio-purity required for this rare decay search.
Addition of a breeding database in the Genome Database for Rosaceae

PubMed Central

Evans, Kate; Jung, Sook; Lee, Taein; Brutcher, Lisa; Cho, Ilhyung; Peace, Cameron; Main, Dorrie

2013-01-01

Breeding programs produce large datasets that require efficient management systems to keep track of performance, pedigree, geographical and image-based data. With the development of DNA-based screening technologies, more breeding programs perform genotyping in addition to phenotyping for performance evaluation. The integration of breeding data with other genomic and genetic data is instrumental for the refinement of marker-assisted breeding tools, enhances genetic understanding of important crop traits and maximizes access and utility by crop breeders and allied scientists. Development of new infrastructure in the Genome Database for Rosaceae (GDR) was designed and implemented to enable secure and efficient storage, management and analysis of large datasets from the Washington State University apple breeding program and subsequently expanded to fit datasets from other Rosaceae breeders. The infrastructure was built using the software Chado and Drupal, making use of the Natural Diversity module to accommodate large-scale phenotypic and genotypic data. Breeders can search accessions within the GDR to identify individuals with specific trait combinations. Results from Search by Parentage lists individuals with parents in common and results from Individual Variety pages link to all data available on each chosen individual including pedigree, phenotypic and genotypic information. Genotypic data are searchable by markers and alleles; results are linked to other pages in the GDR to enable the user to access tools such as GBrowse and CMap. This breeding database provides users with the opportunity to search datasets in a fully targeted manner and retrieve and compare performance data from multiple selections, years and sites, and to output the data needed for variety release publications and patent applications. The breeding database facilitates efficient program management. Storing publicly available breeding data in a database together with genomic and genetic data will further accelerate the cross-utilization of diverse data types by researchers from various disciplines. Database URL: http://www.rosaceae.org/breeders_toolbox PMID:24247530
Addition of a breeding database in the Genome Database for Rosaceae.

PubMed

Evans, Kate; Jung, Sook; Lee, Taein; Brutcher, Lisa; Cho, Ilhyung; Peace, Cameron; Main, Dorrie

2013-01-01

Breeding programs produce large datasets that require efficient management systems to keep track of performance, pedigree, geographical and image-based data. With the development of DNA-based screening technologies, more breeding programs perform genotyping in addition to phenotyping for performance evaluation. The integration of breeding data with other genomic and genetic data is instrumental for the refinement of marker-assisted breeding tools, enhances genetic understanding of important crop traits and maximizes access and utility by crop breeders and allied scientists. Development of new infrastructure in the Genome Database for Rosaceae (GDR) was designed and implemented to enable secure and efficient storage, management and analysis of large datasets from the Washington State University apple breeding program and subsequently expanded to fit datasets from other Rosaceae breeders. The infrastructure was built using the software Chado and Drupal, making use of the Natural Diversity module to accommodate large-scale phenotypic and genotypic data. Breeders can search accessions within the GDR to identify individuals with specific trait combinations. Results from Search by Parentage lists individuals with parents in common and results from Individual Variety pages link to all data available on each chosen individual including pedigree, phenotypic and genotypic information. Genotypic data are searchable by markers and alleles; results are linked to other pages in the GDR to enable the user to access tools such as GBrowse and CMap. This breeding database provides users with the opportunity to search datasets in a fully targeted manner and retrieve and compare performance data from multiple selections, years and sites, and to output the data needed for variety release publications and patent applications. The breeding database facilitates efficient program management. Storing publicly available breeding data in a database together with genomic and genetic data will further accelerate the cross-utilization of diverse data types by researchers from various disciplines. Database URL: http://www.rosaceae.org/breeders_toolbox.
Protein Information Resource: a community resource for expert annotation of protein data

PubMed Central

Barker, Winona C.; Garavelli, John S.; Hou, Zhenglin; Huang, Hongzhan; Ledley, Robert S.; McGarvey, Peter B.; Mewes, Hans-Werner; Orcutt, Bruce C.; Pfeiffer, Friedhelm; Tsugita, Akira; Vinayaka, C. R.; Xiao, Chunlin; Yeh, Lai-Su L.; Wu, Cathy

2001-01-01

The Protein Information Resource, in collaboration with the Munich Information Center for Protein Sequences (MIPS) and the Japan International Protein Information Database (JIPID), produces the most comprehensive and expertly annotated protein sequence database in the public domain, the PIR-International Protein Sequence Database. To provide timely and high quality annotation and promote database interoperability, the PIR-International employs rule-based and classification-driven procedures based on controlled vocabulary and standard nomenclature and includes status tags to distinguish experimentally determined from predicted protein features. The database contains about 200 000 non-redundant protein sequences, which are classified into families and superfamilies and their domains and motifs identified. Entries are extensively cross-referenced to other sequence, classification, genome, structure and activity databases. The PIR web site features search engines that use sequence similarity and database annotation to facilitate the analysis and functional identification of proteins. The PIR-International databases and search tools are accessible on the PIR web site at http://pir.georgetown.edu/ and at the MIPS web site at http://www.mips.biochem.mpg.de. The PIR-International Protein Sequence Database and other files are also available by FTP. PMID:11125041
PARPs database: A LIMS systems for protein-protein interaction data mining or laboratory information management system

PubMed Central

Droit, Arnaud; Hunter, Joanna M; Rouleau, Michèle; Ethier, Chantal; Picard-Cloutier, Aude; Bourgais, David; Poirier, Guy G

2007-01-01

Background In the "post-genome" era, mass spectrometry (MS) has become an important method for the analysis of proteins and the rapid advancement of this technique, in combination with other proteomics methods, results in an increasing amount of proteome data. This data must be archived and analysed using specialized bioinformatics tools. Description We herein describe "PARPs database," a data analysis and management pipeline for liquid chromatography tandem mass spectrometry (LC-MS/MS) proteomics. PARPs database is a web-based tool whose features include experiment annotation, protein database searching, protein sequence management, as well as data-mining of the peptides and proteins identified. Conclusion Using this pipeline, we have successfully identified several interactions of biological significance between PARP-1 and other proteins, namely RFC-1, 2, 3, 4 and 5. PMID:18093328
Software Tools Streamline Project Management

NASA Technical Reports Server (NTRS)

2009-01-01

Three innovative software inventions from Ames Research Center (NETMARK, Program Management Tool, and Query-Based Document Management) are finding their way into NASA missions as well as industry applications. The first, NETMARK, is a program that enables integrated searching of data stored in a variety of databases and documents, meaning that users no longer have to look in several places for related information. NETMARK allows users to search and query information across all of these sources in one step. This cross-cutting capability in information analysis has exponentially reduced the amount of time needed to mine data from days or weeks to mere seconds. NETMARK has been used widely throughout NASA, enabling this automatic integration of information across many documents and databases. NASA projects that use NETMARK include the internal reporting system and project performance dashboard, Erasmus, NASA s enterprise management tool, which enhances organizational collaboration and information sharing through document routing and review; the Integrated Financial Management Program; International Space Station Knowledge Management; Mishap and Anomaly Information Reporting System; and management of the Mars Exploration Rovers. Approximately $1 billion worth of NASA s projects are currently managed using Program Management Tool (PMT), which is based on NETMARK. PMT is a comprehensive, Web-enabled application tool used to assist program and project managers within NASA enterprises in monitoring, disseminating, and tracking the progress of program and project milestones and other relevant resources. The PMT consists of an integrated knowledge repository built upon advanced enterprise-wide database integration techniques and the latest Web-enabled technologies. The current system is in a pilot operational mode allowing users to automatically manage, track, define, update, and view customizable milestone objectives and goals. The third software invention, Query-Based Document Management (QBDM) is a tool that enables content or context searches, either simple or hierarchical, across a variety of databases. The system enables users to specify notification subscriptions where they associate "contexts of interest" and "events of interest" to one or more documents or collection(s) of documents. Based on these subscriptions, users receive notification when the events of interest occur within the contexts of interest for associated document or collection(s) of documents. Users can also associate at least one notification time as part of the notification subscription, with at least one option for the time period of notifications.

CHRONIS: an animal chromosome image database.

PubMed

Toyabe, Shin-Ichi; Akazawa, Kouhei; Fukushi, Daisuke; Fukui, Kiichi; Ushiki, Tatsuo

2005-01-01

We have constructed a database system named CHRONIS (CHROmosome and Nano-Information System) to collect images of animal chromosomes and related nanotechnological information. CHRONIS enables rapid sharing of information on chromosome research among cell biologists and researchers in other fields via the Internet. CHRONIS is also intended to serve as a liaison tool for researchers who work in different centers. The image database contains more than 3,000 color microscopic images, including karyotypic images obtained from more than 1,000 species of animals. Researchers can browse the contents of the database using a usual World Wide Web interface in the following URL: http://chromosome.med.niigata-u.ac.jp/chronis/servlet/chronisservlet. The system enables users to input new images into the database, to locate images of interest by keyword searches, and to display the images with detailed information. CHRONIS has a wide range of applications, such as searching for appropriate probes for fluorescent in situ hybridization, comparing various kinds of microscopic images of a single species, and finding researchers working in the same field of interest.
Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information

NASA Astrophysics Data System (ADS)

Sushko, Iurii; Novotarskyi, Sergii; Körner, Robert; Pandey, Anil Kumar; Rupp, Matthias; Teetz, Wolfram; Brandmaier, Stefan; Abdelaziz, Ahmed; Prokopenko, Volodymyr V.; Tanchuk, Vsevolod Y.; Todeschini, Roberto; Varnek, Alexandre; Marcou, Gilles; Ertl, Peter; Potemkin, Vladimir; Grishina, Maria; Gasteiger, Johann; Schwab, Christof; Baskin, Igor I.; Palyulin, Vladimir A.; Radchenko, Eugene V.; Welsh, William J.; Kholodovych, Vladyslav; Chekmarev, Dmitriy; Cherkasov, Artem; Aires-de-Sousa, Joao; Zhang, Qing-You; Bender, Andreas; Nigsch, Florian; Patiny, Luc; Williams, Antony; Tkachenko, Valery; Tetko, Igor V.

2011-06-01

The Online Chemical Modeling Environment is a web-based platform that aims to automate and simplify the typical steps required for QSAR modeling. The platform consists of two major subsystems: the database of experimental measurements and the modeling framework. A user-contributed database contains a set of tools for easy input, search and modification of thousands of records. The OCHEM database is based on the wiki principle and focuses primarily on the quality and verifiability of the data. The database is tightly integrated with the modeling framework, which supports all the steps required to create a predictive model: data search, calculation and selection of a vast variety of molecular descriptors, application of machine learning methods, validation, analysis of the model and assessment of the applicability domain. As compared to other similar systems, OCHEM is not intended to re-implement the existing tools or models but rather to invite the original authors to contribute their results, make them publicly available, share them with other users and to become members of the growing research community. Our intention is to make OCHEM a widely used platform to perform the QSPR/QSAR studies online and share it with other users on the Web. The ultimate goal of OCHEM is collecting all possible chemoinformatics tools within one simple, reliable and user-friendly resource. The OCHEM is free for web users and it is available online at http://www.ochem.eu.
Classroom Drama as an Instructional Tool. Learning Package No. 42.

ERIC Educational Resources Information Center

Simic, Marge, Comp.; Smith, Carl, Ed.

Originally developed as part of a project for the Department of Defense Schools (DoDDS) system, this learning package on classroom drama as an instructional tool is designed for teachers who wish to upgrade or expand their teaching skills on their own. The package includes an overview of the project; a comprehensive search of the ERIC database; a…
Mobile object retrieval in server-based image databases

NASA Astrophysics Data System (ADS)

Manger, D.; Pagel, F.; Widak, H.

2013-05-01

The increasing number of mobile phones equipped with powerful cameras leads to huge collections of user-generated images. To utilize the information of the images on site, image retrieval systems are becoming more and more popular to search for similar objects in an own image database. As the computational performance and the memory capacity of mobile devices are constantly increasing, this search can often be performed on the device itself. This is feasible, for example, if the images are represented with global image features or if the search is done using EXIF or textual metadata. However, for larger image databases, if multiple users are meant to contribute to a growing image database or if powerful content-based image retrieval methods with local features are required, a server-based image retrieval backend is needed. In this work, we present a content-based image retrieval system with a client server architecture working with local features. On the server side, the scalability to large image databases is addressed with the popular bag-of-word model with state-of-the-art extensions. The client end of the system focuses on a lightweight user interface presenting the most similar images of the database highlighting the visual information which is common with the query image. Additionally, new images can be added to the database making it a powerful and interactive tool for mobile contentbased image retrieval.
PlantRGDB: A Database of Plant Retrocopied Genes.

PubMed

Wang, Yi

2017-01-01

RNA-based gene duplication, known as retrocopy, plays important roles in gene origination and genome evolution. The genomes of many plants have been sequenced, offering an opportunity to annotate and mine the retrocopies in plant genomes. However, comprehensive and unified annotation of retrocopies in these plants is still lacking. In this study I constructed the PlantRGDB (Plant Retrocopied Gene DataBase), the first database of plant retrocopies, to provide a putatively complete centralized list of retrocopies in plant genomes. The database is freely accessible at http://probes.pw.usda.gov/plantrgdb or http://aegilops.wheat.ucdavis.edu/plantrgdb. It currently integrates 49 plant species and 38,997 retrocopies along with characterization information. PlantRGDB provides a user-friendly web interface for searching, browsing and downloading the retrocopies in the database. PlantRGDB also offers graphical viewer-integrated sequence information for displaying the structure of each retrocopy. The attributes of the retrocopies of each species are reported using a browse function. In addition, useful tools, such as an advanced search and BLAST, are available to search the database more conveniently. In conclusion, the database will provide a web platform for obtaining valuable insight into the generation of retrocopies and will supplement research on gene duplication and genome evolution in plants. © The Author 2017. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Database resources of the National Center for Biotechnology Information

PubMed Central

Sayers, Eric W.; Barrett, Tanya; Benson, Dennis A.; Bolton, Evan; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M.; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Krasnov, Sergey; Landsman, David; Lipman, David J.; Lu, Zhiyong; Madden, Thomas L.; Madej, Tom; Maglott, Donna R.; Marchler-Bauer, Aron; Miller, Vadim; Karsch-Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D.; Schuler, Gregory D.; Sequeira, Edwin; Sherry, Stephen T.; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A.; Wagner, Lukas; Wang, Yanli; Wilbur, W. John; Yaschenko, Eugene; Ye, Jian

2012-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Website. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Probe, Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:22140104
Database resources of the National Center for Biotechnology Information

PubMed Central

2013-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page. PMID:23193264
Database resources of the National Center for Biotechnology Information.

PubMed

Wheeler, David L; Barrett, Tanya; Benson, Dennis A; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Geer, Lewis Y; Kapustin, Yuri; Khovayko, Oleg; Landsman, David; Lipman, David J; Madden, Thomas L; Maglott, Donna R; Ostell, James; Miller, Vadim; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Steven T; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusov, Roman L; Tatusova, Tatiana A; Wagner, Lukas; Yaschenko, Eugene

2007-01-01

In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link(BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace and Assembly Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Viral Genotyping Tools, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
Database resources of the National Center for Biotechnology Information.

PubMed

Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Feolo, Michael; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Landsman, David; Lipman, David J; Madden, Thomas L; Maglott, Donna R; Miller, Vadim; Mizrachi, Ilene; Ostell, James; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Yaschenko, Eugene; Ye, Jian

2009-01-01

In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs), Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART) and the PubChem suite of small molecule databases. Augmenting many of the web applications is custom implementation of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
PGMapper: a web-based tool linking phenotype to genes.

PubMed

Xiong, Qing; Qiu, Yuhui; Gu, Weikuan

2008-04-01

With the availability of whole genome sequence in many species, linkage analysis, positional cloning and microarray are gradually becoming powerful tools for investigating the links between phenotype and genotype or genes. However, in these methods, causative genes underlying a quantitative trait locus, or a disease, are usually located within a large genomic region or a large set of genes. Examining the function of every gene is very time consuming and needs to retrieve and integrate the information from multiple databases or genome resources. PGMapper is a software tool for automatically matching phenotype to genes from a defined genome region or a group of given genes by combining the mapping information from the Ensembl database and gene function information from the OMIM and PubMed databases. PGMapper is currently available for candidate gene search of human, mouse, rat, zebrafish and 12 other species. Available online at http://www.genediscovery.org/pgmapper/index.jsp.
Aerodynamic Optimization of Rocket Control Surface Geometry Using Cartesian Methods and CAD Geometry

NASA Technical Reports Server (NTRS)

Nelson, Andrea; Aftosmis, Michael J.; Nemec, Marian; Pulliam, Thomas H.

2004-01-01

Aerodynamic design is an iterative process involving geometry manipulation and complex computational analysis subject to physical constraints and aerodynamic objectives. A design cycle consists of first establishing the performance of a baseline design, which is usually created with low-fidelity engineering tools, and then progressively optimizing the design to maximize its performance. Optimization techniques have evolved from relying exclusively on designer intuition and insight in traditional trial and error methods, to sophisticated local and global search methods. Recent attempts at automating the search through a large design space with formal optimization methods include both database driven and direct evaluation schemes. Databases are being used in conjunction with surrogate and neural network models as a basis on which to run optimization algorithms. Optimization algorithms are also being driven by the direct evaluation of objectives and constraints using high-fidelity simulations. Surrogate methods use data points obtained from simulations, and possibly gradients evaluated at the data points, to create mathematical approximations of a database. Neural network models work in a similar fashion, using a number of high-fidelity database calculations as training iterations to create a database model. Optimal designs are obtained by coupling an optimization algorithm to the database model. Evaluation of the current best design then gives either a new local optima and/or increases the fidelity of the approximation model for the next iteration. Surrogate methods have also been developed that iterate on the selection of data points to decrease the uncertainty of the approximation model prior to searching for an optimal design. The database approximation models for each of these cases, however, become computationally expensive with increase in dimensionality. Thus the method of using optimization algorithms to search a database model becomes problematic as the number of design variables is increased.
Development of a Publicly Available, Comprehensive Database of Fiber and Health Outcomes: Rationale and Methods

PubMed Central

Livingston, Kara A.; Chung, Mei; Sawicki, Caleigh M.; Lyle, Barbara J.; Wang, Ding Ding; Roberts, Susan B.; McKeown, Nicola M.

2016-01-01

Background Dietary fiber is a broad category of compounds historically defined as partially or completely indigestible plant-based carbohydrates and lignin with, more recently, the additional criteria that fibers incorporated into foods as additives should demonstrate functional human health outcomes to receive a fiber classification. Thousands of research studies have been published examining fibers and health outcomes. Objectives (1) Develop a database listing studies testing fiber and physiological health outcomes identified by experts at the Ninth Vahouny Conference; (2) Use evidence mapping methodology to summarize this body of literature. This paper summarizes the rationale, methodology, and resulting database. The database will help both scientists and policy-makers to evaluate evidence linking specific fibers with physiological health outcomes, and identify missing information. Methods To build this database, we conducted a systematic literature search for human intervention studies published in English from 1946 to May 2015. Our search strategy included a broad definition of fiber search terms, as well as search terms for nine physiological health outcomes identified at the Ninth Vahouny Fiber Symposium. Abstracts were screened using a priori defined eligibility criteria and a low threshold for inclusion to minimize the likelihood of rejecting articles of interest. Publications then were reviewed in full text, applying additional a priori defined exclusion criteria. The database was built and published on the Systematic Review Data Repository (SRDR™), a web-based, publicly available application. Conclusions A fiber database was created. This resource will reduce the unnecessary replication of effort in conducting systematic reviews by serving as both a central database archiving PICO (population, intervention, comparator, outcome) data on published studies and as a searchable tool through which this data can be extracted and updated. PMID:27348733
The VIMS Data Explorer: A tool for locating and visualizing hyperspectral data

NASA Astrophysics Data System (ADS)

Pasek, V. D.; Lytle, D. M.; Brown, R. H.

2016-12-01

Since successfully entering Saturn's orbit during Summer 2004 there have been over 300,000 hyperspectral data cubes returned from the visible and infrared mapping spectrometer (VIMS) instrument onboard the Cassini spacecraft. The VIMS Science Investigation is a multidisciplinary effort that uses these hyperspectral data to study a variety of scientific problems, including surface characterizations of the icy satellites and atmospheric analyses of Titan and Saturn. Such investigations may need to identify thousands of exemplary data cubes for analysis and can span many years in scope. Here we describe the VIMS data explorer (VDE) application, currently employed by the VIMS Investigation to search for and visualize data. The VDE application facilitates real-time inspection of the entire VIMS hyperspectral dataset, the construction of in situ maps, and markers to save and recall work. The application relies on two databases to provide comprehensive search capabilities. The first database contains metadata for every cube. These metadata searches are used to identify records based on parameters such as target, observation name, or date taken; they fall short in utility for some investigations. The cube metadata contains no target geometry information. Through the introduction of a post-calibration pixel database, the VDE tool enables users to greatly expand their searching capabilities. Users can select favorable cubes for further processing into 2-D and 3-D interactive maps, aiding in the data interpretation and selection process. The VDE application enables efficient search, visualization, and access to VIMS hyperspectral data. It is simple to use, requiring nothing more than a browser for access. Hyperspectral bands can be individually selected or combined to create real-time color images, a technique commonly employed by hyperspectral researchers to highlight compositional differences.
Generating "fragment-based virtual library" using pocket similarity search of ligand-receptor complexes.

PubMed

Khashan, Raed S

2015-01-01

As the number of available ligand-receptor complexes is increasing, researchers are becoming more dedicated to mine these complexes to aid in the drug design and development process. We present free software which is developed as a tool for performing similarity search across ligand-receptor complexes for identifying binding pockets which are similar to that of a target receptor. The search is based on 3D-geometric and chemical similarity of the atoms forming the binding pocket. For each match identified, the ligand's fragment(s) corresponding to that binding pocket are extracted, thus forming a virtual library of fragments (FragVLib) that is useful for structure-based drug design. The program provides a very useful tool to explore available databases.
Reducing Information Overload in Large Seismic Data Sets

DOE Office of Scientific and Technical Information (OSTI.GOV)

HAMPTON,JEFFERY W.; YOUNG,CHRISTOPHER J.; MERCHANT,BION J.

2000-08-02

Event catalogs for seismic data can become very large. Furthermore, as researchers collect multiple catalogs and reconcile them into a single catalog that is stored in a relational database, the reconciled set becomes even larger. The sheer number of these events makes searching for relevant events to compare with events of interest problematic. Information overload in this form can lead to the data sets being under-utilized and/or used incorrectly or inconsistently. Thus, efforts have been initiated to research techniques and strategies for helping researchers to make better use of large data sets. In this paper, the authors present their effortsmore » to do so in two ways: (1) the Event Search Engine, which is a waveform correlation tool and (2) some content analysis tools, which area combination of custom-built and commercial off-the-shelf tools for accessing, managing, and querying seismic data stored in a relational database. The current Event Search Engine is based on a hierarchical clustering tool known as the dendrogram tool, which is written as a MatSeis graphical user interface. The dendrogram tool allows the user to build dendrogram diagrams for a set of waveforms by controlling phase windowing, down-sampling, filtering, enveloping, and the clustering method (e.g. single linkage, complete linkage, flexible method). It also allows the clustering to be based on two or more stations simultaneously, which is important to bridge gaps in the sparsely recorded event sets anticipated in such a large reconciled event set. Current efforts are focusing on tools to help the researcher winnow the clusters defined using the dendrogram tool down to the minimum optimal identification set. This will become critical as the number of reference events in the reconciled event set continually grows. The dendrogram tool is part of the MatSeis analysis package, which is available on the Nuclear Explosion Monitoring Research and Engineering Program Web Site. As part of the research into how to winnow the reference events in these large reconciled event sets, additional database query approaches have been developed to provide windows into these datasets. These custom built content analysis tools help identify dataset characteristics that can potentially aid in providing a basis for comparing similar reference events in these large reconciled event sets. Once these characteristics can be identified, algorithms can be developed to create and add to the reduced set of events used by the Event Search Engine. These content analysis tools have already been useful in providing information on station coverage of the referenced events and basic statistical, information on events in the research datasets. The tools can also provide researchers with a quick way to find interesting and useful events within the research datasets. The tools could also be used as a means to review reference event datasets as part of a dataset delivery verification process. There has also been an effort to explore the usefulness of commercially available web-based software to help with this problem. The advantages of using off-the-shelf software applications, such as Oracle's WebDB, to manipulate, customize and manage research data are being investigated. These types of applications are being examined to provide access to large integrated data sets for regional seismic research in Asia. All of these software tools would provide the researcher with unprecedented power without having to learn the intricacies and complexities of relational database systems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Clancey, P.; Logg, C.

DEPOT has been developed to provide tracking for the Stanford Linear Collider (SLC) control system equipment. For each piece of equipment entered into the database, complete location, service, maintenance, modification, certification, and radiation exposure histories can be maintained. To facilitate data entry accuracy, efficiency, and consistency, barcoding technology has been used extensively. DEPOT has been an important tool in improving the reliability of the microsystems controlling SLC. This document describes the components of the DEPOT database, the elements in the database records, and the use of the supporting programs for entering data, searching the database, and producing reports from themore » information.« less
Image

DOE Office of Scientific and Technical Information (OSTI.GOV)

Marsh, Amber; Harsch, Tim; Pitt, Julie

2007-08-31

The computer side of the IMAGE project consists of a collection of Perl scripts that perform a variety of tasks; scripts are available to insert, update and delete data from the underlying Oracle database, download data from NCBI's Genbank and other sources, and generate data files for download by interested parties. Web scripts make up the tracking interface, and various tools available on the project web-site (image.llnl.gov) that provide a search interface to the database.
CampusGIS of the University of Cologne: a tool for orientation, navigation, and management

NASA Astrophysics Data System (ADS)

Baaser, U.; Gnyp, M. L.; Hennig, S.; Hoffmeister, D.; Köhn, N.; Laudien, R.; Bareth, G.

2006-10-01

The working group for GIS and Remote Sensing at the Department of Geography at the University of Cologne has established a WebGIS called CampusGIS of the University of Cologne. The overall task of the CampusGIS is the connection of several existing databases at the University of Cologne with spatial data. These existing databases comprise data about staff, buildings, rooms, lectures, and general infrastructure like bus stops etc. These information were yet not linked to their spatial relation. Therefore, a GIS-based method is developed to link all the different databases to spatial entities. Due to the philosophy of the CampusGIS, an online-GUI is programmed which enables users to search for staff, buildings, or institutions. The query results are linked to the GIS database which allows the visualization of the spatial location of the searched entity. This system was established in 2005 and is operational since early 2006. In this contribution, the focus is on further developments. First results of (i) including routing services in, (ii) programming GUIs for mobile devices for, and (iii) including infrastructure management tools in the CampusGIS are presented. Consequently, the CampusGIS is not only available for spatial information retrieval and orientation. It also serves for on-campus navigation and administrative management.
Coverage of Google Scholar, Scopus, and Web of Science: a case study of the h-index in nursing.

PubMed

De Groote, Sandra L; Raszewski, Rebecca

2012-01-01

This study compares the articles cited in CINAHL, Scopus, Web of Science (WOS), and Google Scholar and the h-index ratings provided by Scopus, WOS, and Google Scholar. The publications of 30 College of Nursing faculty at a large urban university were examined. Searches by author name were executed in Scopus, WOS, and POP (Publish or Perish, which searches Google Scholar), and the h-index for each author from each database was recorded. In addition, the citing articles of their published articles were imported into a bibliographic management program. This data was used to determine an aggregated h-index for each author. Scopus, WOS, and Google Scholar provided different h-index ratings for authors and each database found unique and duplicate citing references. More than one tool should be used to calculate the h-index for nursing faculty because one tool alone cannot be relied on to provide a thorough assessment of a researcher's impact. If researchers are interested in a comprehensive h-index, they should aggregate the citing references located by WOS and Scopus. Because h-index rankings differ among databases, comparisons between researchers should be done only within a specified database. Copyright © 2012 Elsevier Inc. All rights reserved.
Database resources of the National Center for Biotechnology Information

PubMed Central

2015-01-01

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. PMID:25398906

Database resources of the National Center for Biotechnology Information

PubMed Central

2016-01-01

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:26615191
Search and Graph Database Technologies for Biomedical Semantic Indexing: Experimental Analysis.

PubMed

Segura Bedmar, Isabel; Martínez, Paloma; Carruana Martín, Adrián

2017-12-01

Biomedical semantic indexing is a very useful support tool for human curators in their efforts for indexing and cataloging the biomedical literature. The aim of this study was to describe a system to automatically assign Medical Subject Headings (MeSH) to biomedical articles from MEDLINE. Our approach relies on the assumption that similar documents should be classified by similar MeSH terms. Although previous work has already exploited the document similarity by using a k-nearest neighbors algorithm, we represent documents as document vectors by search engine indexing and then compute the similarity between documents using cosine similarity. Once the most similar documents for a given input document are retrieved, we rank their MeSH terms to choose the most suitable set for the input document. To do this, we define a scoring function that takes into account the frequency of the term into the set of retrieved documents and the similarity between the input document and each retrieved document. In addition, we implement guidelines proposed by human curators to annotate MEDLINE articles; in particular, the heuristic that says if 3 MeSH terms are proposed to classify an article and they share the same ancestor, they should be replaced by this ancestor. The representation of the MeSH thesaurus as a graph database allows us to employ graph search algorithms to quickly and easily capture hierarchical relationships such as the lowest common ancestor between terms. Our experiments show promising results with an F1 of 69% on the test dataset. To the best of our knowledge, this is the first work that combines search and graph database technologies for the task of biomedical semantic indexing. Due to its horizontal scalability, ElasticSearch becomes a real solution to index large collections of documents (such as the bibliographic database MEDLINE). Moreover, the use of graph search algorithms for accessing MeSH information could provide a support tool for cataloging MEDLINE abstracts in real time. ©Isabel Segura Bedmar, Paloma Martínez, Adrián Carruana Martín. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 01.12.2017.
ExtraTrain: a database of Extragenic regions and Transcriptional information in prokaryotic organisms

PubMed Central

Pareja, Eduardo; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Bonal, Javier; Tobes, Raquel

2006-01-01

Background Transcriptional regulation processes are the principal mechanisms of adaptation in prokaryotes. In these processes, the regulatory proteins and the regulatory DNA signals located in extragenic regions are the key elements involved. As all extragenic spaces are putative regulatory regions, ExtraTrain covers all extragenic regions of available genomes and regulatory proteins from bacteria and archaea included in the UniProt database. Description ExtraTrain provides integrated and easily manageable information for 679816 extragenic regions and for the genes delimiting each of them. In addition ExtraTrain supplies a tool to explore extragenic regions, named Palinsight, oriented to detect and search palindromic patterns. This interactive visual tool is totally integrated in the database, allowing the search for regulatory signals in user defined sets of extragenic regions. The 26046 regulatory proteins included in ExtraTrain belong to the families AraC/XylS, ArsR, AsnC, Cold shock domain, CRP-FNR, DeoR, GntR, IclR, LacI, LuxR, LysR, MarR, MerR, NtrC/Fis, OmpR and TetR. The database follows the InterPro criteria to define these families. The information about regulators includes manually curated sets of references specifically associated to regulator entries. In order to achieve a sustainable and maintainable knowledge database ExtraTrain is a platform open to the contribution of knowledge by the scientific community providing a system for the incorporation of textual knowledge. Conclusion ExtraTrain is a new database for exploring Extragenic regions and Transcriptional information in bacteria and archaea. ExtraTrain database is available at . PMID:16539733
The comparative recall of Google Scholar versus PubMed in identical searches for biomedical systematic reviews: a review of searches used in systematic reviews.

PubMed

Bramer, Wichor M; Giustini, Dean; Kramer, Bianca Mr; Anderson, Pf

2013-12-23

The usefulness of Google Scholar (GS) as a bibliographic database for biomedical systematic review (SR) searching is a subject of current interest and debate in research circles. Recent research has suggested GS might even be used alone in SR searching. This assertion is challenged here by testing whether GS can locate all studies included in 21 previously published SRs. Second, it examines the recall of GS, taking into account the maximum number of items that can be viewed, and tests whether more complete searches created by an information specialist will improve recall compared to the searches used in the 21 published SRs. The authors identified 21 biomedical SRs that had used GS and PubMed as information sources and reported their use of identical, reproducible search strategies in both databases. These search strategies were rerun in GS and PubMed, and analyzed as to their coverage and recall. Efforts were made to improve searches that underperformed in each database. GS' overall coverage was higher than PubMed (98% versus 91%) and overall recall is higher in GS: 80% of the references included in the 21 SRs were returned by the original searches in GS versus 68% in PubMed. Only 72% of the included references could be used as they were listed among the first 1,000 hits (the maximum number shown). Practical precision (the number of included references retrieved in the first 1,000, divided by 1,000) was on average 1.9%, which is only slightly lower than in other published SRs. Improving searches with the lowest recall resulted in an increase in recall from 48% to 66% in GS and, in PubMed, from 60% to 85%. Although its coverage and precision are acceptable, GS, because of its incomplete recall, should not be used as a single source in SR searching. A specialized, curated medical database such as PubMed provides experienced searchers with tools and functionality that help improve recall, and numerous options in order to optimize precision. Searches for SRs should be performed by experienced searchers creating searches that maximize recall for as many databases as deemed necessary by the search expert.
The comparative recall of Google Scholar versus PubMed in identical searches for biomedical systematic reviews: a review of searches used in systematic reviews

PubMed Central

2013-01-01

Background The usefulness of Google Scholar (GS) as a bibliographic database for biomedical systematic review (SR) searching is a subject of current interest and debate in research circles. Recent research has suggested GS might even be used alone in SR searching. This assertion is challenged here by testing whether GS can locate all studies included in 21 previously published SRs. Second, it examines the recall of GS, taking into account the maximum number of items that can be viewed, and tests whether more complete searches created by an information specialist will improve recall compared to the searches used in the 21 published SRs. Methods The authors identified 21 biomedical SRs that had used GS and PubMed as information sources and reported their use of identical, reproducible search strategies in both databases. These search strategies were rerun in GS and PubMed, and analyzed as to their coverage and recall. Efforts were made to improve searches that underperformed in each database. Results GS’ overall coverage was higher than PubMed (98% versus 91%) and overall recall is higher in GS: 80% of the references included in the 21 SRs were returned by the original searches in GS versus 68% in PubMed. Only 72% of the included references could be used as they were listed among the first 1,000 hits (the maximum number shown). Practical precision (the number of included references retrieved in the first 1,000, divided by 1,000) was on average 1.9%, which is only slightly lower than in other published SRs. Improving searches with the lowest recall resulted in an increase in recall from 48% to 66% in GS and, in PubMed, from 60% to 85%. Conclusions Although its coverage and precision are acceptable, GS, because of its incomplete recall, should not be used as a single source in SR searching. A specialized, curated medical database such as PubMed provides experienced searchers with tools and functionality that help improve recall, and numerous options in order to optimize precision. Searches for SRs should be performed by experienced searchers creating searches that maximize recall for as many databases as deemed necessary by the search expert. PMID:24360284
Using the Saccharomyces Genome Database (SGD) for analysis of genomic information

PubMed Central

Skrzypek, Marek S.; Hirschman, Jodi

2011-01-01

Analysis of genomic data requires access to software tools that place the sequence-derived information in the context of biology. The Saccharomyces Genome Database (SGD) integrates functional information about budding yeast genes and their products with a set of analysis tools that facilitate exploring their biological details. This unit describes how the various types of functional data available at SGD can be searched, retrieved, and analyzed. Starting with the guided tour of the SGD Home page and Locus Summary page, this unit highlights how to retrieve data using YeastMine, how to visualize genomic information with GBrowse, how to explore gene expression patterns with SPELL, and how to use Gene Ontology tools to characterize large-scale datasets. PMID:21901739
Electronic Biomedical Literature Search for Budding Researcher

PubMed Central

Thakre, Subhash B.; Thakre S, Sushama S.; Thakre, Amol D.

2013-01-01

Search for specific and well defined literature related to subject of interest is the foremost step in research. When we are familiar with topic or subject then we can frame appropriate research question. Appropriate research question is the basis for study objectives and hypothesis. The Internet provides a quick access to an overabundance of the medical literature, in the form of primary, secondary and tertiary literature. It is accessible through journals, databases, dictionaries, textbooks, indexes, and e-journals, thereby allowing access to more varied, individualised, and systematic educational opportunities. Web search engine is a tool designed to search for information on the World Wide Web, which may be in the form of web pages, images, information, and other types of files. Search engines for internet-based search of medical literature include Google, Google scholar, Scirus, Yahoo search engine, etc., and databases include MEDLINE, PubMed, MEDLARS, etc. Several web-libraries (National library Medicine, Cochrane, Web of Science, Medical matrix, Emory libraries) have been developed as meta-sites, providing useful links to health resources globally. A researcher must keep in mind the strengths and limitations of a particular search engine/database while searching for a particular type of data. Knowledge about types of literature, levels of evidence, and detail about features of search engine as available, user interface, ease of access, reputable content, and period of time covered allow their optimal use and maximal utility in the field of medicine. Literature search is a dynamic and interactive process; there is no one way to conduct a search and there are many variables involved. It is suggested that a systematic search of literature that uses available electronic resource effectively, is more likely to produce quality research. PMID:24179937
Electronic biomedical literature search for budding researcher.

PubMed

Thakre, Subhash B; Thakre S, Sushama S; Thakre, Amol D

2013-09-01

Search for specific and well defined literature related to subject of interest is the foremost step in research. When we are familiar with topic or subject then we can frame appropriate research question. Appropriate research question is the basis for study objectives and hypothesis. The Internet provides a quick access to an overabundance of the medical literature, in the form of primary, secondary and tertiary literature. It is accessible through journals, databases, dictionaries, textbooks, indexes, and e-journals, thereby allowing access to more varied, individualised, and systematic educational opportunities. Web search engine is a tool designed to search for information on the World Wide Web, which may be in the form of web pages, images, information, and other types of files. Search engines for internet-based search of medical literature include Google, Google scholar, Scirus, Yahoo search engine, etc., and databases include MEDLINE, PubMed, MEDLARS, etc. Several web-libraries (National library Medicine, Cochrane, Web of Science, Medical matrix, Emory libraries) have been developed as meta-sites, providing useful links to health resources globally. A researcher must keep in mind the strengths and limitations of a particular search engine/database while searching for a particular type of data. Knowledge about types of literature, levels of evidence, and detail about features of search engine as available, user interface, ease of access, reputable content, and period of time covered allow their optimal use and maximal utility in the field of medicine. Literature search is a dynamic and interactive process; there is no one way to conduct a search and there are many variables involved. It is suggested that a systematic search of literature that uses available electronic resource effectively, is more likely to produce quality research.
Pepitome: evaluating improved spectral library search for identification complementarity and quality assessment

PubMed Central

Dasari, Surendra; Chambers, Matthew C.; Martinez, Misti A.; Carpenter, Kristin L.; Ham, Amy-Joan L.; Vega-Montoto, Lorenzo J.; Tabb, David L.

2012-01-01

Spectral libraries have emerged as a viable alternative to protein sequence databases for peptide identification. These libraries contain previously detected peptide sequences and their corresponding tandem mass spectra (MS/MS). Search engines can then identify peptides by comparing experimental MS/MS scans to those in the library. Many of these algorithms employ the dot product score for measuring the quality of a spectrum-spectrum match (SSM). This scoring system does not offer a clear statistical interpretation and ignores fragment ion m/z discrepancies in the scoring. We developed a new spectral library search engine, Pepitome, which employs statistical systems for scoring SSMs. Pepitome outperformed the leading library search tool, SpectraST, when analyzing data sets acquired on three different mass spectrometry platforms. We characterized the reliability of spectral library searches by confirming shotgun proteomics identifications through RNA-Seq data. Applying spectral library and database searches on the same sample revealed their complementary nature. Pepitome identifications enabled the automation of quality analysis and quality control (QA/QC) for shotgun proteomics data acquisition pipelines. PMID:22217208
Seismic Search Engine: A distributed database for mining large scale seismic data

NASA Astrophysics Data System (ADS)

Liu, Y.; Vaidya, S.; Kuzma, H. A.

2009-12-01

The International Monitoring System (IMS) of the CTBTO collects terabytes worth of seismic measurements from many receiver stations situated around the earth with the goal of detecting underground nuclear testing events and distinguishing them from other benign, but more common events such as earthquakes and mine blasts. The International Data Center (IDC) processes and analyzes these measurements, as they are collected by the IMS, to summarize event detections in daily bulletins. Thereafter, the data measurements are archived into a large format database. Our proposed Seismic Search Engine (SSE) will facilitate a framework for data exploration of the seismic database as well as the development of seismic data mining algorithms. Analogous to GenBank, the annotated genetic sequence database maintained by NIH, through SSE, we intend to provide public access to seismic data and a set of processing and analysis tools, along with community-generated annotations and statistical models to help interpret the data. SSE will implement queries as user-defined functions composed from standard tools and models. Each query is compiled and executed over the database internally before reporting results back to the user. Since queries are expressed with standard tools and models, users can easily reproduce published results within this framework for peer-review and making metric comparisons. As an illustration, an example query is “what are the best receiver stations in East Asia for detecting events in the Middle East?” Evaluating this query involves listing all receiver stations in East Asia, characterizing known seismic events in that region, and constructing a profile for each receiver station to determine how effective its measurements are at predicting each event. The results of this query can be used to help prioritize how data is collected, identify defective instruments, and guide future sensor placements.
SkyDOT: a publicly accessible variability database, containing multiple sky surveys and real-time data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Starr, D. L.; Wozniak, P. R.; Vestrand, W. T.

2002-01-01

SkyDOT (Sky Database for Objects in Time-Domain) is a Virtual Observatory currently comprised of data from the RAPTOR, ROTSE I, and OGLE I1 survey projects. This makes it a very large time domain database. In addition, the RAPTOR project provides SkyDOT with real-time variability data as well as stereoscopic information. With its web interface, we believe SkyDOT will be a very useful tool for both astronomers, and the public. Our main task has been to construct an efficient relational database containing all existing data, while handling a real-time inflow of data. We also provide a useful web interface allowing easymore » access to both astronomers and the public. Initially, this server will allow common searches, specific queries, and access to light curves. In the future we will include machine learning classification tools and access to spectral information.« less
CRAVE: a database, middleware and visualization system for phenotype ontologies.

PubMed

Gkoutos, Georgios V; Green, Eain C J; Greenaway, Simon; Blake, Andrew; Mallon, Ann-Marie; Hancock, John M

2005-04-01

A major challenge in modern biology is to link genome sequence information to organismal function. In many organisms this is being done by characterizing phenotypes resulting from mutations. Efficiently expressing phenotypic information requires combinatorial use of ontologies. However tools are not currently available to visualize combinations of ontologies. Here we describe CRAVE (Concept Relation Assay Value Explorer), a package allowing storage, active updating and visualization of multiple ontologies. CRAVE is a web-accessible JAVA application that accesses an underlying MySQL database of ontologies via a JAVA persistent middleware layer (Chameleon). This maps the database tables into discrete JAVA classes and creates memory resident, interlinked objects corresponding to the ontology data. These JAVA objects are accessed via calls through the middleware's application programming interface. CRAVE allows simultaneous display and linking of multiple ontologies and searching using Boolean and advanced searches.
The Majorana Parts Tracking Database

DOE PAGES

Abgrall, N.; Aguayo, E.; Avignone, F. T.; ...

2015-01-16

The Majorana Demonstrator is an ultra-low background physics experiment searching for the neutrinoless double beta decay of 76Ge. The Majorana Parts Tracking Database is used to record the history of components used in the construction of the Demonstrator. The tracking implementation takes a novel approach based on the schema-free database technology CouchDB. Transportation, storage, and processes undergone by parts such as machining or cleaning are linked to part records. Tracking parts provides a great logistics benefit and an important quality assurance reference during construction. In addition, the location history of parts provides an estimate of their exposure to cosmic radiation.more » In summary, a web application for data entry and a radiation exposure calculator have been developed as tools for achieving the extreme radio-purity required for this rare decay search.« less
RISSC: a novel database for ribosomal 16S–23S RNA genes spacer regions

PubMed Central

García-Martínez, Jesús; Bescós, Ignacio; Rodríguez-Sala, Jesús Javier; Rodríguez-Valera, Francisco

2001-01-01

A novel database, under the acronym RISSC (Ribosomal Intergenic Spacer Sequence Collection), has been created. It compiles more than 1600 entries of edited DNA sequence data from the 16S–23S ribosomal spacers present in most prokaryotes and organelles (e.g. mitochondria and chloroplasts) and is accessible through the Internet (http://ulises.umh.es/RISSC), where systematic searches for specific words can be conducted, as well as BLAST-type sequence searches. Additionally, a characteristic feature of this region, the presence/absence and nature of tRNA genes within the spacer, is included in all the entries, even when not previously indicated in the original database. All these combined features could provide a useful documentation tool for studies on evolution, identification, typing and strain characterization, among others. PMID:11125084
Energy science and technology database (on the internet). Online data

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

The Energy Science and Technology Database (EDB) is a multidisciplinary file containing worldwide references to basic and applied scientific and technical research literature. The information is collected for use by government managers, researchers at the national laboratories, and other research efforts sponsored by the U.S. Department of Energy, and the results of this research are transferred to the public. Abstracts are included for records from 1976 to the present. The EDB also contains the Nuclear Science Abstracts which is a comprehensive abstract and index collection to the international nuclear science and technology literature for the period 1948 through 1976. Includedmore » are scientific and technical reports of the U.S. Atomic Energy Commission, U.S. Energy Research and Development Administration and its contractors, other agencies, universities, and industrial and research organizations. Approximately 25% of the records in the file contain abstracts. Nuclear Science Abstracts contains over 900,000 bibliographic records. The entire Energy Science and Technology Database contains over 3 million bibliographic records. This database is now available for searching through the GOV. Research-Center (GRC) service. GRC is a single online web-based search service to well known Government databases. Featuring powerful search and retrieval software, GRC is an important research tool. The GRC web site is at http://grc.ntis.gov.« less
New Tools to Document and Manage Data/Metadata: Example NGEE Arctic and UrbIS

NASA Astrophysics Data System (ADS)

Crow, M. C.; Devarakonda, R.; Hook, L.; Killeffer, T.; Krassovski, M.; Boden, T.; King, A. W.; Wullschleger, S. D.

2016-12-01

Tools used for documenting, archiving, cataloging, and searching data are critical pieces of informatics. This discussion describes tools being used in two different projects at Oak Ridge National Laboratory (ORNL), but at different stages of the data lifecycle. The Metadata Entry and Data Search Tool is being used for the documentation, archival, and data discovery stages for the Next Generation Ecosystem Experiment - Arctic (NGEE Arctic) project while the Urban Information Systems (UrbIS) Data Catalog is being used to support indexing, cataloging, and searching. The NGEE Arctic Online Metadata Entry Tool [1] provides a method by which researchers can upload their data and provide original metadata with each upload. The tool is built upon a Java SPRING framework to parse user input into, and from, XML output. Many aspects of the tool require use of a relational database including encrypted user-login, auto-fill functionality for predefined sites and plots, and file reference storage and sorting. The UrbIS Data Catalog is a data discovery tool supported by the Mercury cataloging framework [2] which aims to compile urban environmental data from around the world into one location, and be searchable via a user-friendly interface. Each data record conveniently displays its title, source, and date range, and features: (1) a button for a quick view of the metadata, (2) a direct link to the data and, for some data sets, (3) a button for visualizing the data. The search box incorporates autocomplete capabilities for search terms and sorted keyword filters are available on the side of the page, including a map for searching by area. References: [1] Devarakonda, Ranjeet, et al. "Use of a metadata documentation and search tool for large data volumes: The NGEE arctic example." Big Data (Big Data), 2015 IEEE International Conference on. IEEE, 2015. [2] Devarakonda, R., Palanisamy, G., Wilson, B. E., & Green, J. M. (2010). Mercury: reusable metadata management, data discovery and access system. Earth Science Informatics, 3(1-2), 87-94.
IMGT/3Dstructure-DB and IMGT/StructuralQuery, a database and a tool for immunoglobulin, T cell receptor and MHC structural data

PubMed Central

Kaas, Quentin; Ruiz, Manuel; Lefranc, Marie-Paule

2004-01-01

IMGT/3Dstructure-DB and IMGT/Structural-Query are a novel 3D structure database and a new tool for immunological proteins. They are part of IMGT, the international ImMunoGenetics information system®, a high-quality integrated knowledge resource specializing in immunoglobulins (IG), T cell receptors (TR), major histocompatibility complex (MHC) and related proteins of the immune system (RPI) of human and other vertebrate species, which consists of databases, Web resources and interactive on-line tools. IMGT/3Dstructure-DB data are described according to the IMGT Scientific chart rules based on the IMGT-ONTOLOGY concepts. IMGT/3Dstructure-DB provides IMGT gene and allele identification of IG, TR and MHC proteins with known 3D structures, domain delimitations, amino acid positions according to the IMGT unique numbering and renumbered coordinate flat files. Moreover IMGT/3Dstructure-DB provides 2D graphical representations (or Collier de Perles) and results of contact analysis. The IMGT/StructuralQuery tool allows search of this database based on specific structural characteristics. IMGT/3Dstructure-DB and IMGT/StructuralQuery are freely available at http://imgt.cines.fr. PMID:14681396
The gene expression database for mouse development (GXD): putting developmental expression information at your fingertips.

PubMed

Smith, Constance M; Finger, Jacqueline H; Kadin, James A; Richardson, Joel E; Ringwald, Martin

2014-10-01

Because molecular mechanisms of development are extraordinarily complex, the understanding of these processes requires the integration of pertinent research data. Using the Gene Expression Database for Mouse Development (GXD) as an example, we illustrate the progress made toward this goal, and discuss relevant issues that apply to developmental databases and developmental research in general. Since its first release in 1998, GXD has served the scientific community by integrating multiple types of expression data from publications and electronic submissions and by making these data freely and widely available. Focusing on endogenous gene expression in wild-type and mutant mice and covering data from RNA in situ hybridization, in situ reporter (knock-in), immunohistochemistry, reverse transcriptase-polymerase chain reaction, Northern blot, and Western blot experiments, the database has grown tremendously over the years in terms of data content and search utilities. Currently, GXD includes over 1.4 million annotated expression results and over 260,000 images. All these data and images are readily accessible to many types of database searches. Here we describe the data and search tools of GXD; explain how to use the database most effectively; discuss how we acquire, curate, and integrate developmental expression information; and describe how the research community can help in this process. Copyright © 2014 The Authors Developmental Dynamics published by Wiley Periodicals, Inc. on behalf of American Association of Anatomists.
YPED: An Integrated Bioinformatics Suite and Database for Mass Spectrometry-based Proteomics Research

PubMed Central

Colangelo, Christopher M.; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L.; Carriero, Nicholas J.; Gulcicek, Erol E.; Lam, TuKiet T.; Wu, Terence; Bjornson, Robert D.; Bruce, Can; Nairn, Angus C.; Rinehart, Jesse; Miller, Perry L.; Williams, Kenneth R.

2015-01-01

We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography–tandem mass spectrometry (LC–MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED’s database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. PMID:25712262
YPED: an integrated bioinformatics suite and database for mass spectrometry-based proteomics research.

PubMed

Colangelo, Christopher M; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L; Carriero, Nicholas J; Gulcicek, Erol E; Lam, TuKiet T; Wu, Terence; Bjornson, Robert D; Bruce, Can; Nairn, Angus C; Rinehart, Jesse; Miller, Perry L; Williams, Kenneth R

2015-02-01

We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography-tandem mass spectrometry (LC-MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED's database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.

Evidence-based practice: extending the search to find material for the systematic review

PubMed Central

Helmer, Diane; Savoie, Isabelle; Green, Carolyn; Kazanjian, Arminée

2001-01-01

Background: Cochrane-style systematic reviews increasingly require the participation of librarians. Guidelines on the appropriate search strategy to use for systematic reviews have been proposed. However, research evidence supporting these recommendations is limited. Objective: This study investigates the effectiveness of various systematic search methods used to uncover randomized controlled trials (RCTs) for systematic reviews. Effectiveness is defined as the proportion of relevant material uncovered for the systematic review using extended systematic review search methods. The following extended systematic search methods are evaluated: searching subject-specific or specialized databases (including trial registries), hand searching, scanning reference lists, and communicating personally. Methods: Two systematic review projects were prospectively monitored regarding the method used to identify items as well as the type of items retrieved. The proportion of RCTs identified by each systematic search method was calculated. Results: The extended systematic search methods uncovered 29.2% of all items retrieved for the systematic reviews. The search of specialized databases was the most effective method, followed by scanning of reference lists, communicating personally, and hand searching. Although the number of items identified through hand searching was small, these unique items would otherwise have been missed. Conclusions: Extended systematic search methods are effective tools for uncovering material for the systematic review. The quality of the items uncovered has yet to be assessed and will be key in evaluating the value of the systematic search methods. PMID:11837256
The Advent of Portals.

ERIC Educational Resources Information Center

Jackson, Mary E.

2002-01-01

Explains portals as tools that gather a variety of electronic information resources, including local library resources, into a single Web page. Highlights include cross-database searching; integration with university portals and course management software; the ARL (Association of Research Libraries) Scholars Portal Initiative; and selected vendors…
Development of a Searchable Metabolite Database and Simulator of Xenobiotic Metabolism

EPA Science Inventory

A computational tool (MetaPath) has been developed for storage and analysis of metabolic pathways and associated metadata. The system is capable of sophisticated text and chemical structure/substructure searching as well as rapid comparison of metabolites formed across chemicals,...
Toolbox for Evaluating Residents as Teachers

ERIC Educational Resources Information Center

Coverdale, John H.; Ismail, Nadia; Mian, Ayesha; Dewey, Charlene

2010-01-01

Objective: The authors review existing assessment tools related to evaluating residents' teaching skills and teaching effectiveness. Methods: PubMed and PsycInfo databases were searched using combinations of keywords including "residents," "residents as teachers," "teaching skills," and "assessments" or "rating scales." Results: Eleven evaluation…
CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units

PubMed Central

Liu, Yongchao; Maskell, Douglas L; Schmidt, Bertil

2009-01-01

Background The Smith-Waterman algorithm is one of the most widely used tools for searching biological sequence databases due to its high sensitivity. Unfortunately, the Smith-Waterman algorithm is computationally demanding, which is further compounded by the exponential growth of sequence databases. The recent emergence of many-core architectures, and their associated programming interfaces, provides an opportunity to accelerate sequence database searches using commonly available and inexpensive hardware. Findings Our CUDASW++ implementation (benchmarked on a single-GPU NVIDIA GeForce GTX 280 graphics card and a dual-GPU GeForce GTX 295 graphics card) provides a significant performance improvement compared to other publicly available implementations, such as SWPS3, CBESW, SW-CUDA, and NCBI-BLAST. CUDASW++ supports query sequences of length up to 59K and for query sequences ranging in length from 144 to 5,478 in Swiss-Prot release 56.6, the single-GPU version achieves an average performance of 9.509 GCUPS with a lowest performance of 9.039 GCUPS and a highest performance of 9.660 GCUPS, and the dual-GPU version achieves an average performance of 14.484 GCUPS with a lowest performance of 10.660 GCUPS and a highest performance of 16.087 GCUPS. Conclusion CUDASW++ is publicly available open-source software. It provides a significant performance improvement for Smith-Waterman-based protein sequence database searches by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs. PMID:19416548
41. DISCOVERY, SEARCH, AND COMMUNICATION OF TEXTUAL KNOWLEDGE RESOURCES IN DISTRIBUTED SYSTEMS a. Discovering and Utilizing Knowledge Sources for Metasearch Knowledge Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zamora, Antonio

Advanced Natural Language Processing Tools for Web Information Retrieval, Content Analysis, and Synthesis. The goal of this SBIR was to implement and evaluate several advanced Natural Language Processing (NLP) tools and techniques to enhance the precision and relevance of search results by analyzing and augmenting search queries and by helping to organize the search output obtained from heterogeneous databases and web pages containing textual information of interest to DOE and the scientific-technical user communities in general. The SBIR investigated 1) the incorporation of spelling checkers in search applications, 2) identification of significant phrases and concepts using a combination of linguisticmore » and statistical techniques, and 3) enhancement of the query interface and search retrieval results through the use of semantic resources, such as thesauri. A search program with a flexible query interface was developed to search reference databases with the objective of enhancing search results from web queries or queries of specialized search systems such as DOE's Information Bridge. The DOE ETDE/INIS Joint Thesaurus was processed to create a searchable database. Term frequencies and term co-occurrences were used to enhance the web information retrieval by providing algorithmically-derived objective criteria to organize relevant documents into clusters containing significant terms. A thesaurus provides an authoritative overview and classification of a field of knowledge. By organizing the results of a search using the thesaurus terminology, the output is more meaningful than when the results are just organized based on the terms that co-occur in the retrieved documents, some of which may not be significant. An attempt was made to take advantage of the hierarchy provided by broader and narrower terms, as well as other field-specific information in the thesauri. The search program uses linguistic morphological routines to find relevant entries regardless of whether terms are stored in singular or plural form. Implementation of additional inflectional morphology processes for verbs can enhance retrieval further, but this has to be balanced by the possibility of broadening the results too much. In addition to the DOE energy thesaurus, other sources of specialized organized knowledge such as the Medical Subject Headings (MeSH), the Unified Medical Language System (UMLS), and Wikipedia were investigated. The supporting role of the NLP thesaurus search program was enhanced by incorporating spelling aid and a part-of-speech tagger to cope with misspellings in the queries and to determine the grammatical roles of the query words and identify nouns for special processing. To improve precision, multiple modes of searching were implemented including Boolean operators, and field-specific searches. Programs to convert a thesaurus or reference file into searchable support files can be deployed easily, and the resulting files are immediately searchable to produce relevance-ranked results with builtin spelling aid, morphological processing, and advanced search logic. Demonstration systems were built for several databases, including the DOE energy thesaurus.« less
The Majorana Parts Tracking Database

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abgrall, N.; Aguayo, E.; Avignone, F. T.

2015-04-01

The MAJORANA DEMONSTRATOR is an ultra-low background physics experiment searching for the neutrinoless double beta decay of 76Ge. The MAJORANA Parts Tracking Database is used to record the history of components used in the construction of the DEMONSTRATOR. Transportation, storage, and processes undergone by parts such as machining or cleaning are linked to part records. Tracking parts provides a great logistics benefit and an important quality assurance reference during construction. In addition, the location history of parts provides an estimate of their exposure to cosmic radiation. A web application for data entry and a radiation exposure calculator have been developedmore » as tools for achieving the extreme radiopurity required for this rare decay search.« less
Structure-Based Characterization of Multiprotein Complexes

PubMed Central

Wiederstein, Markus; Gruber, Markus; Frank, Karl; Melo, Francisco; Sippl, Manfred J.

2014-01-01

Summary Multiprotein complexes govern virtually all cellular processes. Their 3D structures provide important clues to their biological roles, especially through structural correlations among protein molecules and complexes. The detection of such correlations generally requires comprehensive searches in databases of known protein structures by means of appropriate structure-matching techniques. Here, we present a high-speed structure search engine capable of instantly matching large protein oligomers against the complete and up-to-date database of biologically functional assemblies of protein molecules. We use this tool to reveal unseen structural correlations on the level of protein quaternary structure and demonstrate its general usefulness for efficiently exploring complex structural relationships among known protein assemblies. PMID:24954616
MassSieve: Panning MS/MS peptide data for proteins

PubMed Central

Slotta, Douglas J.; McFarland, Melinda A.; Markey, Sanford P.

2010-01-01

We present MassSieve, a Java-based platform for visualization and parsimony analysis of single and comparative LC-MS/MS database search engine results. The success of mass spectrometric peptide sequence assignment algorithms has led to the need for a tool to merge and evaluate the increasing data set sizes that result from LC-MS/MS-based shotgun proteomic experiments. MassSieve supports reports from multiple search engines with differing search characteristics, which can increase peptide sequence coverage and/or identify conflicting or ambiguous spectral assignments. PMID:20564260
A neotropical Miocene pollen database employing image-based search and semantic modeling.

PubMed

Han, Jing Ginger; Cao, Hongfei; Barb, Adrian; Punyasena, Surangi W; Jaramillo, Carlos; Shyu, Chi-Ren

2014-08-01

Digital microscopic pollen images are being generated with increasing speed and volume, producing opportunities to develop new computational methods that increase the consistency and efficiency of pollen analysis and provide the palynological community a computational framework for information sharing and knowledge transfer. • Mathematical methods were used to assign trait semantics (abstract morphological representations) of the images of neotropical Miocene pollen and spores. Advanced database-indexing structures were built to compare and retrieve similar images based on their visual content. A Web-based system was developed to provide novel tools for automatic trait semantic annotation and image retrieval by trait semantics and visual content. • Mathematical models that map visual features to trait semantics can be used to annotate images with morphology semantics and to search image databases with improved reliability and productivity. Images can also be searched by visual content, providing users with customized emphases on traits such as color, shape, and texture. • Content- and semantic-based image searches provide a powerful computational platform for pollen and spore identification. The infrastructure outlined provides a framework for building a community-wide palynological resource, streamlining the process of manual identification, analysis, and species discovery.
Influenza Virus Database (IVDB): an integrated information resource and analysis platform for influenza virus research.

PubMed

Chang, Suhua; Zhang, Jiajie; Liao, Xiaoyun; Zhu, Xinxing; Wang, Dahai; Zhu, Jiang; Feng, Tao; Zhu, Baoli; Gao, George F; Wang, Jian; Yang, Huanming; Yu, Jun; Wang, Jing

2007-01-01

Frequent outbreaks of highly pathogenic avian influenza and the increasing data available for comparative analysis require a central database specialized in influenza viruses (IVs). We have established the Influenza Virus Database (IVDB) to integrate information and create an analysis platform for genetic, genomic, and phylogenetic studies of the virus. IVDB hosts complete genome sequences of influenza A virus generated by Beijing Institute of Genomics (BIG) and curates all other published IV sequences after expert annotation. Our Q-Filter system classifies and ranks all nucleotide sequences into seven categories according to sequence content and integrity. IVDB provides a series of tools and viewers for comparative analysis of the viral genomes, genes, genetic polymorphisms and phylogenetic relationships. A search system has been developed for users to retrieve a combination of different data types by setting search options. To facilitate analysis of global viral transmission and evolution, the IV Sequence Distribution Tool (IVDT) has been developed to display the worldwide geographic distribution of chosen viral genotypes and to couple genomic data with epidemiological data. The BLAST, multiple sequence alignment and phylogenetic analysis tools were integrated for online data analysis. Furthermore, IVDB offers instant access to pre-computed alignments and polymorphisms of IV genes and proteins, and presents the results as SNP distribution plots and minor allele distributions. IVDB is publicly available at http://influenza.genomics.org.cn.
A Guided Tour of Saada

NASA Astrophysics Data System (ADS)

Michel, L.; Motch, C.; Nguyen Ngoc, H.; Pineau, F. X.

2009-09-01

Saada (http://amwdb.u-strasbg.fr/saada) is a tool for helping astronomers build local archives without writing any code (Michel et al. 2004). Databases created by Saada can host collections of heterogeneous data files. These data collections can also be published in the VO. An overview of the main Saada features is presented in this demo: creation of a basic database, creation of relationships, data searches using SaadaQL, metadata tagging, and use of VO services.
BioMart Central Portal: an open database network for the biological community

PubMed Central

Guberman, Jonathan M.; Ai, J.; Arnaiz, O.; Baran, Joachim; Blake, Andrew; Baldock, Richard; Chelala, Claude; Croft, David; Cros, Anthony; Cutts, Rosalind J.; Di Génova, A.; Forbes, Simon; Fujisawa, T.; Gadaleta, E.; Goodstein, D. M.; Gundem, Gunes; Haggarty, Bernard; Haider, Syed; Hall, Matthew; Harris, Todd; Haw, Robin; Hu, S.; Hubbard, Simon; Hsu, Jack; Iyer, Vivek; Jones, Philip; Katayama, Toshiaki; Kinsella, R.; Kong, Lei; Lawson, Daniel; Liang, Yong; Lopez-Bigas, Nuria; Luo, J.; Lush, Michael; Mason, Jeremy; Moreews, Francois; Ndegwa, Nelson; Oakley, Darren; Perez-Llamas, Christian; Primig, Michael; Rivkin, Elena; Rosanoff, S.; Shepherd, Rebecca; Simon, Reinhard; Skarnes, B.; Smedley, Damian; Sperling, Linda; Spooner, William; Stevenson, Peter; Stone, Kevin; Teague, J.; Wang, Jun; Wang, Jianxin; Whitty, Brett; Wong, D. T.; Wong-Erasmus, Marie; Yao, L.; Youens-Clark, Ken; Yung, Christina; Zhang, Junjun; Kasprzyk, Arek

2011-01-01

BioMart Central Portal is a first of its kind, community-driven effort to provide unified access to dozens of biological databases spanning genomics, proteomics, model organisms, cancer data, ontology information and more. Anybody can contribute an independently maintained resource to the Central Portal, allowing it to be exposed to and shared with the research community, and linking it with the other resources in the portal. Users can take advantage of the common interface to quickly utilize different sources without learning a new system for each. The system also simplifies cross-database searches that might otherwise require several complicated steps. Several integrated tools streamline common tasks, such as converting between ID formats and retrieving sequences. The combination of a wide variety of databases, an easy-to-use interface, robust programmatic access and the array of tools make Central Portal a one-stop shop for biological data querying. Here, we describe the structure of Central Portal and show example queries to demonstrate its capabilities. Database URL: http://central.biomart.org. PMID:21930507
PoMaMo--a comprehensive database for potato genome data.

PubMed

Meyer, Svenja; Nagel, Axel; Gebhardt, Christiane

2005-01-01

A database for potato genome data (PoMaMo, Potato Maps and More) was established. The database contains molecular maps of all twelve potato chromosomes with about 1000 mapped elements, sequence data, putative gene functions, results from BLAST analysis, SNP and InDel information from different diploid and tetraploid potato genotypes, publication references, links to other public databases like GenBank (http://www.ncbi.nlm.nih.gov/) or SGN (Solanaceae Genomics Network, http://www.sgn.cornell.edu/), etc. Flexible search and data visualization interfaces enable easy access to the data via internet (https://gabi.rzpd.de/PoMaMo.html). The Java servlet tool YAMB (Yet Another Map Browser) was designed to interactively display chromosomal maps. Maps can be zoomed in and out, and detailed information about mapped elements can be obtained by clicking on an element of interest. The GreenCards interface allows a text-based data search by marker-, sequence- or genotype name, by sequence accession number, gene function, BLAST Hit or publication reference. The PoMaMo database is a comprehensive database for different potato genome data, and to date the only database containing SNP and InDel data from diploid and tetraploid potato genotypes.
PoMaMo—a comprehensive database for potato genome data

PubMed Central

Meyer, Svenja; Nagel, Axel; Gebhardt, Christiane

2005-01-01

A database for potato genome data (PoMaMo, Potato Maps and More) was established. The database contains molecular maps of all twelve potato chromosomes with about 1000 mapped elements, sequence data, putative gene functions, results from BLAST analysis, SNP and InDel information from different diploid and tetraploid potato genotypes, publication references, links to other public databases like GenBank (http://www.ncbi.nlm.nih.gov/) or SGN (Solanaceae Genomics Network, http://www.sgn.cornell.edu/), etc. Flexible search and data visualization interfaces enable easy access to the data via internet (https://gabi.rzpd.de/PoMaMo.html). The Java servlet tool YAMB (Yet Another Map Browser) was designed to interactively display chromosomal maps. Maps can be zoomed in and out, and detailed information about mapped elements can be obtained by clicking on an element of interest. The GreenCards interface allows a text-based data search by marker-, sequence- or genotype name, by sequence accession number, gene function, BLAST Hit or publication reference. The PoMaMo database is a comprehensive database for different potato genome data, and to date the only database containing SNP and InDel data from diploid and tetraploid potato genotypes. PMID:15608284
Measurement properties of self-report physical activity assessment tools in stroke: a protocol for a systematic review

PubMed Central

Martins, Júlia Caetano; Aguiar, Larissa Tavares; Nadeau, Sylvie; Scianni, Aline Alvim; Teixeira-Salmela, Luci Fuscaldi; Faria, Christina Danielli Coelho de Morais

2017-01-01

Introduction Self-report physical activity assessment tools are commonly used for the evaluation of physical activity levels in individuals with stroke. A great variety of these tools have been developed and widely used in recent years, which justify the need to examine their measurement properties and clinical utility. Therefore, the main objectives of this systematic review are to examine the measurement properties and clinical utility of self-report measures of physical activity and discuss the strengths and limitations of the identified tools. Methods and analysis A systematic review of studies that investigated the measurement properties and/or clinical utility of self-report physical activity assessment tools in stroke will be conducted. Electronic searches will be performed in five databases: Medical Literature Analysis and Retrieval System Online (MEDLINE) (PubMed), Excerpta Medica Database (EMBASE), Physiotherapy Evidence Database (PEDro), Literatura Latino-Americana e do Caribe em Ciências da Saúde (LILACS) and Scientific Electronic Library Online (SciELO), followed by hand searches of the reference lists of the included studies. Two independent reviewers will screen all retrieve titles, abstracts, and full texts, according to the inclusion criteria and will also extract the data. A third reviewer will be referred to solve any disagreement. A descriptive summary of the included studies will contain the design, participants, as well as the characteristics, measurement properties, and clinical utility of the self-report tools. The methodological quality of the studies will be evaluated using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist and the clinical utility of the identified tools will be assessed considering predefined criteria. This systematic review will follow the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) statement. Discussion This systematic review will provide an extensive review of the measurement properties and clinical utility of self-report physical activity assessment tools used in individuals with stroke, which would benefit clinicians and researchers. Trial registration number PROSPERO CRD42016037146. PMID:28193848
Application of ChemDraw NMR Tool: Correlation of Program-Generated 13C Chemical Shifts and pKa Values of para-Substituted Benzoic Acids

NASA Astrophysics Data System (ADS)

Wang, Hongyi

2005-09-01

An application of ChemDraw NMR Tool was demonstrated by correlation of program-generated 13 C NMR chemical shifts and p K a values of para-substituted benzoic acids. Experimental 13 C NMR chemical shifts were analyzed in the same way for comparison. The project can be used as an assignment at the end of the first-year organic chemistry course to review topics or explore new techniques: Hammett equation, acid base equilibrium theory, electronic nature of functional groups, inductive and resonance effects, structure reactivity relationship, NMR spectroscopy, literature search, database search, and ChemDraw software.
A geodata warehouse: Using denormalisation techniques as a tool for delivering spatially enabled integrated geological information to geologists

NASA Astrophysics Data System (ADS)

Kingdon, Andrew; Nayembil, Martin L.; Richardson, Anne E.; Smith, A. Graham

2016-11-01

New requirements to understand geological properties in three dimensions have led to the development of PropBase, a data structure and delivery tools to deliver this. At the BGS, relational database management systems (RDBMS) has facilitated effective data management using normalised subject-based database designs with business rules in a centralised, vocabulary controlled, architecture. These have delivered effective data storage in a secure environment. However, isolated subject-oriented designs prevented efficient cross-domain querying of datasets. Additionally, the tools provided often did not enable effective data discovery as they struggled to resolve the complex underlying normalised structures providing poor data access speeds. Users developed bespoke access tools to structures they did not fully understand sometimes delivering them incorrect results. Therefore, BGS has developed PropBase, a generic denormalised data structure within an RDBMS to store property data, to facilitate rapid and standardised data discovery and access, incorporating 2D and 3D physical and chemical property data, with associated metadata. This includes scripts to populate and synchronise the layer with its data sources through structured input and transcription standards. A core component of the architecture includes, an optimised query object, to deliver geoscience information from a structure equivalent to a data warehouse. This enables optimised query performance to deliver data in multiple standardised formats using a web discovery tool. Semantic interoperability is enforced through vocabularies combined from all data sources facilitating searching of related terms. PropBase holds 28.1 million spatially enabled property data points from 10 source databases incorporating over 50 property data types with a vocabulary set that includes 557 property terms. By enabling property data searches across multiple databases PropBase has facilitated new scientific research, previously considered impractical. PropBase is easily extended to incorporate 4D data (time series) and is providing a baseline for new "big data" monitoring projects.
ProtaBank: A repository for protein design and engineering data.

PubMed

Wang, Connie Y; Chang, Paul M; Ary, Marie L; Allen, Benjamin D; Chica, Roberto A; Mayo, Stephen L; Olafson, Barry D

2018-03-25

We present ProtaBank, a repository for storing, querying, analyzing, and sharing protein design and engineering data in an actively maintained and updated database. ProtaBank provides a format to describe and compare all types of protein mutational data, spanning a wide range of properties and techniques. It features a user-friendly web interface and programming layer that streamlines data deposition and allows for batch input and queries. The database schema design incorporates a standard format for reporting protein sequences and experimental data that facilitates comparison of results across different data sets. A suite of analysis and visualization tools are provided to facilitate discovery, to guide future designs, and to benchmark and train new predictive tools and algorithms. ProtaBank will provide a valuable resource to the protein engineering community by storing and safeguarding newly generated data, allowing for fast searching and identification of relevant data from the existing literature, and exploring correlations between disparate data sets. ProtaBank invites researchers to contribute data to the database to make it accessible for search and analysis. ProtaBank is available at https://protabank.org. © 2018 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
The availability and effectiveness of tools supporting shared decision making in metastatic breast cancer care: a review.

PubMed

Spronk, Inge; Burgers, Jako S; Schellevis, François G; van Vliet, Liesbeth M; Korevaar, Joke C

2018-05-11

Shared decision-making (SDM) in the management of metastatic breast cancer care is associated with positive patient outcomes. In daily clinical practice, however, SDM is not fully integrated yet. Initiatives to improve the implementation of SDM would be helpful. The aim of this review was to assess the availability and effectiveness of tools supporting SDM in metastatic breast cancer care. Literature databases were systematically searched for articles published since 2006 focusing on the development or evaluation of tools to improve information-provision and to support decision-making in metastatic breast cancer care. Internet searches and experts identified additional tools. Data from included tools were extracted and the evaluation of tools was appraised using the GRADE grading system. The literature search yielded five instruments. In addition, two tools were identified via internet searches and consultation of experts. Four tools were specifically developed for supporting SDM in metastatic breast cancer, the other three tools focused on metastatic cancer in general. Tools were mainly applicable across the care process, and usable for decisions on supportive care with or without chemotherapy. All tools were designed for patients to be used before a consultation with the physician. Effects on patient outcomes were generally weakly positive although most tools were not studied in well-designed studies. Despite its recognized importance, only two tools were positively evaluated on effectiveness and are available to support patients with metastatic breast cancer in SDM. These tools show promising results in pilot studies and focus on different aspects of care. However, their effectiveness should be confirmed in well-designed studies before implementation in clinical practice. Innovation and development of SDM tools targeting clinicians as well as patients during a clinical encounter is recommended.

The National NeuroAIDS Tissue Consortium (NNTC) Database: an integrated database for HIV-related studies

PubMed Central

Cserhati, Matyas F.; Pandey, Sanjit; Beaudoin, James J.; Baccaglini, Lorena; Guda, Chittibabu; Fox, Howard S.

2015-01-01

We herein present the National NeuroAIDS Tissue Consortium-Data Coordinating Center (NNTC-DCC) database, which is the only available database for neuroAIDS studies that contains data in an integrated, standardized form. This database has been created in conjunction with the NNTC, which provides human tissue and biofluid samples to individual researchers to conduct studies focused on neuroAIDS. The database contains experimental datasets from 1206 subjects for the following categories (which are further broken down into subcategories): gene expression, genotype, proteins, endo-exo-chemicals, morphometrics and other (miscellaneous) data. The database also contains a wide variety of downloadable data and metadata for 95 HIV-related studies covering 170 assays from 61 principal investigators. The data represent 76 tissue types, 25 measurement types, and 38 technology types, and reaches a total of 33 017 407 data points. We used the ISA platform to create the database and develop a searchable web interface for querying the data. A gene search tool is also available, which searches for NCBI GEO datasets associated with selected genes. The database is manually curated with many user-friendly features, and is cross-linked to the NCBI, HUGO and PubMed databases. A free registration is required for qualified users to access the database. Database URL: http://nntc-dcc.unmc.edu PMID:26228431
maxdLoad2 and maxdBrowse: standards-compliant tools for microarray experimental annotation, data management and dissemination.

PubMed

Hancock, David; Wilson, Michael; Velarde, Giles; Morrison, Norman; Hayes, Andrew; Hulme, Helen; Wood, A Joseph; Nashar, Karim; Kell, Douglas B; Brass, Andy

2005-11-03

maxdLoad2 is a relational database schema and Java application for microarray experimental annotation and storage. It is compliant with all standards for microarray meta-data capture; including the specification of what data should be recorded, extensive use of standard ontologies and support for data exchange formats. The output from maxdLoad2 is of a form acceptable for submission to the ArrayExpress microarray repository at the European Bioinformatics Institute. maxdBrowse is a PHP web-application that makes contents of maxdLoad2 databases accessible via web-browser, the command-line and web-service environments. It thus acts as both a dissemination and data-mining tool. maxdLoad2 presents an easy-to-use interface to an underlying relational database and provides a full complement of facilities for browsing, searching and editing. There is a tree-based visualization of data connectivity and the ability to explore the links between any pair of data elements, irrespective of how many intermediate links lie between them. Its principle novel features are: the flexibility of the meta-data that can be captured, the tools provided for importing data from spreadsheets and other tabular representations, the tools provided for the automatic creation of structured documents, the ability to browse and access the data via web and web-services interfaces. Within maxdLoad2 it is very straightforward to customise the meta-data that is being captured or change the definitions of the meta-data. These meta-data definitions are stored within the database itself allowing client software to connect properly to a modified database without having to be specially configured. The meta-data definitions (configuration file) can also be centralized allowing changes made in response to revisions of standards or terminologies to be propagated to clients without user intervention.maxdBrowse is hosted on a web-server and presents multiple interfaces to the contents of maxd databases. maxdBrowse emulates many of the browse and search features available in the maxdLoad2 application via a web-browser. This allows users who are not familiar with maxdLoad2 to browse and export microarray data from the database for their own analysis. The same browse and search features are also available via command-line and SOAP server interfaces. This both enables scripting of data export for use embedded in data repositories and analysis environments, and allows access to the maxd databases via web-service architectures. maxdLoad2 http://www.bioinf.man.ac.uk/microarray/maxd/ and maxdBrowse http://dbk.ch.umist.ac.uk/maxdBrowse are portable and compatible with all common operating systems and major database servers. They provide a powerful, flexible package for annotation of microarray experiments and a convenient dissemination environment. They are available for download and open sourced under the Artistic License.
Database resources of the National Center for Biotechnology Information.

PubMed

Sayers, Eric W; Barrett, Tanya; Benson, Dennis A; Bolton, Evan; Bryant, Stephen H; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M; DiCuccio, Michael; Federhen, Scott; Feolo, Michael; Fingerman, Ian M; Geer, Lewis Y; Helmberg, Wolfgang; Kapustin, Yuri; Landsman, David; Lipman, David J; Lu, Zhiyong; Madden, Thomas L; Madej, Tom; Maglott, Donna R; Marchler-Bauer, Aron; Miller, Vadim; Mizrachi, Ilene; Ostell, James; Panchenko, Anna; Phan, Lon; Pruitt, Kim D; Schuler, Gregory D; Sequeira, Edwin; Sherry, Stephen T; Shumway, Martin; Sirotkin, Karl; Slotta, Douglas; Souvorov, Alexandre; Starchenko, Grigory; Tatusova, Tatiana A; Wagner, Lukas; Wang, Yanli; Wilbur, W John; Yaschenko, Eugene; Ye, Jian

2011-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central (PMC), Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Primer-BLAST, COBALT, Electronic PCR, OrfFinder, Splign, ProSplign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, Cancer Chromosomes, Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Trace Archive, Sequence Read Archive, Retroviral Genotyping Tools, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus (GEO), Entrez Probe, GENSAT, Online Mendelian Inheritance in Man (OMIM), Online Mendelian Inheritance in Animals (OMIA), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), IBIS, Biosystems, Peptidome, OMSSA, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
The antiSMASH database, a comprehensive database of microbial secondary metabolite biosynthetic gene clusters.

PubMed

Blin, Kai; Medema, Marnix H; Kottmann, Renzo; Lee, Sang Yup; Weber, Tilmann

2017-01-04

Secondary metabolites produced by microorganisms are the main source of bioactive compounds that are in use as antimicrobial and anticancer drugs, fungicides, herbicides and pesticides. In the last decade, the increasing availability of microbial genomes has established genome mining as a very important method for the identification of their biosynthetic gene clusters (BGCs). One of the most popular tools for this task is antiSMASH. However, so far, antiSMASH is limited to de novo computing results for user-submitted genomes and only partially connects these with BGCs from other organisms. Therefore, we developed the antiSMASH database, a simple but highly useful new resource to browse antiSMASH-annotated BGCs in the currently 3907 bacterial genomes in the database and perform advanced search queries combining multiple search criteria. antiSMASH-DB is available at http://antismash-db.secondarymetabolites.org/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Interactive Multi-Instrument Database of Solar Flares

NASA Technical Reports Server (NTRS)

Ranjan, Shubha S.; Spaulding, Ryan; Deardorff, Donald G.

2018-01-01

The fundamental motivation of the project is that the scientific output of solar research can be greatly enhanced by better exploitation of the existing solar/heliosphere space-data products jointly with ground-based observations. Our primary focus is on developing a specific innovative methodology based on recent advances in "big data" intelligent databases applied to the growing amount of high-spatial and multi-wavelength resolution, high-cadence data from NASA's missions and supporting ground-based observatories. Our flare database is not simply a manually searchable time-based catalog of events or list of web links pointing to data. It is a preprocessed metadata repository enabling fast search and automatic identification of all recorded flares sharing a specifiable set of characteristics, features, and parameters. The result is a new and unique database of solar flares and data search and classification tools for the Heliophysics community, enabling multi-instrument/multi-wavelength investigations of flare physics and supporting further development of flare-prediction methodologies.
Improved orthologous databases to ease protozoan targets inference.

PubMed

Kotowski, Nelson; Jardim, Rodrigo; Dávila, Alberto M R

2015-09-29

Homology inference helps on identifying similarities, as well as differences among organisms, which provides a better insight on how closely related one might be to another. In addition, comparative genomics pipelines are widely adopted tools designed using different bioinformatics applications and algorithms. In this article, we propose a methodology to build improved orthologous databases with the potential to aid on protozoan target identification, one of the many tasks which benefit from comparative genomics tools. Our analyses are based on OrthoSearch, a comparative genomics pipeline originally designed to infer orthologs through protein-profile comparison, supported by an HMM, reciprocal best hits based approach. Our methodology allows OrthoSearch to confront two orthologous databases and to generate an improved new one. Such can be later used to infer potential protozoan targets through a similarity analysis against the human genome. The protein sequences of Cryptosporidium hominis, Entamoeba histolytica and Leishmania infantum genomes were comparatively analyzed against three orthologous databases: (i) EggNOG KOG, (ii) ProtozoaDB and (iii) Kegg Orthology (KO). That allowed us to create two new orthologous databases, "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB", with 16,938 and 27,701 orthologous groups, respectively. Such new orthologous databases were used for a regular OrthoSearch run. By confronting "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB" databases and protozoan species we were able to detect the following total of orthologous groups and coverage (relation between the inferred orthologous groups and the species total number of proteins): Cryptosporidium hominis: 1,821 (11 %) and 3,254 (12 %); Entamoeba histolytica: 2,245 (13 %) and 5,305 (19 %); Leishmania infantum: 2,702 (16 %) and 4,760 (17 %). Using our HMM-based methodology and the largest created orthologous database, it was possible to infer 13 orthologous groups which represent potential protozoan targets; these were found because of our distant homology approach. We also provide the number of species-specific, pair-to-pair and core groups from such analyses, depicted in Venn diagrams. The orthologous databases generated by our HMM-based methodology provide a broader dataset, with larger amounts of orthologous groups when compared to the original databases used as input. Those may be used for several homology inference analyses, annotation tasks and protozoan targets identification.
TokSearch: A search engine for fusion experimental data

DOE PAGES

Sammuli, Brian S.; Barr, Jayson L.; Eidietis, Nicholas W.; ...

2018-04-01

At a typical fusion research site, experimental data is stored using archive technologies that deal with each discharge as an independent set of data. These technologies (e.g. MDSplus or HDF5) are typically supplemented with a database that aggregates metadata for multiple shots to allow for efficient querying of certain predefined quantities. Often, however, a researcher will need to extract information from the archives, possibly for many shots, that is not available in the metadata store or otherwise indexed for quick retrieval. To address this need, a new search tool called TokSearch has been added to the General Atomics TokSys controlmore » design and analysis suite [1]. This tool provides the ability to rapidly perform arbitrary, parallelized queries of archived tokamak shot data (both raw and analyzed) over large numbers of shots. The TokSearch query API borrows concepts from SQL, and users can choose to implement queries in either MatlabTM or Python.« less
TokSearch: A search engine for fusion experimental data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sammuli, Brian S.; Barr, Jayson L.; Eidietis, Nicholas W.

At a typical fusion research site, experimental data is stored using archive technologies that deal with each discharge as an independent set of data. These technologies (e.g. MDSplus or HDF5) are typically supplemented with a database that aggregates metadata for multiple shots to allow for efficient querying of certain predefined quantities. Often, however, a researcher will need to extract information from the archives, possibly for many shots, that is not available in the metadata store or otherwise indexed for quick retrieval. To address this need, a new search tool called TokSearch has been added to the General Atomics TokSys controlmore » design and analysis suite [1]. This tool provides the ability to rapidly perform arbitrary, parallelized queries of archived tokamak shot data (both raw and analyzed) over large numbers of shots. The TokSearch query API borrows concepts from SQL, and users can choose to implement queries in either MatlabTM or Python.« less
Technology in Science and Mathematics Education.

ERIC Educational Resources Information Center

Buccino, Alphonse

Provided are several perspectives on technology, addressing changes in learners related to technology, changes in contemporary life related to technology, and changes in subject areas related to technology (indicating that technology has created such new tools for inquiry as computer programming, word processing, online database searches, and…
Peptide Mass Fingerprinting of Egg White Proteins

ERIC Educational Resources Information Center

Alty, Lisa T.; LaRiviere, Frederick J.

2016-01-01

Use of advanced mass spectrometry techniques in the undergraduate setting has burgeoned in the past decade. However, relatively few undergraduate experiments examine the proteomics tools of protein digestion, peptide accurate mass determination, and database searching, also known as peptide mass fingerprinting. In this experiment, biochemistry…
Keeping Up with the Internet.

ERIC Educational Resources Information Center

Dealy, Jacqueline

1994-01-01

Offers instructions and resources for Internet novices wanting to access Internet services. Instructions are offered for connecting to 13 education listservs, 9 electronic journals and newsletters, 3 education databases, 7 Telnet gopher sites, Veronica and Archie search tools, and File Transfer Protocol (FTP). (Contains 16 references.) (SLW)
Advanced Composition and the Computerized Library.

ERIC Educational Resources Information Center

Hult, Christine

1989-01-01

Discusses four kinds of computerized access tools: online catalogs; computerized reference; online database searching; and compact disks and read only memory (CD-ROM). Examines how these technologies are changing research. Suggests how research instruction in advanced writing courses can be refocused to include the new technologies. (RS)
ALDB: a domestic-animal long noncoding RNA database.

PubMed

Li, Aimin; Zhang, Junying; Zhou, Zhongyin; Wang, Lei; Liu, Yujuan; Liu, Yajun

2015-01-01

Long noncoding RNAs (lncRNAs) have attracted significant attention in recent years due to their important roles in many biological processes. Domestic animals constitute a unique resource for understanding the genetic basis of phenotypic variation and are ideal models relevant to diverse areas of biomedical research. With improving sequencing technologies, numerous domestic-animal lncRNAs are now available. Thus, there is an immediate need for a database resource that can assist researchers to store, organize, analyze and visualize domestic-animal lncRNAs. The domestic-animal lncRNA database, named ALDB, is the first comprehensive database with a focus on the domestic-animal lncRNAs. It currently archives 12,103 pig intergenic lncRNAs (lincRNAs), 8,923 chicken lincRNAs and 8,250 cow lincRNAs. In addition to the annotations of lincRNAs, it offers related data that is not available yet in existing lncRNA databases (lncRNAdb and NONCODE), such as genome-wide expression profiles and animal quantitative trait loci (QTLs) of domestic animals. Moreover, a collection of interfaces and applications, such as the Basic Local Alignment Search Tool (BLAST), the Generic Genome Browser (GBrowse) and flexible search functionalities, are available to help users effectively explore, analyze and download data related to domestic-animal lncRNAs. ALDB enables the exploration and comparative analysis of lncRNAs in domestic animals. A user-friendly web interface, integrated information and tools make it valuable to researchers in their studies. ALDB is freely available from http://res.xaut.edu.cn/aldb/index.jsp.
Protein Identification Using Top-Down Spectra*

PubMed Central

Liu, Xiaowen; Sirotkin, Yakov; Shen, Yufeng; Anderson, Gordon; Tsai, Yihsuan S.; Ting, Ying S.; Goodlett, David R.; Smith, Richard D.; Bafna, Vineet; Pevzner, Pavel A.

2012-01-01

In the last two years, because of advances in protein separation and mass spectrometry, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples and identifying hundreds and even thousands of proteins. However, computational tools for database search of top-down spectra against protein databases are still in their infancy. We describe MS-Align+, a fast algorithm for top-down protein identification based on spectral alignment that enables searches for unexpected post-translational modifications. We also propose a method for evaluating statistical significance of top-down protein identifications and further benchmark various software tools on two top-down data sets from Saccharomyces cerevisiae and Salmonella typhimurium. We demonstrate that MS-Align+ significantly increases the number of identified spectra as compared with MASCOT and OMSSA on both data sets. Although MS-Align+ and ProSightPC have similar performance on the Salmonella typhimurium data set, MS-Align+ outperforms ProSightPC on the (more complex) Saccharomyces cerevisiae data set. PMID:22027200
NeisseriaBase: a specialised Neisseria genomic resource and analysis platform.

PubMed

Zheng, Wenning; Mutha, Naresh V R; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah; Choo, Siew Woh

2016-01-01

Background. The gram-negative Neisseria is associated with two of the most potent human epidemic diseases: meningococcal meningitis and gonorrhoea. In both cases, disease is caused by bacteria colonizing human mucosal membrane surfaces. Overall, the genus shows great diversity and genetic variation mainly due to its ability to acquire and incorporate genetic material from a diverse range of sources through horizontal gene transfer. Although a number of databases exist for the Neisseria genomes, they are mostly focused on the pathogenic species. In this present study we present the freely available NeisseriaBase, a database dedicated to the genus Neisseria encompassing the complete and draft genomes of 15 pathogenic and commensal Neisseria species. Methods. The genomic data were retrieved from National Center for Biotechnology Information (NCBI) and annotated using the RAST server which were then stored into the MySQL database. The protein-coding genes were further analyzed to obtain information such as calculation of GC content (%), predicted hydrophobicity and molecular weight (Da) using in-house Perl scripts. The web application was developed following the secure four-tier web application architecture: (1) client workstation, (2) web server, (3) application server, and (4) database server. The web interface was constructed using PHP, JavaScript, jQuery, AJAX and CSS, utilizing the model-view-controller (MVC) framework. The in-house developed bioinformatics tools implemented in NeisseraBase were developed using Python, Perl, BioPerl and R languages. Results. Currently, NeisseriaBase houses 603,500 Coding Sequences (CDSs), 16,071 RNAs and 13,119 tRNA genes from 227 Neisseria genomes. The database is equipped with interactive web interfaces. Incorporation of the JBrowse genome browser in the database enables fast and smooth browsing of Neisseria genomes. NeisseriaBase includes the standard BLAST program to facilitate homology searching, and for Virulence Factor Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my.
NeisseriaBase: a specialised Neisseria genomic resource and analysis platform

PubMed Central

Zheng, Wenning; Mutha, Naresh V.R.; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S.; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah

2016-01-01

Background. The gram-negative Neisseria is associated with two of the most potent human epidemic diseases: meningococcal meningitis and gonorrhoea. In both cases, disease is caused by bacteria colonizing human mucosal membrane surfaces. Overall, the genus shows great diversity and genetic variation mainly due to its ability to acquire and incorporate genetic material from a diverse range of sources through horizontal gene transfer. Although a number of databases exist for the Neisseria genomes, they are mostly focused on the pathogenic species. In this present study we present the freely available NeisseriaBase, a database dedicated to the genus Neisseria encompassing the complete and draft genomes of 15 pathogenic and commensal Neisseria species. Methods. The genomic data were retrieved from National Center for Biotechnology Information (NCBI) and annotated using the RAST server which were then stored into the MySQL database. The protein-coding genes were further analyzed to obtain information such as calculation of GC content (%), predicted hydrophobicity and molecular weight (Da) using in-house Perl scripts. The web application was developed following the secure four-tier web application architecture: (1) client workstation, (2) web server, (3) application server, and (4) database server. The web interface was constructed using PHP, JavaScript, jQuery, AJAX and CSS, utilizing the model-view-controller (MVC) framework. The in-house developed bioinformatics tools implemented in NeisseraBase were developed using Python, Perl, BioPerl and R languages. Results. Currently, NeisseriaBase houses 603,500 Coding Sequences (CDSs), 16,071 RNAs and 13,119 tRNA genes from 227 Neisseria genomes. The database is equipped with interactive web interfaces. Incorporation of the JBrowse genome browser in the database enables fast and smooth browsing of Neisseria genomes. NeisseriaBase includes the standard BLAST program to facilitate homology searching, and for Virulence Factor Database (VFDB) specific homology searches, the VFDB BLAST is also incorporated into the database. In addition, NeisseriaBase is equipped with in-house designed tools such as the Pairwise Genome Comparison tool (PGC) for comparative genomic analysis and the Pathogenomics Profiling Tool (PathoProT) for the comparative pathogenomics analysis of Neisseria strains. Discussion. This user-friendly database not only provides access to a host of genomic resources on Neisseria but also enables high-quality comparative genome analysis, which is crucial for the expanding scientific community interested in Neisseria research. This database is freely available at http://neisseria.um.edu.my. PMID:27017950
Using PHP/MySQL to Manage Potential Mass Impacts

NASA Technical Reports Server (NTRS)

Hager, Benjamin I.

2010-01-01

This paper presents a new application using commercially available software to manage mass properties for spaceflight vehicles. PHP/MySQL(PHP: Hypertext Preprocessor and My Structured Query Language) are a web scripting language and a database language commonly used in concert with each other. They open up new opportunities to develop cutting edge mass properties tools, and in particular, tools for the management of potential mass impacts (threats and opportunities). The paper begins by providing an overview of the functions and capabilities of PHP/MySQL. The focus of this paper is on how PHP/MySQL are being used to develop an advanced "web accessible" database system for identifying and managing mass impacts on NASA's Ares I Upper Stage program, managed by the Marshall Space Flight Center. To fully describe this application, examples of the data, search functions, and views are provided to promote, not only the function, but the security, ease of use, simplicity, and eye-appeal of this new application. This paper concludes with an overview of the other potential mass properties applications and tools that could be developed using PHP/MySQL. The premise behind this paper is that PHP/MySQL are software tools that are easy to use and readily available for the development of cutting edge mass properties applications. These tools are capable of providing "real-time" searching and status of an active database, automated report generation, and other capabilities to streamline and enhance mass properties management application. By using PHP/MySQL, proven existing methods for managing mass properties can be adapted to present-day information technology to accelerate mass properties data gathering, analysis, and reporting, allowing mass property management to keep pace with today's fast-pace design and development processes.
Engineering With Nature Geographic Project Mapping Tool (EWN ProMap)

DTIC Science & Technology

2015-07-01

EWN ProMap database provides numerous case studies for infrastructure projects such as breakwaters, river engineering dikes, and seawalls that have...the EWN Project Mapping Tool (EWN ProMap) is to assist users in their search for case study information that can be valuable for developing EWN ideas...Essential elements of EWN include: (1) using science and engineering to produce operational efficiencies supporting sustainable delivery of
Efficient RNA structure comparison algorithms.

PubMed

Arslan, Abdullah N; Anandan, Jithendar; Fry, Eric; Monschke, Keith; Ganneboina, Nitin; Bowerman, Jason

2017-12-01

Recently proposed relative addressing-based ([Formula: see text]) RNA secondary structure representation has important features by which an RNA structure database can be stored into a suffix array. A fast substructure search algorithm has been proposed based on binary search on this suffix array. Using this substructure search algorithm, we present a fast algorithm that finds the largest common substructure of given multiple RNA structures in [Formula: see text] format. The multiple RNA structure comparison problem is NP-hard in its general formulation. We introduced a new problem for comparing multiple RNA structures. This problem has more strict similarity definition and objective, and we propose an algorithm that solves this problem efficiently. We also develop another comparison algorithm that iteratively calls this algorithm to locate nonoverlapping large common substructures in compared RNAs. With the new resulting tools, we improved the RNASSAC website (linked from http://faculty.tamuc.edu/aarslan ). This website now also includes two drawing tools: one specialized for preparing RNA substructures that can be used as input by the search tool, and another one for automatically drawing the entire RNA structure from a given structure sequence.
E-MSD: an integrated data resource for bioinformatics.

PubMed

Golovin, A; Oldfield, T J; Tate, J G; Velankar, S; Barton, G J; Boutselakis, H; Dimitropoulos, D; Fillon, J; Hussain, A; Ionides, J M C; John, M; Keller, P A; Krissinel, E; McNeil, P; Naim, A; Newman, R; Pajon, A; Pineda, J; Rachedi, A; Copeland, J; Sitnov, A; Sobhany, S; Suarez-Uruena, A; Swaminathan, G J; Tagari, M; Tromm, S; Vranken, W; Henrick, K

2004-01-01

The Macromolecular Structure Database (MSD) group (http://www.ebi.ac.uk/msd/) continues to enhance the quality and consistency of macromolecular structure data in the Protein Data Bank (PDB) and to work towards the integration of various bioinformatics data resources. We have implemented a simple form-based interface that allows users to query the MSD directly. The MSD 'atlas pages' show all of the information in the MSD for a particular PDB entry. The group has designed new search interfaces aimed at specific areas of interest, such as the environment of ligands and the secondary structures of proteins. We have also implemented a novel search interface that begins to integrate separate MSD search services in a single graphical tool. We have worked closely with collaborators to build a new visualization tool that can present both structure and sequence data in a unified interface, and this data viewer is now used throughout the MSD services for the visualization and presentation of search results. Examples showcasing the functionality and power of these tools are available from tutorial webpages (http://www. ebi.ac.uk/msd-srv/docs/roadshow_tutorial/).

E-MSD: an integrated data resource for bioinformatics

PubMed Central

Golovin, A.; Oldfield, T. J.; Tate, J. G.; Velankar, S.; Barton, G. J.; Boutselakis, H.; Dimitropoulos, D.; Fillon, J.; Hussain, A.; Ionides, J. M. C.; John, M.; Keller, P. A.; Krissinel, E.; McNeil, P.; Naim, A.; Newman, R.; Pajon, A.; Pineda, J.; Rachedi, A.; Copeland, J.; Sitnov, A.; Sobhany, S.; Suarez-Uruena, A.; Swaminathan, G. J.; Tagari, M.; Tromm, S.; Vranken, W.; Henrick, K.

2004-01-01

The Macromolecular Structure Database (MSD) group (http://www.ebi.ac.uk/msd/) continues to enhance the quality and consistency of macromolecular structure data in the Protein Data Bank (PDB) and to work towards the integration of various bioinformatics data resources. We have implemented a simple form-based interface that allows users to query the MSD directly. The MSD ‘atlas pages’ show all of the information in the MSD for a particular PDB entry. The group has designed new search interfaces aimed at specific areas of interest, such as the environment of ligands and the secondary structures of proteins. We have also implemented a novel search interface that begins to integrate separate MSD search services in a single graphical tool. We have worked closely with collaborators to build a new visualization tool that can present both structure and sequence data in a unified interface, and this data viewer is now used throughout the MSD services for the visualization and presentation of search results. Examples showcasing the functionality and power of these tools are available from tutorial webpages (http://www.ebi.ac.uk/msd-srv/docs/roadshow_tutorial/). PMID:14681397
The National NeuroAIDS Tissue Consortium (NNTC) Database: an integrated database for HIV-related studies.

PubMed

Cserhati, Matyas F; Pandey, Sanjit; Beaudoin, James J; Baccaglini, Lorena; Guda, Chittibabu; Fox, Howard S

2015-01-01

We herein present the National NeuroAIDS Tissue Consortium-Data Coordinating Center (NNTC-DCC) database, which is the only available database for neuroAIDS studies that contains data in an integrated, standardized form. This database has been created in conjunction with the NNTC, which provides human tissue and biofluid samples to individual researchers to conduct studies focused on neuroAIDS. The database contains experimental datasets from 1206 subjects for the following categories (which are further broken down into subcategories): gene expression, genotype, proteins, endo-exo-chemicals, morphometrics and other (miscellaneous) data. The database also contains a wide variety of downloadable data and metadata for 95 HIV-related studies covering 170 assays from 61 principal investigators. The data represent 76 tissue types, 25 measurement types, and 38 technology types, and reaches a total of 33,017,407 data points. We used the ISA platform to create the database and develop a searchable web interface for querying the data. A gene search tool is also available, which searches for NCBI GEO datasets associated with selected genes. The database is manually curated with many user-friendly features, and is cross-linked to the NCBI, HUGO and PubMed databases. A free registration is required for qualified users to access the database. © The Author(s) 2015. Published by Oxford University Press.
Structator: fast index-based search for RNA sequence-structure patterns

PubMed Central

2011-01-01

Background The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence-structure patterns. Unfortunately, current tools for this task have, in the best case, a running time that is only linear in the size of sequence databases. Furthermore, established index data structures for fast sequence matching, like suffix trees or arrays, cannot benefit from the complementarity constraints introduced by the secondary structure of RNAs. Results We present a novel method and readily applicable software for time efficient matching of RNA sequence-structure patterns in sequence databases. Our approach is based on affix arrays, a recently introduced index data structure, preprocessed from the target database. Affix arrays support bidirectional pattern search, which is required for efficiently handling the structural constraints of the pattern. Structural patterns like stem-loops can be matched inside out, such that the loop region is matched first and then the pairing bases on the boundaries are matched consecutively. This allows to exploit base pairing information for search space reduction and leads to an expected running time that is sublinear in the size of the sequence database. The incorporation of a new chaining approach in the search of RNA sequence-structure patterns enables the description of molecules folding into complex secondary structures with multiple ordered patterns. The chaining approach removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our method runs up to two orders of magnitude faster than previous methods. Conclusions The presented method's sublinear expected running time makes it well suited for RNA sequence-structure pattern matching in large sequence databases. RNA molecules containing several stem-loop substructures can be described by multiple sequence-structure patterns and their matches are efficiently handled by a novel chaining method. Beyond our algorithmic contributions, we provide with Structator a complete and robust open-source software solution for index-based search of RNA sequence-structure patterns. The Structator software is available at http://www.zbh.uni-hamburg.de/Structator. PMID:21619640
Facilitating quality control for spectra assignments of small organic molecules: nmrshiftdb2--a free in-house NMR database with integrated LIMS for academic service laboratories.

PubMed

Kuhn, Stefan; Schlörer, Nils E

2015-08-01

nmrshiftdb2 supports with its laboratory information management system the integration of an electronic lab administration and management into academic NMR facilities. Also, it offers the setup of a local database, while full access to nmrshiftdb2's World Wide Web database is granted. This freely available system allows on the one hand the submission of orders for measurement, transfers recorded data automatically or manually, and enables download of spectra via web interface, as well as the integrated access to prediction, search, and assignment tools of the NMR database for lab users. On the other hand, for the staff and lab administration, flow of all orders can be supervised; administrative tools also include user and hardware management, a statistic functionality for accounting purposes, and a 'QuickCheck' function for assignment control, to facilitate quality control of assignments submitted to the (local) database. Laboratory information management system and database are based on a web interface as front end and are therefore independent of the operating system in use. Copyright © 2015 John Wiley & Sons, Ltd.
Database resources of the National Center for Biotechnology Information

PubMed Central

Acland, Abigail; Agarwala, Richa; Barrett, Tanya; Beck, Jeff; Benson, Dennis A.; Bollin, Colleen; Bolton, Evan; Bryant, Stephen H.; Canese, Kathi; Church, Deanna M.; Clark, Karen; DiCuccio, Michael; Dondoshansky, Ilya; Federhen, Scott; Feolo, Michael; Geer, Lewis Y.; Gorelenkov, Viatcheslav; Hoeppner, Marilu; Johnson, Mark; Kelly, Christopher; Khotomlianski, Viatcheslav; Kimchi, Avi; Kimelman, Michael; Kitts, Paul; Krasnov, Sergey; Kuznetsov, Anatoliy; Landsman, David; Lipman, David J.; Lu, Zhiyong; Madden, Thomas L.; Madej, Tom; Maglott, Donna R.; Marchler-Bauer, Aron; Karsch-Mizrachi, Ilene; Murphy, Terence; Ostell, James; O'Sullivan, Christopher; Panchenko, Anna; Phan, Lon; Pruitt, Don Preussm Kim D.; Rubinstein, Wendy; Sayers, Eric W.; Schneider, Valerie; Schuler, Gregory D.; Sequeira, Edwin; Sherry, Stephen T.; Shumway, Martin; Sirotkin, Karl; Siyan, Karanjit; Slotta, Douglas; Soboleva, Alexandra; Soussov, Vladimir; Starchenko, Grigory; Tatusova, Tatiana A.; Trawick, Bart W.; Vakatov, Denis; Wang, Yanli; Ward, Minghong; John Wilbur, W.; Yaschenko, Eugene; Zbicz, Kerry

2014-01-01

In addition to maintaining the GenBank® nucleic acid sequence database, the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI Web site. NCBI resources include Entrez, the Entrez Programming Utilities, MyNCBI, PubMed, PubMed Central, PubReader, Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Primer-BLAST, COBALT, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, dbVar, Epigenomics, the Genetic Testing Registry, Genome and related tools, the Map Viewer, Trace Archive, Sequence Read Archive, BioProject, BioSample, ClinVar, MedGen, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Probe, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool, Biosystems, Protein Clusters and the PubChem suite of small molecule databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All these resources can be accessed through the NCBI home page. PMID:24259429
IntegromeDB: an integrated system and biological search engine.

PubMed

Baitaluk, Michael; Kozhenkov, Sergey; Dubinina, Yulia; Ponomarenko, Julia

2012-01-19

With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback.
Capturing citation activity in three health sciences departments: a comparison study of Scopus and Web of Science.

PubMed

Sarkozy, Alexandra; Slyman, Alison; Wu, Wendy

2015-01-01

Scopus and Web of Science are the two major citation databases that collect and disseminate bibliometric statistics about research articles, journals, institutions, and individual authors. Liaison librarians are now regularly called upon to utilize these databases to assist faculty in finding citation activity on their published works for tenure and promotion, grant applications, and more. But questions about the accuracy, scope, and coverage of these tools deserve closer scrutiny. Discrepancies in citation capture led to a systematic study on how Scopus and Web of Science compared in a real-life situation encountered by liaisons: comparing three different disciplines at a medical school and nursing program. How many articles would each database retrieve for each faculty member using the author-searching tools provided? How many cited references for each faculty member would each tool generate? Results demonstrated troubling differences in publication and citation activity capture between Scopus and Web of Science. Implications for librarians are discussed.
PhAST: pharmacophore alignment search tool.

PubMed

Hähnke, Volker; Hofmann, Bettina; Grgat, Tomislav; Proschak, Ewgenij; Steinhilber, Dieter; Schneider, Gisbert

2009-04-15

We present a ligand-based virtual screening technique (PhAST) for rapid hit and lead structure searching in large compound databases. Molecules are represented as strings encoding the distribution of pharmacophoric features on the molecular graph. In contrast to other text-based methods using SMILES strings, we introduce a new form of text representation that describes the pharmacophore of molecules. This string representation opens the opportunity for revealing functional similarity between molecules by sequence alignment techniques in analogy to homology searching in protein or nucleic acid sequence databases. We favorably compared PhAST with other current ligand-based virtual screening methods in a retrospective analysis using the BEDROC metric. In a prospective application, PhAST identified two novel inhibitors of 5-lipoxygenase product formation with minimal experimental effort. This outcome demonstrates the applicability of PhAST to drug discovery projects and provides an innovative concept of sequence-based compound screening with substantial scaffold hopping potential. 2008 Wiley Periodicals, Inc.
Initial development of a computer-aided diagnosis tool for solitary pulmonary nodules

NASA Astrophysics Data System (ADS)

Catarious, David M., Jr.; Baydush, Alan H.; Floyd, Carey E., Jr.

2001-07-01

This paper describes the development of a computer-aided diagnosis (CAD) tool for solitary pulmonary nodules. This CAD tool is built upon physically meaningful features that were selected because of their relevance to shape and texture. These features included a modified version of the Hotelling statistic (HS), a channelized HS, three measures of fractal properties, two measures of spicularity, and three manually measured shape features. These features were measured from a difficult database consisting of 237 regions of interest (ROIs) extracted from digitized chest radiographs. The center of each 256x256 pixel ROI contained a suspicious lesion which was sent to follow-up by a radiologist and whose nature was later clinically determined. Linear discriminant analysis (LDA) was used to search the feature space via sequential forward search using percentage correct as the performance metric. An optimized feature subset, selected for the highest accuracy, was then fed into a three layer artificial neural network (ANN). The ANN's performance was assessed by receiver operating characteristic (ROC) analysis. A leave-one-out testing/training methodology was employed for the ROC analysis. The performance of this system is competitive with that of three radiologists on the same database.
Choosing Assessments that Matter

ERIC Educational Resources Information Center

Abilock, Debbie, Ed.

2007-01-01

Professionally, school librarians are faced with an explosion of choices--search engines, online catalogs, media types, subscription databases, and Web tools--all requiring scrutiny, evaluation, and selection. In turn, this support "stuff" forms a basis for making additional choices about how and what they teach and what they assess. Whereas once…
Informetrics: Exploring Databases as Analytical Tools.

ERIC Educational Resources Information Center

Wormell, Irene

1998-01-01

Advanced online search facilities and information retrieval techniques have increased the potential of bibliometric research. Discusses three case studies carried out by the Centre for Informetric Studies at the Royal School of Library Science (Denmark) on the internationality of international journals, informetric analyses on the World Wide Web,…
BRAD, the genetics and genomics database for Brassica plants.

PubMed

Cheng, Feng; Liu, Shengyi; Wu, Jian; Fang, Lu; Sun, Silong; Liu, Bo; Li, Pingxia; Hua, Wei; Wang, Xiaowu

2011-10-13

Brassica species include both vegetable and oilseed crops, which are very important to the daily life of common human beings. Meanwhile, the Brassica species represent an excellent system for studying numerous aspects of plant biology, specifically for the analysis of genome evolution following polyploidy, so it is also very important for scientific research. Now, the genome of Brassica rapa has already been assembled, it is the time to do deep mining of the genome data. BRAD, the Brassica database, is a web-based resource focusing on genome scale genetic and genomic data for important Brassica crops. BRAD was built based on the first whole genome sequence and on further data analysis of the Brassica A genome species, Brassica rapa (Chiifu-401-42). It provides datasets, such as the complete genome sequence of B. rapa, which was de novo assembled from Illumina GA II short reads and from BAC clone sequences, predicted genes and associated annotations, non coding RNAs, transposable elements (TE), B. rapa genes' orthologous to those in A. thaliana, as well as genetic markers and linkage maps. BRAD offers useful searching and data mining tools, including search across annotation datasets, search for syntenic or non-syntenic orthologs, and to search the flanking regions of a certain target, as well as the tools of BLAST and Gbrowse. BRAD allows users to enter almost any kind of information, such as a B. rapa or A. thaliana gene ID, physical position or genetic marker. BRAD, a new database which focuses on the genetics and genomics of the Brassica plants has been developed, it aims at helping scientists and breeders to fully and efficiently use the information of genome data of Brassica plants. BRAD will be continuously updated and can be accessed through http://brassicadb.org.
[Design of computerised database for clinical and basic management of uveal melanoma].

PubMed

Bande Rodríguez, M F; Santiago Varela, M; Blanco Teijeiro, M J; Mera Yañez, P; Pardo Perez, M; Capeans Tome, C; Piñeiro Ces, A

2012-09-01

The uveal melanoma is the most common primary intraocular tumour in adults. The objective of this work is to show how a computerised database has been formed with specific applications, for clinical and research use, to an extensive group of patients diagnosed with uveal melanoma. For the design of the database a selection of categories, attributes and values was created based on the classifications and parameters given by various authors of articles which have had great relevance in the field of uveal melanoma in recent years. The database has over 250 patient entries with specific information on their clinical history, diagnosis, treatment and progress. It enables us to search any parameter of the entry and make quick and simple statistical studies of them. The database models have been transformed into a basic tool for clinical practice, as they are an efficient way of storing, compiling and selective searching of information. When creating a database it is very important to define a common strategy and the use of a standard language. Copyright © 2011 Sociedad Española de Oftalmología. Published by Elsevier Espana. All rights reserved.
Fish Karyome: A karyological information network database of Indian Fishes.

PubMed

Nagpure, Naresh Sahebrao; Pathak, Ajey Kumar; Pati, Rameshwar; Singh, Shri Prakash; Singh, Mahender; Sarkar, Uttam Kumar; Kushwaha, Basdeo; Kumar, Ravindra

2012-01-01

'Fish Karyome', a database on karyological information of Indian fishes have been developed that serves as central source for karyotype data about Indian fishes compiled from the published literature. Fish Karyome has been intended to serve as a liaison tool for the researchers and contains karyological information about 171 out of 2438 finfish species reported in India and is publically available via World Wide Web. The database provides information on chromosome number, morphology, sex chromosomes, karyotype formula and cytogenetic markers etc. Additionally, it also provides the phenotypic information that includes species name, its classification, and locality of sample collection, common name, local name, sex, geographical distribution, and IUCN Red list status. Besides, fish and karyotype images, references for 171 finfish species have been included in the database. Fish Karyome has been developed using SQL Server 2008, a relational database management system, Microsoft's ASP.NET-2008 and Macromedia's FLASH Technology under Windows 7 operating environment. The system also enables users to input new information and images into the database, search and view the information and images of interest using various search options. Fish Karyome has wide range of applications in species characterization and identification, sex determination, chromosomal mapping, karyo-evolution and systematics of fishes.
Systematic review of the properties of tools used to measure outcomes in anxiety intervention studies for children with autism spectrum disorders.

PubMed

Wigham, Sarah; McConachie, Helen

2014-01-01

Evidence about relevant outcomes is required in the evaluation of clinical interventions for children with autism spectrum disorders (ASD). However, to date, the variety of outcome measurement tools being used, and lack of knowledge about the measurement properties of some, compromise conclusions regarding the most effective interventions. This two-stage systematic review aimed to identify the tools used in studies evaluating interventions for anxiety for high-functioning children with ASD in middle childhood, and then to evaluate the tools for their appropriateness and measurement properties. Electronic databases including Medline, PsychInfo, Embase, and the Cochrane database and registers were searched for anxiety intervention studies for children with ASD in middle childhood. Articles examining the measurement properties of the tools used were then searched for using a methodological filter in PubMed, and the quality of the papers evaluated using the COSMIN checklist. Ten intervention studies were identified in which six tools measuring anxiety and one of overall symptom change were used as primary outcomes. One further tool was included as it is recommended for standard use in UK children's mental health services. Sixty three articles on the properties of the tools were evaluated for the quality of evidence, and the quality of the measurement properties of each tool was summarised. Overall three questionnaires were found robust in their measurement properties, the Spence Children's Anxiety Scale, its revised version - the Revised Children's Anxiety and Depression Scale, and also the Screen for Child Anxiety Related Emotional Disorders. Crucially the articles on measurement properties provided almost no evidence on responsiveness to change, nor on the validity of use of the tools for evaluation of interventions for children with ASD. CRD42012002684.
Tools used to assess medical students competence in procedural skills at the end of a primary medical degree: a systematic review.

PubMed

Morris, Marie C; Gallagher, Tom K; Ridgway, Paul F

2012-01-01

The objective was to systematically review the literature to identify and grade tools used for the end point assessment of procedural skills (e.g., phlebotomy, IV cannulation, suturing) competence in medical students prior to certification. The authors searched eight bibliographic databases electronically - ERIC, Medline, CINAHL, EMBASE, Psychinfo, PsychLIT, EBM Reviews and the Cochrane databases. Two reviewers independently reviewed the literature to identify procedural assessment tools used specifically for assessing medical students within the PRISMA framework, the inclusion/exclusion criteria and search period. Papers on OSATS and DOPS were excluded as they focused on post-registration assessment and clinical rather than simulated competence. Of 659 abstracted articles 56 identified procedural assessment tools. Only 11 specifically assessed medical students. The final 11 studies consisted of 1 randomised controlled trial, 4 comparative and 6 descriptive studies yielding 12 heterogeneous procedural assessment tools for analysis. Seven tools addressed four discrete pre-certification skills, basic suture (3), airway management (2), nasogastric tube insertion (1) and intravenous cannulation (1). One tool used a generic assessment of procedural skills. Two tools focused on postgraduate laparoscopic skills and one on osteopathic students and thus were not included in this review. The levels of evidence are low with regard to reliability - κ = 0.65-0.71 and minimum validity is achieved - face and content. In conclusion, there are no tools designed specifically to assess competence of procedural skills in a final certification examination. There is a need to develop standardised tools with proven reliability and validity for assessment of procedural skills competence at the end of medical training. Medicine graduates must have comparable levels of procedural skills acquisition entering the clinical workforce irrespective of the country of training.
Systematic Review of the Properties of Tools Used to Measure Outcomes in Anxiety Intervention Studies for Children with Autism Spectrum Disorders

PubMed Central

Wigham, Sarah; McConachie, Helen

2014-01-01

Background Evidence about relevant outcomes is required in the evaluation of clinical interventions for children with autism spectrum disorders (ASD). However, to date, the variety of outcome measurement tools being used, and lack of knowledge about the measurement properties of some, compromise conclusions regarding the most effective interventions. Objectives This two-stage systematic review aimed to identify the tools used in studies evaluating interventions for anxiety for high-functioning children with ASD in middle childhood, and then to evaluate the tools for their appropriateness and measurement properties. Methods Electronic databases including Medline, PsychInfo, Embase, and the Cochrane database and registers were searched for anxiety intervention studies for children with ASD in middle childhood. Articles examining the measurement properties of the tools used were then searched for using a methodological filter in PubMed, and the quality of the papers evaluated using the COSMIN checklist. Results Ten intervention studies were identified in which six tools measuring anxiety and one of overall symptom change were used as primary outcomes. One further tool was included as it is recommended for standard use in UK children's mental health services. Sixty three articles on the properties of the tools were evaluated for the quality of evidence, and the quality of the measurement properties of each tool was summarised. Conclusions Overall three questionnaires were found robust in their measurement properties, the Spence Children's Anxiety Scale, its revised version – the Revised Children's Anxiety and Depression Scale, and also the Screen for Child Anxiety Related Emotional Disorders. Crucially the articles on measurement properties provided almost no evidence on responsiveness to change, nor on the validity of use of the tools for evaluation of interventions for children with ASD. PROSPERO Registration number CRD42012002684. PMID:24465519
Freva - Freie Univ Evaluation System Framework for Scientific Infrastructures in Earth System Modeling

NASA Astrophysics Data System (ADS)

Kadow, Christopher; Illing, Sebastian; Kunst, Oliver; Schartner, Thomas; Kirchner, Ingo; Rust, Henning W.; Cubasch, Ulrich; Ulbrich, Uwe

2016-04-01

The Freie Univ Evaluation System Framework (Freva - freva.met.fu-berlin.de) is a software infrastructure for standardized data and tool solutions in Earth system science. Freva runs on high performance computers to handle customizable evaluation systems of research projects, institutes or universities. It combines different software technologies into one common hybrid infrastructure, including all features present in the shell and web environment. The database interface satisfies the international standards provided by the Earth System Grid Federation (ESGF). Freva indexes different data projects into one common search environment by storing the meta data information of the self-describing model, reanalysis and observational data sets in a database. This implemented meta data system with its advanced but easy-to-handle search tool supports users, developers and their plugins to retrieve the required information. A generic application programming interface (API) allows scientific developers to connect their analysis tools with the evaluation system independently of the programming language used. Users of the evaluation techniques benefit from the common interface of the evaluation system without any need to understand the different scripting languages. Facilitation of the provision and usage of tools and climate data automatically increases the number of scientists working with the data sets and identifying discrepancies. The integrated web-shell (shellinabox) adds a degree of freedom in the choice of the working environment and can be used as a gate to the research projects HPC. Plugins are able to integrate their e.g. post-processed results into the database of the user. This allows e.g. post-processing plugins to feed statistical analysis plugins, which fosters an active exchange between plugin developers of a research project. Additionally, the history and configuration sub-system stores every analysis performed with the evaluation system in a database. Configurations and results of the tools can be shared among scientists via shell or web system. Therefore, plugged-in tools benefit from transparency and reproducibility. Furthermore, if configurations match while starting an evaluation plugin, the system suggests to use results already produced by other users - saving CPU/h, I/O, disk space and time. The efficient interaction between different technologies improves the Earth system modeling science framed by Freva.
Structure-based characterization of multiprotein complexes.

PubMed

Wiederstein, Markus; Gruber, Markus; Frank, Karl; Melo, Francisco; Sippl, Manfred J

2014-07-08

Multiprotein complexes govern virtually all cellular processes. Their 3D structures provide important clues to their biological roles, especially through structural correlations among protein molecules and complexes. The detection of such correlations generally requires comprehensive searches in databases of known protein structures by means of appropriate structure-matching techniques. Here, we present a high-speed structure search engine capable of instantly matching large protein oligomers against the complete and up-to-date database of biologically functional assemblies of protein molecules. We use this tool to reveal unseen structural correlations on the level of protein quaternary structure and demonstrate its general usefulness for efficiently exploring complex structural relationships among known protein assemblies. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
PubMed and beyond: a survey of web tools for searching biomedical literature

PubMed Central

Lu, Zhiyong

2011-01-01

The past decade has witnessed the modern advances of high-throughput technology and rapid growth of research capacity in producing large-scale biological data, both of which were concomitant with an exponential growth of biomedical literature. This wealth of scholarly knowledge is of significant importance for researchers in making scientific discoveries and healthcare professionals in managing health-related matters. However, the acquisition of such information is becoming increasingly difficult due to its large volume and rapid growth. In response, the National Center for Biotechnology Information (NCBI) is continuously making changes to its PubMed Web service for improvement. Meanwhile, different entities have devoted themselves to developing Web tools for helping users quickly and efficiently search and retrieve relevant publications. These practices, together with maturity in the field of text mining, have led to an increase in the number and quality of various Web tools that provide comparable literature search service to PubMed. In this study, we review 28 such tools, highlight their respective innovations, compare them to the PubMed system and one another, and discuss directions for future development. Furthermore, we have built a website dedicated to tracking existing systems and future advances in the field of biomedical literature search. Taken together, our work serves information seekers in choosing tools for their needs and service providers and developers in keeping current in the field. Database URL: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/search PMID:21245076

The Yak genome database: an integrative database for studying yak biology and high-altitude adaption

PubMed Central

2012-01-01

Background The yak (Bos grunniens) is a long-haired bovine that lives at high altitudes and is an important source of milk, meat, fiber and fuel. The recent sequencing, assembly and annotation of its genome are expected to further our understanding of the means by which it has adapted to life at high altitudes and its ecologically important traits. Description The Yak Genome Database (YGD) is an internet-based resource that provides access to genomic sequence data and predicted functional information concerning the genes and proteins of Bos grunniens. The curated data stored in the YGD includes genome sequences, predicted genes and associated annotations, non-coding RNA sequences, transposable elements, single nucleotide variants, and three-way whole-genome alignments between human, cattle and yak. YGD offers useful searching and data mining tools, including the ability to search for genes by name or using function keywords as well as GBrowse genome browsers and/or BLAST servers, which can be used to visualize genome regions and identify similar sequences. Sequence data from the YGD can also be downloaded to perform local searches. Conclusions A new yak genome database (YGD) has been developed to facilitate studies on high-altitude adaption and bovine genomics. The database will be continuously updated to incorporate new information such as transcriptome data and population resequencing data. The YGD can be accessed at http://me.lzu.edu.cn/yak. PMID:23134687
Nuclear science abstracts (NSA) database 1948--1974 (on the Internet)

DOE Office of Scientific and Technical Information (OSTI.GOV)

NONE

Nuclear Science Abstracts (NSA) is a comprehensive abstract and index collection of the International Nuclear Science and Technology literature for the period 1948 through 1976. Included are scientific and technical reports of the US Atomic Energy Commission, US Energy Research and Development Administration and its contractors, other agencies, universities, and industrial and research organizations. Coverage of the literature since 1976 is provided by Energy Science and Technology Database. Approximately 25% of the records in the file contain abstracts. These are from the following volumes of the print Nuclear Science Abstracts: Volumes 12--18, Volume 29, and Volume 33. The database containsmore » over 900,000 bibliographic records. All aspects of nuclear science and technology are covered, including: Biomedical Sciences; Metals, Ceramics, and Other Materials; Chemistry; Nuclear Materials and Waste Management; Environmental and Earth Sciences; Particle Accelerators; Engineering; Physics; Fusion Energy; Radiation Effects; Instrumentation; Reactor Technology; Isotope and Radiation Source Technology. The database includes all records contained in Volume 1 (1948) through Volume 33 (1976) of the printed version of Nuclear Science Abstracts (NSA). This worldwide coverage includes books, conference proceedings, papers, patents, dissertations, engineering drawings, and journal literature. This database is now available for searching through the GOV. Research Center (GRC) service. GRC is a single online web-based search service to well known Government databases. Featuring powerful search and retrieval software, GRC is an important research tool. The GRC web site is at http://grc.ntis.gov.« less
POPcorn: An Online Resource Providing Access to Distributed and Diverse Maize Project Data.

PubMed

Cannon, Ethalinda K S; Birkett, Scott M; Braun, Bremen L; Kodavali, Sateesh; Jennewein, Douglas M; Yilmaz, Alper; Antonescu, Valentin; Antonescu, Corina; Harper, Lisa C; Gardiner, Jack M; Schaeffer, Mary L; Campbell, Darwin A; Andorf, Carson M; Andorf, Destri; Lisch, Damon; Koch, Karen E; McCarty, Donald R; Quackenbush, John; Grotewold, Erich; Lushbough, Carol M; Sen, Taner Z; Lawrence, Carolyn J

2011-01-01

The purpose of the online resource presented here, POPcorn (Project Portal for corn), is to enhance accessibility of maize genetic and genomic resources for plant biologists. Currently, many online locations are difficult to find, some are best searched independently, and individual project websites often degrade over time-sometimes disappearing entirely. The POPcorn site makes available (1) a centralized, web-accessible resource to search and browse descriptions of ongoing maize genomics projects, (2) a single, stand-alone tool that uses web Services and minimal data warehousing to search for sequence matches in online resources of diverse offsite projects, and (3) a set of tools that enables researchers to migrate their data to the long-term model organism database for maize genetic and genomic information: MaizeGDB. Examples demonstrating POPcorn's utility are provided herein.
POPcorn: An Online Resource Providing Access to Distributed and Diverse Maize Project Data

PubMed Central

Cannon, Ethalinda K. S.; Birkett, Scott M.; Braun, Bremen L.; Kodavali, Sateesh; Jennewein, Douglas M.; Yilmaz, Alper; Antonescu, Valentin; Antonescu, Corina; Harper, Lisa C.; Gardiner, Jack M.; Schaeffer, Mary L.; Campbell, Darwin A.; Andorf, Carson M.; Andorf, Destri; Lisch, Damon; Koch, Karen E.; McCarty, Donald R.; Quackenbush, John; Grotewold, Erich; Lushbough, Carol M.; Sen, Taner Z.; Lawrence, Carolyn J.

2011-01-01

The purpose of the online resource presented here, POPcorn (Project Portal for corn), is to enhance accessibility of maize genetic and genomic resources for plant biologists. Currently, many online locations are difficult to find, some are best searched independently, and individual project websites often degrade over time—sometimes disappearing entirely. The POPcorn site makes available (1) a centralized, web-accessible resource to search and browse descriptions of ongoing maize genomics projects, (2) a single, stand-alone tool that uses web Services and minimal data warehousing to search for sequence matches in online resources of diverse offsite projects, and (3) a set of tools that enables researchers to migrate their data to the long-term model organism database for maize genetic and genomic information: MaizeGDB. Examples demonstrating POPcorn's utility are provided herein. PMID:22253616
orthoFind Facilitates the Discovery of Homologous and Orthologous Proteins.

PubMed

Mier, Pablo; Andrade-Navarro, Miguel A; Pérez-Pulido, Antonio J

2015-01-01

Finding homologous and orthologous protein sequences is often the first step in evolutionary studies, annotation projects, and experiments of functional complementation. Despite all currently available computational tools, there is a requirement for easy-to-use tools that provide functional information. Here, a new web application called orthoFind is presented, which allows a quick search for homologous and orthologous proteins given one or more query sequences, allowing a recurrent and exhaustive search against reference proteomes, and being able to include user databases. It addresses the protein multidomain problem, searching for homologs with the same domain architecture, and gives a simple functional analysis of the results to help in the annotation process. orthoFind is easy to use and has been proven to provide accurate results with different datasets. Availability: http://www.bioinfocabd.upo.es/orthofind/.
Database resources of the National Center for Biotechnology Information

PubMed Central

Wheeler, David L.; Barrett, Tanya; Benson, Dennis A.; Bryant, Stephen H.; Canese, Kathi; Chetvernin, Vyacheslav; Church, Deanna M.; DiCuccio, Michael; Edgar, Ron; Federhen, Scott; Feolo, Michael; Geer, Lewis Y.; Helmberg, Wolfgang; Kapustin, Yuri; Khovayko, Oleg; Landsman, David; Lipman, David J.; Madden, Thomas L.; Maglott, Donna R.; Miller, Vadim; Ostell, James; Pruitt, Kim D.; Schuler, Gregory D.; Shumway, Martin; Sequeira, Edwin; Sherry, Steven T.; Sirotkin, Karl; Souvorov, Alexandre; Starchenko, Grigory; Tatusov, Roman L.; Tatusova, Tatiana A.; Wagner, Lukas; Yaschenko, Eugene

2008-01-01

In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data available through NCBI's web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace, Assembly, and Short Read Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Database of Genotype and Phenotype, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting the web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:18045790
Analysing patent landscapes in plant biotechnology and new plant breeding techniques.

PubMed

Parisi, Claudia; Rodríguez-Cerezo, Emilio; Thangaraj, Harry

2013-02-01

This article aims to inform the reader of the importance of searching patent landscapes in plant biotechnology and the use of basic tools to perform a patent search. The recommendations for a patent search strategy are illustrated with the specific example of zinc finger nuclease technology for genetic engineering in plants. Within this scope, we provide a general introduction to searching using two online and free-access patent databases esp@cenet and PatentScope. The essential features of the two databases, and their functionality is described, together with short descriptions to enable the reader to understand patents, searching, their content, patent families, and their territorial scope. We mostly stress the value of patent searching for mining scientific, rather than legal information. Search methods through the use of keywords and patent codes are elucidated together with suggestions about how to search with or combine codes with keywords and we also comment on limitations of each method. We stress the importance of patent literature to complement more mainstream scientific literature, and the relative complexities and difficulties in searching patents compared to the latter. A parallel online resource where we describe detailed search exercises is available through reference for those intending further exploration. In essence this is aimed at a novice patent searcher who may want to examine accessory patent literature to complement knowledge gained from mainstream journal resources.
Physical activity in advanced cancer patients: a systematic review protocol.

PubMed

Lowe, Sonya S; Tan, Maria; Faily, Joan; Watanabe, Sharon M; Courneya, Kerry S

2016-03-11

Progressive, incurable cancer is associated with increased fatigue, increased muscle weakness, and reduced physical functioning, all of which negatively impact quality of life. Physical activity has demonstrated benefits on cancer-related fatigue and physical functioning in early-stage cancer patients; however, its impact on these outcomes in end-stage cancer has not been established. The aim of this systematic review is to determine the potential benefits, harms, and effects of physical activity interventions on quality of life outcomes in advanced cancer patients. A systematic review of peer-reviewed literature on physical activity in advanced cancer patients will be undertaken. Empirical quantitative studies will be considered for inclusion if they present interventional or observational data on physical activity in advanced cancer patients. Searches will be conducted in the following electronic databases: CINAHL; CIRRIE Database of International Rehabilitation Research; Cochrane Database of Systematic Reviews (CDSR); Database of Abstracts of Reviews of Effects (DARE); Cochrane Central Register of Controlled Trials (CENTRAL); EMBASE; MEDLINE; PEDro: the Physiotherapy Evidence Database; PQDT; PsycInfo; PubMed; REHABDATA; Scopus; SPORTDiscus; and Web of Science, to identify relevant studies of interest. Additional strategies to identify relevant studies will include citation searches and evaluation of reference lists of included articles. Titles, abstracts, and keywords of identified studies from the search strategies will be screened for inclusion criteria. Two independent reviewers will conduct quality appraisal using the Effective Public Health Practice Project Quality Assessment Tool for Quantitative Studies (EPHPP) and the Cochrane risk of bias tool. A descriptive summary of included studies will describe the study designs, participant and activity characteristics, and objective and patient-reported outcomes. This systematic review will summarize the current evidence base on physical activity interventions in advanced cancer patients. The findings from this systematic review will identify gaps to be explored by future research studies and inform future practice guideline development of physical activity interventions in advanced cancer patients. PROSPERO CRD42015026281.
Examining Student Research Choices and Processes in a Disintermediated Searching Environment

ERIC Educational Resources Information Center

Rempel, Hannah Gascho; Buck, Stefanie; Deitering, Anne-Marie

2013-01-01

Students today perform research in a disintermediated environment, which often allows them to struggle directly with the process of selecting research tools and choosing scholarly sources. The authors conducted a qualitative study with twenty students, using structured observations to ascertain the processes students use to select databases and…
Dice and DNA

ERIC Educational Resources Information Center

Wernersson, Rasmus

2007-01-01

An important part of teaching students how to use the BLAST tool for searching large sequence databases, is to train the students to think critically about the quality of the sequence hits found--both in terms of the statistical significance and how informative the individual hits are. This paper describes how generating truly random sequences by…
SABRE--A Novel Software Tool for Bibliographic Post-Processing.

ERIC Educational Resources Information Center

Burge, Cecil D.

1989-01-01

Describes the software architecture and application of SABRE (Semi-Automated Bibliographic Environment), which is one of the first products to provide a semi-automatic environment for relevancy ranking of citations obtained from searches of bibliographic databases. Features designed to meet the review, categorization, culling, and reporting needs…
First Toronto Conference on Database Users. Systems that Enhance User Performance.

ERIC Educational Resources Information Center

Doszkocs, Tamas E.; Toliver, David

1987-01-01

The first of two papers discusses natural language searching as a user performance enhancement tool, focusing on artificial intelligence applications for information retrieval and problems with natural language processing. The second presents a conceptual framework for further development and future design of front ends to online bibliographic…
Specialized microbial databases for inductive exploration of microbial genome sequences

PubMed Central

Fang, Gang; Ho, Christine; Qiu, Yaowu; Cubas, Virginie; Yu, Zhou; Cabau, Cédric; Cheung, Frankie; Moszer, Ivan; Danchin, Antoine

2005-01-01

Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore , a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya) has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis) associated to related organisms for comparison. PMID:15698474
Computer applications making rapid advances in high throughput microbial proteomics (HTMP).

PubMed

Anandkumar, Balakrishna; Haga, Steve W; Wu, Hui-Fen

2014-02-01

The last few decades have seen the rise of widely-available proteomics tools. From new data acquisition devices, such as MALDI-MS and 2DE to new database searching softwares, these new products have paved the way for high throughput microbial proteomics (HTMP). These tools are enabling researchers to gain new insights into microbial metabolism, and are opening up new areas of study, such as protein-protein interactions (interactomics) discovery. Computer software is a key part of these emerging fields. This current review considers: 1) software tools for identifying the proteome, such as MASCOT or PDQuest, 2) online databases of proteomes, such as SWISS-PROT, Proteome Web, or the Proteomics Facility of the Pathogen Functional Genomics Resource Center, and 3) software tools for applying proteomic data, such as PSI-BLAST or VESPA. These tools allow for research in network biology, protein identification, functional annotation, target identification/validation, protein expression, protein structural analysis, metabolic pathway engineering and drug discovery.
HONselect: multilingual assistant search engine operated by a concept-based interface system to decentralized heterogeneous sources.

PubMed

Boyer, C; Baujard, V; Scherrer, J R

2001-01-01

Any new user to the Internet will think that to retrieve the relevant document is an easy task especially with the wealth of sources available on this medium, but this is not the case. Even experienced users have difficulty formulating the right query for making the most of a search tool in order to efficiently obtain an accurate result. The goal of this work is to reduce the time and the energy necessary in searching and locating medical and health information. To reach this goal we have developed HONselect [1]. The aim of HONselect is not only to improve efficiency in retrieving documents but to respond to an increased need for obtaining a selection of relevant and accurate documents from a breadth of various knowledge databases including scientific bibliographical references, clinical trials, daily news, multimedia illustrations, conferences, forum, Web sites, clinical cases, and others. The authors based their approach on the knowledge representation using the National Library of Medicine's Medical Subject Headings (NLM, MeSH) vocabulary and classification [2,3]. The innovation is to propose a multilingual "one-stop searching" (one Web interface to databases currently in English, French and German) with full navigational and connectivity capabilities. The user may choose from a given selection of related terms the one that best suit his search, navigate in the term's hierarchical tree, and access directly to a selection of documents from high quality knowledge suppliers such as the MEDLINE database, the NLM's ClinicalTrials.gov server, the NewsPage's daily news, the HON's media gallery, conference listings and MedHunt's Web sites [4, 5, 6, 7, 8, 9]. HONselect, developed by HON, a non-profit organisation [10], is a free online available multilingual tool based on the MeSH thesaurus to index, select, retrieve and display accurate, up to date, high-level and quality documents.
Creating a FIESTA (Framework for Integrated Earth Science and Technology Applications) with MagIC

NASA Astrophysics Data System (ADS)

Minnett, R.; Koppers, A. A. P.; Jarboe, N.; Tauxe, L.; Constable, C.

2017-12-01

The Magnetics Information Consortium (https://earthref.org/MagIC) has recently developed a containerized web application to considerably reduce the friction in contributing, exploring and combining valuable and complex datasets for the paleo-, geo- and rock magnetic scientific community. The data produced in this scientific domain are inherently hierarchical and the communities evolving approaches to this scientific workflow, from sampling to taking measurements to multiple levels of interpretations, require a large and flexible data model to adequately annotate the results and ensure reproducibility. Historically, contributing such detail in a consistent format has been prohibitively time consuming and often resulted in only publishing the highly derived interpretations. The new open-source (https://github.com/earthref/MagIC) application provides a flexible upload tool integrated with the data model to easily create a validated contribution and a powerful search interface for discovering datasets and combining them to enable transformative science. MagIC is hosted at EarthRef.org along with several interdisciplinary geoscience databases. A FIESTA (Framework for Integrated Earth Science and Technology Applications) is being created by generalizing MagIC's web application for reuse in other domains. The application relies on a single configuration document that describes the routing, data model, component settings and external services integrations. The container hosts an isomorphic Meteor JavaScript application, MongoDB database and ElasticSearch search engine. Multiple containers can be configured as microservices to serve portions of the application or rely on externally hosted MongoDB, ElasticSearch, or third-party services to efficiently scale computational demands. FIESTA is particularly well suited for many Earth Science disciplines with its flexible data model, mapping, account management, upload tool to private workspaces, reference metadata, image galleries, full text searches and detailed filters. EarthRef's Seamount Catalog of bathymetry and morphology data, EarthRef's Geochemical Earth Reference Model (GERM) databases, and Oregon State University's Marine and Geology Repository (http://osu-mgr.org) will benefit from custom adaptations of FIESTA.
New glycoproteomics software, GlycoPep Evaluator, generates decoy glycopeptides de novo and enables accurate false discovery rate analysis for small data sets.

PubMed

Zhu, Zhikai; Su, Xiaomeng; Go, Eden P; Desaire, Heather

2014-09-16

Glycoproteins are biologically significant large molecules that participate in numerous cellular activities. In order to obtain site-specific protein glycosylation information, intact glycopeptides, with the glycan attached to the peptide sequence, are characterized by tandem mass spectrometry (MS/MS) methods such as collision-induced dissociation (CID) and electron transfer dissociation (ETD). While several emerging automated tools are developed, no consensus is present in the field about the best way to determine the reliability of the tools and/or provide the false discovery rate (FDR). A common approach to calculate FDRs for glycopeptide analysis, adopted from the target-decoy strategy in proteomics, employs a decoy database that is created based on the target protein sequence database. Nonetheless, this approach is not optimal in measuring the confidence of N-linked glycopeptide matches, because the glycopeptide data set is considerably smaller compared to that of peptides, and the requirement of a consensus sequence for N-glycosylation further limits the number of possible decoy glycopeptides tested in a database search. To address the need to accurately determine FDRs for automated glycopeptide assignments, we developed GlycoPep Evaluator (GPE), a tool that helps to measure FDRs in identifying glycopeptides without using a decoy database. GPE generates decoy glycopeptides de novo for every target glycopeptide, in a 1:20 target-to-decoy ratio. The decoys, along with target glycopeptides, are scored against the ETD data, from which FDRs can be calculated accurately based on the number of decoy matches and the ratio of the number of targets to decoys, for small data sets. GPE is freely accessible for download and can work with any search engine that interprets ETD data of N-linked glycopeptides. The software is provided at https://desairegroup.ku.edu/research.
Conformational flexibility of two RNA trimers explored by computational tools and database search.

PubMed

Fadrná, Eva; Koca, Jaroslav

2003-04-01

Two RNA sequences, AAA and AUG, were studied by the conformational search program CICADA and by molecular dynamics (MD) in the framework of the AMBER force field, and also via thorough PDB database search. CICADA was used to provide detailed information about conformers and conformational interconversions on the energy surfaces of the above molecules. Several conformational families were found for both sequences. Analysis of the results shows differences, especially between the energy of the single families, and also in flexibility and concerted conformational movement. Therefore, several MD trajectories (altogether 16 ns) were run to obtain more details about both the stability of conformers belonging to different conformational families and about the dynamics of the two systems. Results show that the trajectories strongly depend on the starting structure. When the MD start from the global minimum found by CICADA, they provide a stable run, while MD starting from another conformational family generates a trajectory where several different conformational families are visited. The results obtained by theoretical methods are compared with the thorough database search data. It is concluded that all except for the highest energy conformational families found in theoretical result also appear in experimental data. Registry numbers: adenylyl-(3' --> 5')-adenylyl-(3' --> 5')-adenosine [917-44-2] adenylyl-(3' --> 5')-uridylyl-(3' --> 5')-guanosine [3494-35-7].
A neotropical Miocene pollen database employing image-based search and semantic modeling1

PubMed Central

Han, Jing Ginger; Cao, Hongfei; Barb, Adrian; Punyasena, Surangi W.; Jaramillo, Carlos; Shyu, Chi-Ren

2014-01-01

• Premise of the study: Digital microscopic pollen images are being generated with increasing speed and volume, producing opportunities to develop new computational methods that increase the consistency and efficiency of pollen analysis and provide the palynological community a computational framework for information sharing and knowledge transfer. • Methods: Mathematical methods were used to assign trait semantics (abstract morphological representations) of the images of neotropical Miocene pollen and spores. Advanced database-indexing structures were built to compare and retrieve similar images based on their visual content. A Web-based system was developed to provide novel tools for automatic trait semantic annotation and image retrieval by trait semantics and visual content. • Results: Mathematical models that map visual features to trait semantics can be used to annotate images with morphology semantics and to search image databases with improved reliability and productivity. Images can also be searched by visual content, providing users with customized emphases on traits such as color, shape, and texture. • Discussion: Content- and semantic-based image searches provide a powerful computational platform for pollen and spore identification. The infrastructure outlined provides a framework for building a community-wide palynological resource, streamlining the process of manual identification, analysis, and species discovery. PMID:25202648
Biosequence Similarity Search on the Mercury System

PubMed Central

Krishnamurthy, Praveen; Buhler, Jeremy; Chamberlain, Roger; Franklin, Mark; Gyang, Kwame; Jacob, Arpith; Lancaster, Joseph

2007-01-01

Biosequence similarity search is an important application in modern molecular biology. Search algorithms aim to identify sets of sequences whose extensional similarity suggests a common evolutionary origin or function. The most widely used similarity search tool for biosequences is BLAST, a program designed to compare query sequences to a database. Here, we present the design of BLASTN, the version of BLAST that searches DNA sequences, on the Mercury system, an architecture that supports high-volume, high-throughput data movement off a data store and into reconfigurable hardware. An important component of application deployment on the Mercury system is the functional decomposition of the application onto both the reconfigurable hardware and the traditional processor. Both the Mercury BLASTN application design and its performance analysis are described. PMID:18846267

dbHiMo: a web-based epigenomics platform for histone-modifying enzymes

PubMed Central

Choi, Jaeyoung; Kim, Ki-Tae; Huh, Aram; Kwon, Seomun; Hong, Changyoung; Asiegbu, Fred O.; Jeon, Junhyun; Lee, Yong-Hwan

2015-01-01

Over the past two decades, epigenetics has evolved into a key concept for understanding regulation of gene expression. Among many epigenetic mechanisms, covalent modifications such as acetylation and methylation of lysine residues on core histones emerged as a major mechanism in epigenetic regulation. Here, we present the database for histone-modifying enzymes (dbHiMo; http://hme.riceblast.snu.ac.kr/) aimed at facilitating functional and comparative analysis of histone-modifying enzymes (HMEs). HMEs were identified by applying a search pipeline built upon profile hidden Markov model (HMM) to proteomes. The database incorporates 11 576 HMEs identified from 603 proteomes including 483 fungal, 32 plants and 51 metazoan species. The dbHiMo provides users with web-based personalized data browsing and analysis tools, supporting comparative and evolutionary genomics. With comprehensive data entries and associated web-based tools, our database will be a valuable resource for future epigenetics/epigenomics studies. Database URL: http://hme.riceblast.snu.ac.kr/ PMID:26055100
Chemical Space: Big Data Challenge for Molecular Diversity.

PubMed

Awale, Mahendra; Visini, Ricardo; Probst, Daniel; Arús-Pous, Josep; Reymond, Jean-Louis

2017-10-25

Chemical space describes all possible molecules as well as multi-dimensional conceptual spaces representing the structural diversity of these molecules. Part of this chemical space is available in public databases ranging from thousands to billions of compounds. Exploiting these databases for drug discovery represents a typical big data problem limited by computational power, data storage and data access capacity. Here we review recent developments of our laboratory, including progress in the chemical universe databases (GDB) and the fragment subset FDB-17, tools for ligand-based virtual screening by nearest neighbor searches, such as our multi-fingerprint browser for the ZINC database to select purchasable screening compounds, and their application to discover potent and selective inhibitors for calcium channel TRPV6 and Aurora A kinase, the polypharmacology browser (PPB) for predicting off-target effects, and finally interactive 3D-chemical space visualization using our online tools WebDrugCS and WebMolCS. All resources described in this paper are available for public use at www.gdb.unibe.ch.
BioMart Central Portal: an open database network for the biological community.

PubMed

Guberman, Jonathan M; Ai, J; Arnaiz, O; Baran, Joachim; Blake, Andrew; Baldock, Richard; Chelala, Claude; Croft, David; Cros, Anthony; Cutts, Rosalind J; Di Génova, A; Forbes, Simon; Fujisawa, T; Gadaleta, E; Goodstein, D M; Gundem, Gunes; Haggarty, Bernard; Haider, Syed; Hall, Matthew; Harris, Todd; Haw, Robin; Hu, S; Hubbard, Simon; Hsu, Jack; Iyer, Vivek; Jones, Philip; Katayama, Toshiaki; Kinsella, R; Kong, Lei; Lawson, Daniel; Liang, Yong; Lopez-Bigas, Nuria; Luo, J; Lush, Michael; Mason, Jeremy; Moreews, Francois; Ndegwa, Nelson; Oakley, Darren; Perez-Llamas, Christian; Primig, Michael; Rivkin, Elena; Rosanoff, S; Shepherd, Rebecca; Simon, Reinhard; Skarnes, B; Smedley, Damian; Sperling, Linda; Spooner, William; Stevenson, Peter; Stone, Kevin; Teague, J; Wang, Jun; Wang, Jianxin; Whitty, Brett; Wong, D T; Wong-Erasmus, Marie; Yao, L; Youens-Clark, Ken; Yung, Christina; Zhang, Junjun; Kasprzyk, Arek

2011-01-01

BioMart Central Portal is a first of its kind, community-driven effort to provide unified access to dozens of biological databases spanning genomics, proteomics, model organisms, cancer data, ontology information and more. Anybody can contribute an independently maintained resource to the Central Portal, allowing it to be exposed to and shared with the research community, and linking it with the other resources in the portal. Users can take advantage of the common interface to quickly utilize different sources without learning a new system for each. The system also simplifies cross-database searches that might otherwise require several complicated steps. Several integrated tools streamline common tasks, such as converting between ID formats and retrieving sequences. The combination of a wide variety of databases, an easy-to-use interface, robust programmatic access and the array of tools make Central Portal a one-stop shop for biological data querying. Here, we describe the structure of Central Portal and show example queries to demonstrate its capabilities.
IntegromeDB: an integrated system and biological search engine

PubMed Central

2012-01-01

Background With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Description Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. Conclusions The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback. PMID:22260095
Textpresso Central: a customizable platform for searching, text mining, viewing, and curating biomedical literature.

PubMed

Müller, H-M; Van Auken, K M; Li, Y; Sternberg, P W

2018-03-09

The biomedical literature continues to grow at a rapid pace, making the challenge of knowledge retrieval and extraction ever greater. Tools that provide a means to search and mine the full text of literature thus represent an important way by which the efficiency of these processes can be improved. We describe the next generation of the Textpresso information retrieval system, Textpresso Central (TPC). TPC builds on the strengths of the original system by expanding the full text corpus to include the PubMed Central Open Access Subset (PMC OA), as well as the WormBase C. elegans bibliography. In addition, TPC allows users to create a customized corpus by uploading and processing documents of their choosing. TPC is UIMA compliant, to facilitate compatibility with external processing modules, and takes advantage of Lucene indexing and search technology for efficient handling of millions of full text documents. Like Textpresso, TPC searches can be performed using keywords and/or categories (semantically related groups of terms), but to provide better context for interpreting and validating queries, search results may now be viewed as highlighted passages in the context of full text. To facilitate biocuration efforts, TPC also allows users to select text spans from the full text and annotate them, create customized curation forms for any data type, and send resulting annotations to external curation databases. As an example of such a curation form, we describe integration of TPC with the Noctua curation tool developed by the Gene Ontology (GO) Consortium. Textpresso Central is an online literature search and curation platform that enables biocurators and biomedical researchers to search and mine the full text of literature by integrating keyword and category searches with viewing search results in the context of the full text. It also allows users to create customized curation interfaces, use those interfaces to make annotations linked to supporting evidence statements, and then send those annotations to any database in the world. Textpresso Central URL: http://www.textpresso.org/tpc.
Astronomical database and VO-tools of Nikolaev Astronomical Observatory

NASA Astrophysics Data System (ADS)

Mazhaev, A. E.; Protsyuk, Yu. I.

2010-05-01

Results of work in 2006-2009 on creation of astronomical databases aiming at development of Nikolaev Virtual Observatory (NVO) are presented in this abstract. Results of observations and theirreduction, which were obtained during the whole history of Nikolaev Astronomical Observatory (NAO), are included in the databases. The databases may be considered as a basis for construction of a data centre. Images of different regions of the celestial sphere have been stored in NAO since 1929. About 8000 photo plates were obtained during observations in the 20th century. Observations with CCD have been started since 1996. Annually, telescopes of NAO, using CCD cameras, create data volume of several tens of gigabytes (GB) in the form of CCD images and up to 100 GB of video records. At the end of 2008, the volume of accumulated data in the form of CCD images was about 300 GB. Problems of data volume growth are common in astronomy, nuclear physics and bioinformatics. Therefore, the astronomical community needs to use archives, databases and distributed grid computing to cope with this problem in astronomy. The International Virtual Observatory Alliance (IVOA) was formed in June 2002 with a mission to "enable the international utilization of astronomical archives..." The NVO was created at the NAO website in 2008, and consists of three main parts. The first part contains 27 astrometric stellar catalogues with short descriptions. The files of catalogues were compiled in the standard VOTable format using eXtensible Markup Language (XML), and they are available for downloading. This is an example of the so-called science-ready product. The VOTable format was developed by the International Virtual Observatory Alliance (IVOA) for exchange of tabular data. A user may download these catalogues and open them using any standalone application that supports standards of the IVOA. There are several directions of development for such applications, for example, search of catalogues and images, search and visualisation of spectra, spectral energy distribution (SED) building, search of cross-correlation between objects in different catalogues, statistical data processing of large data volumes etc. The second part includes database of observations, accumulated in NAO, with access via a browser. The database has a common interface for searching of textual and graphical information concerning photographic and CCD observations. The database contains: textual information about 7437 plates as well as 2700 preview images in JPEG format with resolution of 300 DPI (dots per inch); textual information about 16660 CCD frames as well as 1100 preview images in JPEG format. Absent preview images will be added to the database as soon as they will be ready after plates scanning and CCD frames processing. The user has to define the equatorial coordinates of search centre, a search radius and a period of observations. Then he or she may also specify additional filters, such as: any combination of objects given separately for plates and CCD frames, output parameters for plates, telescope names for CCD observations. Results of search are generated in the form of two tables for photographic and CCD observations. To obtain access to the source images in FITS format with support of World Coordinate System (WCS), the user has to fill and submit electronic form given after the tables. The third part includes database of observations with access via a standalone application such as Aladin, which has been developed by Strasbourg Astronomical Data Centre. To obtain access to the database, the user has to perform a series of simple actions, which are described on a corresponding site page. Then he or she may get access to the database via a server selector of Aladin, which has a menu with wide range of image and catalogue servers located world wide, including two menu items for photographic and CCD observations of a NVO image server. The user has to define the equatorial coordinates of search centre and a search radius. The search results are outputted into a main window of Aladin in textual and graphical forms using XML and Simple Object Access Protocol (SOAP). In this way, the NVO image server is integrated with other astronomical servers, using a special configuration file. The user may conveniently request information from many servers using the same server selector of Aladin, although the servers are located in different countries. Aladin has a wide range of special tools for data analysis and handling, including connection with other standalone applications. As a conclusion, we should note that a research team of a data centre, which provides the infrastructure for data output to the internet, is responsible for creation of corresponding archives. Therefore, each observatory or data centre has to provide an access to its archives in accordance with the IVOA standards and a resolution adopted by the IAU XXV General Assembly #B.1, titled: Public Access to Astronomical Archives. A research team of NAO copes successfully with this task and continues to develop the NVO. Using our databases and VO-tools, we also take part in development of the Ukrainian Virtual Observatory (UkrVO). All three main parts of the NVO are used as prototypes for the UkrVO. Informational resources provided by other astronomical institutions from Ukraine will be included in corresponding databases and VO interfaces.
GenoMycDB: a database for comparative analysis of mycobacterial genes and genomes.

PubMed

Catanho, Marcos; Mascarenhas, Daniel; Degrave, Wim; Miranda, Antonio Basílio de

2006-03-31

Several databases and computational tools have been created with the aim of organizing, integrating and analyzing the wealth of information generated by large-scale sequencing projects of mycobacterial genomes and those of other organisms. However, with very few exceptions, these databases and tools do not allow for massive and/or dynamic comparison of these data. GenoMycDB (http://www.dbbm.fiocruz.br/GenoMycDB) is a relational database built for large-scale comparative analyses of completely sequenced mycobacterial genomes, based on their predicted protein content. Its central structure is composed of the results obtained after pair-wise sequence alignments among all the predicted proteins coded by the genomes of six mycobacteria: Mycobacterium tuberculosis (strains H37Rv and CDC1551), M. bovis AF2122/97, M. avium subsp. paratuberculosis K10, M. leprae TN, and M. smegmatis MC2 155. The database stores the computed similarity parameters of every aligned pair, providing for each protein sequence the predicted subcellular localization, the assigned cluster of orthologous groups, the features of the corresponding gene, and links to several important databases. Tables containing pairs or groups of potential homologs between selected species/strains can be produced dynamically by user-defined criteria, based on one or multiple sequence similarity parameters. In addition, searches can be restricted according to the predicted subcellular localization of the protein, the DNA strand of the corresponding gene and/or the description of the protein. Massive data search and/or retrieval are available, and different ways of exporting the result are offered. GenoMycDB provides an on-line resource for the functional classification of mycobacterial proteins as well as for the analysis of genome structure, organization, and evolution.
Hmrbase: a database of hormones and their receptors

PubMed Central

Rashid, Mamoon; Singla, Deepak; Sharma, Arun; Kumar, Manish; Raghava, Gajendra PS

2009-01-01

Background Hormones are signaling molecules that play vital roles in various life processes, like growth and differentiation, physiology, and reproduction. These molecules are mostly secreted by endocrine glands, and transported to target organs through the bloodstream. Deficient, or excessive, levels of hormones are associated with several diseases such as cancer, osteoporosis, diabetes etc. Thus, it is important to collect and compile information about hormones and their receptors. Description This manuscript describes a database called Hmrbase which has been developed for managing information about hormones and their receptors. It is a highly curated database for which information has been collected from the literature and the public databases. The current version of Hmrbase contains comprehensive information about ~2000 hormones, e.g., about their function, source organism, receptors, mature sequences, structures etc. Hmrbase also contains information about ~3000 hormone receptors, in terms of amino acid sequences, subcellular localizations, ligands, and post-translational modifications etc. One of the major features of this database is that it provides data about ~4100 hormone-receptor pairs. A number of online tools have been integrated into the database, to provide the facilities like keyword search, structure-based search, mapping of a given peptide(s) on the hormone/receptor sequence, sequence similarity search. This database also provides a number of external links to other resources/databases in order to help in the retrieving of further related information. Conclusion Owing to the high impact of endocrine research in the biomedical sciences, the Hmrbase could become a leading data portal for researchers. The salient features of Hmrbase are hormone-receptor pair-related information, mapping of peptide stretches on the protein sequences of hormones and receptors, Pfam domain annotations, categorical browsing options, online data submission, DrugPedia linkage etc. Hmrbase is available online for public from . PMID:19589147
COREMIC: a web-tool to search for a niche associated CORE MICrobiome.

PubMed

Rodrigues, Richard R; Rodgers, Nyle C; Wu, Xiaowei; Williams, Mark A

2018-01-01

Microbial diversity on earth is extraordinary, and soils alone harbor thousands of species per gram of soil. Understanding how this diversity is sorted and selected into habitat niches is a major focus of ecology and biotechnology, but remains only vaguely understood. A systems-biology approach was used to mine information from databases to show how it can be used to answer questions related to the core microbiome of habitat-microbe relationships. By making use of the burgeoning growth of information from databases, our tool "COREMIC" meets a great need in the search for understanding niche partitioning and habitat-function relationships. The work is unique, furthermore, because it provides a user-friendly statistically robust web-tool (http://coremic2.appspot.com or http://core-mic.com), developed using Google App Engine, to help in the process of database mining to identify the "core microbiome" associated with a given habitat. A case study is presented using data from 31 switchgrass rhizosphere community habitats across a diverse set of soil and sampling environments. The methodology utilizes an outgroup of 28 non-switchgrass (other grasses and forbs) to identify a core switchgrass microbiome. Even across a diverse set of soils (five environments), and conservative statistical criteria (presence in more than 90% samples and FDR q -val <0.05% for Fisher's exact test) a core set of bacteria associated with switchgrass was observed. These included, among others, closely related taxa from Lysobacter spp., Mesorhizobium spp , and Chitinophagaceae . These bacteria have been shown to have functions related to the production of bacterial and fungal antibiotics and plant growth promotion. COREMIC can be used as a hypothesis generating or confirmatory tool that shows great potential for identifying taxa that may be important to the functioning of a habitat (e.g. host plant). The case study, in conclusion, shows that COREMIC can identify key habitat-specific microbes across diverse samples, using currently available databases and a unique freely available software.
COREMIC: a web-tool to search for a niche associated CORE MICrobiome

PubMed Central

Rodgers, Nyle C.; Wu, Xiaowei; Williams, Mark A.

2018-01-01

Microbial diversity on earth is extraordinary, and soils alone harbor thousands of species per gram of soil. Understanding how this diversity is sorted and selected into habitat niches is a major focus of ecology and biotechnology, but remains only vaguely understood. A systems-biology approach was used to mine information from databases to show how it can be used to answer questions related to the core microbiome of habitat-microbe relationships. By making use of the burgeoning growth of information from databases, our tool “COREMIC” meets a great need in the search for understanding niche partitioning and habitat-function relationships. The work is unique, furthermore, because it provides a user-friendly statistically robust web-tool (http://coremic2.appspot.com or http://core-mic.com), developed using Google App Engine, to help in the process of database mining to identify the “core microbiome” associated with a given habitat. A case study is presented using data from 31 switchgrass rhizosphere community habitats across a diverse set of soil and sampling environments. The methodology utilizes an outgroup of 28 non-switchgrass (other grasses and forbs) to identify a core switchgrass microbiome. Even across a diverse set of soils (five environments), and conservative statistical criteria (presence in more than 90% samples and FDR q-val <0.05% for Fisher’s exact test) a core set of bacteria associated with switchgrass was observed. These included, among others, closely related taxa from Lysobacter spp., Mesorhizobium spp, and Chitinophagaceae. These bacteria have been shown to have functions related to the production of bacterial and fungal antibiotics and plant growth promotion. COREMIC can be used as a hypothesis generating or confirmatory tool that shows great potential for identifying taxa that may be important to the functioning of a habitat (e.g. host plant). The case study, in conclusion, shows that COREMIC can identify key habitat-specific microbes across diverse samples, using currently available databases and a unique freely available software. PMID:29473009
TRENDS: A flight test relational database user's guide and reference manual

NASA Technical Reports Server (NTRS)

Bondi, M. J.; Bjorkman, W. S.; Cross, J. L.

1994-01-01

This report is designed to be a user's guide and reference manual for users intending to access rotocraft test data via TRENDS, the relational database system which was developed as a tool for the aeronautical engineer with no programming background. This report has been written to assist novice and experienced TRENDS users. TRENDS is a complete system for retrieving, searching, and analyzing both numerical and narrative data, and for displaying time history and statistical data in graphical and numerical formats. This manual provides a 'guided tour' and a 'user's guide' for the new and intermediate-skilled users. Examples for the use of each menu item within TRENDS is provided in the Menu Reference section of the manual, including full coverage for TIMEHIST, one of the key tools. This manual is written around the XV-15 Tilt Rotor database, but does include an appendix on the UH-60 Blackhawk database. This user's guide and reference manual establishes a referrable source for the research community and augments NASA TM-101025, TRENDS: The Aeronautical Post-Test, Database Management System, Jan. 1990, written by the same authors.
Policy implications for familial searching

PubMed Central

2011-01-01

In the United States, several states have made policy decisions regarding whether and how to use familial searching of the Combined DNA Index System (CODIS) database in criminal investigations. Familial searching pushes DNA typing beyond merely identifying individuals to detecting genetic relatedness, an application previously reserved for missing persons identifications and custody battles. The intentional search of CODIS for partial matches to an item of evidence offers law enforcement agencies a powerful tool for developing investigative leads, apprehending criminals, revitalizing cold cases and exonerating wrongfully convicted individuals. As familial searching involves a range of logistical, social, ethical and legal considerations, states are now grappling with policy options for implementing familial searching to balance crime fighting with its potential impact on society. When developing policies for familial searching, legislators should take into account the impact of familial searching on select populations and the need to minimize personal intrusion on relatives of individuals in the DNA database. This review describes the approaches used to narrow a suspect pool from a partial match search of CODIS and summarizes the economic, ethical, logistical and political challenges of implementing familial searching. We examine particular US state policies and the policy options adopted to address these issues. The aim of this review is to provide objective background information on the controversial approach of familial searching to inform policy decisions in this area. Herein we highlight key policy options and recommendations regarding effective utilization of familial searching that minimize harm to and afford maximum protection of US citizens. PMID:22040348
Policy implications for familial searching.

PubMed

Kim, Joyce; Mammo, Danny; Siegel, Marni B; Katsanis, Sara H

2011-11-01

In the United States, several states have made policy decisions regarding whether and how to use familial searching of the Combined DNA Index System (CODIS) database in criminal investigations. Familial searching pushes DNA typing beyond merely identifying individuals to detecting genetic relatedness, an application previously reserved for missing persons identifications and custody battles. The intentional search of CODIS for partial matches to an item of evidence offers law enforcement agencies a powerful tool for developing investigative leads, apprehending criminals, revitalizing cold cases and exonerating wrongfully convicted individuals. As familial searching involves a range of logistical, social, ethical and legal considerations, states are now grappling with policy options for implementing familial searching to balance crime fighting with its potential impact on society. When developing policies for familial searching, legislators should take into account the impact of familial searching on select populations and the need to minimize personal intrusion on relatives of individuals in the DNA database. This review describes the approaches used to narrow a suspect pool from a partial match search of CODIS and summarizes the economic, ethical, logistical and political challenges of implementing familial searching. We examine particular US state policies and the policy options adopted to address these issues. The aim of this review is to provide objective background information on the controversial approach of familial searching to inform policy decisions in this area. Herein we highlight key policy options and recommendations regarding effective utilization of familial searching that minimize harm to and afford maximum protection of US citizens.
A new approach for detecting adventitious viruses shows Sf-rhabdovirus-negative Sf-RVN cells are suitable for safe biologicals production.

PubMed

Geisler, Christoph

2018-02-07

Adventitious viral contamination in cell substrates used for biologicals production is a major safety concern. A powerful new approach that can be used to identify adventitious viruses is a combination of bioinformatics tools with massively parallel sequencing technology. Typically, this involves mapping or BLASTN searching individual reads against viral nucleotide databases. Although extremely sensitive for known viruses, this approach can easily miss viruses that are too dissimilar to viruses in the database. Moreover, it is computationally intensive and requires reference cell genome databases. To avoid these drawbacks, we set out to develop an alternative approach. We reasoned that searching genome and transcriptome assemblies for adventitious viral contaminants using TBLASTN with a compact viral protein database covering extant viral diversity as the query could be fast and sensitive without a requirement for high performance computing hardware. We tested our approach on Spodoptera frugiperda Sf-RVN, a recently isolated insect cell line, to determine if it was contaminated with one or more adventitious viruses. We used Illumina reads to assemble the Sf-RVN genome and transcriptome and searched them for adventitious viral contaminants using TBLASTN with our viral protein database. We found no evidence of viral contamination, which was substantiated by the fact that our searches otherwise identified diverse sequences encoding virus-like proteins. These sequences included Maverick, R1 LINE, and errantivirus transposons, all of which are common in insect genomes. We also identified previously described as well as novel endogenous viral elements similar to ORFs encoded by diverse insect viruses. Our results demonstrate TBLASTN searching massively parallel sequencing (MPS) assemblies with a compact, manually curated viral protein database is more sensitive for adventitious virus detection than BLASTN, as we identified various sequences that encoded virus-like proteins, but had no similarity to viral sequences at the nucleotide level. Moreover, searches were fast without requiring high performance computing hardware. Our study also documents the enhanced biosafety profile of Sf-RVN as compared to other Sf cell lines, and supports the notion that Sf-RVN is highly suitable for the production of safe biologicals.
The Tropical Biominer Project: mining old sources for new drugs.

PubMed

Artiguenave, François; Lins, André; Maciel, Wesley Dias; Junior, Antonio Celso Caldeira; Nacif-Coelho, Carla; de Souza Linhares, Maria Margarida Ribeiro; de Oliveira, Guilherme Correa; Barbosa, Luis Humberto Rezende; Lopes, Júlio César Dias; Junior, Claudionor Nunes Coelho

2005-01-01

The Tropical Biominer Project is a recent initiative from the Federal University of Minas Gerais (UFMG) and the Oswaldo Cruz foundation, with the participation of the Biominas Foundation (Belo Horizonte, Minas Gerais, Brazil) and the start-up Homologix. The main objective of the project is to build a new resource for the chemogenomics research, on chemical compounds, with a strong emphasis on natural molecules. Adopted technologies include the search of information from structured, semi-structured, and non-structured documents (the last two from the web) and datamining tools in order to gather information from different sources. The database is the support for developing applications to find new potential treatments for parasitic infections by using virtual screening tools. We present here the midpoint of the project: the conception and implementation of the Tropical Biominer Database. This is a Federated Database designed to store data from different resources. Connected to the database, a web crawler is able to gather information from distinct, patented web sites and store them after automatic classification using datamining tools. Finally, we demonstrate the interest of the approach, by formulating new hypotheses on specific targets of a natural compound, violacein, using inferences from a Virtual Screening procedure.
The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes.

PubMed

Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin

2011-01-01

The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
PGSB PlantsDB: updates to the database framework for comparative plant genome research.

PubMed

Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai C; Martis, Mihaela M; Seidel, Michael; Kugler, Karl G; Gundlach, Heidrun; Mayer, Klaus F X

2016-01-04

PGSB (Plant Genome and Systems Biology: formerly MIPS) PlantsDB (http://pgsb.helmholtz-muenchen.de/plant/index.jsp) is a database framework for the comparative analysis and visualization of plant genome data. The resource has been updated with new data sets and types as well as specialized tools and interfaces to address user demands for intuitive access to complex plant genome data. In its latest incarnation, we have re-worked both the layout and navigation structure and implemented new keyword search options and a new BLAST sequence search functionality. Actively involved in corresponding sequencing consortia, PlantsDB has dedicated special efforts to the integration and visualization of complex triticeae genome data, especially for barley, wheat and rye. We enhanced CrowsNest, a tool to visualize syntenic relationships between genomes, with data from the wheat sub-genome progenitor Aegilops tauschii and added functionality to the PGSB RNASeqExpressionBrowser. GenomeZipper results were integrated for the genomes of barley, rye, wheat and perennial ryegrass and interactive access is granted through PlantsDB interfaces. Data exchange and cross-linking between PlantsDB and other plant genome databases is stimulated by the transPLANT project (http://transplantdb.eu/). © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
The PARIGA server for real time filtering and analysis of reciprocal BLAST results.

PubMed

Orsini, Massimiliano; Carcangiu, Simone; Cuccuru, Gianmauro; Uva, Paolo; Tramontano, Anna

2013-01-01

BLAST-based similarity searches are commonly used in several applications involving both nucleotide and protein sequences. These applications span from simple tasks such as mapping sequences over a database to more complex procedures as clustering or annotation processes. When the amount of analysed data increases, manual inspection of BLAST results become a tedious procedure. Tools for parsing or filtering BLAST results for different purposes are then required. We describe here PARIGA (http://resources.bioinformatica.crs4.it/pariga/), a server that enables users to perform all-against-all BLAST searches on two sets of sequences selected by the user. Moreover, since it stores the two BLAST output in a python-serialized-objects database, results can be filtered according to several parameters in real-time fashion, without re-running the process and avoiding additional programming efforts. Results can be interrogated by the user using logical operations, for example to retrieve cases where two queries match same targets, or when sequences from the two datasets are reciprocal best hits, or when a query matches a target in multiple regions. The Pariga web server is designed to be a helpful tool for managing the results of sequence similarity searches. The design and implementation of the server renders all operations very fast and easy to use.
The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research

PubMed Central

Estivalet, Gustavo L.; Meunier, Fanny

2015-01-01

In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perform distributional analyses of the corpus and comparisons to other current databases. Our main objective was to provide a large, reliable, and useful word-based corpus with a dynamic, easy-to-use, and intuitive interface with free internet access for word and word-criteria searches. We used the Núcleo Interinstitucional de Linguística Computacional’s corpus as the basic data source and developed the Brazilian Portuguese Lexicon by deriving and adding metalinguistic and psycholinguistic information about Brazilian Portuguese words. We obtained a final corpus with more than 30 million word tokens, 215 thousand word types and 25 categories of information about each word. This corpus was made available on the internet via a free-access site with two search engines: a simple search and a complex search. The simple engine basically searches for a list of words, while the complex engine accepts all types of criteria in the corpus categories. The output result presents all entries found in the corpus with the criteria specified in the input search and can be downloaded as a.csv file. We created a module in the results that delivers basic statistics about each search. The Brazilian Portuguese Lexicon also provides a pseudoword engine and specific tools for linguistic and statistical analysis. Therefore, the Brazilian Portuguese Lexicon is a convenient instrument for stimulus search, selection, control, and manipulation in psycholinguistic experiments, as also it is a powerful database for computational linguistics research and language modeling related to lexicon distribution, functioning, and behavior. PMID:26630138
The Brazilian Portuguese Lexicon: An Instrument for Psycholinguistic Research.

PubMed

Estivalet, Gustavo L; Meunier, Fanny

2015-01-01

In this article, we present the Brazilian Portuguese Lexicon, a new word-based corpus for psycholinguistic and computational linguistic research in Brazilian Portuguese. We describe the corpus development, the specific characteristics on the internet site and database for user access. We also perform distributional analyses of the corpus and comparisons to other current databases. Our main objective was to provide a large, reliable, and useful word-based corpus with a dynamic, easy-to-use, and intuitive interface with free internet access for word and word-criteria searches. We used the Núcleo Interinstitucional de Linguística Computacional's corpus as the basic data source and developed the Brazilian Portuguese Lexicon by deriving and adding metalinguistic and psycholinguistic information about Brazilian Portuguese words. We obtained a final corpus with more than 30 million word tokens, 215 thousand word types and 25 categories of information about each word. This corpus was made available on the internet via a free-access site with two search engines: a simple search and a complex search. The simple engine basically searches for a list of words, while the complex engine accepts all types of criteria in the corpus categories. The output result presents all entries found in the corpus with the criteria specified in the input search and can be downloaded as a.csv file. We created a module in the results that delivers basic statistics about each search. The Brazilian Portuguese Lexicon also provides a pseudoword engine and specific tools for linguistic and statistical analysis. Therefore, the Brazilian Portuguese Lexicon is a convenient instrument for stimulus search, selection, control, and manipulation in psycholinguistic experiments, as also it is a powerful database for computational linguistics research and language modeling related to lexicon distribution, functioning, and behavior.

UCbase 2.0: ultraconserved sequences database (2014 update)

PubMed Central

Lomonaco, Vincenzo; Martoglia, Riccardo; Mandreoli, Federica; Anderlucci, Laura; Emmett, Warren; Bicciato, Silvio; Taccioli, Cristian

2014-01-01

UCbase 2.0 (http://ucbase.unimore.it) is an update, extension and evolution of UCbase, a Web tool dedicated to the analysis of ultraconserved sequences (UCRs). UCRs are 481 sequences >200 bases sharing 100% identity among human, mouse and rat genomes. They are frequently located in genomic regions known to be involved in cancer or differentially expressed in human leukemias and carcinomas. UCbase 2.0 is a platform-independent Web resource that includes the updated version of the human genome annotation (hg19), information linking disorders to chromosomal coordinates based on the Systematized Nomenclature of Medicine classification, a query tool to search for Single Nucleotide Polymorphisms (SNPs) and a new text box to directly interrogate the database using a MySQL interface. To facilitate the interactive visual interpretation of UCR chromosomal positioning, UCbase 2.0 now includes a graph visualization interface directly linked to UCSC genome browser. Database URL: http://ucbase.unimore.it PMID:24951797
CoryneBase: Corynebacterium Genomic Resources and Analysis Tools at Your Fingertips

PubMed Central

Tan, Mui Fern; Jakubovics, Nick S.; Wee, Wei Yee; Mutha, Naresh V. R.; Wong, Guat Jah; Ang, Mia Yang; Yazdi, Amir Hessam; Choo, Siew Woh

2014-01-01

Corynebacteria are used for a wide variety of industrial purposes but some species are associated with human diseases. With increasing number of corynebacterial genomes having been sequenced, comparative analysis of these strains may provide better understanding of their biology, phylogeny, virulence and taxonomy that may lead to the discoveries of beneficial industrial strains or contribute to better management of diseases. To facilitate the ongoing research of corynebacteria, a specialized central repository and analysis platform for the corynebacterial research community is needed to host the fast-growing amount of genomic data and facilitate the analysis of these data. Here we present CoryneBase, a genomic database for Corynebacterium with diverse functionality for the analysis of genomes aimed to provide: (1) annotated genome sequences of Corynebacterium where 165,918 coding sequences and 4,180 RNAs can be found in 27 species; (2) access to comprehensive Corynebacterium data through the use of advanced web technologies for interactive web interfaces; and (3) advanced bioinformatic analysis tools consisting of standard BLAST for homology search, VFDB BLAST for sequence homology search against the Virulence Factor Database (VFDB), Pairwise Genome Comparison (PGC) tool for comparative genomic analysis, and a newly designed Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomic analysis. CoryneBase offers the access of a range of Corynebacterium genomic resources as well as analysis tools for comparative genomics and pathogenomics. It is publicly available at http://corynebacterium.um.edu.my/. PMID:24466021
Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints.

PubMed

Awale, Mahendra; Jin, Xian; Reymond, Jean-Louis

2015-01-01

Tools to explore large compound databases in search for analogs of query molecules provide a strategically important support in drug discovery to help identify available analogs of any given reference or hit compound by ligand based virtual screening (LBVS). We recently showed that large databases can be formatted for very fast searching with various 2D-fingerprints using the city-block distance as similarity measure, in particular a 2D-atom pair fingerprint (APfp) and the related category extended atom pair fingerprint (Xfp) which efficiently encode molecular shape and pharmacophores, but do not perceive stereochemistry. Here we investigated related 3D-atom pair fingerprints to enable rapid stereoselective searches in the ZINC database (23.2 million 3D structures). Molecular fingerprints counting atom pairs at increasing through-space distance intervals were designed using either all atoms (16-bit 3DAPfp) or different atom categories (80-bit 3DXfp). These 3D-fingerprints retrieved molecular shape and pharmacophore analogs (defined by OpenEye ROCS scoring functions) of 110,000 compounds from the Cambridge Structural Database with equal or better accuracy than the 2D-fingerprints APfp and Xfp, and showed comparable performance in recovering actives from decoys in the DUD database. LBVS by 3DXfp or 3DAPfp similarity was stereoselective and gave very different analogs when starting from different diastereomers of the same chiral drug. Results were also different from LBVS with the parent 2D-fingerprints Xfp or APfp. 3D- and 2D-fingerprints also gave very different results in LBVS of folded molecules where through-space distances between atom pairs are much shorter than topological distances. 3DAPfp and 3DXfp are suitable for stereoselective searches for shape and pharmacophore analogs of query molecules in large databases. Web-browsers for searching ZINC by 3DAPfp and 3DXfp similarity are accessible at www.gdb.unibe.ch and should provide useful assistance to drug discovery projects. Graphical abstractAtom pair fingerprints based on through-space distances (3DAPfp) provide better shape encoding than atom pair fingerprints based on topological distances (APfp) as measured by the recovery of ROCS shape analogs by fp similarity.
Using Enabling Technologies to Advance Data Intensive Analysis Tools in the JPL Tropical Cyclone Information System

NASA Astrophysics Data System (ADS)

Knosp, B.; Gangl, M. E.; Hristova-Veleva, S. M.; Kim, R. M.; Lambrigtsen, B.; Li, P.; Niamsuwan, N.; Shen, T. P. J.; Turk, F. J.; Vu, Q. A.

2014-12-01

The JPL Tropical Cyclone Information System (TCIS) brings together satellite, aircraft, and model forecast data from several NASA, NOAA, and other data centers to assist researchers in comparing and analyzing data related to tropical cyclones. The TCIS has been supporting specific science field campaigns, such as the Genesis and Rapid Intensification Processes (GRIP) campaign and the Hurricane and Severe Storm Sentinel (HS3) campaign, by creating near real-time (NRT) data visualization portals. These portals are intended to assist in mission planning, enhance the understanding of current physical processes, and improve model data by comparing it to satellite and aircraft observations. The TCIS NRT portals allow the user to view plots on a Google Earth interface. To compliment these visualizations, the team has been working on developing data analysis tools to let the user actively interrogate areas of Level 2 swath and two-dimensional plots they see on their screen. As expected, these observation and model data are quite voluminous and bottlenecks in the system architecture can occur when the databases try to run geospatial searches for data files that need to be read by the tools. To improve the responsiveness of the data analysis tools, the TCIS team has been conducting studies on how to best store Level 2 swath footprints and run sub-second geospatial searches to discover data. The first objective was to improve the sampling accuracy of the footprints being stored in the TCIS database by comparing the Java-based NASA PO.DAAC Level 2 Swath Generator with a TCIS Python swath generator. The second objective was to compare the performance of four database implementations - MySQL, MySQL+Solr, MongoDB, and PostgreSQL - to see which database management system would yield the best geospatial query and storage performance. The final objective was to integrate our chosen technologies with our Joint Probability Density Function (Joint PDF), Wave Number Analysis, and Automated Rotational Center Hurricane Eye Retrieval (ARCHER) tools. In this presentation, we will compare the enabling technologies we tested and discuss which ones we selected for integration into the TCIS' data analysis tool architecture. We will also show how these techniques have been automated to provide access to NRT data through our analysis tools.
Database resources of the National Center for Biotechnology Information.

PubMed

2016-01-04

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Database resources of the National Center for Biotechnology Information.

PubMed

2015-01-01

The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Image-based query-by-example for big databases of galaxy images

NASA Astrophysics Data System (ADS)

Shamir, Lior; Kuminski, Evan

2017-01-01

Very large astronomical databases containing millions or even billions of galaxy images have been becoming increasingly important tools in astronomy research. However, in many cases the very large size makes it more difficult to analyze these data manually, reinforcing the need for computer algorithms that can automate the data analysis process. An example of such task is the identification of galaxies of a certain morphology of interest. For instance, if a rare galaxy is identified it is reasonable to expect that more galaxies of similar morphology exist in the database, but it is virtually impossible to manually search these databases to identify such galaxies. Here we describe computer vision and pattern recognition methodology that receives a galaxy image as an input, and searches automatically a large dataset of galaxies to return a list of galaxies that are visually similar to the query galaxy. The returned list is not necessarily complete or clean, but it provides a substantial reduction of the original database into a smaller dataset, in which the frequency of objects visually similar to the query galaxy is much higher. Experimental results show that the algorithm can identify rare galaxies such as ring galaxies among datasets of 10,000 astronomical objects.
G-Bean: an ontology-graph based web tool for biomedical literature retrieval

PubMed Central

2014-01-01

Background Currently, most people use NCBI's PubMed to search the MEDLINE database, an important bibliographical information source for life science and biomedical information. However, PubMed has some drawbacks that make it difficult to find relevant publications pertaining to users' individual intentions, especially for non-expert users. To ameliorate the disadvantages of PubMed, we developed G-Bean, a graph based biomedical search engine, to search biomedical articles in MEDLINE database more efficiently. Methods G-Bean addresses PubMed's limitations with three innovations: (1) Parallel document index creation: a multithreaded index creation strategy is employed to generate the document index for G-Bean in parallel; (2) Ontology-graph based query expansion: an ontology graph is constructed by merging four major UMLS (Version 2013AA) vocabularies, MeSH, SNOMEDCT, CSP and AOD, to cover all concepts in National Library of Medicine (NLM) database; a Personalized PageRank algorithm is used to compute concept relevance in this ontology graph and the Term Frequency - Inverse Document Frequency (TF-IDF) weighting scheme is used to re-rank the concepts. The top 500 ranked concepts are selected for expanding the initial query to retrieve more accurate and relevant information; (3) Retrieval and re-ranking of documents based on user's search intention: after the user selects any article from the existing search results, G-Bean analyzes user's selections to determine his/her true search intention and then uses more relevant and more specific terms to retrieve additional related articles. The new articles are presented to the user in the order of their relevance to the already selected articles. Results Performance evaluation with 106 OHSUMED benchmark queries shows that G-Bean returns more relevant results than PubMed does when using these queries to search the MEDLINE database. PubMed could not even return any search result for some OHSUMED queries because it failed to form the appropriate Boolean query statement automatically from the natural language query strings. G-Bean is available at http://bioinformatics.clemson.edu/G-Bean/index.php. Conclusions G-Bean addresses PubMed's limitations with ontology-graph based query expansion, automatic document indexing, and user search intention discovery. It shows significant advantages in finding relevant articles from the MEDLINE database to meet the information need of the user. PMID:25474588
G-Bean: an ontology-graph based web tool for biomedical literature retrieval.

PubMed

Wang, James Z; Zhang, Yuanyuan; Dong, Liang; Li, Lin; Srimani, Pradip K; Yu, Philip S

2014-01-01

Currently, most people use NCBI's PubMed to search the MEDLINE database, an important bibliographical information source for life science and biomedical information. However, PubMed has some drawbacks that make it difficult to find relevant publications pertaining to users' individual intentions, especially for non-expert users. To ameliorate the disadvantages of PubMed, we developed G-Bean, a graph based biomedical search engine, to search biomedical articles in MEDLINE database more efficiently. G-Bean addresses PubMed's limitations with three innovations: (1) Parallel document index creation: a multithreaded index creation strategy is employed to generate the document index for G-Bean in parallel; (2) Ontology-graph based query expansion: an ontology graph is constructed by merging four major UMLS (Version 2013AA) vocabularies, MeSH, SNOMEDCT, CSP and AOD, to cover all concepts in National Library of Medicine (NLM) database; a Personalized PageRank algorithm is used to compute concept relevance in this ontology graph and the Term Frequency - Inverse Document Frequency (TF-IDF) weighting scheme is used to re-rank the concepts. The top 500 ranked concepts are selected for expanding the initial query to retrieve more accurate and relevant information; (3) Retrieval and re-ranking of documents based on user's search intention: after the user selects any article from the existing search results, G-Bean analyzes user's selections to determine his/her true search intention and then uses more relevant and more specific terms to retrieve additional related articles. The new articles are presented to the user in the order of their relevance to the already selected articles. Performance evaluation with 106 OHSUMED benchmark queries shows that G-Bean returns more relevant results than PubMed does when using these queries to search the MEDLINE database. PubMed could not even return any search result for some OHSUMED queries because it failed to form the appropriate Boolean query statement automatically from the natural language query strings. G-Bean is available at http://bioinformatics.clemson.edu/G-Bean/index.php. G-Bean addresses PubMed's limitations with ontology-graph based query expansion, automatic document indexing, and user search intention discovery. It shows significant advantages in finding relevant articles from the MEDLINE database to meet the information need of the user.
A browsing tool for the Internet Logical Library of the HPCC Software Exchange

NASA Technical Reports Server (NTRS)

Biro, Ross

1993-01-01

As the quantity of information available on the Internet grows, locating a particular piece of information becomes more difficult. One possible solution is for a database of pointers to all available information to be maintained at a central site. Subject classifications for all the information could also be maintained in order to make searching possible. This paper describes one possible method of searching such an index. In particular a prototype browsing tool has been created using TCL/TK to demonstrate several possible features: rapidly scanning at any rank of the index, narrowing the index to any scope, regular-expression searching, and creation of a list of pointers answering to any set of index terms. The prototype browser is an easy-to-use independent X application designed for use in the Catalog of Repositories of the HPCC (High Performance Computing and Communications) Software Exchange.
Measurement properties of self-report physical activity assessment tools in stroke: a protocol for a systematic review.

PubMed

Martins, Júlia Caetano; Aguiar, Larissa Tavares; Nadeau, Sylvie; Scianni, Aline Alvim; Teixeira-Salmela, Luci Fuscaldi; Faria, Christina Danielli Coelho de Morais

2017-02-13

Self-report physical activity assessment tools are commonly used for the evaluation of physical activity levels in individuals with stroke. A great variety of these tools have been developed and widely used in recent years, which justify the need to examine their measurement properties and clinical utility. Therefore, the main objectives of this systematic review are to examine the measurement properties and clinical utility of self-report measures of physical activity and discuss the strengths and limitations of the identified tools. A systematic review of studies that investigated the measurement properties and/or clinical utility of self-report physical activity assessment tools in stroke will be conducted. Electronic searches will be performed in five databases: Medical Literature Analysis and Retrieval System Online (MEDLINE) (PubMed), Excerpta Medica Database (EMBASE), Physiotherapy Evidence Database (PEDro), Literatura Latino-Americana e do Caribe em Ciências da Saúde (LILACS) and Scientific Electronic Library Online (SciELO), followed by hand searches of the reference lists of the included studies. Two independent reviewers will screen all retrieve titles, abstracts, and full texts, according to the inclusion criteria and will also extract the data. A third reviewer will be referred to solve any disagreement. A descriptive summary of the included studies will contain the design, participants, as well as the characteristics, measurement properties, and clinical utility of the self-report tools. The methodological quality of the studies will be evaluated using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist and the clinical utility of the identified tools will be assessed considering predefined criteria. This systematic review will follow the Preferred Reporting Items for Systematic Review and Meta-Analyses (PRISMA) statement. This systematic review will provide an extensive review of the measurement properties and clinical utility of self-report physical activity assessment tools used in individuals with stroke, which would benefit clinicians and researchers. PROSPERO CRD42016037146. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Characteristics and use of urban health indicator tools by municipal built environment policy and decision-makers: a systematic review protocol.

PubMed

Pineo, Helen; Glonti, Ketevan; Rutter, Harry; Zimmermann, Nicole; Wilkinson, Paul; Davies, Michael

2017-01-13

There is wide agreement that there is a lack of attention to health in municipal environmental policy-making, such as urban planning and regeneration. Explanations for this include differing professional norms between health and urban environment professionals, system complexity and limited evidence for causality between attributes of the built environment and health outcomes. Data from urban health indicator (UHI) tools are potentially a valuable form of evidence for local government policy and decision-makers. Although many UHI tools have been specifically developed to inform policy, there is poor understanding of how they are used. This study aims to identify the nature and characteristics of UHI tools and their use by municipal built environment policy and decision-makers. Health and social sciences databases (ASSIA, Campbell Library, EMBASE, MEDLINE, Scopus, Social Policy and Practice and Web of Science Core Collection) will be searched for studies using UHI tools alongside hand-searching of key journals and citation searches of included studies. Advanced searches of practitioner websites and Google will also be used to find grey literature. Search results will be screened for UHI tools, and for studies which report on or evaluate the use of such tools. Data about UHI tools will be extracted to compile a census and taxonomy of existing tools based on their specific characteristics and purpose. In addition, qualitative and quantitative studies about the use of these tools will be appraised using quality appraisal tools produced by the UK National Institute for Health and Care Excellence (NICE) and synthesised in order to gain insight into the perceptions, value and use of UHI tools in the municipal built environment policy and decision-making process. This review is not registered with PROSPERO. This systematic review focuses specifically on UHI tools that assess the physical environment's impact on health (such as transport, housing, air quality and greenspace). This study will help indicator producers understand whether this form of evidence is of value to built environment policy and decision-makers and how such tools should be tailored for this audience. N/A.
Epistemonikos: a free, relational, collaborative, multilingual database of health evidence.

PubMed

Rada, Gabriel; Pérez, Daniel; Capurro, Daniel

2013-01-01

Epistemonikos (www.epistemonikos.org) is a free, multilingual database of the best available health evidence. This paper describes the design, development and implementation of the Epistemonikos project. Using several web technologies to store systematic reviews, their included articles, overviews of reviews and structured summaries, Epistemonikos is able to provide a simple and powerful search tool to access health evidence for sound decision making. Currently, Epistemonikos stores more than 115,000 unique documents and more than 100,000 relationships between documents. In addition, since its database is translated into 9 different languages, Epistemonikos ensures that non-English speaking decision-makers can access the best available evidence without language barriers.
Strategies to explore functional genomics data sets in NCBI's GEO database.

PubMed

Wilhite, Stephen E; Barrett, Tanya

2012-01-01

The Gene Expression Omnibus (GEO) database is a major repository that stores high-throughput functional genomics data sets that are generated using both microarray-based and sequence-based technologies. Data sets are submitted to GEO primarily by researchers who are publishing their results in journals that require original data to be made freely available for review and analysis. In addition to serving as a public archive for these data, GEO has a suite of tools that allow users to identify, analyze, and visualize data relevant to their specific interests. These tools include sample comparison applications, gene expression profile charts, data set clusters, genome browser tracks, and a powerful search engine that enables users to construct complex queries.
Strategies to Explore Functional Genomics Data Sets in NCBI’s GEO Database

PubMed Central

Wilhite, Stephen E.; Barrett, Tanya

2012-01-01

The Gene Expression Omnibus (GEO) database is a major repository that stores high-throughput functional genomics data sets that are generated using both microarray-based and sequence-based technologies. Data sets are submitted to GEO primarily by researchers who are publishing their results in journals that require original data to be made freely available for review and analysis. In addition to serving as a public archive for these data, GEO has a suite of tools that allow users to identify, analyze and visualize data relevant to their specific interests. These tools include sample comparison applications, gene expression profile charts, data set clusters, genome browser tracks, and a powerful search engine that enables users to construct complex queries. PMID:22130872
Uploading, Searching and Visualizing of Paleomagnetic and Rock Magnetic Data in the Online MagIC Database

NASA Astrophysics Data System (ADS)

Minnett, R.; Koppers, A.; Tauxe, L.; Constable, C.; Donadini, F.

2007-12-01

The Magnetics Information Consortium (MagIC) is commissioned to implement and maintain an online portal to a relational database populated by both rock and paleomagnetic data. The goal of MagIC is to archive all available measurements and derived properties from paleomagnetic studies of directions and intensities, and for rock magnetic experiments (hysteresis, remanence, susceptibility, anisotropy). MagIC is hosted under EarthRef.org at http://earthref.org/MAGIC/ and will soon implement two search nodes, one for paleomagnetism and one for rock magnetism. Currently the PMAG node is operational. Both nodes provide query building based on location, reference, methods applied, material type and geological age, as well as a visual map interface to browse and select locations. Users can also browse the database by data type or by data compilation to view all contributions associated with well known earlier collections like PINT, GMPDB or PSVRL. The query result set is displayed in a digestible tabular format allowing the user to descend from locations to sites, samples, specimens and measurements. At each stage, the result set can be saved and, where appropriate, can be visualized by plotting global location maps, equal area, XY, age, and depth plots, or typical Zijderveld, hysteresis, magnetization and remanence diagrams. User contributions to the MagIC database are critical to achieving a useful research tool. We have developed a standard data and metadata template (version 2.3) that can be used to format and upload all data at the time of publication in Earth Science journals. Software tools are provided to facilitate population of these templates within Microsoft Excel. These tools allow for the import/export of text files and provide advanced functionality to manage and edit the data, and to perform various internal checks to maintain data integrity and prepare for uploading. The MagIC Contribution Wizard at http://earthref.org/MAGIC/upload.htm executes the upload and takes only a few minutes to process tens of thousands of data records. The standardized MagIC template files are stored in the digital archives of EarthRef.org where they remain available for download by the public (in both text and Excel format). Finally, the contents of these template files are automatically parsed into the online relational database, making the data available for online searches in the paleomagnetic and rock magnetic search nodes. During the upload process the owner has the option of keeping the contribution private so it can be viewed in the context of other data sets and visualized using the suite of MagIC plotting tools. Alternatively, the new data can be password protected and shared with a group of users at the contributor's discretion. Once they are published and the owner is comfortable making the upload publicly accessible, the MagIC Editing Committee reviews the contribution for adherence to the MagIC data model and conventions to ensure a high level of data integrity.
Hermes, the Information Messenger, Integrating Information Services and Delivering Them to the End User.

ERIC Educational Resources Information Center

Coello-Coutino, Gerardo; Ainsworth, Shirley; Escalante-Gonzalbo, Ana Marie

2002-01-01

Describes Hermes, a research tool that uses specially designed acquisition, parsing and presentation methods to integrate information resources on the Internet, from searching in disparate bibliographic databases, to accessing full text articles online, and developing a web of information associated with each reference via one common interface.…
Assessment of the Risk of Bias in Rehabilitation Reviews

ERIC Educational Resources Information Center

Farmer, Sybil E.; Wood, Duncan; Swain, Ian D.; Pandyan, Anand D.

2012-01-01

Systematic reviews are used to inform practice, and develop guidelines and protocols. A questionnaire to quantify the risk of bias in systematic reviews, the review paper assessment (RPA) tool, was developed and tested. A search of electronic databases provided a data set of review articles that were then independently reviewed by two assessors…
PPDB - A tool for investigation of plants physiology based on gene ontology.

PubMed

Sharma, Ajay Shiv; Gupta, Hari Om; Prasad, Rajendra

2014-09-02

Representing the way forward, from functional genomics and its ontology to functional understanding and physiological model, in a computationally tractable fashion is one of the ongoing challenges faced by computational biology. To tackle the standpoint, we herein feature the applications of contemporary database management to the development of PPDB, a searching and browsing tool for the Plants Physiology Database that is based upon the mining of a large amount of gene ontology data currently available. The working principles and search options associated with the PPDB are publicly available and freely accessible on-line ( http://www.iitr.ernet.in/ajayshiv/ ) through a user friendly environment generated by means of Drupal-6.24. By knowing that genes are expressed in temporally and spatially characteristic patterns and that their functionally distinct products often reside in specific cellular compartments and may be part of one or more multi-component complexes, this sort of work is intended to be relevant for investigating the functional relationships of gene products at a system level and, thus, helps us approach to the full physiology.
PPDB: A Tool for Investigation of Plants Physiology Based on Gene Ontology.

PubMed

Sharma, Ajay Shiv; Gupta, Hari Om; Prasad, Rajendra

2015-09-01

Representing the way forward, from functional genomics and its ontology to functional understanding and physiological model, in a computationally tractable fashion is one of the ongoing challenges faced by computational biology. To tackle the standpoint, we herein feature the applications of contemporary database management to the development of PPDB, a searching and browsing tool for the Plants Physiology Database that is based upon the mining of a large amount of gene ontology data currently available. The working principles and search options associated with the PPDB are publicly available and freely accessible online ( http://www.iitr.ac.in/ajayshiv/ ) through a user-friendly environment generated by means of Drupal-6.24. By knowing that genes are expressed in temporally and spatially characteristic patterns and that their functionally distinct products often reside in specific cellular compartments and may be part of one or more multicomponent complexes, this sort of work is intended to be relevant for investigating the functional relationships of gene products at a system level and, thus, helps us approach to the full physiology.

Multi-lingual search engine to access PubMed monolingual subsets: a feasibility study.

PubMed

Darmoni, Stéfan J; Soualmia, Lina F; Griffon, Nicolas; Grosjean, Julien; Kerdelhué, Gaétan; Kergourlay, Ivan; Dahamna, Badisse

2013-01-01

PubMed contains many articles in languages other than English but it is difficult to find them using the English version of the Medical Subject Headings (MeSH) Thesaurus. The aim of this work is to propose a tool allowing access to a PubMed subset in one language, and to evaluate its performance. Translations of MeSH were enriched and gathered in the information system. PubMed subsets in main European languages were also added in our database, using a dedicated parser. The CISMeF generic semantic search engine was evaluated on the response time for simple queries. MeSH descriptors are currently available in 11 languages in the information system. All the 654,000 PubMed citations in French were integrated into CISMeF database. None of the response times exceed the threshold defined for usability (2 seconds). It is now possible to freely access biomedical literature in French using a tool in French; health professionals and lay people with a low English language may find it useful. It will be expended to several European languages: German, Spanish, Norwegian and Portuguese.
Epsilon-Q: An Automated Analyzer Interface for Mass Spectral Library Search and Label-Free Protein Quantification.

PubMed

Cho, Jin-Young; Lee, Hyoung-Joo; Jeong, Seul-Ki; Paik, Young-Ki

2017-12-01

Mass spectrometry (MS) is a widely used proteome analysis tool for biomedical science. In an MS-based bottom-up proteomic approach to protein identification, sequence database (DB) searching has been routinely used because of its simplicity and convenience. However, searching a sequence DB with multiple variable modification options can increase processing time, false-positive errors in large and complicated MS data sets. Spectral library searching is an alternative solution, avoiding the limitations of sequence DB searching and allowing the detection of more peptides with high sensitivity. Unfortunately, this technique has less proteome coverage, resulting in limitations in the detection of novel and whole peptide sequences in biological samples. To solve these problems, we previously developed the "Combo-Spec Search" method, which uses manually multiple references and simulated spectral library searching to analyze whole proteomes in a biological sample. In this study, we have developed a new analytical interface tool called "Epsilon-Q" to enhance the functions of both the Combo-Spec Search method and label-free protein quantification. Epsilon-Q performs automatically multiple spectral library searching, class-specific false-discovery rate control, and result integration. It has a user-friendly graphical interface and demonstrates good performance in identifying and quantifying proteins by supporting standard MS data formats and spectrum-to-spectrum matching powered by SpectraST. Furthermore, when the Epsilon-Q interface is combined with the Combo-Spec search method, called the Epsilon-Q system, it shows a synergistic function by outperforming other sequence DB search engines for identifying and quantifying low-abundance proteins in biological samples. The Epsilon-Q system can be a versatile tool for comparative proteome analysis based on multiple spectral libraries and label-free quantification.
Fungal genome resources at NCBI.

PubMed

Robbertse, B; Tatusova, T

2011-09-01

The National Center for Biotechnology Information (NCBI) is well known for the nucleotide sequence archive, GenBank and sequence analysis tool BLAST. However, NCBI integrates many types of biomolecular data from variety of sources and makes it available to the scientific community as interactive web resources as well as organized releases of bulk data. These tools are available to explore and compare fungal genomes. Searching all databases with Fungi [organism] at http://www.ncbi.nlm.nih.gov/ is the quickest way to find resources of interest with fungal entries. Some tools though are resources specific and can be indirectly accessed from a particular database in the Entrez system. These include graphical viewers and comparative analysis tools such as TaxPlot, TaxMap and UniGene DDD (found via UniGene Homepage). Gene and BioProject pages also serve as portals to external data such as community annotation websites, BioGrid and UniProt. There are many different ways of accessing genomic data at NCBI. Depending on the focus and goal of research projects or the level of interest, a user would select a particular route for accessing genomic databases and resources. This review article describes methods of accessing fungal genome data and provides examples that illustrate the use of analysis tools.
CTGA: the database for genetic disorders in Arab populations.

PubMed

Tadmouri, Ghazi O; Al Ali, Mahmoud Taleb; Al-Haj Ali, Sarah; Al Khaja, Najib

2006-01-01

The Arabs comprise a genetically heterogeneous group that resulted from the admixture of different populations throughout history. They share many common characteristics responsible for a considerable proportion of perinatal and neonatal mortalities. To this end, the Centre for Arab Genomic Studies (CAGS) launched a pilot project to construct the 'Catalogue of Transmission Genetics in Arabs' (CTGA) database for genetic disorders in Arabs. Information in CTGA is drawn from published research and mined hospital records. The database offers web-based basic and advanced search approaches. In either case, the final search result is a detailed HTML record that includes text-, URL- and graphic-based fields. At present, CTGA hosts entries for 692 phenotypes and 235 related genes described in Arab individuals. Of these, 213 phenotypic descriptions and 22 related genes were observed in the Arab population of the United Arab Emirates (UAE). These results emphasize the role of CTGA as an essential tool to promote scientific research on genetic disorders in the region. The priority of CTGA is to provide timely information on the occurrence of genetic disorders in Arab individuals. It is anticipated that data from Arab countries other than the UAE will be exhaustively searched and incorporated in CTGA (http://www.cags.org.ae).
RNA Bricks—a database of RNA 3D motifs and their interactions

PubMed Central

Chojnowski, Grzegorz; Waleń, Tomasz; Bujnicki, Janusz M.

2014-01-01

The RNA Bricks database (http://iimcb.genesilico.pl/rnabricks), stores information about recurrent RNA 3D motifs and their interactions, found in experimentally determined RNA structures and in RNA–protein complexes. In contrast to other similar tools (RNA 3D Motif Atlas, RNA Frabase, Rloom) RNA motifs, i.e. ‘RNA bricks’ are presented in the molecular environment, in which they were determined, including RNA, protein, metal ions, water molecules and ligands. All nucleotide residues in RNA bricks are annotated with structural quality scores that describe real-space correlation coefficients with the electron density data (if available), backbone geometry and possible steric conflicts, which can be used to identify poorly modeled residues. The database is also equipped with an algorithm for 3D motif search and comparison. The algorithm compares spatial positions of backbone atoms of the user-provided query structure and of stored RNA motifs, without relying on sequence or secondary structure information. This enables the identification of local structural similarities among evolutionarily related and unrelated RNA molecules. Besides, the search utility enables searching ‘RNA bricks’ according to sequence similarity, and makes it possible to identify motifs with modified ribonucleotide residues at specific positions. PMID:24220091
CTGA: the database for genetic disorders in Arab populations

PubMed Central

Tadmouri, Ghazi O.; Ali, Mahmoud Taleb Al; Ali, Sarah Al-Haj; Khaja, Najib Al

2006-01-01

The Arabs comprise a genetically heterogeneous group that resulted from the admixture of different populations throughout history. They share many common characteristics responsible for a considerable proportion of perinatal and neonatal mortalities. To this end, the Centre for Arab Genomic Studies (CAGS) launched a pilot project to construct the ‘Catalogue of Transmission Genetics in Arabs’ (CTGA) database for genetic disorders in Arabs. Information in CTGA is drawn from published research and mined hospital records. The database offers web-based basic and advanced search approaches. In either case, the final search result is a detailed HTML record that includes text-, URL- and graphic-based fields. At present, CTGA hosts entries for 692 phenotypes and 235 related genes described in Arab individuals. Of these, 213 phenotypic descriptions and 22 related genes were observed in the Arab population of the United Arab Emirates (UAE). These results emphasize the role of CTGA as an essential tool to promote scientific research on genetic disorders in the region. The priority of CTGA is to provide timely information on the occurrence of genetic disorders in Arab individuals. It is anticipated that data from Arab countries other than the UAE will be exhaustively searched and incorporated in CTGA (). PMID:16381941
De-MetaST-BLAST: A Tool for the Validation of Degenerate Primer Sets and Data Mining of Publicly Available Metagenomes

PubMed Central

Gulvik, Christopher A.; Effler, T. Chad; Wilhelm, Steven W.; Buchan, Alison

2012-01-01

Development and use of primer sets to amplify nucleic acid sequences of interest is fundamental to studies spanning many life science disciplines. As such, the validation of primer sets is essential. Several computer programs have been created to aid in the initial selection of primer sequences that may or may not require multiple nucleotide combinations (i.e., degeneracies). Conversely, validation of primer specificity has remained largely unchanged for several decades, and there are currently few available programs that allows for an evaluation of primers containing degenerate nucleotide bases. To alleviate this gap, we developed the program De-MetaST that performs an in silico amplification using user defined nucleotide sequence dataset(s) and primer sequences that may contain degenerate bases. The program returns an output file that contains the in silico amplicons. When De-MetaST is paired with NCBI’s BLAST (De-MetaST-BLAST), the program also returns the top 10 nr NCBI database hits for each recovered in silico amplicon. While the original motivation for development of this search tool was degenerate primer validation using the wealth of nucleotide sequences available in environmental metagenome and metatranscriptome databases, this search tool has potential utility in many data mining applications. PMID:23189198
Search for 5'-leader regulatory RNA structures based on gene annotation aided by the RiboGap database.

PubMed

Naghdi, Mohammad Reza; Smail, Katia; Wang, Joy X; Wade, Fallou; Breaker, Ronald R; Perreault, Jonathan

2017-03-15

The discovery of noncoding RNAs (ncRNAs) and their importance for gene regulation led us to develop bioinformatics tools to pursue the discovery of novel ncRNAs. Finding ncRNAs de novo is challenging, first due to the difficulty of retrieving large numbers of sequences for given gene activities, and second due to exponential demands on calculation needed for comparative genomics on a large scale. Recently, several tools for the prediction of conserved RNA secondary structure were developed, but many of them are not designed to uncover new ncRNAs, or are too slow for conducting analyses on a large scale. Here we present various approaches using the database RiboGap as a primary tool for finding known ncRNAs and for uncovering simple sequence motifs with regulatory roles. This database also can be used to easily extract intergenic sequences of eubacteria and archaea to find conserved RNA structures upstream of given genes. We also show how to extend analysis further to choose the best candidate ncRNAs for experimental validation. Copyright © 2017 Elsevier Inc. All rights reserved.
Gene Unprediction with Spurio: A tool to identify spurious protein sequences.

PubMed

Höps, Wolfram; Jeffryes, Matt; Bateman, Alex

2018-01-01

We now have access to the sequences of tens of millions of proteins. These protein sequences are essential for modern molecular biology and computational biology. The vast majority of protein sequences are derived from gene prediction tools and have no experimental supporting evidence for their translation. Despite the increasing accuracy of gene prediction tools there likely exists a large number of spurious protein predictions in the sequence databases. We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes. Spurio searches the query protein sequence against a prokaryotic nucleotide database using tblastn and identifies homologous sequences. The tblastn matches are used to score the query sequence's likelihood of being a spurious protein prediction using a Gaussian process model. The most informative feature is the appearance of stop codons within the presumed translation of homologous DNA sequences. Benchmarking shows that the Spurio tool is able to distinguish spurious from true proteins. However, transposon proteins are prone to be predicted as spurious because of the frequency of degraded homologs found in the DNA sequence databases. Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than the AntiFam resource. The Spurio software and source code is available under an MIT license at the following URL: https://bitbucket.org/bateman-group/spurio.
RadSearch: a RIS/PACS integrated query tool

NASA Astrophysics Data System (ADS)

Tsao, Sinchai; Documet, Jorge; Moin, Paymann; Wang, Kevin; Liu, Brent J.

2008-03-01

Radiology Information Systems (RIS) contain a wealth of information that can be used for research, education, and practice management. However, the sheer amount of information available makes querying specific data difficult and time consuming. Previous work has shown that a clinical RIS database and its RIS text reports can be extracted, duplicated and indexed for searches while complying with HIPAA and IRB requirements. This project's intent is to provide a software tool, the RadSearch Toolkit, to allow intelligent indexing and parsing of RIS reports for easy yet powerful searches. In addition, the project aims to seamlessly query and retrieve associated images from the Picture Archiving and Communication System (PACS) in situations where an integrated RIS/PACS is in place - even subselecting individual series, such as in an MRI study. RadSearch's application of simple text parsing techniques to index text-based radiology reports will allow the search engine to quickly return relevant results. This powerful combination will be useful in both private practice and academic settings; administrators can easily obtain complex practice management information such as referral patterns; researchers can conduct retrospective studies with specific, multiple criteria; teaching institutions can quickly and effectively create thorough teaching files.
PGSB/MIPS PlantsDB Database Framework for the Integration and Analysis of Plant Genome Data.

PubMed

Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai; Gundlach, Heidrun; Mayer, Klaus F X

2017-01-01

Plant Genome and Systems Biology (PGSB), formerly Munich Institute for Protein Sequences (MIPS) PlantsDB, is a database framework for the integration and analysis of plant genome data, developed and maintained for more than a decade now. Major components of that framework are genome databases and analysis resources focusing on individual (reference) genomes providing flexible and intuitive access to data. Another main focus is the integration of genomes from both model and crop plants to form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny). Data exchange and integrated search functionality with/over many plant genome databases is provided within the transPLANT project.
Lynx web services for annotations and systems analysis of multi-gene disorders.

PubMed

Sulakhe, Dinanath; Taylor, Andrew; Balasubramanian, Sandhya; Feng, Bo; Xie, Bingqing; Börnigen, Daniela; Dave, Utpal J; Foster, Ian T; Gilliam, T Conrad; Maltsev, Natalia

2014-07-01

Lynx is a web-based integrated systems biology platform that supports annotation and analysis of experimental data and generation of weighted hypotheses on molecular mechanisms contributing to human phenotypes and disorders of interest. Lynx has integrated multiple classes of biomedical data (genomic, proteomic, pathways, phenotypic, toxicogenomic, contextual and others) from various public databases as well as manually curated data from our group and collaborators (LynxKB). Lynx provides tools for gene list enrichment analysis using multiple functional annotations and network-based gene prioritization. Lynx provides access to the integrated database and the analytical tools via REST based Web Services (http://lynx.ci.uchicago.edu/webservices.html). This comprises data retrieval services for specific functional annotations, services to search across the complete LynxKB (powered by Lucene), and services to access the analytical tools built within the Lynx platform. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
SSTAR, a Stand-Alone Easy-To-Use Antimicrobial Resistance Gene Predictor.

PubMed

de Man, Tom J B; Limbago, Brandi M

2016-01-01

We present the easy-to-use Sequence Search Tool for Antimicrobial Resistance, SSTAR. It combines a locally executed BLASTN search against a customizable database with an intuitive graphical user interface for identifying antimicrobial resistance (AR) genes from genomic data. Although the database is initially populated from a public repository of acquired resistance determinants (i.e., ARG-ANNOT), it can be customized for particular pathogen groups and resistance mechanisms. For instance, outer membrane porin sequences associated with carbapenem resistance phenotypes can be added, and known intrinsic mechanisms can be included. Unique about this tool is the ability to easily detect putative new alleles and truncated versions of existing AR genes. Variants and potential new alleles are brought to the attention of the user for further investigation. For instance, SSTAR is able to identify modified or truncated versions of porins, which may be of great importance in carbapenemase-negative carbapenem-resistant Enterobacteriaceae. SSTAR is written in Java and is therefore platform independent and compatible with both Windows and Unix operating systems. SSTAR and its manual, which includes a simple installation guide, are freely available from https://github.com/tomdeman-bio/Sequence-Search-Tool-for-Antimicrobial-Resistance-SSTAR-. IMPORTANCE Whole-genome sequencing (WGS) is quickly becoming a routine method for identifying genes associated with antimicrobial resistance (AR). However, for many microbiologists, the use and analysis of WGS data present a substantial challenge. We developed SSTAR, software with a graphical user interface that enables the identification of known AR genes from WGS and has the unique capacity to easily detect new variants of known AR genes, including truncated protein variants. Current software solutions do not notify the user when genes are truncated and, therefore, likely nonfunctional, which makes phenotype predictions less accurate. SSTAR users can apply any AR database of interest as a reference comparator and can manually add genes that impact resistance, even if such genes are not resistance determinants per se (e.g., porins and efflux pumps).
FunSimMat: a comprehensive functional similarity database

PubMed Central

Schlicker, Andreas; Albrecht, Mario

2008-01-01

Functional similarity based on Gene Ontology (GO) annotation is used in diverse applications like gene clustering, gene expression data analysis, protein interaction prediction and evaluation. However, there exists no comprehensive resource of functional similarity values although such a database would facilitate the use of functional similarity measures in different applications. Here, we describe FunSimMat (Functional Similarity Matrix, http://funsimmat.bioinf.mpi-inf.mpg.de/), a large new database that provides several different semantic similarity measures for GO terms. It offers various precomputed functional similarity values for proteins contained in UniProtKB and for protein families in Pfam and SMART. The web interface allows users to efficiently perform both semantic similarity searches with GO terms and functional similarity searches with proteins or protein families. All results can be downloaded in tab-delimited files for use with other tools. An additional XML–RPC interface gives automatic online access to FunSimMat for programs and remote services. PMID:17932054
The BaMM web server for de-novo motif discovery and regulatory sequence analysis.

PubMed

Kiesel, Anja; Roth, Christian; Ge, Wanwan; Wess, Maximilian; Meier, Markus; Söding, Johannes

2018-05-28

The BaMM web server offers four tools: (i) de-novo discovery of enriched motifs in a set of nucleotide sequences, (ii) scanning a set of nucleotide sequences with motifs to find motif occurrences, (iii) searching with an input motif for similar motifs in our BaMM database with motifs for >1000 transcription factors, trained from the GTRD ChIP-seq database and (iv) browsing and keyword searching the motif database. In contrast to most other servers, we represent sequence motifs not by position weight matrices (PWMs) but by Bayesian Markov Models (BaMMs) of order 4, which we showed previously to perform substantially better in ROC analyses than PWMs or first order models. To address the inadequacy of P- and E-values as measures of motif quality, we introduce the AvRec score, the average recall over the TP-to-FP ratio between 1 and 100. The BaMM server is freely accessible without registration at https://bammmotif.mpibpc.mpg.de.
Interactive and Versatile Navigation of Structural Databases.

PubMed

Korb, Oliver; Kuhn, Bernd; Hert, Jérôme; Taylor, Neil; Cole, Jason; Groom, Colin; Stahl, Martin

2016-05-12

We present CSD-CrossMiner, a novel tool for pharmacophore-based searches in crystal structure databases. Intuitive pharmacophore queries describing, among others, protein-ligand interaction patterns, ligand scaffolds, or protein environments can be built and modified interactively. Matching crystal structures are overlaid onto the query and visualized as soon as they are available, enabling the researcher to quickly modify a hypothesis on the fly. We exemplify the utility of the approach by showing applications relevant to real-world drug discovery projects, including the identification of novel fragments for a specific protein environment or scaffold hopping. The ability to concurrently search protein-ligand binding sites extracted from the Protein Data Bank (PDB) and small organic molecules from the Cambridge Structural Database (CSD) using the same pharmacophore query further emphasizes the flexibility of CSD-CrossMiner. We believe that CSD-CrossMiner closes an important gap in mining structural data and will allow users to extract more value from the growing number of available crystal structures.
Using the Textpresso Site-Specific Recombinases Web server to identify Cre expressing mouse strains and floxed alleles.

PubMed

Condie, Brian G; Urbanski, William M

2014-01-01

Effective tools for searching the biomedical literature are essential for identifying reagents or mouse strains as well as for effective experimental design and informed interpretation of experimental results. We have built the Textpresso Site Specific Recombinases (Textpresso SSR) Web server to enable researchers who use mice to perform in-depth searches of a rapidly growing and complex part of the mouse literature. Our Textpresso Web server provides an interface for searching the full text of most of the peer-reviewed publications that report the characterization or use of mouse strains that express Cre or Flp recombinase. The database also contains most of the publications that describe the characterization or analysis of strains carrying conditional alleles or transgenes that can be inactivated or activated by site-specific recombinases such as Cre or Flp. Textpresso SSR complements the existing online databases that catalog Cre and Flp expression patterns by providing a unique online interface for the in-depth text mining of the site specific recombinase literature.
The Biomolecular Crystallization Database Version 4: expanded content and new features.

PubMed

Tung, Michael; Gallagher, D Travis

2009-01-01

The Biological Macromolecular Crystallization Database (BMCD) has been a publicly available resource since 1988, providing a curated archive of information on crystal growth for proteins and other biological macromolecules. The BMCD content has recently been expanded to include 14 372 crystal entries. The resource continues to be freely available at http://xpdb.nist.gov:8060/BMCD4. In addition, the software has been adapted to support the Java-based Lucene query language, enabling detailed searching over specific parameters, and explicit search of parameter ranges is offered for five numeric variables. Extensive tools have been developed for import and handling of data from the RCSB Protein Data Bank. The updated BMCD is called version 4.02 or BMCD4. BMCD4 entries have been expanded to include macromolecule sequence, enabling more elaborate analysis of relations among protein properties, crystal-growth conditions and the geometric and diffraction properties of the crystals. The BMCD version 4.02 contains greatly expanded content and enhanced search capabilities to facilitate scientific analysis and design of crystal-growth strategies.
Testing search strategies for systematic reviews in the Medline literature database through PubMed.

PubMed

Volpato, Enilze S N; Betini, Marluci; El Dib, Regina

2014-04-01

A high-quality electronic search is essential in ensuring accuracy and completeness in retrieved records for the conducting of a systematic review. We analysed the available sample of search strategies to identify the best method for searching in Medline through PubMed, considering the use or not of parenthesis, double quotation marks, truncation and use of a simple search or search history. In our cross-sectional study of search strategies, we selected and analysed the available searches performed during evidence-based medicine classes and in systematic reviews conducted in the Botucatu Medical School, UNESP, Brazil. We analysed 120 search strategies. With regard to the use of phrase searches with parenthesis, there was no difference between the results with and without parenthesis and simple searches or search history tools in 100% of the sample analysed (P = 1.0). The number of results retrieved by the searches analysed was smaller using double quotations marks and using truncation compared with the standard strategy (P = 0.04 and P = 0.08, respectively). There is no need to use phrase-searching parenthesis to retrieve studies; however, we recommend the use of double quotation marks when an investigator attempts to retrieve articles in which a term appears to be exactly the same as what was proposed in the search form. Furthermore, we do not recommend the use of truncation in search strategies in the Medline via PubMed. Although the results of simple searches or search history tools were the same, we recommend using the latter.
Evaluation and comparison of bioinformatic tools for the enrichment analysis of metabolomics data.

PubMed

Marco-Ramell, Anna; Palau-Rodriguez, Magali; Alay, Ania; Tulipani, Sara; Urpi-Sarda, Mireia; Sanchez-Pla, Alex; Andres-Lacueva, Cristina

2018-01-02

Bioinformatic tools for the enrichment of 'omics' datasets facilitate interpretation and understanding of data. To date few are suitable for metabolomics datasets. The main objective of this work is to give a critical overview, for the first time, of the performance of these tools. To that aim, datasets from metabolomic repositories were selected and enriched data were created. Both types of data were analysed with these tools and outputs were thoroughly examined. An exploratory multivariate analysis of the most used tools for the enrichment of metabolite sets, based on a non-metric multidimensional scaling (NMDS) of Jaccard's distances, was performed and mirrored their diversity. Codes (identifiers) of the metabolites of the datasets were searched in different metabolite databases (HMDB, KEGG, PubChem, ChEBI, BioCyc/HumanCyc, LipidMAPS, ChemSpider, METLIN and Recon2). The databases that presented more identifiers of the metabolites of the dataset were PubChem, followed by METLIN and ChEBI. However, these databases had duplicated entries and might present false positives. The performance of over-representation analysis (ORA) tools, including BioCyc/HumanCyc, ConsensusPathDB, IMPaLA, MBRole, MetaboAnalyst, Metabox, MetExplore, MPEA, PathVisio and Reactome and the mapping tool KEGGREST, was examined. Results were mostly consistent among tools and between real and enriched data despite the variability of the tools. Nevertheless, a few controversial results such as differences in the total number of metabolites were also found. Disease-based enrichment analyses were also assessed, but they were not found to be accurate probably due to the fact that metabolite disease sets are not up-to-date and the difficulty of predicting diseases from a list of metabolites. We have extensively reviewed the state-of-the-art of the available range of tools for metabolomic datasets, the completeness of metabolite databases, the performance of ORA methods and disease-based analyses. Despite the variability of the tools, they provided consistent results independent of their analytic approach. However, more work on the completeness of metabolite and pathway databases is required, which strongly affects the accuracy of enrichment analyses. Improvements will be translated into more accurate and global insights of the metabolome.

Literature searches on Ayurveda: An update.

PubMed

Aggithaya, Madhur G; Narahari, Saravu R

2015-01-01

The journals that publish on Ayurveda are increasingly indexed by popular medical databases in recent years. However, many Eastern journals are not indexed biomedical journal databases such as PubMed. Literature searches for Ayurveda continue to be challenging due to the nonavailability of active, unbiased dedicated databases for Ayurvedic literature. In 2010, authors identified 46 databases that can be used for systematic search of Ayurvedic papers and theses. This update reviewed our previous recommendation and identified current and relevant databases. To update on Ayurveda literature search and strategy to retrieve maximum publications. Author used psoriasis as an example to search previously listed databases and identify new. The population, intervention, control, and outcome table included keywords related to psoriasis and Ayurvedic terminologies for skin diseases. Current citation update status, search results, and search options of previous databases were assessed. Eight search strategies were developed. Hundred and five journals, both biomedical and Ayurveda, which publish on Ayurveda, were identified. Variability in databases was explored to identify bias in journal citation. Five among 46 databases are now relevant - AYUSH research portal, Annotated Bibliography of Indian Medicine, Digital Helpline for Ayurveda Research Articles (DHARA), PubMed, and Directory of Open Access Journals. Search options in these databases are not uniform, and only PubMed allows complex search strategy. "The Researches in Ayurveda" and "Ayurvedic Research Database" (ARD) are important grey resources for hand searching. About 44/105 (41.5%) journals publishing Ayurvedic studies are not indexed in any database. Only 11/105 (10.4%) exclusive Ayurveda journals are indexed in PubMed. AYUSH research portal and DHARA are two major portals after 2010. It is mandatory to search PubMed and four other databases because all five carry citations from different groups of journals. The hand searching is important to identify Ayurveda publications that are not indexed elsewhere. Availability information of citations in Ayurveda libraries from National Union Catalogue of Scientific Serials in India if regularly updated will improve the efficacy of hand searching. A grey database (ARD) contains unpublished PG/Ph.D. theses. The AYUSH portal, DHARA (funded by Ministry of AYUSH), and ARD should be merged to form single larger database to limit Ayurveda literature searches.
BioCarian: search engine for exploratory searches in heterogeneous biological databases.

PubMed

Zaki, Nazar; Tennakoon, Chandana

2017-10-02

There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community. We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search on previously published viral integration data and were able to deduce the main conclusions of the original publication. BioCarian is accessible via http://www.biocarian.com . We have developed a search engine to explore RDF databases that can be used by both novice and advanced users.
Cancer Internet search activity on a major search engine, United States 2001-2003.

PubMed

Cooper, Crystale Purvis; Mallon, Kenneth P; Leadbetter, Steven; Pollack, Lori A; Peipins, Lucy A

2005-07-01

To locate online health information, Internet users typically use a search engine, such as Yahoo! or Google. We studied Yahoo! search activity related to the 23 most common cancers in the United States. The objective was to test three potential correlates of Yahoo! cancer search activity--estimated cancer incidence, estimated cancer mortality, and the volume of cancer news coverage--and to study the periodicity of and peaks in Yahoo! cancer search activity. Yahoo! cancer search activity was obtained from a proprietary database called the Yahoo! Buzz Index. The American Cancer Society's estimates of cancer incidence and mortality were used. News reports associated with specific cancer types were identified using the LexisNexis "US News" database, which includes more than 400 national and regional newspapers and a variety of newswire services. The Yahoo! search activity associated with specific cancers correlated with their estimated incidence (Spearman rank correlation, rho = 0.50, P = .015), estimated mortality (rho = 0.66, P = .001), and volume of related news coverage (rho = 0.88, P < .001). Yahoo! cancer search activity tended to be higher on weekdays and during national cancer awareness months but lower during summer months; cancer news coverage also tended to follow these trends. Sharp increases in Yahoo! search activity scores from one day to the next appeared to be associated with increases in relevant news coverage. Media coverage appears to play a powerful role in prompting online searches for cancer information. Internet search activity offers an innovative tool for passive surveillance of health information-seeking behavior.
Cancer Internet Search Activity on a Major Search Engine, United States 2001-2003

PubMed Central

Cooper, Crystale Purvis; Mallon, Kenneth P; Leadbetter, Steven; Peipins, Lucy A

2005-01-01

Background To locate online health information, Internet users typically use a search engine, such as Yahoo! or Google. We studied Yahoo! search activity related to the 23 most common cancers in the United States. Objective The objective was to test three potential correlates of Yahoo! cancer search activity—estimated cancer incidence, estimated cancer mortality, and the volume of cancer news coverage—and to study the periodicity of and peaks in Yahoo! cancer search activity. Methods Yahoo! cancer search activity was obtained from a proprietary database called the Yahoo! Buzz Index. The American Cancer Society's estimates of cancer incidence and mortality were used. News reports associated with specific cancer types were identified using the LexisNexis “US News” database, which includes more than 400 national and regional newspapers and a variety of newswire services. Results The Yahoo! search activity associated with specific cancers correlated with their estimated incidence (Spearman rank correlation, ρ = 0.50, P = .015), estimated mortality (ρ = 0.66, P = .001), and volume of related news coverage (ρ = 0.88, P < .001). Yahoo! cancer search activity tended to be higher on weekdays and during national cancer awareness months but lower during summer months; cancer news coverage also tended to follow these trends. Sharp increases in Yahoo! search activity scores from one day to the next appeared to be associated with increases in relevant news coverage. Conclusions Media coverage appears to play a powerful role in prompting online searches for cancer information. Internet search activity offers an innovative tool for passive surveillance of health information–seeking behavior. PMID:15998627
Web 2.0 Tools in the Prevention of Curable Sexually Transmitted Diseases: Scoping Review

PubMed Central

2018-01-01

Background The internet is now the primary source of information that young people use to get information on issues related to sex, contraception, and sexually transmitted infections. Objective The goal of the research was to review the scientific literature related to the use of Web 2.0 tools as opposed to other strategies in the prevention of curable sexually transmitted diseases (STDs). Methods A scoping review was performed on the documentation indexed in the bibliographic databases MEDLINE, Cochrane Library, Scopus, Cumulative Index to Nursing and Allied Health Literature, Web of Science, Literatura Latinoamericana y del Caribe en Ciencias de la Salud, PsycINFO, Educational Resources Information Center, the databases of Centro Superior de Investigaciones Científicas in Spain, and the Índice Bibliográfico Español de Ciencias de la Salud from the first available date according to the characteristics of each database until April 2017. The equation search was realized by means of the using of descriptors together with the consultation of the fields of title register and summary with free terms. Bibliographies of the selected papers were searched for additional articles. Results A total of 627 references were retrieved, of which 6 papers were selected after applying the inclusion and exclusion criteria. The STDs studied were chlamydia, gonorrhea, and syphilis. The Web 2.0 tools used were Facebook, Twitter, Instagram, and YouTube. The 6 papers used Web 2.0 in the promotion of STD detection. Conclusions Web 2.0 tools have demonstrated a positive effect on the promotion of prevention strategies for STDs and can help attract and link youth to campaigns related to sexual health. These tools can be combined with other interventions. In any case, Web 2.0 and especially Facebook have all the potential to become essential instruments for public health. PMID:29567633
ZINC: A Free Tool to Discover Chemistry for Biology

PubMed Central

2012-01-01

ZINC is a free public resource for ligand discovery. The database contains over twenty million commercially available molecules in biologically relevant representations that may be downloaded in popular ready-to-dock formats and subsets. The Web site also enables searches by structure, biological activity, physical property, vendor, catalog number, name, and CAS number. Small custom subsets may be created, edited, shared, docked, downloaded, and conveyed to a vendor for purchase. The database is maintained and curated for a high purchasing success rate and is freely available at zinc.docking.org. PMID:22587354
MerCat: a versatile k-mer counter and diversity estimator for database-independent property analysis obtained from metagenomic and/or metatranscriptomic sequencing data

DOE Office of Scientific and Technical Information (OSTI.GOV)

White, Richard A.; Panyala, Ajay R.; Glass, Kevin A.

MerCat is a parallel, highly scalable and modular property software package for robust analysis of features in next-generation sequencing data. MerCat inputs include assembled contigs and raw sequence reads from any platform resulting in feature abundance counts tables. MerCat allows for direct analysis of data properties without reference sequence database dependency commonly used by search tools such as BLAST and/or DIAMOND for compositional analysis of whole community shotgun sequencing (e.g. metagenomes and metatranscriptomes).
Towards the design of novel cuprate-based superconductors

NASA Astrophysics Data System (ADS)

Yee, Chuck-Hou

The rapid maturation of materials databases combined with recent development of theories seeking to quantitatively link chemical properties to superconductivity in the cuprates provide the context to design novel superconductors. In this talk, we describe a framework designed to search for new superconductors, which combines chemical rules-of-thumb, insights of transition temperatures from dynamical mean-field theory, first-principles electronic structure tools, materials databases and structure prediction via evolutionary algorithms. We apply the framework to design a family of copper oxysulfides and evaluate the prospects of superconductivity.
Open, Cross Platform Chemistry Application Unifying Structure Manipulation, External Tools, Databases and Visualization

DTIC Science & Technology

2014-05-30

mol.addBond(o1, h2, 1); Avogadro ::Core::Bond b2 = mol.addBond(o1, h3, 1); The QtGui::Molecule class inherits from Core::Molecule and Qt’s QObject...populated as an input (although they are all implemented in terms of the Core::Molecule class. The third is QtGui::RWMolecule which inherits from just...shown in Figure 16. The use of molecule fingerprinting techniques gives the database the ability to be searched by similarity to a desired structure, as
Adjacency and Proximity Searching in the Science Citation Index and Google

DTIC Science & Technology

2005-01-01

major database search engines , including commercial S&T database search engines (e.g., Science Citation Index (SCI), Engineering Compendex (EC...PubMed, OVID), Federal agency award database search engines (e.g., NSF, NIH, DOE, EPA, as accessed in Federal R&D Project Summaries), Web search Engines (e.g...searching. Some database search engines allow strict constrained co- occurrence searching as a user option (e.g., OVID, EC), while others do not (e.g., SCI
Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation

PubMed Central

2011-01-01

Background The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. Results A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. Conclusions Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance. PMID:21631914
Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation.

PubMed

Rognes, Torbjørn

2011-06-01

The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance.
Database Search Engines: Paradigms, Challenges and Solutions.

PubMed

Verheggen, Kenneth; Martens, Lennart; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

2016-01-01

The first step in identifying proteins from mass spectrometry based shotgun proteomics data is to infer peptides from tandem mass spectra, a task generally achieved using database search engines. In this chapter, the basic principles of database search engines are introduced with a focus on open source software, and the use of database search engines is demonstrated using the freely available SearchGUI interface. This chapter also discusses how to tackle general issues related to sequence database searching and shows how to minimize their impact.
SIRSALE: integrated video database management tools

NASA Astrophysics Data System (ADS)

Brunie, Lionel; Favory, Loic; Gelas, J. P.; Lefevre, Laurent; Mostefaoui, Ahmed; Nait-Abdesselam, F.

2002-07-01

Video databases became an active field of research during the last decade. The main objective in such systems is to provide users with capabilities to friendly search, access and playback distributed stored video data in the same way as they do for traditional distributed databases. Hence, such systems need to deal with hard issues : (a) video documents generate huge volumes of data and are time sensitive (streams must be delivered at a specific bitrate), (b) contents of video data are very hard to be automatically extracted and need to be humanly annotated. To cope with these issues, many approaches have been proposed in the literature including data models, query languages, video indexing etc. In this paper, we present SIRSALE : a set of video databases management tools that allow users to manipulate video documents and streams stored in large distributed repositories. All the proposed tools are based on generic models that can be customized for specific applications using ad-hoc adaptation modules. More precisely, SIRSALE allows users to : (a) browse video documents by structures (sequences, scenes, shots) and (b) query the video database content by using a graphical tool, adapted to the nature of the target video documents. This paper also presents an annotating interface which allows archivists to describe the content of video documents. All these tools are coupled to a video player integrating remote VCR functionalities and are based on active network technology. So, we present how dedicated active services allow an optimized video transport for video streams (with Tamanoir active nodes). We then describe experiments of using SIRSALE on an archive of news video and soccer matches. The system has been demonstrated to professionals with a positive feedback. Finally, we discuss open issues and present some perspectives.
GENEASE: Real time bioinformatics tool for multi-omics and disease ontology exploration, analysis and visualization.

PubMed

Ghandikota, Sudhir; Hershey, Gurjit K Khurana; Mersha, Tesfaye B

2018-03-24

Advances in high-throughput sequencing technologies have made it possible to generate multiple omics data at an unprecedented rate and scale. The accumulation of these omics data far outpaces the rate at which biologists can mine and generate new hypothesis to test experimentally. There is an urgent need to develop a myriad of powerful tools to efficiently and effectively search and filter these resources to address specific post-GWAS functional genomics questions. However, to date, these resources are scattered across several databases and often lack a unified portal for data annotation and analytics. In addition, existing tools to analyze and visualize these databases are highly fragmented, resulting researchers to access multiple applications and manual interventions for each gene or variant in an ad hoc fashion until all the questions are answered. In this study, we present GENEASE, a web-based one-stop bioinformatics tool designed to not only query and explore multi-omics and phenotype databases (e.g., GTEx, ClinVar, dbGaP, GWAS Catalog, ENCODE, Roadmap Epigenomics, KEGG, Reactome, Gene and Phenotype Ontology) in a single web interface but also to perform seamless post genome-wide association downstream functional and overlap analysis for non-coding regulatory variants. GENEASE accesses over 50 different databases in public domain including model organism-specific databases to facilitate gene/variant and disease exploration, enrichment and overlap analysis in real time. It is a user-friendly tool with point-and-click interface containing links for support information including user manual and examples. GENEASE can be accessed freely at http://research.cchmc.org/mershalab/genease_new/login.html. Tesfaye.Mersha@cchmc.org, Sudhir.Ghandikota@cchmc.org. Supplementary data are available at Bioinformatics online.
Accessing Biomedical Literature in the Current Information Landscape

PubMed Central

Khare, Ritu; Leaman, Robert; Lu, Zhiyong

2015-01-01

i. Summary Biomedical and life sciences literature is unique because of its exponentially increasing volume and interdisciplinary nature. Biomedical literature access is essential for several types of users including biomedical researchers, clinicians, database curators, and bibliometricians. In the past few decades, several online search tools and literature archives, generic as well as biomedicine-specific, have been developed. We present this chapter in the light of three consecutive steps of literature access: searching for citations, retrieving full-text, and viewing the article. The first section presents the current state of practice of biomedical literature access, including an analysis of the search tools most frequently used by the users, including PubMed, Google Scholar, Web of Science, Scopus, and Embase, and a study on biomedical literature archives such as PubMed Central. The next section describes current research and the state-of-the-art systems motivated by the challenges a user faces during query formulation and interpretation of search results. The research solutions are classified into five key areas related to text and data mining, text similarity search, semantic search, query support, relevance ranking, and clustering results. Finally, the last section describes some predicted future trends for improving biomedical literature access, such as searching and reading articles on portable devices, and adoption of the open access policy. PMID:24788259
Inferring transposons activity chronology by TRANScendence - TEs database and de-novo mining tool.

PubMed

Startek, Michał Piotr; Nogły, Jakub; Gromadka, Agnieszka; Grzebelus, Dariusz; Gambin, Anna

2017-10-16

The constant progress in sequencing technology leads to ever increasing amounts of genomic data. In the light of current evidence transposable elements (TEs for short) are becoming useful tools for learning about the evolution of host genome. Therefore the software for genome-wide detection and analysis of TEs is of great interest. Here we describe the computational tool for mining, classifying and storing TEs from newly sequenced genomes. This is an online, web-based, user-friendly service, enabling users to upload their own genomic data, and perform de-novo searches for TEs. The detected TEs are automatically analyzed, compared to reference databases, annotated, clustered into families, and stored in TEs repository. Also, the genome-wide nesting structure of found elements are detected and analyzed by new method for inferring evolutionary history of TEs. We illustrate the functionality of our tool by performing a full-scale analyses of TE landscape in Medicago truncatula genome. TRANScendence is an effective tool for the de-novo annotation and classification of transposable elements in newly-acquired genomes. Its streamlined interface makes it well-suited for evolutionary studies.
FCDD: A Database for Fruit Crops Diseases.

PubMed

Chauhan, Rupal; Jasrai, Yogesh; Pandya, Himanshu; Chaudhari, Suman; Samota, Chand Mal

2014-01-01

Fruit Crops Diseases Database (FCDD) requires a number of biotechnology and bioinformatics tools. The FCDD is a unique bioinformatics resource that compiles information about 162 details on fruit crops diseases, diseases type, its causal organism, images, symptoms and their control. The FCDD contains 171 phytochemicals from 25 fruits, their 2D images and their 20 possible sequences. This information has been manually extracted and manually verified from numerous sources, including other electronic databases, textbooks and scientific journals. FCDD is fully searchable and supports extensive text search. The main focus of the FCDD is on providing possible information of fruit crops diseases, which will help in discovery of potential drugs from one of the common bioresource-fruits. The database was developed using MySQL. The database interface is developed in PHP, HTML and JAVA. FCDD is freely available. http://www.fruitcropsdd.com/
Effectiveness of structured multidisciplinary rounding in acute care units on length of stay and satisfaction of patients and staff: a quantitative systematic review.

PubMed

Mercedes, Angela; Fairman, Precillia; Hogan, Lisa; Thomas, Rexi; Slyer, Jason T

2016-07-01

Consistent, concise and timely communication between a multidisciplinary team of healthcare providers, patients and families is necessary for the delivery of quality care. Structured multidisciplinary rounding (MDR) using a structured communication tool may positively impact length of stay (LOS) and satisfaction of patients and staff by improving communication, coordination and collaboration among the healthcare team. To evaluate the effectiveness of structured MDR using a structured communication tool in acute care units on LOS and satisfaction of patients and staff. Adult patients admitted to acute care units and healthcare providers who provide direct care for adult patients hospitalized in in-patient acute care units. The implementation of structured MDR utilizing a structured communication tool to enhance and/or guide communication. Quasi-experimental studies and descriptive studies. Length of stay, patient satisfaction and staff satisfaction. The comprehensive search strategy aimed to find relevant published and unpublished quantitative English language studies from the inception of each database searched through June 30, 2015. Databases searched include Cumulative Index to Nursing and Allied Health Literature, PubMed, Excerpta Medica Database, Health Source, Cochrane Central Register of Controlled Trials and Scopus. A search of gray literature was also performed. All reviewers independently evaluated the included studies for methodological quality using critical appraisal tools from the Joanna Briggs Institute (JBI). Data related to the methods, participants, interventions and findings were extracted using a standardized data extraction tool from the JBI. Due to clinical and methodological heterogeneity in the interventions and outcome measures of the included studies, statistical meta-analysis was not possible. Results are presented in narrative form. Eight studies were included, three quasi-experimental studies and five descriptive studies of quality improvement projects. In the three quasi-experimental studies, one had a statistically significant decrease (p = 0.01), one no change (p = 0.1) and one had an increase (p = 0.03) in LOS; in the two descriptive studies, one had a statistically significant decrease (p = 0.02) and the other reported a trend toward reduced LOS. Two studies evaluated patient satisfaction, one showed no change (p = 0.76) and one showed a trend toward increased patient satisfaction at 12 months. Six studies demonstrated an improvement in staff satisfaction (p < 0.05) after implementation of structured MDR. The evidence suggests that MDR utilizing a structured communication tool may have contributed to an improvement in staff satisfaction. There was inconclusive evidence to support the use of structured MDR to improve LOS or patient satisfaction. The use of a structured communication tool during MDR is one means to facilitate communication and collaboration, thus improving satisfaction among the multidisciplinary team. More rigorous research using higher level study designs on larger samples of diverse patient populations is needed to further evaluate the effectiveness of structured MDR on patient care outcomes and satisfaction of patients and providers.
An approach in building a chemical compound search engine in oracle database.

PubMed

Wang, H; Volarath, P; Harrison, R

2005-01-01

A searching or identifying of chemical compounds is an important process in drug design and in chemistry research. An efficient search engine involves a close coupling of the search algorithm and database implementation. The database must process chemical structures, which demands the approaches to represent, store, and retrieve structures in a database system. In this paper, a general database framework for working as a chemical compound search engine in Oracle database is described. The framework is devoted to eliminate data type constrains for potential search algorithms, which is a crucial step toward building a domain specific query language on top of SQL. A search engine implementation based on the database framework is also demonstrated. The convenience of the implementation emphasizes the efficiency and simplicity of the framework.

Improvements in the Protein Identifier Cross-Reference service.

PubMed

Wein, Samuel P; Côté, Richard G; Dumousseau, Marine; Reisinger, Florian; Hermjakob, Henning; Vizcaíno, Juan A

2012-07-01

The Protein Identifier Cross-Reference (PICR) service is a tool that allows users to map protein identifiers, protein sequences and gene identifiers across over 100 different source databases. PICR takes input through an interactive website as well as Representational State Transfer (REST) and Simple Object Access Protocol (SOAP) services. It returns the results as HTML pages, XLS and CSV files. It has been in production since 2007 and has been recently enhanced to add new functionality and increase the number of databases it covers. Protein subsequences can be Basic Local Alignment Search Tool (BLAST) against the UniProt Knowledgebase (UniProtKB) to provide an entry point to the standard PICR mapping algorithm. In addition, gene identifiers from UniProtKB and Ensembl can now be submitted as input or mapped to as output from PICR. We have also implemented a 'best-guess' mapping algorithm for UniProt. In this article, we describe the usefulness of PICR, how these changes have been implemented, and the corresponding additions to the web services. Finally, we explain that the number of source databases covered by PICR has increased from the initial 73 to the current 102. New resources include several new species-specific Ensembl databases as well as the Ensembl Genome ones. PICR can be accessed at http://www.ebi.ac.uk/Tools/picr/.
GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data

PubMed Central

Jung, Sook; Staton, Margaret; Lee, Taein; Blenda, Anna; Svancara, Randall; Abbott, Albert; Main, Dorrie

2008-01-01

The Genome Database for Rosaceae (GDR) is a central repository of curated and integrated genetics and genomics data of Rosaceae, an economically important family which includes apple, cherry, peach, pear, raspberry, rose and strawberry. GDR contains annotated databases of all publicly available Rosaceae ESTs, the genetically anchored peach physical map, Rosaceae genetic maps and comprehensively annotated markers and traits. The ESTs are assembled to produce unigene sets of each genus and the entire Rosaceae. Other annotations include putative function, microsatellites, open reading frames, single nucleotide polymorphisms, gene ontology terms and anchored map position where applicable. Most of the published Rosaceae genetic maps can be viewed and compared through CMap, the comparative map viewer. The peach physical map can be viewed using WebFPC/WebChrom, and also through our integrated GDR map viewer, which serves as a portal to the combined genetic, transcriptome and physical mapping information. ESTs, BACs, markers and traits can be queried by various categories and the search result sites are linked to the mapping visualization tools. GDR also provides online analysis tools such as a batch BLAST/FASTA server for the GDR datasets, a sequence assembly server and microsatellite and primer detection tools. GDR is available at http://www.rosaceae.org. PMID:17932055
Textpresso site-specific recombinases: A text-mining server for the recombinase literature including Cre mice and conditional alleles.

PubMed

Urbanski, William M; Condie, Brian G

2009-12-01

Textpresso Site Specific Recombinases (http://ssrc.genetics.uga.edu/) is a text-mining web server for searching a database of more than 9,000 full-text publications. The papers and abstracts in this database represent a wide range of topics related to site-specific recombinase (SSR) research tools. Included in the database are most of the papers that report the characterization or use of mouse strains that express Cre recombinase as well as papers that describe or analyze mouse lines that carry conditional (floxed) alleles or SSR-activated transgenes/knockins. The database also includes reports describing SSR-based cloning methods such as the Gateway or the Creator systems, papers reporting the development or use of SSR-based tools in systems such as Drosophila, bacteria, parasites, stem cells, yeast, plants, zebrafish, and Xenopus as well as publications that describe the biochemistry, genetics, or molecular structure of the SSRs themselves. Textpresso Site Specific Recombinases is the only comprehensive text-mining resource available for the literature describing the biology and technical applications of SSRs. (c) 2009 Wiley-Liss, Inc.
ESCAPE: database for integrating high-content published data collected from human and mouse embryonic stem cells.

PubMed

Xu, Huilei; Baroukh, Caroline; Dannenfelser, Ruth; Chen, Edward Y; Tan, Christopher M; Kou, Yan; Kim, Yujin E; Lemischka, Ihor R; Ma'ayan, Avi

2013-01-01

High content studies that profile mouse and human embryonic stem cells (m/hESCs) using various genome-wide technologies such as transcriptomics and proteomics are constantly being published. However, efforts to integrate such data to obtain a global view of the molecular circuitry in m/hESCs are lagging behind. Here, we present an m/hESC-centered database called Embryonic Stem Cell Atlas from Pluripotency Evidence integrating data from many recent diverse high-throughput studies including chromatin immunoprecipitation followed by deep sequencing, genome-wide inhibitory RNA screens, gene expression microarrays or RNA-seq after knockdown (KD) or overexpression of critical factors, immunoprecipitation followed by mass spectrometry proteomics and phosphoproteomics. The database provides web-based interactive search and visualization tools that can be used to build subnetworks and to identify known and novel regulatory interactions across various regulatory layers. The web-interface also includes tools to predict the effects of combinatorial KDs by additive effects controlled by sliders, or through simulation software implemented in MATLAB. Overall, the Embryonic Stem Cell Atlas from Pluripotency Evidence database is a comprehensive resource for the stem cell systems biology community. Database URL: http://www.maayanlab.net/ESCAPE
Tidying Up International Nucleotide Sequence Databases: Ecological, Geographical and Sequence Quality Annotation of ITS Sequences of Mycorrhizal Fungi

PubMed Central

Tedersoo, Leho; Abarenkov, Kessy; Nilsson, R. Henrik; Schüssler, Arthur; Grelet, Gwen-Aëlle; Kohout, Petr; Oja, Jane; Bonito, Gregory M.; Veldre, Vilmar; Jairus, Teele; Ryberg, Martin; Larsson, Karl-Henrik; Kõljalg, Urmas

2011-01-01

Sequence analysis of the ribosomal RNA operon, particularly the internal transcribed spacer (ITS) region, provides a powerful tool for identification of mycorrhizal fungi. The sequence data deposited in the International Nucleotide Sequence Databases (INSD) are, however, unfiltered for quality and are often poorly annotated with metadata. To detect chimeric and low-quality sequences and assign the ectomycorrhizal fungi to phylogenetic lineages, fungal ITS sequences were downloaded from INSD, aligned within family-level groups, and examined through phylogenetic analyses and BLAST searches. By combining the fungal sequence database UNITE and the annotation and search tool PlutoF, we also added metadata from the literature to these accessions. Altogether 35,632 sequences belonged to mycorrhizal fungi or originated from ericoid and orchid mycorrhizal roots. Of these sequences, 677 were considered chimeric and 2,174 of low read quality. Information detailing country of collection, geographical coordinates, interacting taxon and isolation source were supplemented to cover 78.0%, 33.0%, 41.7% and 96.4% of the sequences, respectively. These annotated sequences are publicly available via UNITE (http://unite.ut.ee/) for downstream biogeographic, ecological and taxonomic analyses. In European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/), the annotated sequences have a special link-out to UNITE. We intend to expand the data annotation to additional genes and all taxonomic groups and functional guilds of fungi. PMID:21949797
POLLUX: a program for simulated cloning, mutagenesis and database searching of DNA constructs.

PubMed

Dayringer, H E; Sammons, S A

1991-04-01

Computer support for research in biotechnology has developed rapidly and has provided several tools to aid the researcher. This report describes the capabilities of new computer software developed in this laboratory to aid in the documentation and planning of experiments in molecular biology. The program, POLLUX, provides a graphical medium for the entry, edit and manipulation of DNA constructs and a textual format for display and edit of construct descriptive data. Program operation and procedures are designed to mimic the actual laboratory experiments with respect to capability and the order in which they are performed. Flexible control over the content of the computer-generated displays and program facilities is provided by a mouse-driven menu interface. Programmed facilities for mutagenesis, simulated cloning and searching of the database from networked workstations are described.
The Impact of Online Bibliographic Databases on Teaching and Research in Political Science.

ERIC Educational Resources Information Center

Reichel, Mary

The availability of online bibliographic databases greatly facilitates literature searching in political science. The advantages to searching databases online include combination of concepts, comprehensiveness, multiple database searching, free-text searching, currency, current awareness services, document delivery service, and convenience.…
The MIGenAS integrated bioinformatics toolkit for web-based sequence analysis

PubMed Central

Rampp, Markus; Soddemann, Thomas; Lederer, Hermann

2006-01-01

We describe a versatile and extensible integrated bioinformatics toolkit for the analysis of biological sequences over the Internet. The web portal offers convenient interactive access to a growing pool of chainable bioinformatics software tools and databases that are centrally installed and maintained by the RZG. Currently, supported tasks comprise sequence similarity searches in public or user-supplied databases, computation and validation of multiple sequence alignments, phylogenetic analysis and protein–structure prediction. Individual tools can be seamlessly chained into pipelines allowing the user to conveniently process complex workflows without the necessity to take care of any format conversions or tedious parsing of intermediate results. The toolkit is part of the Max-Planck Integrated Gene Analysis System (MIGenAS) of the Max Planck Society available at (click ‘Start Toolkit’). PMID:16844980
Microlithography and resist technology information at your fingertips via SciFinder

NASA Astrophysics Data System (ADS)

Konuk, Rengin; Macko, John R.; Staggenborg, Lisa

1997-07-01

Finding and retrieving the information you need about microlithography and resist technology in a timely fashion can make or break your competitive edge in today's business environment. Chemical Abstracts Service (CAS) provides the most complete and comprehensive database of the chemical literature in the CAplus, REGISTRY, and CASREACT files including 13 million document references, 15 million substance records and over 1.2 million reactions. This includes comprehensive coverage of positive and negative resist formulations and processing, photoacid generation, silylation, single and multilayer resist systems, photomasks, dry and wet etching, photolithography, electron-beam, ion-beam and x-ray lithography technologies and process control, optical tools, exposure systems, radiation sources and steppers. Journal articles, conference proceedings and patents related to microlithography and resist technology are analyzed and indexed by scientific information analysts with strong technical background in these areas. The full CAS database, which is updated weekly with new information, is now available at your desktop, via a convenient, user-friendly tool called 'SciFinder.' Author, subject and chemical substance searching is simplified by SciFinder's smart search features. Chemical substances can be searched by chemical structure, chemical name, CAS registry number or molecular formula. Drawing chemical structures in SciFinder is easy and does not require compliance with CA conventions. Built-in intelligence of SciFinder enables users to retrieve substances with multiple components, tautomeric forms and salts.
Kazusa Marker DataBase: a database for genomics, genetics, and molecular breeding in plants.

PubMed

Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki

2014-09-01

In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops.
Functional Analysis of OMICs Data and Small Molecule Compounds in an Integrated "Knowledge-Based" Platform.

PubMed

Dubovenko, Alexey; Nikolsky, Yuri; Rakhmatulin, Eugene; Nikolskaya, Tatiana

2017-01-01

Analysis of NGS and other sequencing data, gene variants, gene expression, proteomics, and other high-throughput (OMICs) data is challenging because of its biological complexity and high level of technical and biological noise. One way to deal with both problems is to perform analysis with a high fidelity annotated knowledgebase of protein interactions, pathways, and functional ontologies. This knowledgebase has to be structured in a computer-readable format and must include software tools for managing experimental data, analysis, and reporting. Here, we present MetaCore™ and Key Pathway Advisor (KPA), an integrated platform for functional data analysis. On the content side, MetaCore and KPA encompass a comprehensive database of molecular interactions of different types, pathways, network models, and ten functional ontologies covering human, mouse, and rat genes. The analytical toolkit includes tools for gene/protein list enrichment analysis, statistical "interactome" tool for the identification of over- and under-connected proteins in the dataset, and a biological network analysis module made up of network generation algorithms and filters. The suite also features Advanced Search, an application for combinatorial search of the database content, as well as a Java-based tool called Pathway Map Creator for drawing and editing custom pathway maps. Applications of MetaCore and KPA include molecular mode of action of disease research, identification of potential biomarkers and drug targets, pathway hypothesis generation, analysis of biological effects for novel small molecule compounds and clinical applications (analysis of large cohorts of patients, and translational and personalized medicine).
High-performance metadata indexing and search in petascale data storage systems

NASA Astrophysics Data System (ADS)

Leung, A. W.; Shao, M.; Bisson, T.; Pasupathy, S.; Miller, E. L.

2008-07-01

Large-scale storage systems used for scientific applications can store petabytes of data and billions of files, making the organization and management of data in these systems a difficult, time-consuming task. The ability to search file metadata in a storage system can address this problem by allowing scientists to quickly navigate experiment data and code while allowing storage administrators to gather the information they need to properly manage the system. In this paper, we present Spyglass, a file metadata search system that achieves scalability by exploiting storage system properties, providing the scalability that existing file metadata search tools lack. In doing so, Spyglass can achieve search performance up to several thousand times faster than existing database solutions. We show that Spyglass enables important functionality that can aid data management for scientists and storage administrators.
Finding research information on the web: how to make the most of Google and other free search tools.

PubMed

Blakeman, Karen

2013-01-01

The Internet and the World Wide Web has had a major impact on the accessibility of research information. The move towards open access and development of institutional repositories has resulted in increasing amounts of information being made available free of charge. Many of these resources are not included in conventional subscription databases and Google is not always the best way to ensure that one is picking up all relevant material on a topic. This article will look at how Google's search engine works, how to use Google more effectively for identifying research information, alternatives to Google and will review some of the specialist tools that have evolved to cope with the diverse forms of information that now exist in electronic form.
Semantic-JSON: a lightweight web service interface for Semantic Web contents integrating multiple life science databases.

PubMed

Kobayashi, Norio; Ishii, Manabu; Takahashi, Satoshi; Mochizuki, Yoshiki; Matsushima, Akihiro; Toyoda, Tetsuro

2011-07-01

Global cloud frameworks for bioinformatics research databases become huge and heterogeneous; solutions face various diametric challenges comprising cross-integration, retrieval, security and openness. To address this, as of March 2011 organizations including RIKEN published 192 mammalian, plant and protein life sciences databases having 8.2 million data records, integrated as Linked Open or Private Data (LOD/LPD) using SciNetS.org, the Scientists' Networking System. The huge quantity of linked data this database integration framework covers is based on the Semantic Web, where researchers collaborate by managing metadata across public and private databases in a secured data space. This outstripped the data query capacity of existing interface tools like SPARQL. Actual research also requires specialized tools for data analysis using raw original data. To solve these challenges, in December 2009 we developed the lightweight Semantic-JSON interface to access each fragment of linked and raw life sciences data securely under the control of programming languages popularly used by bioinformaticians such as Perl and Ruby. Researchers successfully used the interface across 28 million semantic relationships for biological applications including genome design, sequence processing, inference over phenotype databases, full-text search indexing and human-readable contents like ontology and LOD tree viewers. Semantic-JSON services of SciNetS.org are provided at http://semanticjson.org.
Selecting CD-ROM databases for nursing students: a comparison of MEDLINE and the Cumulative Index to Nursing and Allied Health Literature (CINAHL).

PubMed

Okuma, E

1994-01-01

With the introduction of the Cumulative Index to Nursing and Allied Health Literature (CINAHL) on CD-ROM, research was initiated to compare coverage of nursing journals by CINAHL and MEDLINE in this format, expanding on previous comparison of these databases in print and online. The study assessed search results for eight topics in 1989 and 1990 citations in both databases, each produced by SilverPlatter. Results were tallied and analyzed for number of records retrieved, unique and overlapping records, relevance, and appropriateness. An overall precision score was developed. The goal of the research was to develop quantifiable tools to help determine which database to purchase for an academic library serving an undergraduate nursing program.
Selecting CD-ROM databases for nursing students: a comparison of MEDLINE and the Cumulative Index to Nursing and Allied Health Literature (CINAHL).

PubMed Central

Okuma, E

1994-01-01

With the introduction of the Cumulative Index to Nursing and Allied Health Literature (CINAHL) on CD-ROM, research was initiated to compare coverage of nursing journals by CINAHL and MEDLINE in this format, expanding on previous comparison of these databases in print and online. The study assessed search results for eight topics in 1989 and 1990 citations in both databases, each produced by SilverPlatter. Results were tallied and analyzed for number of records retrieved, unique and overlapping records, relevance, and appropriateness. An overall precision score was developed. The goal of the research was to develop quantifiable tools to help determine which database to purchase for an academic library serving an undergraduate nursing program. PMID:8136757
PubChemSR: A search and retrieval tool for PubChem

PubMed Central

Hur, Junguk; Wild, David J

2008-01-01

Background Recent years have seen an explosion in the amount of publicly available chemical and related biological information. A significant step has been the emergence of PubChem, which contains property information for millions of chemical structures, and acts as a repository of compounds and bioassay screening data for the NIH Roadmap. There is a strong need for tools designed for scientists that permit easy download and use of these data. We present one such tool, PubChemSR. Implementation PubChemSR (Search and Retrieve) is a freely available desktop application written for Windows using Microsoft .NET that is designed to assist scientists in search, retrieval and organization of chemical and biological data from the PubChem database. It employs SOAP web services made available by NCBI for extraction of information from PubChem. Results and Discussion The program supports a wide range of searching techniques, including queries based on assay or compound keywords and chemical substructures. Results can be examined individually or downloaded and exported in batch for use in other programs such as Microsoft Excel. We believe that PubChemSR makes it straightforward for researchers to utilize the chemical, biological and screening data available in PubChem. We present several examples of how it can be used. PMID:18482452
Updates to the Virtual Atomic and Molecular Data Centre

NASA Astrophysics Data System (ADS)

Hill, Christian; Tennyson, Jonathan; Gordon, Iouli E.; Rothman, Laurence S.; Dubernet, Marie-Lise

2014-06-01

The Virtual Atomic and Molecular Data Centre (VAMDC) has established a set of standards for the storage and transmission of atomic and molecular data and an SQL-based query language (VSS2) for searching online databases, known as nodes. The project has also created an online service, the VAMDC Portal, through which all of these databases may be searched and their results compared and aggregated. Since its inception four years ago, the VAMDC e-infrastructure has grown to encompass over 40 databases, including HITRAN, in more than 20 countries and engages actively with scientists in six continents. Associated with the portal are a growing suite of software tools for the transformation of data from its native, XML-based, XSAMS format, to a range of more convenient human-readable (such as HTML) and machinereadable (such as CSV) formats. The relational database for HITRAN1, created as part of the VAMDC project is a flexible and extensible data model which is able to represent a wider range of parameters than the current fixed-format text-based one. Over the next year, a new online interface to this database will be tested, released and fully documented - this web application, HITRANonline2, will fully replace the ageing and incomplete JavaHAWKS software suite.
CCDB: a curated database of genes involved in cervix cancer.

PubMed

Agarwal, Subhash M; Raghav, Dhwani; Singh, Harinder; Raghava, G P S

2011-01-01

The Cervical Cancer gene DataBase (CCDB, http://crdd.osdd.net/raghava/ccdb) is a manually curated catalog of experimentally validated genes that are thought, or are known to be involved in the different stages of cervical carcinogenesis. In spite of the large women population that is presently affected from this malignancy still at present, no database exists that catalogs information on genes associated with cervical cancer. Therefore, we have compiled 537 genes in CCDB that are linked with cervical cancer causation processes such as methylation, gene amplification, mutation, polymorphism and change in expression level, as evident from published literature. Each record contains details related to gene like architecture (exon-intron structure), location, function, sequences (mRNA/CDS/protein), ontology, interacting partners, homology to other eukaryotic genomes, structure and links to other public databases, thus augmenting CCDB with external data. Also, manually curated literature references have been provided to support the inclusion of the gene in the database and establish its association with cervix cancer. In addition, CCDB provides information on microRNA altered in cervical cancer as well as search facility for querying, several browse options and an online tool for sequence similarity search, thereby providing researchers with easy access to the latest information on genes involved in cervix cancer.
Biological data integration: wrapping data and tools.

PubMed

Lacroix, Zoé

2002-06-01

Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. Building a digital library for scientific data requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web as well as data generated by software. We present an approach to wrapping web data sources, databases, flat files, or data generated by tools through a database view mechanism. Generally, a wrapper has two tasks: it first sends a query to the source to retrieve data and, second builds the expected output with respect to the virtual structure. Our wrappers are composed of a retrieval component based on an intermediate object view mechanism called search views mapping the source capabilities to attributes, and an eXtensible Markup Language (XML) engine, respectively, to perform these two tasks. The originality of the approach consists of: 1) a generic view mechanism to access seamlessly data sources with limited capabilities and 2) the ability to wrap data sources as well as the useful specific tools they may provide. Our approach has been developed and demonstrated as part of the multidatabase system supporting queries via uniform object protocol model (OPM) interfaces.

Consolidating Russia and Eurasia Antibiotic Resistance Data for 1992-2014 Using Search Engine.

PubMed

Bedenkov, Alexander; Shpinev, Vitaly; Suvorov, Nikolay; Sokolov, Evgeny; Riabenko, Evgeniy

2016-01-01

The World Health Organization recognizes the antibiotic resistance problem as a major health threat in the twenty first century. The paper describes an effort to fight it undertaken at the verge of two industries-healthcare and Data Science. One of the major difficulties in monitoring antibiotic resistance is low availability of comprehensive research data. Our aim is to develop a nation-wide antibiotic resistance database using Internet search and data processing algorithms using Russian language publications. An interdisciplinary team built an intelligent Internet search filter to locate all publicly available research data on antibiotic resistance in Russia and Eurasia countries, extracted it, and collated it for analysis. A database was constructed using data from 850 original studies conducted at 153 locations in 12 countries between 1992 and 2014. The studies contained susceptibility and resistance rates of 156 microorganisms to 157 antibiotic drugs. The applied search methodology was highly robust in that it yielded search precision of 58 vs. 20% in a typical Internet search. It allowed finding and collating within the database the following data items (among many others): publication details including title, source, date, authors, etc.; study details: time period, locations, research organization, therapy area, etc.; microorganisms and antibiotic drugs included in the study along with prevalence values of resistant and susceptible strains, and numbers of isolates. The next stage in project development will try to validate the data by matching it to major benchmark studies; in addition, a panel of experts will be convened to evaluate the outcomes. The work provides a supplementary tool to national surveillance systems in antibiotic resistance, and consolidates fragmented research data available for 12 countries for a period of more than 20 years.
Integrated Proteomic Pipeline Using Multiple Search Engines for a Proteogenomic Study with a Controlled Protein False Discovery Rate.

PubMed

Park, Gun Wook; Hwang, Heeyoun; Kim, Kwang Hoe; Lee, Ju Yeon; Lee, Hyun Kyoung; Park, Ji Yeong; Ji, Eun Sun; Park, Sung-Kyu Robin; Yates, John R; Kwon, Kyung-Hoon; Park, Young Mok; Lee, Hyoung-Joo; Paik, Young-Ki; Kim, Jin Young; Yoo, Jong Shin

2016-11-04

In the Chromosome-Centric Human Proteome Project (C-HPP), false-positive identification by peptide spectrum matches (PSMs) after database searches is a major issue for proteogenomic studies using liquid-chromatography and mass-spectrometry-based large proteomic profiling. Here we developed a simple strategy for protein identification, with a controlled false discovery rate (FDR) at the protein level, using an integrated proteomic pipeline (IPP) that consists of four engrailed steps as follows. First, using three different search engines, SEQUEST, MASCOT, and MS-GF+, individual proteomic searches were performed against the neXtProt database. Second, the search results from the PSMs were combined using statistical evaluation tools including DTASelect and Percolator. Third, the peptide search scores were converted into E-scores normalized using an in-house program. Last, ProteinInferencer was used to filter the proteins containing two or more peptides with a controlled FDR of 1.0% at the protein level. Finally, we compared the performance of the IPP to a conventional proteomic pipeline (CPP) for protein identification using a controlled FDR of <1% at the protein level. Using the IPP, a total of 5756 proteins (vs 4453 using the CPP) including 477 alternative splicing variants (vs 182 using the CPP) were identified from human hippocampal tissue. In addition, a total of 10 missing proteins (vs 7 using the CPP) were identified with two or more unique peptides, and their tryptic peptides were validated using MS/MS spectral pattern from a repository database or their corresponding synthetic peptides. This study shows that the IPP effectively improved the identification of proteins, including alternative splicing variants and missing proteins, in human hippocampal tissues for the C-HPP. All RAW files used in this study were deposited in ProteomeXchange (PXD000395).
Consolidating Russia and Eurasia Antibiotic Resistance Data for 1992–2014 Using Search Engine

PubMed Central

Bedenkov, Alexander; Shpinev, Vitaly; Suvorov, Nikolay; Sokolov, Evgeny; Riabenko, Evgeniy

2016-01-01

Background: The World Health Organization recognizes the antibiotic resistance problem as a major health threat in the twenty first century. The paper describes an effort to fight it undertaken at the verge of two industries—healthcare and Data Science. One of the major difficulties in monitoring antibiotic resistance is low availability of comprehensive research data. Our aim is to develop a nation-wide antibiotic resistance database using Internet search and data processing algorithms using Russian language publications. Materials and Methods: An interdisciplinary team built an intelligent Internet search filter to locate all publicly available research data on antibiotic resistance in Russia and Eurasia countries, extracted it, and collated it for analysis. A database was constructed using data from 850 original studies conducted at 153 locations in 12 countries between 1992 and 2014. The studies contained susceptibility and resistance rates of 156 microorganisms to 157 antibiotic drugs. Results: The applied search methodology was highly robust in that it yielded search precision of 58 vs. 20% in a typical Internet search. It allowed finding and collating within the database the following data items (among many others): publication details including title, source, date, authors, etc.; study details: time period, locations, research organization, therapy area, etc.; microorganisms and antibiotic drugs included in the study along with prevalence values of resistant and susceptible strains, and numbers of isolates. The next stage in project development will try to validate the data by matching it to major benchmark studies; in addition, a panel of experts will be convened to evaluate the outcomes. Conclusions: The work provides a supplementary tool to national surveillance systems in antibiotic resistance, and consolidates fragmented research data available for 12 countries for a period of more than 20 years. PMID:27014217
SABIO-RK: an updated resource for manually curated biochemical reaction kinetics

PubMed Central

Rey, Maja; Weidemann, Andreas; Kania, Renate; Müller, Wolfgang

2018-01-01

Abstract SABIO-RK (http://sabiork.h-its.org/) is a manually curated database containing data about biochemical reactions and their reaction kinetics. The data are primarily extracted from scientific literature and stored in a relational database. The content comprises both naturally occurring and alternatively measured biochemical reactions and is not restricted to any organism class. The data are made available to the public by a web-based search interface and by web services for programmatic access. In this update we describe major improvements and extensions of SABIO-RK since our last publication in the database issue of Nucleic Acid Research (2012). (i) The website has been completely revised and (ii) allows now also free text search for kinetics data. (iii) Additional interlinkages with other databases in our field have been established; this enables users to gain directly comprehensive knowledge about the properties of enzymes and kinetics beyond SABIO-RK. (iv) Vice versa, direct access to SABIO-RK data has been implemented in several systems biology tools and workflows. (v) On request of our experimental users, the data can be exported now additionally in spreadsheet formats. (vi) The newly established SABIO-RK Curation Service allows to respond to specific data requirements. PMID:29092055
Integrative medicine for managing the symptoms of lupus nephritis: A protocol for systematic review and meta-analysis.

PubMed

Choi, Tae-Young; Jun, Ji Hee; Lee, Myeong Soo

2018-03-01

Integrative medicine is claimed to improve symptoms of lupus nephritis. No systematic reviews have been performed for the application of integrative medicine for lupus nephritis on patients with systemic lupus erythematosus (SLE). Thus, this review will aim to evaluate the current evidence on the efficacy of integrative medicine for the management of lupus nephritis in patients with SLE. The following electronic databases will be searched for studies published from their dates of inception February 2018: Medline, EMBASE and the Cochrane Central Register of Controlled Trials (CENTRAL), as well as 6 Korean medical databases (Korea Med, the Oriental Medicine Advanced Search Integrated System [OASIS], DBpia, the Korean Medical Database [KM base], the Research Information Service System [RISS], and the Korean Studies Information Services System [KISS]), and 1 Chinese medical database (the China National Knowledge Infrastructure [CNKI]). Study selection, data extraction, and assessment will be performed independently by 2 researchers. The risk of bias (ROB) will be assessed using the Cochrane ROB tool. This systematic review will be published in a peer-reviewed journal and disseminated both electronically and in print. The review will be updated to inform and guide healthcare practice and policy. PROSPERO 2018 CRD42018085205.
Cazymes Analysis Toolkit (CAT): Webservice for searching and analyzing carbohydrateactive enzymes in a newly sequenced organism using CAZy database

DOE Office of Scientific and Technical Information (OSTI.GOV)

Karpinets, Tatiana V; Park, Byung; Syed, Mustafa H

2010-01-01

The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire non-redundant sequences of the CAZy database. Themore » second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains (DUF) and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit (CAT), and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.« less
CAZymes Analysis Toolkit (CAT): web service for searching and analyzing carbohydrate-active enzymes in a newly sequenced organism using CAZy database.

PubMed

Park, Byung H; Karpinets, Tatiana V; Syed, Mustafa H; Leuze, Michael R; Uberbacher, Edward C

2010-12-01

The Carbohydrate-Active Enzyme (CAZy) database provides a rich set of manually annotated enzymes that degrade, modify, or create glycosidic bonds. Despite rich and invaluable information stored in the database, software tools utilizing this information for annotation of newly sequenced genomes by CAZy families are limited. We have employed two annotation approaches to fill the gap between manually curated high-quality protein sequences collected in the CAZy database and the growing number of other protein sequences produced by genome or metagenome sequencing projects. The first approach is based on a similarity search against the entire nonredundant sequences of the CAZy database. The second approach performs annotation using links or correspondences between the CAZy families and protein family domains. The links were discovered using the association rule learning algorithm applied to sequences from the CAZy database. The approaches complement each other and in combination achieved high specificity and sensitivity when cross-evaluated with the manually curated genomes of Clostridium thermocellum ATCC 27405 and Saccharophagus degradans 2-40. The capability of the proposed framework to predict the function of unknown protein domains and of hypothetical proteins in the genome of Neurospora crassa is demonstrated. The framework is implemented as a Web service, the CAZymes Analysis Toolkit, and is available at http://cricket.ornl.gov/cgi-bin/cat.cgi.
Literature searches on Ayurveda: An update

PubMed Central

Aggithaya, Madhur G.; Narahari, Saravu R.

2015-01-01

Introduction: The journals that publish on Ayurveda are increasingly indexed by popular medical databases in recent years. However, many Eastern journals are not indexed biomedical journal databases such as PubMed. Literature searches for Ayurveda continue to be challenging due to the nonavailability of active, unbiased dedicated databases for Ayurvedic literature. In 2010, authors identified 46 databases that can be used for systematic search of Ayurvedic papers and theses. This update reviewed our previous recommendation and identified current and relevant databases. Aims: To update on Ayurveda literature search and strategy to retrieve maximum publications. Methods: Author used psoriasis as an example to search previously listed databases and identify new. The population, intervention, control, and outcome table included keywords related to psoriasis and Ayurvedic terminologies for skin diseases. Current citation update status, search results, and search options of previous databases were assessed. Eight search strategies were developed. Hundred and five journals, both biomedical and Ayurveda, which publish on Ayurveda, were identified. Variability in databases was explored to identify bias in journal citation. Results: Five among 46 databases are now relevant – AYUSH research portal, Annotated Bibliography of Indian Medicine, Digital Helpline for Ayurveda Research Articles (DHARA), PubMed, and Directory of Open Access Journals. Search options in these databases are not uniform, and only PubMed allows complex search strategy. “The Researches in Ayurveda” and “Ayurvedic Research Database” (ARD) are important grey resources for hand searching. About 44/105 (41.5%) journals publishing Ayurvedic studies are not indexed in any database. Only 11/105 (10.4%) exclusive Ayurveda journals are indexed in PubMed. Conclusion: AYUSH research portal and DHARA are two major portals after 2010. It is mandatory to search PubMed and four other databases because all five carry citations from different groups of journals. The hand searching is important to identify Ayurveda publications that are not indexed elsewhere. Availability information of citations in Ayurveda libraries from National Union Catalogue of Scientific Serials in India if regularly updated will improve the efficacy of hand searching. A grey database (ARD) contains unpublished PG/Ph.D. theses. The AYUSH portal, DHARA (funded by Ministry of AYUSH), and ARD should be merged to form single larger database to limit Ayurveda literature searches. PMID:27313409
The Plant Ontology: A Tool for Plant Genomics.

PubMed

Cooper, Laurel; Jaiswal, Pankaj

2016-01-01

The use of controlled, structured vocabularies (ontologies) has become a critical tool for scientists in the post-genomic era of massive datasets. Adoption and integration of common vocabularies and annotation practices enables cross-species comparative analyses and increases data sharing and reusability. The Plant Ontology (PO; http://www.plantontology.org/ ) describes plant anatomy, morphology, and the stages of plant development, and offers a database of plant genomics annotations associated to the PO terms. The scope of the PO has grown from its original design covering only rice, maize, and Arabidopsis, and now includes terms to describe all green plants from angiosperms to green algae.This chapter introduces how the PO and other related ontologies are constructed and organized, including languages and software used for ontology development, and provides an overview of the key features. Detailed instructions illustrate how to search and browse the PO database and access the associated annotation data. Users are encouraged to provide input on the ontology through the online term request form and contribute datasets for integration in the PO database.
dbHiMo: a web-based epigenomics platform for histone-modifying enzymes.

PubMed

Choi, Jaeyoung; Kim, Ki-Tae; Huh, Aram; Kwon, Seomun; Hong, Changyoung; Asiegbu, Fred O; Jeon, Junhyun; Lee, Yong-Hwan

2015-01-01

Over the past two decades, epigenetics has evolved into a key concept for understanding regulation of gene expression. Among many epigenetic mechanisms, covalent modifications such as acetylation and methylation of lysine residues on core histones emerged as a major mechanism in epigenetic regulation. Here, we present the database for histone-modifying enzymes (dbHiMo; http://hme.riceblast.snu.ac.kr/) aimed at facilitating functional and comparative analysis of histone-modifying enzymes (HMEs). HMEs were identified by applying a search pipeline built upon profile hidden Markov model (HMM) to proteomes. The database incorporates 11,576 HMEs identified from 603 proteomes including 483 fungal, 32 plants and 51 metazoan species. The dbHiMo provides users with web-based personalized data browsing and analysis tools, supporting comparative and evolutionary genomics. With comprehensive data entries and associated web-based tools, our database will be a valuable resource for future epigenetics/epigenomics studies. © The Author(s) 2015. Published by Oxford University Press.
UCbase 2.0: ultraconserved sequences database (2014 update).

PubMed

Lomonaco, Vincenzo; Martoglia, Riccardo; Mandreoli, Federica; Anderlucci, Laura; Emmett, Warren; Bicciato, Silvio; Taccioli, Cristian

2014-01-01

UCbase 2.0 (http://ucbase.unimore.it) is an update, extension and evolution of UCbase, a Web tool dedicated to the analysis of ultraconserved sequences (UCRs). UCRs are 481 sequences >200 bases sharing 100% identity among human, mouse and rat genomes. They are frequently located in genomic regions known to be involved in cancer or differentially expressed in human leukemias and carcinomas. UCbase 2.0 is a platform-independent Web resource that includes the updated version of the human genome annotation (hg19), information linking disorders to chromosomal coordinates based on the Systematized Nomenclature of Medicine classification, a query tool to search for Single Nucleotide Polymorphisms (SNPs) and a new text box to directly interrogate the database using a MySQL interface. To facilitate the interactive visual interpretation of UCR chromosomal positioning, UCbase 2.0 now includes a graph visualization interface directly linked to UCSC genome browser. Database URL: http://ucbase.unimore.it. © The Author(s) 2014. Published by Oxford University Press.
Web-Based Tools for Text-Based Patient-Provider Communication in Chronic Conditions: Scoping Review

PubMed Central

Grunfeld, Eva; Makuwaza, Tutsirai; Bender, Jacqueline L

2017-01-01

Background Patients with chronic conditions require ongoing care which not only necessitates support from health care providers outside appointments but also self-management. Web-based tools for text-based patient-provider communication, such as secure messaging, allow for sharing of contextual information and personal narrative in a simple accessible medium, empowering patients and enabling their providers to address emerging care needs. Objective The objectives of this study were to (1) conduct a systematic search of the published literature and the Internet for Web-based tools for text-based communication between patients and providers; (2) map tool characteristics, their intended use, contexts in which they were used, and by whom; (3) describe the nature of their evaluation; and (4) understand the terminology used to describe the tools. Methods We conducted a scoping review using the MEDLINE (Medical Literature Analysis and Retrieval System Online) and EMBASE (Excerpta Medica Database) databases. We summarized information on the characteristics of the tools (structure, functions, and communication paradigm), intended use, context and users, evaluation (study design and outcomes), and terminology. We performed a parallel search of the Internet to compare with tools identified in the published literature. Results We identified 54 papers describing 47 unique tools from 13 countries studied in the context of 68 chronic health conditions. The majority of tools (77%, 36/47) had functions in addition to communication (eg, viewable care plan, symptom diary, or tracker). Eight tools (17%, 8/47) were described as allowing patients to communicate with the team or multiple health care providers. Most of the tools were intended to support communication regarding symptom reporting (49%, 23/47), and lifestyle or behavior modification (36%, 17/47). The type of health care providers who used tools to communicate with patients were predominantly allied health professionals of various disciplines (30%, 14/47), nurses (23%, 11/47), and physicians (19%, 9/47), among others. Over half (52%, 25/48) of the tools were evaluated in randomized controlled trials, and 23 tools (48%, 23/48) were evaluated in nonrandomized studies. Terminology of tools varied by intervention type and functionality and did not consistently reflect a theme of communication. The majority of tools found in the Internet search were patient portals from 6 developers; none were found among published articles. Conclusions Web-based tools for text-based patient-provider communication were identified from a wide variety of clinical contexts and with varied functionality. Tools were most prevalent in contexts where intended use was self-management. Few tools for team-based communication were found, but this may become increasingly important as chronic disease care becomes more interdisciplinary. PMID:29079552
Web-Based Tools for Text-Based Patient-Provider Communication in Chronic Conditions: Scoping Review.

PubMed

Voruganti, Teja; Grunfeld, Eva; Makuwaza, Tutsirai; Bender, Jacqueline L

2017-10-27

Patients with chronic conditions require ongoing care which not only necessitates support from health care providers outside appointments but also self-management. Web-based tools for text-based patient-provider communication, such as secure messaging, allow for sharing of contextual information and personal narrative in a simple accessible medium, empowering patients and enabling their providers to address emerging care needs. The objectives of this study were to (1) conduct a systematic search of the published literature and the Internet for Web-based tools for text-based communication between patients and providers; (2) map tool characteristics, their intended use, contexts in which they were used, and by whom; (3) describe the nature of their evaluation; and (4) understand the terminology used to describe the tools. We conducted a scoping review using the MEDLINE (Medical Literature Analysis and Retrieval System Online) and EMBASE (Excerpta Medica Database) databases. We summarized information on the characteristics of the tools (structure, functions, and communication paradigm), intended use, context and users, evaluation (study design and outcomes), and terminology. We performed a parallel search of the Internet to compare with tools identified in the published literature. We identified 54 papers describing 47 unique tools from 13 countries studied in the context of 68 chronic health conditions. The majority of tools (77%, 36/47) had functions in addition to communication (eg, viewable care plan, symptom diary, or tracker). Eight tools (17%, 8/47) were described as allowing patients to communicate with the team or multiple health care providers. Most of the tools were intended to support communication regarding symptom reporting (49%, 23/47), and lifestyle or behavior modification (36%, 17/47). The type of health care providers who used tools to communicate with patients were predominantly allied health professionals of various disciplines (30%, 14/47), nurses (23%, 11/47), and physicians (19%, 9/47), among others. Over half (52%, 25/48) of the tools were evaluated in randomized controlled trials, and 23 tools (48%, 23/48) were evaluated in nonrandomized studies. Terminology of tools varied by intervention type and functionality and did not consistently reflect a theme of communication. The majority of tools found in the Internet search were patient portals from 6 developers; none were found among published articles. Web-based tools for text-based patient-provider communication were identified from a wide variety of clinical contexts and with varied functionality. Tools were most prevalent in contexts where intended use was self-management. Few tools for team-based communication were found, but this may become increasingly important as chronic disease care becomes more interdisciplinary. ©Teja Voruganti, Eva Grunfeld, Tutsirai Makuwaza, Jacqueline L Bender. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 27.10.2017.
Are Bibliographic Management Software Search Interfaces Reliable?: A Comparison between Search Results Obtained Using Database Interfaces and the EndNote Online Search Function

ERIC Educational Resources Information Center

Fitzgibbons, Megan; Meert, Deborah

2010-01-01

The use of bibliographic management software and its internal search interfaces is now pervasive among researchers. This study compares the results between searches conducted in academic databases' search interfaces versus the EndNote search interface. The results show mixed search reliability, depending on the database and type of search…
[SCREENING OF NUTRITIONAL STATUS AMONG ELDERLY PEOPLE AT FAMILY MEDICINE].

PubMed

Račić, M; Ivković, N; Kusmuk, S

2015-11-01

The prevalence of malnutrition in elderly is high. Malnutrition or risk of malnutrition can be detected by use of nutritional screening or assessment tools. This systematic review aimed to identify tools that would be reliable, valid, sensitive and specific for nutritional status screening in patients older than 65 at family medicine. The review was performed following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. Studies were retrieved using MEDLINE (via Ovid), PubMed and Cochrane Library electronic databases and by manual searching of relevant articles listed in reference list of key publications. The electronic databases were searched using defined key words adapted to each database and using MESH terms. Manual revision of reviews and original articles was performed using Electronic Journals Library. Included studies involved development and validation of screening tools in the community-dwelling elderly population. The tools, subjected to validity and reliability testing for use in the community-dwelling elderly population were Mini Nutritional Assessment (MNA), Mini Nutritional Assessment-Short Form (MNA-SF), Nutrition Screening Initiative (NSI), which includes DETERMINE list, Level I and II Screen, Seniors in the Community: Risk Evaluation for Eating, and Nutrition (SCREEN I and SCREEN II), Subjective Global Assessment (SGA), Nutritional Risk Index (NRI), and Malaysian and South African tool. MNA and MNA-SF appear to have highest reliability and validity for screening of community-dwelling elderly, while the reliability and validity of SCREEN II are good. The authors conclude that whilst several tools have been developed, most have not undergone extensive testing to demonstrate their ability to identify nutritional risk. MNA and MNA-SF have the highest reliability and validity for screening of nutritional status in the community-dwelling elderly, and the reliability and validity of SCREEN II are satisfactory. These instruments also contain all three nutritional status indicators and are practical for use in family medicine. However, the gold standard for screening cannot be set because testing of reliability and continuous validation in the study with a higher level of evidence need to be conducted in family medicine.
An Interactive Multi-instrument Database of Solar Flares

NASA Astrophysics Data System (ADS)

Sadykov, Viacheslav M.; Kosovichev, Alexander G.; Oria, Vincent; Nita, Gelu M.

2017-07-01

Solar flares are complicated physical phenomena that are observable in a broad range of the electromagnetic spectrum, from radio waves to γ-rays. For a more comprehensive understanding of flares, it is necessary to perform a combined multi-wavelength analysis using observations from many satellites and ground-based observatories. For an efficient data search, integration of different flare lists, and representation of observational data, we have developed the Interactive Multi-Instrument Database of Solar Flares (IMIDSF, https://solarflare.njit.edu/). The web-accessible database is fully functional and allows the user to search for uniquely identified flare events based on their physical descriptors and the availability of observations by a particular set of instruments. Currently, the data from three primary flare lists (Geostationary Operational Environmental Satellites, RHESSI, and HEK) and a variety of other event catalogs (Hinode, Fermi GBM, Konus-WIND, the OVSA flare catalogs, the CACTus CME catalog, the Filament eruption catalog) and observing logs (IRIS and Nobeyama coverage) are integrated, and an additional set of physical descriptors (temperature and emission measure) is provided along with an observing summary, data links, and multi-wavelength light curves for each flare event since 2002 January. We envision that this new tool will allow researchers to significantly speed up the search of events of interest for statistical and case studies.
RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics

PubMed Central

Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo

2007-01-01

Background The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. Results Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request. PMID:17961253
Freva - Freie Univ Evaluation System Framework for Scientific HPC Infrastructures in Earth System Modeling

NASA Astrophysics Data System (ADS)

Kadow, C.; Illing, S.; Schartner, T.; Grieger, J.; Kirchner, I.; Rust, H.; Cubasch, U.; Ulbrich, U.

2017-12-01

The Freie Univ Evaluation System Framework (Freva - freva.met.fu-berlin.de) is a software infrastructure for standardized data and tool solutions in Earth system science (e.g. www-miklip.dkrz.de, cmip-eval.dkrz.de). Freva runs on high performance computers to handle customizable evaluation systems of research projects, institutes or universities. It combines different software technologies into one common hybrid infrastructure, including all features present in the shell and web environment. The database interface satisfies the international standards provided by the Earth System Grid Federation (ESGF). Freva indexes different data projects into one common search environment by storing the meta data information of the self-describing model, reanalysis and observational data sets in a database. This implemented meta data system with its advanced but easy-to-handle search tool supports users, developers and their plugins to retrieve the required information. A generic application programming interface (API) allows scientific developers to connect their analysis tools with the evaluation system independently of the programming language used. Users of the evaluation techniques benefit from the common interface of the evaluation system without any need to understand the different scripting languages. The integrated web-shell (shellinabox) adds a degree of freedom in the choice of the working environment and can be used as a gate to the research projects HPC. Plugins are able to integrate their e.g. post-processed results into the database of the user. This allows e.g. post-processing plugins to feed statistical analysis plugins, which fosters an active exchange between plugin developers of a research project. Additionally, the history and configuration sub-system stores every analysis performed with the evaluation system in a database. Configurations and results of the tools can be shared among scientists via shell or web system. Furthermore, if configurations match while starting an evaluation plugin, the system suggests to use results already produced by other users - saving CPU/h, I/O, disk space and time. The efficient interaction between different technologies improves the Earth system modeling science framed by Freva.
reSpect: Software for Identification of High and Low Abundance Ion Species in Chimeric Tandem Mass Spectra

PubMed Central

Shteynberg, David; Mendoza, Luis; Hoopmann, Michael R.; Sun, Zhi; Schmidt, Frank; Deutsch, Eric W.; Moritz, Robert L.

2016-01-01

Most shotgun proteomics data analysis workflows are based on the assumption that each fragment ion spectrum is explained by a single species of peptide ion isolated by the mass spectrometer; however, in reality mass spectrometers often isolate more than one peptide ion within the window of isolation that contributes to additional peptide fragment peaks in many spectra. We present a new tool called reSpect, implemented in the Trans-Proteomic Pipeline (TPP), that enables an iterative workflow whereby fragment ion peaks explained by a peptide ion identified in one round of sequence searching or spectral library search are attenuated based on the confidence of the identification, and then the altered spectrum is subjected to further rounds of searching. The reSpect tool is not implemented as a search engine, but rather as a post search engine processing step where only fragment ion intensities are altered. This enables the application of any search engine combination in the following iterations. Thus, reSpect is compatible with all other protein sequence database search engines as well as peptide spectral library search engines that are supported by the TPP. We show that while some datasets are highly amenable to chimeric spectrum identification and lead to additional peptide identification boosts of over 30% with as many as four different peptide ions identified per spectrum, datasets with narrow precursor ion selection only benefit from such processing at the level of a few percent. We demonstrate a technique that facilitates the determination of the degree to which a dataset would benefit from chimeric spectrum analysis. The reSpect tool is free and open source, provided within the TPP and available at the TPP website. PMID:26419769
reSpect: software for identification of high and low abundance ion species in chimeric tandem mass spectra.

PubMed

Shteynberg, David; Mendoza, Luis; Hoopmann, Michael R; Sun, Zhi; Schmidt, Frank; Deutsch, Eric W; Moritz, Robert L

2015-11-01

Most shotgun proteomics data analysis workflows are based on the assumption that each fragment ion spectrum is explained by a single species of peptide ion isolated by the mass spectrometer; however, in reality mass spectrometers often isolate more than one peptide ion within the window of isolation that contribute to additional peptide fragment peaks in many spectra. We present a new tool called reSpect, implemented in the Trans-Proteomic Pipeline (TPP), which enables an iterative workflow whereby fragment ion peaks explained by a peptide ion identified in one round of sequence searching or spectral library search are attenuated based on the confidence of the identification, and then the altered spectrum is subjected to further rounds of searching. The reSpect tool is not implemented as a search engine, but rather as a post-search engine processing step where only fragment ion intensities are altered. This enables the application of any search engine combination in the iterations that follow. Thus, reSpect is compatible with all other protein sequence database search engines as well as peptide spectral library search engines that are supported by the TPP. We show that while some datasets are highly amenable to chimeric spectrum identification and lead to additional peptide identification boosts of over 30% with as many as four different peptide ions identified per spectrum, datasets with narrow precursor ion selection only benefit from such processing at the level of a few percent. We demonstrate a technique that facilitates the determination of the degree to which a dataset would benefit from chimeric spectrum analysis. The reSpect tool is free and open source, provided within the TPP and available at the TPP website. Graphical Abstract ᅟ.

reSpect: Software for Identification of High and Low Abundance Ion Species in Chimeric Tandem Mass Spectra

NASA Astrophysics Data System (ADS)

Shteynberg, David; Mendoza, Luis; Hoopmann, Michael R.; Sun, Zhi; Schmidt, Frank; Deutsch, Eric W.; Moritz, Robert L.

2015-11-01

Most shotgun proteomics data analysis workflows are based on the assumption that each fragment ion spectrum is explained by a single species of peptide ion isolated by the mass spectrometer; however, in reality mass spectrometers often isolate more than one peptide ion within the window of isolation that contribute to additional peptide fragment peaks in many spectra. We present a new tool called reSpect, implemented in the Trans-Proteomic Pipeline (TPP), which enables an iterative workflow whereby fragment ion peaks explained by a peptide ion identified in one round of sequence searching or spectral library search are attenuated based on the confidence of the identification, and then the altered spectrum is subjected to further rounds of searching. The reSpect tool is not implemented as a search engine, but rather as a post-search engine processing step where only fragment ion intensities are altered. This enables the application of any search engine combination in the iterations that follow. Thus, reSpect is compatible with all other protein sequence database search engines as well as peptide spectral library search engines that are supported by the TPP. We show that while some datasets are highly amenable to chimeric spectrum identification and lead to additional peptide identification boosts of over 30% with as many as four different peptide ions identified per spectrum, datasets with narrow precursor ion selection only benefit from such processing at the level of a few percent. We demonstrate a technique that facilitates the determination of the degree to which a dataset would benefit from chimeric spectrum analysis. The reSpect tool is free and open source, provided within the TPP and available at the TPP website.
Computer Card Games in Computer Science Education: A 10-Year Review

ERIC Educational Resources Information Center

Kordaki, Maria; Gousiou, Anthi

2016-01-01

This paper presents a 10-year review study that focuses on the investigation of the use of computer card games (CCGs) as learning tools in Computer Science (CS) Education. Specific search terms keyed into 10 large scientific electronic databases identified 24 papers referring to the use of CCGs for the learning of CS matters during the last…
Automating Information Discovery Within the Invisible Web

NASA Astrophysics Data System (ADS)

Sweeney, Edwina; Curran, Kevin; Xie, Ermai

A Web crawler or spider crawls through the Web looking for pages to index, and when it locates a new page it passes the page on to an indexer. The indexer identifies links, keywords, and other content and stores these within its database. This database is searched by entering keywords through an interface and suitable Web pages are returned in a results page in the form of hyperlinks accompanied by short descriptions. The Web, however, is increasingly moving away from being a collection of documents to a multidimensional repository for sounds, images, audio, and other formats. This is leading to a situation where certain parts of the Web are invisible or hidden. The term known as the "Deep Web" has emerged to refer to the mass of information that can be accessed via the Web but cannot be indexed by conventional search engines. The concept of the Deep Web makes searches quite complex for search engines. Google states that the claim that conventional search engines cannot find such documents as PDFs, Word, PowerPoint, Excel, or any non-HTML page is not fully accurate and steps have been taken to address this problem by implementing procedures to search items such as academic publications, news, blogs, videos, books, and real-time information. However, Google still only provides access to a fraction of the Deep Web. This chapter explores the Deep Web and the current tools available in accessing it.
Use of methodological tools for assessing the quality of studies in periodontology and implant dentistry: a systematic review.

PubMed

Faggion, Clovis M; Huda, Fahd; Wasiak, Jason

2014-06-01

To evaluate the methodological approaches used to assess the quality of studies included in systematic reviews (SRs) in periodontology and implant dentistry. Two electronic databases (PubMed and Cochrane Database of Systematic Reviews) were searched independently to identify SRs examining interventions published through 2 September 2013. The reference lists of included SRs and records of 10 specialty dental journals were searched manually. Methodological approaches were assessed using seven criteria based on the Cochrane Handbook for Systematic Reviews of Interventions. Temporal trends in methodological quality were also explored. Of the 159 SRs with meta-analyses included in the analysis, 44 (28%) reported the use of domain-based tools, 15 (9%) reported the use of checklists and 7 (4%) reported the use of scales. Forty-two (26%) SRs reported use of more than one tool. Criteria were met heterogeneously; authors of 15 (9%) publications incorporated the quality of evidence of primary studies into SRs, whereas 69% of SRs reported methodological approaches in the Materials/Methods section. Reporting of four criteria was significantly better in recent (2010-2013) than in previous publications. The analysis identified several methodological limitations of approaches used to assess evidence in studies included in SRs in periodontology and implant dentistry. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Linking Publications to Instruments, Field Campaigns, Sites and Working Groups: The ARM Experience

NASA Astrophysics Data System (ADS)

Lehnert, K.; Parsons, M. A.; Ramachandran, R.; Fils, D.; Narock, T.; Fox, P. A.; Troyan, D.; Cialella, A. T.; Gregory, L.; Lazar, K.; Liang, M.; Ma, L.; Tilp, A.; Wagener, R.

2017-12-01

For the past 25 years, the ARM Climate Research Facility - a US Department of Energy scientific user facility - has been collecting atmospheric data in different climatic regimes using both in situ and remote instrumentation. Configuration of the facility's components has been designed to improve the understanding and representation, in climate and earth system models, of clouds and aerosols. Placing a premium on long-term continuous data collection resulted in terabytes of data having been collected, stored, and made accessible to any interested person. All data is accessible via the ARM.gov website and the ARM Data Discovery Tool. A team of metadata professionals assign appropriate tags to help facilitate searching the databases for desired data. The knowledge organization tools and concepts are used to create connections between data, instruments, field campaigns, sites, and measurements are familiar to informatics professionals. Ontology, taxonomy, classification, and thesauri are among the customized concepts put into practice for ARM's purposes. In addition to the multitude of data available, there have been approximately 3,000 journal articles that utilize ARM data. These have been linked to specific ARM web pages. Searches of the complete ARM publication database can be done using a separate interface. This presentation describes how ARM data is linked to instruments, sites, field campaigns, and publications through the application of standard knowledge organization tools and concepts.
Database documentation of marine mammal stranding and mortality: current status review and future prospects.

PubMed

Chan, Derek K P; Tsui, Henry C L; Kot, Brian C W

2017-11-21

Databases are systematic tools to archive and manage information related to marine mammal stranding and mortality events. Stranding response networks, governmental authorities and non-governmental organizations have established regional or national stranding networks and have developed unique standard stranding response and necropsy protocols to document and track stranded marine mammal demographics, signalment and health data. The objectives of this study were to (1) describe and review the current status of marine mammal stranding and mortality databases worldwide, including the year established, types of database and their goals; and (2) summarize the geographic range included in the database, the number of cases recorded, accessibility, filter and display methods. Peer-reviewed literature was searched, focussing on published databases of live and dead marine mammal strandings and mortality and information released from stranding response organizations (i.e. online updates, journal articles and annual stranding reports). Databases that were not published in the primary literature or recognized by government agencies were excluded. Based on these criteria, 10 marine mammal stranding and mortality databases were identified, and strandings and necropsy data found in these databases were evaluated. We discuss the results, limitations and future prospects of database development. Future prospects include the development and application of virtopsy, a new necropsy investigation tool. A centralized web-accessed database of all available postmortem multimedia from stranded marine mammals may eventually support marine conservation and policy decisions, which will allow the use of marine animals as sentinels of ecosystem health, working towards a 'One Ocean-One Health' ideal.
Analysis of human serum phosphopeptidome by a focused database searching strategy.

PubMed

Zhu, Jun; Wang, Fangjun; Cheng, Kai; Song, Chunxia; Qin, Hongqiang; Hu, Lianghai; Figeys, Daniel; Ye, Mingliang; Zou, Hanfa

2013-01-14

As human serum is an important source for early diagnosis of many serious diseases, analysis of serum proteome and peptidome has been extensively performed. However, the serum phosphopeptidome was less explored probably because the effective method for database searching is lacking. Conventional database searching strategy always uses the whole proteome database, which is very time-consuming for phosphopeptidome search due to the huge searching space resulted from the high redundancy of the database and the setting of dynamic modifications during searching. In this work, a focused database searching strategy using an in-house collected human serum pro-peptidome target/decoy database (HuSPep) was established. It was found that the searching time was significantly decreased without compromising the identification sensitivity. By combining size-selective Ti (IV)-MCM-41 enrichment, RP-RP off-line separation, and complementary CID and ETD fragmentation with the new searching strategy, 143 unique endogenous phosphopeptides and 133 phosphorylation sites (109 novel sites) were identified from human serum with high reliability. Copyright © 2012 Elsevier B.V. All rights reserved.
CyanOmics: an integrated database of omics for the model cyanobacterium Synechococcus sp. PCC 7002.

PubMed

Yang, Yaohua; Feng, Jie; Li, Tao; Ge, Feng; Zhao, Jindong

2015-01-01

Cyanobacteria are an important group of organisms that carry out oxygenic photosynthesis and play vital roles in both the carbon and nitrogen cycles of the Earth. The annotated genome of Synechococcus sp. PCC 7002, as an ideal model cyanobacterium, is available. A series of transcriptomic and proteomic studies of Synechococcus sp. PCC 7002 cells grown under different conditions have been reported. However, no database of such integrated omics studies has been constructed. Here we present CyanOmics, a database based on the results of Synechococcus sp. PCC 7002 omics studies. CyanOmics comprises one genomic dataset, 29 transcriptomic datasets and one proteomic dataset and should prove useful for systematic and comprehensive analysis of all those data. Powerful browsing and searching tools are integrated to help users directly access information of interest with enhanced visualization of the analytical results. Furthermore, Blast is included for sequence-based similarity searching and Cluster 3.0, as well as the R hclust function is provided for cluster analyses, to increase CyanOmics's usefulness. To the best of our knowledge, it is the first integrated omics analysis database for cyanobacteria. This database should further understanding of the transcriptional patterns, and proteomic profiling of Synechococcus sp. PCC 7002 and other cyanobacteria. Additionally, the entire database framework is applicable to any sequenced prokaryotic genome and could be applied to other integrated omics analysis projects. Database URL: http://lag.ihb.ac.cn/cyanomics. © The Author(s) 2015. Published by Oxford University Press.
Dynamic publication model for neurophysiology databases.

PubMed

Gardner, D; Abato, M; Knuth, K H; DeBellis, R; Erde, S M

2001-08-29

We have implemented a pair of database projects, one serving cortical electrophysiology and the other invertebrate neurones and recordings. The design for each combines aspects of two proven schemes for information interchange. The journal article metaphor determined the type, scope, organization and quantity of data to comprise each submission. Sequence databases encouraged intuitive tools for data viewing, capture, and direct submission by authors. Neurophysiology required transcending these models with new datatypes. Time-series, histogram and bivariate datatypes, including illustration-like wrappers, were selected by their utility to the community of investigators. As interpretation of neurophysiological recordings depends on context supplied by metadata attributes, searches are via visual interfaces to sets of controlled-vocabulary metadata trees. Neurones, for example, can be specified by metadata describing functional and anatomical characteristics. Permanence is advanced by data model and data formats largely independent of contemporary technology or implementation, including Java and the XML standard. All user tools, including dynamic data viewers that serve as a virtual oscilloscope, are Java-based, free, multiplatform, and distributed by our application servers to any contemporary networked computer. Copyright is retained by submitters; viewer displays are dynamic and do not violate copyright of related journal figures. Panels of neurophysiologists view and test schemas and tools, enhancing community support.
Clusters of genetic diseases in Brazil.

PubMed

Cardoso, Gabriela Costa; de Oliveira, Marcelo Zagonel; Paixão-Côrtes, Vanessa Rodrigues; Castilla, Eduardo Enrique; Schuler-Faccini, Lavínia

2018-06-02

The aim of this paper is to present a database of isolated communities (CENISO) with high prevalence of genetic disorders or congenital anomalies in Brazil. We used two strategies to identify such communities: (1) a systematic literature review and (2) a "rumor strategy" based on anecdotal accounts. All rumors and reports were validated in a stepwise process. The bibliographical search identified 34 rumors and 245 rumors through the rumor strategy, and 144 were confirmed. A database like this one presented here represents an important tool for the planning of health priorities for rare diseases in low- and middle-income countries with large populations.
EvolMarkers: a database for mining exon and intron markers for evolution, ecology and conservation studies.

PubMed

Li, Chenhong; Riethoven, Jean-Jack M; Naylor, Gavin J P

2012-09-01

Recent innovations in next-generation sequencing have lowered the cost of genome projects. Nevertheless, sequencing entire genomes for all representatives in a study remains expensive and unnecessary for most studies in ecology, evolution and conservation. It is still more cost-effective and efficient to target and sequence single-copy nuclear gene markers for such studies. Many tools have been developed for identifying nuclear markers, but most of these have focused on particular taxonomic groups. We have built a searchable database, EvolMarkers, for developing single-copy coding sequence (CDS) and exon-primed-intron-crossing (EPIC) markers that is designed to work across a broad range of phylogenetic divergences. The database is made up of single-copy CDS derived from BLAST searches of a variety of metazoan genomes. Users can search the database for different types of markers (CDS or EPIC) that are common to different sets of input species with different divergence characteristics. EvolMarkers can be applied to any taxonomic group for which genome data are available for two or more species. We included 82 genomes in the first version of EvolMarkers and have found the methods to be effective across Placozoa, Cnidaria, Arthropod, Nematoda, Annelida, Mollusca, Echinodermata, Hemichordata, Chordata and plants. We demonstrate the effectiveness of searching for CDS markers within annelids and show how to find potentially useful intronic markers within the lizard Anolis. © 2012 Blackwell Publishing Ltd.
rSNPBase 3.0: an updated database of SNP-related regulatory elements, element-gene pairs and SNP-based gene regulatory networks.

PubMed

Guo, Liyuan; Wang, Jing

2018-01-04

Here, we present the updated rSNPBase 3.0 database (http://rsnp3.psych.ac.cn), which provides human SNP-related regulatory elements, element-gene pairs and SNP-based regulatory networks. This database is the updated version of the SNP regulatory annotation database rSNPBase and rVarBase. In comparison to the last two versions, there are both structural and data adjustments in rSNPBase 3.0: (i) The most significant new feature is the expansion of analysis scope from SNP-related regulatory elements to include regulatory element-target gene pairs (E-G pairs), therefore it can provide SNP-based gene regulatory networks. (ii) Web function was modified according to data content and a new network search module is provided in the rSNPBase 3.0 in addition to the previous regulatory SNP (rSNP) search module. The two search modules support data query for detailed information (related-elements, element-gene pairs, and other extended annotations) on specific SNPs and SNP-related graphic networks constructed by interacting transcription factors (TFs), miRNAs and genes. (3) The type of regulatory elements was modified and enriched. To our best knowledge, the updated rSNPBase 3.0 is the first data tool supports SNP functional analysis from a regulatory network prospective, it will provide both a comprehensive understanding and concrete guidance for SNP-related regulatory studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
rSNPBase 3.0: an updated database of SNP-related regulatory elements, element-gene pairs and SNP-based gene regulatory networks

PubMed Central

2018-01-01

Abstract Here, we present the updated rSNPBase 3.0 database (http://rsnp3.psych.ac.cn), which provides human SNP-related regulatory elements, element-gene pairs and SNP-based regulatory networks. This database is the updated version of the SNP regulatory annotation database rSNPBase and rVarBase. In comparison to the last two versions, there are both structural and data adjustments in rSNPBase 3.0: (i) The most significant new feature is the expansion of analysis scope from SNP-related regulatory elements to include regulatory element–target gene pairs (E–G pairs), therefore it can provide SNP-based gene regulatory networks. (ii) Web function was modified according to data content and a new network search module is provided in the rSNPBase 3.0 in addition to the previous regulatory SNP (rSNP) search module. The two search modules support data query for detailed information (related-elements, element-gene pairs, and other extended annotations) on specific SNPs and SNP-related graphic networks constructed by interacting transcription factors (TFs), miRNAs and genes. (3) The type of regulatory elements was modified and enriched. To our best knowledge, the updated rSNPBase 3.0 is the first data tool supports SNP functional analysis from a regulatory network prospective, it will provide both a comprehensive understanding and concrete guidance for SNP-related regulatory studies. PMID:29140525
Database Resources of the BIG Data Center in 2018

PubMed Central

Xu, Xingjian; Hao, Lili; Zhu, Junwei; Tang, Bixia; Zhou, Qing; Song, Fuhai; Chen, Tingting; Zhang, Sisi; Dong, Lili; Lan, Li; Wang, Yanqing; Sang, Jian; Hao, Lili; Liang, Fang; Cao, Jiabao; Liu, Fang; Liu, Lin; Wang, Fan; Ma, Yingke; Xu, Xingjian; Zhang, Lijuan; Chen, Meili; Tian, Dongmei; Li, Cuiping; Dong, Lili; Du, Zhenglin; Yuan, Na; Zeng, Jingyao; Zhang, Zhewen; Wang, Jinyue; Shi, Shuo; Zhang, Yadong; Pan, Mengyu; Tang, Bixia; Zou, Dong; Song, Shuhui; Sang, Jian; Xia, Lin; Wang, Zhennan; Li, Man; Cao, Jiabao; Niu, Guangyi; Zhang, Yang; Sheng, Xin; Lu, Mingming; Wang, Qi; Xiao, Jingfa; Zou, Dong; Wang, Fan; Hao, Lili; Liang, Fang; Li, Mengwei; Sun, Shixiang; Zou, Dong; Li, Rujiao; Yu, Chunlei; Wang, Guangyu; Sang, Jian; Liu, Lin; Li, Mengwei; Li, Man; Niu, Guangyi; Cao, Jiabao; Sun, Shixiang; Xia, Lin; Yin, Hongyan; Zou, Dong; Xu, Xingjian; Ma, Lina; Chen, Huanxin; Sun, Yubin; Yu, Lei; Zhai, Shuang; Sun, Mingyuan; Zhang, Zhang; Zhao, Wenming; Xiao, Jingfa; Bao, Yiming; Song, Shuhui; Hao, Lili; Li, Rujiao; Ma, Lina; Sang, Jian; Wang, Yanqing; Tang, Bixia; Zou, Dong; Wang, Fan

2018-01-01

Abstract The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. PMID:29036542
Measurement tools of resource use and quality of life in clinical trials for dementia or cognitive impairment interventions: protocol for a scoping review.

PubMed

Yang, Fan; Dawes, Piers; Leroi, Iracema; Gannon, Brenda

2017-01-26

Dementia and cognitive impairment could severely impact patients' life and bring heavy burden to patients, caregivers and societies. Some interventions are suggested for the older patients with these conditions to help them live well, but economic evaluation is needed to assess the cost-effectiveness of these interventions. Trial-based economic evaluation is an ideal method; however, little is known about the tools used to collect data of resource use and quality of life alongside the trials. Therefore, the aim of this review is to identify and describe the resource use and quality of life instruments in clinical trials of interventions for older patients with dementia or cognitive impairment. We will perform a search in main electronic databases (Ovid MEDLINE, PsycINFO, EMBASE, CINAHL, Cochrane Databases of Systematic Reviews, Web of Science and Scopus) using the key terms or their synonyms: older, dementia, cognitive impairment, cost, quality of life, intervention and tools. After removing duplicates, two independent reviewers will screen each entry for eligibility, initially by title and abstract, then by full-text. A hand search of the references of included articles and general search, e.g. Google Scholar, will also be conducted to identify potential relevant studies. All disagreements will be resolved by discussion or consultation with a third reviewer if necessary. Data analysis will be completed and reported in a narrative review. This review will identify the instruments used in clinical trials to collect resource use and quality of life data for dementia or cognitive impairment interventions. This will help to guide the study design of future trial-based economic evaluation of these interventions. PROSPERO CRD42016038495.
The effects of applying information technology on job empowerment dimensions.

PubMed

Ajami, Sima; Arab-Chadegani, Raziyeh

2014-01-01

Information Technology (IT) is known as a valuable tool for information dissemination. Today, information communication technology can be used as a powerful tool to improve employees' quality and efficiency. The increasing development of technology-based tools and their adaptation speed with human requirements has led to a new form of the learning environment and creative, active and inclusive interaction. These days, information is one of the most important power resources in every organization and accordingly, acquiring information, especially central or strategic one can help organizations to build a power base and influence others. The aim of this study was to identify the most important criteria in job empowerment using IT and also the advantages of assessing empowerment. This study was a narrative review. The literature was searched on databases and journals of Springer, Proquest, PubMed, science direct and scientific information database) with keywords including IT, empowerment and employees in the searching areas of titles, keywords, abstracts and full texts. The preliminary search resulted in 85 articles, books and conference proceedings in which published between 1983 and 2013 during July 2013. After a careful analysis of the content of each paper, a total of 40 papers and books were selected based on their relevancy. According to Ardalan Model IT plays a significant role in the fast data collection, global and fast access to a broad range of health information, a quick evaluation of information, better communication among health experts and more awareness through access to various information sources. IT leads to a better performance accompanied by higher efficiency in service providing all of which will cause more satisfaction from fast and high-quality services.
The effects of applying information technology on job empowerment dimensions

PubMed Central

Ajami, Sima; Arab-Chadegani, Raziyeh

2014-01-01

Information Technology (IT) is known as a valuable tool for information dissemination. Today, information communication technology can be used as a powerful tool to improve employees’ quality and efficiency. The increasing development of technology-based tools and their adaptation speed with human requirements has led to a new form of the learning environment and creative, active and inclusive interaction. These days, information is one of the most important power resources in every organization and accordingly, acquiring information, especially central or strategic one can help organizations to build a power base and influence others. The aim of this study was to identify the most important criteria in job empowerment using IT and also the advantages of assessing empowerment. This study was a narrative review. The literature was searched on databases and journals of Springer, Proquest, PubMed, science direct and scientific information database) with keywords including IT, empowerment and employees in the searching areas of titles, keywords, abstracts and full texts. The preliminary search resulted in 85 articles, books and conference proceedings in which published between 1983 and 2013 during July 2013. After a careful analysis of the content of each paper, a total of 40 papers and books were selected based on their relevancy. According to Ardalan Model IT plays a significant role in the fast data collection, global and fast access to a broad range of health information, a quick evaluation of information, better communication among health experts and more awareness through access to various information sources. IT leads to a better performance accompanied by higher efficiency in service providing all of which will cause more satisfaction from fast and high-quality services. PMID:25250350
GEM-TREND: a web tool for gene expression data mining toward relevant network discovery

PubMed Central

Feng, Chunlai; Araki, Michihiro; Kunimoto, Ryo; Tamon, Akiko; Makiguchi, Hiroki; Niijima, Satoshi; Tsujimoto, Gozoh; Okuno, Yasushi

2009-01-01

Background DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database. Results GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations are dynamically linked to external data repositories. Conclusion GEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data. It could be a very useful resource for finding similar gene expression profiles and constructing its gene co-expression networks from a publicly available database. GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery. GEM-TREND is freely available at . PMID:19728865
GEM-TREND: a web tool for gene expression data mining toward relevant network discovery.

PubMed

Feng, Chunlai; Araki, Michihiro; Kunimoto, Ryo; Tamon, Akiko; Makiguchi, Hiroki; Niijima, Satoshi; Tsujimoto, Gozoh; Okuno, Yasushi

2009-09-03

DNA microarray technology provides us with a first step toward the goal of uncovering gene functions on a genomic scale. In recent years, vast amounts of gene expression data have been collected, much of which are available in public databases, such as the Gene Expression Omnibus (GEO). To date, most researchers have been manually retrieving data from databases through web browsers using accession numbers (IDs) or keywords, but gene-expression patterns are not considered when retrieving such data. The Connectivity Map was recently introduced to compare gene expression data by introducing gene-expression signatures (represented by a set of genes with up- or down-regulated labels according to their biological states) and is available as a web tool for detecting similar gene-expression signatures from a limited data set (approximately 7,000 expression profiles representing 1,309 compounds). In order to support researchers to utilize the public gene expression data more effectively, we developed a web tool for finding similar gene expression data and generating its co-expression networks from a publicly available database. GEM-TREND, a web tool for searching gene expression data, allows users to search data from GEO using gene-expression signatures or gene expression ratio data as a query and retrieve gene expression data by comparing gene-expression pattern between the query and GEO gene expression data. The comparison methods are based on the nonparametric, rank-based pattern matching approach of Lamb et al. (Science 2006) with the additional calculation of statistical significance. The web tool was tested using gene expression ratio data randomly extracted from the GEO and with in-house microarray data, respectively. The results validated the ability of GEM-TREND to retrieve gene expression entries biologically related to a query from GEO. For further analysis, a network visualization interface is also provided, whereby genes and gene annotations are dynamically linked to external data repositories. GEM-TREND was developed to retrieve gene expression data by comparing query gene-expression pattern with those of GEO gene expression data. It could be a very useful resource for finding similar gene expression profiles and constructing its gene co-expression networks from a publicly available database. GEM-TREND was designed to be user-friendly and is expected to support knowledge discovery. GEM-TREND is freely available at http://cgs.pharm.kyoto-u.ac.jp/services/network.
Content and Design Features of Academic Health Sciences Libraries' Home Pages.

PubMed

McConnaughy, Rozalynd P; Wilson, Steven P

2018-01-01

The goal of this content analysis was to identify commonly used content and design features of academic health sciences library home pages. After developing a checklist, data were collected from 135 academic health sciences library home pages. The core components of these library home pages included a contact phone number, a contact email address, an Ask-a-Librarian feature, the physical address listed, a feedback/suggestions link, subject guides, a discovery tool or database-specific search box, multimedia, social media, a site search option, a responsive web design, and a copyright year or update date.

Information System through ANIS at CeSAM

NASA Astrophysics Data System (ADS)

Moreau, C.; Agneray, F.; Gimenez, S.

2015-09-01

ANIS (AstroNomical Information System) is a web generic tool developed at CeSAM to facilitate and standardize the implementation of astronomical data of various kinds through private and/or public dedicated Information Systems. The architecture of ANIS is composed of a database server which contains the project data, a web user interface template which provides high level services (search, extract and display imaging and spectroscopic data using a combination of criteria, an object list, a sql query module or a cone search interfaces), a framework composed of several packages, and a metadata database managed by a web administration entity. The process to implement a new ANIS instance at CeSAM is easy and fast : the scientific project has to submit data or a data secure access, the CeSAM team installs the new instance (web interface template and the metadata database), and the project administrator can configure the instance with the web ANIS-administration entity. Currently, the CeSAM offers through ANIS a web access to VO compliant Information Systems for different projects (HeDaM, HST-COSMOS, CFHTLS-ZPhots, ExoDAT,...).
BIG: a large-scale data integration tool for renal physiology.

PubMed

Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya; Knepper, Mark A

2016-10-01

Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: "How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?" This is the type of problem that has motivated the "Big-Data" revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/.
New tools for discovery from old databases

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, J.P.

1990-05-01

Very large quantities of information have been accumulated as a result of petroleum exploration and the practice of petroleum geology. New and more powerful methods to build and analyze databases have been developed. The new tools must be tested, and, as quickly as possible, combined with traditional methods to the full advantage of currently limited funds in the search for new and extended hydrocarbon reserves. A recommended combined sequence is (1) database validating, (2) category separating, (3) machine learning, (4) graphic modeling, (5) database filtering, and (6) regression for predicting. To illustrate this procedure, a database from the Railroad Commissionmore » of Texas has been analyzed. Clusters of information have been identified to prevent apples and oranges problems from obscuring the conclusions. Artificial intelligence has checked the database for potentially invalid entries and has identified rules governing the relationship between factors, which can be numeric or nonnumeric (words), or both. Graphic 3-Dimensional modeling has clarified relationships. Database filtering has physically separated the integral parts of the database, which can then be run through the sequence again, increasing the precision. Finally, regressions have been run on separated clusters giving equations, which can be used with confidence in making predictions. Advances in computer systems encourage the learning of much more from past records, and reduce the danger of prejudiced decisions. Soon there will be giant strides beyond current capabilities to the advantage of those who are ready for them.« less
EasyKSORD: A Platform of Keyword Search Over Relational Databases

NASA Astrophysics Data System (ADS)

Peng, Zhaohui; Li, Jing; Wang, Shan

Keyword Search Over Relational Databases (KSORD) enables casual users to use keyword queries (a set of keywords) to search relational databases just like searching the Web, without any knowledge of the database schema or any need of writing SQL queries. Based on our previous work, we design and implement a novel KSORD platform named EasyKSORD for users and system administrators to use and manage different KSORD systems in a novel and simple manner. EasyKSORD supports advanced queries, efficient data-graph-based search engines, multiform result presentations, and system logging and analysis. Through EasyKSORD, users can search relational databases easily and read search results conveniently, and system administrators can easily monitor and analyze the operations of KSORD and manage KSORD systems much better.
TryTransDB: A web-based resource for transport proteins in Trypanosomatidae.

PubMed

Sonar, Krushna; Kabra, Ritika; Singh, Shailza

2018-03-12

TryTransDB is a web-based resource that stores transport protein data which can be retrieved using a standalone BLAST tool. We have attempted to create an integrated database that can be a one-stop shop for the researchers working with transport proteins of Trypanosomatidae family. TryTransDB (Trypanosomatidae Transport Protein Database) is a web based comprehensive resource that can fire a BLAST search against most of the transport protein sequences (protein and nucleotide) from Trypanosomatidae family organisms. This web resource further allows to compute a phylogenetic tree by performing multiple sequence alignment (MSA) using CLUSTALW suite embedded in it. Also, cross-linking to other databases helps in gathering more information for a certain transport protein in a single website.
Lynx: a database and knowledge extraction engine for integrative medicine.

PubMed

Sulakhe, Dinanath; Balasubramanian, Sandhya; Xie, Bingqing; Feng, Bo; Taylor, Andrew; Wang, Sheng; Berrocal, Eduardo; Dave, Utpal; Xu, Jinbo; Börnigen, Daniela; Gilliam, T Conrad; Maltsev, Natalia

2014-01-01

We have developed Lynx (http://lynx.ci.uchicago.edu)--a web-based database and a knowledge extraction engine, supporting annotation and analysis of experimental data and generation of weighted hypotheses on molecular mechanisms contributing to human phenotypes and disorders of interest. Its underlying knowledge base (LynxKB) integrates various classes of information from >35 public databases and private collections, as well as manually curated data from our group and collaborators. Lynx provides advanced search capabilities and a variety of algorithms for enrichment analysis and network-based gene prioritization to assist the user in extracting meaningful knowledge from LynxKB and experimental data, whereas its service-oriented architecture provides public access to LynxKB and its analytical tools via user-friendly web services and interfaces.
Tripal v1.1: a standards-based toolkit for construction of online genetic and genomic databases.

PubMed

Sanderson, Lacey-Anne; Ficklin, Stephen P; Cheng, Chun-Huai; Jung, Sook; Feltus, Frank A; Bett, Kirstin E; Main, Dorrie

2013-01-01

Tripal is an open-source freely available toolkit for construction of online genomic and genetic databases. It aims to facilitate development of community-driven biological websites by integrating the GMOD Chado database schema with Drupal, a popular website creation and content management software. Tripal provides a suite of tools for interaction with a Chado database and display of content therein. The tools are designed to be generic to support the various ways in which data may be stored in Chado. Previous releases of Tripal have supported organisms, genomic libraries, biological stocks, stock collections and genomic features, their alignments and annotations. Also, Tripal and its extension modules provided loaders for commonly used file formats such as FASTA, GFF, OBO, GAF, BLAST XML, KEGG heir files and InterProScan XML. Default generic templates were provided for common views of biological data, which could be customized using an open Application Programming Interface to change the way data are displayed. Here, we report additional tools and functionality that are part of release v1.1 of Tripal. These include (i) a new bulk loader that allows a site curator to import data stored in a custom tab delimited format; (ii) full support of every Chado table for Drupal Views (a powerful tool allowing site developers to construct novel displays and search pages); (iii) new modules including 'Feature Map', 'Genetic', 'Publication', 'Project', 'Contact' and the 'Natural Diversity' modules. Tutorials, mailing lists, download and set-up instructions, extension modules and other documentation can be found at the Tripal website located at http://tripal.info. DATABASE URL: http://tripal.info/.
Tripal v1.1: a standards-based toolkit for construction of online genetic and genomic databases

PubMed Central

Sanderson, Lacey-Anne; Ficklin, Stephen P.; Cheng, Chun-Huai; Jung, Sook; Feltus, Frank A.; Bett, Kirstin E.; Main, Dorrie

2013-01-01

Tripal is an open-source freely available toolkit for construction of online genomic and genetic databases. It aims to facilitate development of community-driven biological websites by integrating the GMOD Chado database schema with Drupal, a popular website creation and content management software. Tripal provides a suite of tools for interaction with a Chado database and display of content therein. The tools are designed to be generic to support the various ways in which data may be stored in Chado. Previous releases of Tripal have supported organisms, genomic libraries, biological stocks, stock collections and genomic features, their alignments and annotations. Also, Tripal and its extension modules provided loaders for commonly used file formats such as FASTA, GFF, OBO, GAF, BLAST XML, KEGG heir files and InterProScan XML. Default generic templates were provided for common views of biological data, which could be customized using an open Application Programming Interface to change the way data are displayed. Here, we report additional tools and functionality that are part of release v1.1 of Tripal. These include (i) a new bulk loader that allows a site curator to import data stored in a custom tab delimited format; (ii) full support of every Chado table for Drupal Views (a powerful tool allowing site developers to construct novel displays and search pages); (iii) new modules including ‘Feature Map’, ‘Genetic’, ‘Publication’, ‘Project’, ‘Contact’ and the ‘Natural Diversity’ modules. Tutorials, mailing lists, download and set-up instructions, extension modules and other documentation can be found at the Tripal website located at http://tripal.info. Database URL: http://tripal.info/ PMID:24163125
Using SQL Databases for Sequence Similarity Searching and Analysis.

PubMed

Pearson, William R; Mackey, Aaron J

2017-09-13

Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
HTAPP: High-Throughput Autonomous Proteomic Pipeline

PubMed Central

Yu, Kebing; Salomon, Arthur R.

2011-01-01

Recent advances in the speed and sensitivity of mass spectrometers and in analytical methods, the exponential acceleration of computer processing speeds, and the availability of genomic databases from an array of species and protein information databases have led to a deluge of proteomic data. The development of a lab-based automated proteomic software platform for the automated collection, processing, storage, and visualization of expansive proteomic datasets is critically important. The high-throughput autonomous proteomic pipeline (HTAPP) described here is designed from the ground up to provide critically important flexibility for diverse proteomic workflows and to streamline the total analysis of a complex proteomic sample. This tool is comprised of software that controls the acquisition of mass spectral data along with automation of post-acquisition tasks such as peptide quantification, clustered MS/MS spectral database searching, statistical validation, and data exploration within a user-configurable lab-based relational database. The software design of HTAPP focuses on accommodating diverse workflows and providing missing software functionality to a wide range of proteomic researchers to accelerate the extraction of biological meaning from immense proteomic data sets. Although individual software modules in our integrated technology platform may have some similarities to existing tools, the true novelty of the approach described here is in the synergistic and flexible combination of these tools to provide an integrated and efficient analysis of proteomic samples. PMID:20336676
Towards Monitoring-as-a-service for Scientific Computing Cloud applications using the ElasticSearch ecosystem

NASA Astrophysics Data System (ADS)

Bagnasco, S.; Berzano, D.; Guarise, A.; Lusso, S.; Masera, M.; Vallero, S.

2015-12-01

The INFN computing centre in Torino hosts a private Cloud, which is managed with the OpenNebula cloud controller. The infrastructure offers Infrastructure-as-a-Service (IaaS) and Platform-as-a-Service (PaaS) services to different scientific computing applications. The main stakeholders of the facility are a grid Tier-2 site for the ALICE collaboration at LHC, an interactive analysis facility for the same experiment and a grid Tier-2 site for the BESIII collaboration, plus an increasing number of other small tenants. The dynamic allocation of resources to tenants is partially automated. This feature requires detailed monitoring and accounting of the resource usage. We set up a monitoring framework to inspect the site activities both in terms of IaaS and applications running on the hosted virtual instances. For this purpose we used the ElasticSearch, Logstash and Kibana (ELK) stack. The infrastructure relies on a MySQL database back-end for data preservation and to ensure flexibility to choose a different monitoring solution if needed. The heterogeneous accounting information is transferred from the database to the ElasticSearch engine via a custom Logstash plugin. Each use-case is indexed separately in ElasticSearch and we setup a set of Kibana dashboards with pre-defined queries in order to monitor the relevant information in each case. For the IaaS metering, we developed sensors for the OpenNebula API. The IaaS level information gathered through the API is sent to the MySQL database through an ad-hoc developed RESTful web service. Moreover, we have developed a billing system for our private Cloud, which relies on the RabbitMQ message queue for asynchronous communication to the database and on the ELK stack for its graphical interface. The Italian Grid accounting framework is also migrating to a similar set-up. Concerning the application level, we used the Root plugin TProofMonSenderSQL to collect accounting data from the interactive analysis facility. The BESIII virtual instances used to be monitored with Zabbix, as a proof of concept we also retrieve the information contained in the Zabbix database. In this way we have achieved a uniform monitoring interface for both the IaaS and the scientific applications, mostly leveraging off-the-shelf tools. At present, we are working to define a model for monitoring-as-a-service, based on the tools described above, which the Cloud tenants can easily configure to suit their specific needs.
Identification and Analysis of Jasmonate Pathway Genes in Coffea canephora (Robusta Coffee) by In Silico Approach.

PubMed

Bharathi, Kosaraju; Sreenath, H L

2017-07-01

Coffea canephora is the commonly cultivated coffee species in the world along with Coffea arabica . Different pests and pathogens affect the production and quality of the coffee. Jasmonic acid (JA) is a plant hormone which plays an important role in plants growth, development, and defense mechanisms, particularly against insect pests. The key enzymes involved in the production of JA are lipoxygenase, allene oxide synthase, allene oxide cyclase, and 12-oxo-phytodienoic reductase. There is no report on the genes involved in JA pathway in coffee plants. We made an attempt to identify and analyze the genes coding for these enzymes in C. canephora . First, protein sequences of jasmonate pathway genes from model plant Arabidopsis thaliana were identified in the National Center for Biotechnology Information (NCBI) database. These protein sequences were used to search the web-based database Coffee Genome Hub to identify homologous protein sequences in C. canephora genome using Basic Local Alignment Search Tool (BLAST). Homologous protein sequences for key genes were identified in the C. canephora genome database. Protein sequences of the top matches were in turn used to search in NCBI database using BLAST tool to confirm the identity of the selected proteins and to identify closely related genes in species. The protein sequences from C. canephora database and the top matches in NCBI were aligned, and phylogenetic trees were constructed using MEGA6 software and identified the genetic distance of the respective genes. The study identified the four key genes of JA pathway in C. canephora , confirming the conserved nature of the pathway in coffee. The study expected to be useful to further explore the defense mechanisms of coffee plants. JA is a plant hormone that plays an important role in plant defense against insect pests. Genes coding for the 4 key enzymes involved in the production of JA viz., LOX, AOS, AOC, and OPR are identified in C. canephora (robusta coffee) by bioinformatic approaches confirming the conserved nature of the pathway in coffee. The findings are useful to understand the defense mechanisms of C. canephora and coffee breeding in the long run. JA is a plant hormone that plays an important role in plant defense against insect pests. Genes coding for the 4 key enzymes involved in the production of JA viz., LOX, AOS, AOC and OPR were identified and analyzed in C. canephora (robusta coffee) by in silico approach. The study has confirmed the conserved nature of JA pathway in coffee; the findings are useful to further explore the defense mechanisms of coffee plants. Abbreviations used: C. canephora : Coffea canephora ; C. arabica : Coffea arabica ; JA: Jasmonic acid; CGH: Coffee Genome Hub; NCBI: National Centre for Biotechnology Information; BLAST: Basic Local Alignment Search Tool; A. thaliana : Arabidopsis thaliana ; LOX: Lipoxygenase, AOS: Allene oxide synthase; AOC: Allene oxide cyclase; OPR: 12 oxo phytodienoic reductase.
Simple re-instantiation of small databases using cloud computing.

PubMed

Tan, Tin Wee; Xie, Chao; De Silva, Mark; Lim, Kuan Siong; Patro, C Pawan K; Lim, Shen Jean; Govindarajan, Kunde Ramamoorthy; Tong, Joo Chuan; Choo, Khar Heng; Ranganathan, Shoba; Khan, Asif M

2013-01-01

Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear.
Simple re-instantiation of small databases using cloud computing

PubMed Central

2013-01-01

Background Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. Results We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Conclusions Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear. PMID:24564380
The South London and Maudsley NHS Foundation Trust Biomedical Research Centre (SLAM BRC) case register: development and descriptive data.

PubMed

Stewart, Robert; Soremekun, Mishael; Perera, Gayan; Broadbent, Matthew; Callard, Felicity; Denis, Mike; Hotopf, Matthew; Thornicroft, Graham; Lovestone, Simon

2009-08-12

Case registers have been used extensively in mental health research. Recent developments in electronic medical records, and in computer software to search and analyse these in anonymised format, have the potential to revolutionise this research tool. We describe the development of the South London and Maudsley NHS Foundation Trust (SLAM) Biomedical Research Centre (BRC) Case Register Interactive Search tool (CRIS) which allows research-accessible datasets to be derived from SLAM, the largest provider of secondary mental healthcare in Europe. All clinical data, including free text, are available for analysis in the form of anonymised datasets. Development involved both the building of the system and setting in place the necessary security (with both functional and procedural elements). Descriptive data are presented for the Register database as of October 2008. The database at that point included 122,440 cases, 35,396 of whom were receiving active case management under the Care Programme Approach. In terms of gender and ethnicity, the database was reasonably representative of the source population. The most common assigned primary diagnoses were within the ICD mood disorders (n = 12,756) category followed by schizophrenia and related disorders (8158), substance misuse (7749), neuroses (7105) and organic disorders (6414). The SLAM BRC Case Register represents a 'new generation' of this research design, built on a long-running system of fully electronic clinical records and allowing in-depth secondary analysis of both numerical, string and free text data, whilst preserving anonymity through technical and procedural safeguards.
Situational Awareness Geospatial Application (iSAGA)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sher, Benjamin

Situational Awareness Geospatial Application (iSAGA) is a geospatial situational awareness software tool that uses an algorithm to extract location data from nearly any internet-based, or custom data source and display it geospatially; allows user-friendly conduct of spatial analysis using custom-developed tools; searches complex Geographic Information System (GIS) databases and accesses high resolution imagery. iSAGA has application at the federal, state and local levels of emergency response, consequence management, law enforcement, emergency operations and other decision makers as a tool to provide complete, visual, situational awareness using data feeds and tools selected by the individual agency or organization. Feeds may bemore » layered and custom tools developed to uniquely suit each subscribing agency or organization. iSAGA may similarly be applied to international agencies and organizations.« less
M2Lite: An Open-source, Light-weight, Pluggable and Fast Proteome Discoverer MSF to mzIdentML Tool.

PubMed

Aiyetan, Paul; Zhang, Bai; Chen, Lily; Zhang, Zhen; Zhang, Hui

2014-04-28

Proteome Discoverer is one of many tools used for protein database search and peptide to spectrum assignment in mass spectrometry-based proteomics. However, the inadequacy of conversion tools makes it challenging to compare and integrate its results to those of other analytical tools. Here we present M2Lite, an open-source, light-weight, easily pluggable and fast conversion tool. M2Lite converts proteome discoverer derived MSF files to the proteomics community defined standard - the mzIdentML file format. M2Lite's source code is available as open-source at https://bitbucket.org/paiyetan/m2lite/src and its compiled binaries and documentation can be freely downloaded at https://bitbucket.org/paiyetan/m2lite/downloads.
The application of health literacy measurement tools (collective or individual domains) in assessing chronic disease management: a systematic review protocol.

PubMed

Shum, Jessica; Poureslami, Iraj; Doyle-Waters, Mary M; FitzGerald, J Mark

2016-06-07

The term "health literacy" (HL) was first coined in 1974, and its most common definition is currently defined as a person's ability to access, understand, evaluate, communicate, and use health information to make decisions for one's health. The previous systematic reviews assessing the effect of existing HL measurement tools on health outcomes have simply searched for the term "health literacy" only to identify measures instead of incorporating either one or more of the five domains in their search. Furthermore, as the domain "use" is fairly new, few studies have actually assessed this domain. In this protocol, we propose to identify and assess HL measures that applied the mentioned five domains either collectively or individually in assessing chronic disease management, in particular for asthma and chronic obstructive pulmonary disease (COPD). The ultimate goal is to provide recommendations towards the development and validation of a patient-centric HL measurement tool for the two diseases. A comprehensive, electronic search will be conducted to identify potential studies dating from 1974 to 2016 from databases such as Embase, MEDLINE, CINAHL, Cochrane Central Register of Controlled Trials, Web of Science, ERIC, PsycINFO, and HAPI. Database searches will be complemented with grey literature. Two independent reviewers will perform tool selection, study selection, data extraction, and quality assessment using pre-designed study forms. Any disagreement will be resolved through discussion or a third reviewer. Only studies that have developed or validated HL measurement tools (including one or more of the five domains mentioned above) among asthma and COPD patients will be included. Information collected from the studies will include instrument details such as versions, purpose, underlying constructs, administration, mapping of items onto the five domains, internal structure, scoring, response processes, standard error of measurement (SEM), correlation with other variables, clinically important difference, and item response theory (IRT)-based analyses. The identified strengths and weaknesses as well as reliability, validity, responsiveness, and interpretability of the tools from the validation studies will also be assessed using the COSMIN checklist. A synthesis will be presented for all tools in relation to asthma and COPD management. This systematic review will be one of several key contributions central to a global evidence-based strategy funded by the Canadian Institutes of Health Research (CIHR) for measuring HL in patients with asthma and COPD, highlighting the gaps and inconsistencies of domains between existing tools. The knowledge generated from this review will provide the team information on (1) the five-domain model and cross domains, (2) underlying constructs, (3) tool length, (4) time for completion, (5) reading level, and (6) format for development of the proposed tool. Other aspects of the published validation studies such as reliability coefficients, SEM, correlations with other variables, clinically important difference, and IRT-based analyses will be important for comparison purposes when testing, interpreting, and validating the developed tool. PROSPERO CRD42016037532.
Still Virtual After All These Years: Recent Developments in the Virtual Solar Observatory

NASA Astrophysics Data System (ADS)

Gurman, J. B.; Bogart, R. S.; Davey, A. R.; Hill, F.; Martens, P. C.; Zarro, D. M.; Team, T. v.

2008-05-01

While continuing to add access to data from new missions, including Hinode and STEREO, the Virtual Solar Observatory is also being enhanced as a research tool by the addition of new features such as the unified representation of catalogs and event lists (to allow joined searches in two or more catalogs) and workable representation and manipulation of large numbers of search results (as are expected from the Solar Dynamics Observatory database). Working with our RHESSI colleagues, we have also been able to improve the performance of IDL-callable vso_search and vso_get functions, to the point that use of those routines is a practical alternative to reproducing large subsets of mission data on one's own LAN.
Still Virtual After All These Years: Recent Developments in the Virtual Solar Observatory

NASA Technical Reports Server (NTRS)

Gurman, Joseph B.; Bogart; Davey; Hill; Masters; Zarro

2008-01-01

While continuing to add access to data from new missions, including Hinode and STEREO, the Virtual Solar Observatory is also being enhanced as a research tool by the addition of new features such as the unified representation of catalogs and event lists (to allow joined searches in two or more catalogs) and workable representation and manipulation of large numbers of search results (as are expected from the Solar Dynamics Observatory database). Working with our RHESSI colleagues, we have also been able to improve the performance of IDL-callable vso_search and vso_get functions, to the point that use of those routines is a practical alternative to reproducing large subsets of mission data on one's own LAN.

Research resource: Update and extension of a glycoprotein hormone receptors web application.

PubMed

Kreuchwig, Annika; Kleinau, Gunnar; Kreuchwig, Franziska; Worth, Catherine L; Krause, Gerd

2011-04-01

The SSFA-GPHR (Sequence-Structure-Function-Analysis of Glycoprotein Hormone Receptors) database provides a comprehensive set of mutation data for the glycoprotein hormone receptors (covering the lutropin, the FSH, and the TSH receptors). Moreover, it provides a platform for comparison and investigation of these homologous receptors and helps in understanding protein malfunctions associated with several diseases. Besides extending the data set (> 1100 mutations), the database has been completely redesigned and several novel features and analysis tools have been added to the web site. These tools allow the focused extraction of semiquantitative mutant data from the GPHR subtypes and different experimental approaches. Functional and structural data of the GPHRs are now linked interactively at the web interface, and new tools for data visualization (on three-dimensional protein structures) are provided. The interpretation of functional findings is supported by receptor morphings simulating intramolecular changes during the activation process, which thus help to trace the potential function of each amino acid and provide clues to the local structural environment, including potentially relocated spatial counterpart residues. Furthermore, double and triple mutations are newly included to allow the analysis of their functional effects related to their spatial interrelationship in structures or homology models. A new important feature is the search option and data visualization by interactive and user-defined snake-plots. These new tools allow fast and easy searches for specific functional data and thereby give deeper insights in the mechanisms of hormone binding, signal transduction, and signaling regulation. The web application "Sequence-Structure-Function-Analysis of GPHRs" is accessible on the internet at http://www.ssfa-gphr.de/.
A scoping review of assessment tools for laparoscopic suturing.

PubMed

Bilgic, Elif; Endo, Satoshi; Lebedeva, Ekaterina; Takao, Madoka; McKendy, Katherine M; Watanabe, Yusuke; Feldman, Liane S; Vassiliou, Melina C

2018-05-03

A needs assessment identified a gap in teaching and assessment of laparoscopic suturing (LS) skills. The purpose of this review is to identify assessment tools that were used to assess LS skills, to evaluate validity evidence available, and to provide guidance for selecting the right assessment tool for specific assessment conditions. Bibliographic databases were searched till April 2017. Full-text articles were included if they reported on assessment tools used in the operating room/simulation to (1) assess procedures that require LS or (2) specifically assess LS skills. Forty-two tools were identified, of which 26 were used for assessing LS skills specifically and 26 for procedures that require LS. Tools had the most evidence in internal structure and relationship to other variables, and least in consequences. Through identification and evaluation of assessment tools, the results of this review could be used as a guideline when implementing assessment tools into training programs.
Database resources of the National Center for Biotechnology Information: 2002 update

PubMed Central

Wheeler, David L.; Church, Deanna M.; Lash, Alex E.; Leipe, Detlef D.; Madden, Thomas L.; Pontius, Joan U.; Schuler, Gregory D.; Schriml, Lynn M.; Tatusova, Tatiana A.; Wagner, Lukas; Rapp, Barbara A.

2002-01-01

In addition to maintaining the GenBank nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources that operate on the data in GenBank and a variety of other biological data made available through NCBI’s web site. NCBI data retrieval resources include Entrez, PubMed, LocusLink and the Taxonomy Browser. Data analysis resources include BLAST, Electronic PCR, OrfFinder, RefSeq, UniGene, HomoloGene, Database of Single Nucleotide Polymorphisms (dbSNP), Human Genome Sequencing, Human MapViewer, Human¡VMouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB) and the Conserved Domain Database (CDD). Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. PMID:11752242
Native Health Research Database

MedlinePlus

... Indian Health Board) Welcome to the Native Health Database. Please enter your search terms. Basic Search Advanced ... To learn more about searching the Native Health Database, click here. Tutorial Video The NHD has made ...
Taking structure searches to the next dimension.

PubMed

Schafferhans, Andrea; Rost, Burkhard

2014-07-08

Structure comparisons are now the first step when a new experimental high-resolution protein structure has been determined. In this issue of Structure, Wiederstein and colleagues describe their latest tool for comparing structures, which gives us the unprecedented power to discover crucial structural connections between whole complexes of proteins in the full structural database in real time. Copyright © 2014 Elsevier Ltd. All rights reserved.
Kinetic parameters of cholinesterase interactions with organophosphates: retrieval and comparison tools available through ESTHER database: ESTerases, alpha/beta Hydrolase Enzymes and Relatives.

PubMed

Chatonnet, A; Hotelier, T; Cousin, X

1999-05-14

Cholinesterases are targets for organophosphorus compounds which are used as insecticides, chemical warfare agents and drugs for the treatment of disease such as glaucoma, or parasitic infections. The widespread use of these chemicals explains the growing of this area of research and the ever increasing number of sequences, structures, or biochemical data available. Future advances will depend upon effective management of existing information as well as upon creation of new knowledge. The ESTHER database goal is to facilitate retrieval and comparison of data about structure and function of proteins presenting the alpha/beta hydrolase fold. Protein engineering and in vitro production of enzymes allow direct comparison of biochemical parameters. Kinetic parameters of enzymatic reactions are now included in the database. These parameters can be searched and compared with a table construction tool. ESTHER can be reached through internet (http://www.ensam.inra.fr/cholinesterase). The full database or the specialised X-window Client-server system can be downloaded from our ftp server (ftp://ftp.toulouse.inra.fr./pub/esther). Forms can be used to send updates or corrections directly from the web.
Chesapeake Bay Program Water Quality Database

EPA Pesticide Factsheets

The Chesapeake Information Management System (CIMS), designed in 1996, is an integrated, accessible information management system for the Chesapeake Bay Region. CIMS is an organized, distributed library of information and software tools designed to increase basin-wide public access to Chesapeake Bay information. The information delivered by CIMS includes technical and public information, educational material, environmental indicators, policy documents, and scientific data. Through the use of relational databases, web-based programming, and web-based GIS a large number of Internet resources have been established. These resources include multiple distributed on-line databases, on-demand graphing and mapping of environmental data, and geographic searching tools for environmental information. Baseline monitoring data, summarized data and environmental indicators that document ecosystem status and trends, confirm linkages between water quality, habitat quality and abundance, and the distribution and integrity of biological populations are also available. One of the major features of the CIMS network is the Chesapeake Bay Program's Data Hub, providing users access to a suite of long- term water quality and living resources databases. Chesapeake Bay mainstem and tidal tributary water quality, benthic macroinvertebrates, toxics, plankton, and fluorescence data can be obtained for a network of over 800 monitoring stations.
MPD: a pathogen genome and metagenome database

PubMed Central

Zhang, Tingting; Miao, Jiaojiao; Han, Na; Qiang, Yujun; Zhang, Wen

2018-01-01

Abstract Advances in high-throughput sequencing have led to unprecedented growth in the amount of available genome sequencing data, especially for bacterial genomes, which has been accompanied by a challenge for the storage and management of such huge datasets. To facilitate bacterial research and related studies, we have developed the Mypathogen database (MPD), which provides access to users for searching, downloading, storing and sharing bacterial genomics data. The MPD represents the first pathogenic database for microbial genomes and metagenomes, and currently covers pathogenic microbial genomes (6604 genera, 11 071 species, 41 906 strains) and metagenomic data from host, air, water and other sources (28 816 samples). The MPD also functions as a management system for statistical and storage data that can be used by different organizations, thereby facilitating data sharing among different organizations and research groups. A user-friendly local client tool is provided to maintain the steady transmission of big sequencing data. The MPD is a useful tool for analysis and management in genomic research, especially for clinical Centers for Disease Control and epidemiological studies, and is expected to contribute to advancing knowledge on pathogenic bacteria genomes and metagenomes. Database URL: http://data.mypathogen.org PMID:29917040
The ASTRAL Compendium in 2004

DOE R&D Accomplishments Database

Chandonia, John-Marc; Hon, Gary; Walker, Nigel S.; Lo Conte, Loredana; Koehl, Patrice; Levitt, Michael; Brenner, Steven E.

2003-09-15

The ASTRAL compendium provides several databases and tools to aid in the analysis of protein structures, particularly through the use of their sequences. Partially derived from the SCOP database of protein structure domains, it includes sequences for each domain and other resources useful for studying these sequences and domain structures. The current release of ASTRAL contains 54,745 domains, more than three times as many as the initial release four years ago. ASTRAL has undergone major transformations in the past two years. In addition to several complete updates each year, ASTRAL is now updated on a weekly basis with preliminary classifications of domains from newly released PDB structures. These classifications are available as a stand-alone database, as well as available integrated into other ASTRAL databases such as representative subsets. To enhance the utility of ASTRAL to structural biologists, all SCOP domains are now made available as PDB-style coordinate files as well as sequences. In addition to sequences and representative subsets based on SCOP domains, sequences and subsets based on PDB chains are newly included in ASTRAL. Several search tools have been added to ASTRAL to facilitate retrieval of data by individual users and automated methods.
probeBase—an online resource for rRNA-targeted oligonucleotide probes and primers: new features 2016

PubMed Central

Greuter, Daniel; Loy, Alexander; Horn, Matthias; Rattei, Thomas

2016-01-01

probeBase http://www.probebase.net is a manually maintained and curated database of rRNA-targeted oligonucleotide probes and primers. Contextual information and multiple options for evaluating in silico hybridization performance against the most recent rRNA sequence databases are provided for each oligonucleotide entry, which makes probeBase an important and frequently used resource for microbiology research and diagnostics. Here we present a major update of probeBase, which was last featured in the NAR Database Issue 2007. This update describes a complete remodeling of the database architecture and environment to accommodate computationally efficient access. Improved search functions, sequence match tools and data output now extend the opportunities for finding suitable hierarchical probe sets that target an organism or taxon at different taxonomic levels. To facilitate the identification of complementary probe sets for organisms represented by short rRNA sequence reads generated by amplicon sequencing or metagenomic analysis with next generation sequencing technologies such as Illumina and IonTorrent, we introduce a novel tool that recovers surrogate near full-length rRNA sequences for short query sequences and finds matching oligonucleotides in probeBase. PMID:26586809
Suitability of measurements used to assess mental health outcomes in men and women trafficked for sexual and labour exploitation: a systematic review.

PubMed

Doherty, S; Oram, S; Siriwardhana, C; Abas, M

2016-05-01

Trafficking is a global human rights violation with multiple and complex mental health consequences. Valid and reliable mental health assessment tools are needed to inform health-care provision. We reviewed mental health assessment tools used in research with men and women trafficked for sexual and labour exploitation. We searched nine electronic databases (PsycINFO, Ovid Medline, PubMed, Embase, Assia, the Web of Science, Global Health, Google Scholar, and Open Grey) and hand-searched the reference lists of relevant identified studies. Seven studies were included in this Review. Six of the studies screened for post-traumatic stress disorder, depression, and anxiety; one study screened for harmful use or abuse of alcohol and used a diagnostic tool to assess post-traumatic stress disorder, depression, and anxiety. Two studies included men in their sample population. Although the reported prevalence of mental health problems was high, little information was provided about the validity, reliability, and cultural appropriateness of assessment tools. Further research is needed to determine which assessment tools are culturally appropriate, valid, and reliable for trafficked people. Copyright © 2016 Elsevier Ltd. All rights reserved.
CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions.

PubMed

Liu, Yongchao; Wirawan, Adrianto; Schmidt, Bertil

2013-04-04

The maximal sensitivity for local alignments makes the Smith-Waterman algorithm a popular choice for protein sequence database search based on pairwise alignment. However, the algorithm is compute-intensive due to a quadratic time complexity. Corresponding runtimes are further compounded by the rapid growth of sequence databases. We present CUDASW++ 3.0, a fast Smith-Waterman protein database search algorithm, which couples CPU and GPU SIMD instructions and carries out concurrent CPU and GPU computations. For the CPU computation, this algorithm employs SSE-based vector execution units as accelerators. For the GPU computation, we have investigated for the first time a GPU SIMD parallelization, which employs CUDA PTX SIMD video instructions to gain more data parallelism beyond the SIMT execution model. Moreover, sequence alignment workloads are automatically distributed over CPUs and GPUs based on their respective compute capabilities. Evaluation on the Swiss-Prot database shows that CUDASW++ 3.0 gains a performance improvement over CUDASW++ 2.0 up to 2.9 and 3.2, with a maximum performance of 119.0 and 185.6 GCUPS, on a single-GPU GeForce GTX 680 and a dual-GPU GeForce GTX 690 graphics card, respectively. In addition, our algorithm has demonstrated significant speedups over other top-performing tools: SWIPE and BLAST+. CUDASW++ 3.0 is written in CUDA C++ and PTX assembly languages, targeting GPUs based on the Kepler architecture. This algorithm obtains significant speedups over its predecessor: CUDASW++ 2.0, by benefiting from the use of CPU and GPU SIMD instructions as well as the concurrent execution on CPUs and GPUs. The source code and the simulated data are available at http://cudasw.sourceforge.net.
Semantic-JSON: a lightweight web service interface for Semantic Web contents integrating multiple life science databases

PubMed Central

Kobayashi, Norio; Ishii, Manabu; Takahashi, Satoshi; Mochizuki, Yoshiki; Matsushima, Akihiro; Toyoda, Tetsuro

2011-01-01

Global cloud frameworks for bioinformatics research databases become huge and heterogeneous; solutions face various diametric challenges comprising cross-integration, retrieval, security and openness. To address this, as of March 2011 organizations including RIKEN published 192 mammalian, plant and protein life sciences databases having 8.2 million data records, integrated as Linked Open or Private Data (LOD/LPD) using SciNetS.org, the Scientists' Networking System. The huge quantity of linked data this database integration framework covers is based on the Semantic Web, where researchers collaborate by managing metadata across public and private databases in a secured data space. This outstripped the data query capacity of existing interface tools like SPARQL. Actual research also requires specialized tools for data analysis using raw original data. To solve these challenges, in December 2009 we developed the lightweight Semantic-JSON interface to access each fragment of linked and raw life sciences data securely under the control of programming languages popularly used by bioinformaticians such as Perl and Ruby. Researchers successfully used the interface across 28 million semantic relationships for biological applications including genome design, sequence processing, inference over phenotype databases, full-text search indexing and human-readable contents like ontology and LOD tree viewers. Semantic-JSON services of SciNetS.org are provided at http://semanticjson.org. PMID:21632604
Library Instruction and Online Database Searching.

ERIC Educational Resources Information Center

Mercado, Heidi

1999-01-01

Reviews changes in online database searching in academic libraries. Topics include librarians conducting all searches; the advent of end-user searching and the need for user instruction; compact disk technology; online public catalogs; the Internet; full text databases; electronic information literacy; user education and the remote library user;…
USGS cold-water coral geographic database-Gulf of Mexico and western North Atlantic Ocean, version 1.0

USGS Publications Warehouse

Scanlon, Kathryn M.; Waller, Rhian G.; Sirotek, Alexander R.; Knisel, Julia M.; O'Malley, John; Alesandrini, Stian

2010-01-01

The USGS Cold-Water Coral Geographic Database (CoWCoG) provides a tool for researchers and managers interested in studying, protecting, and/or utilizing cold-water coral habitats in the Gulf of Mexico and western North Atlantic Ocean. The database makes information about the locations and taxonomy of cold-water corals available to the public in an easy-to-access form while preserving the scientific integrity of the data. The database includes over 1700 entries, mostly from published scientific literature, museum collections, and other databases. The CoWCoG database is easy to search in a variety of ways, and data can be quickly displayed in table form and on a map by using only the software included with this publication. Subsets of the database can be selected on the basis of geographic location, taxonomy, or other criteria and exported to one of several available file formats. Future versions of the database are being planned to cover a larger geographic area and additional taxa.
How I do it: a practical database management system to assist clinical research teams with data collection, organization, and reporting.

PubMed

Lee, Howard; Chapiro, Julius; Schernthaner, Rüdiger; Duran, Rafael; Wang, Zhijun; Gorodetski, Boris; Geschwind, Jean-François; Lin, MingDe

2015-04-01

The objective of this study was to demonstrate that an intra-arterial liver therapy clinical research database system is a more workflow efficient and robust tool for clinical research than a spreadsheet storage system. The database system could be used to generate clinical research study populations easily with custom search and retrieval criteria. A questionnaire was designed and distributed to 21 board-certified radiologists to assess current data storage problems and clinician reception to a database management system. Based on the questionnaire findings, a customized database and user interface system were created to perform automatic calculations of clinical scores including staging systems such as the Child-Pugh and Barcelona Clinic Liver Cancer, and facilitates data input and output. Questionnaire participants were favorable to a database system. The interface retrieved study-relevant data accurately and effectively. The database effectively produced easy-to-read study-specific patient populations with custom-defined inclusion/exclusion criteria. The database management system is workflow efficient and robust in retrieving, storing, and analyzing data. Copyright © 2015 AUR. Published by Elsevier Inc. All rights reserved.
The EMBL nucleotide sequence database

PubMed Central

Stoesser, Guenter; Baker, Wendy; van den Broek, Alexandra; Camon, Evelyn; Garcia-Pastor, Maria; Kanz, Carola; Kulikova, Tamara; Lombard, Vincent; Lopez, Rodrigo; Parkinson, Helen; Redaschi, Nicole; Sterk, Peter; Stoehr, Peter; Tuli, Mary Ann

2001-01-01

The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via ftp, email and World Wide Web interfaces. EBI’s Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many specialized databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. PMID:11125039
CRCDA—Comprehensive resources for cancer NGS data analysis

PubMed Central

Thangam, Manonanthini; Gopal, Ramesh Kumar

2015-01-01

Next generation sequencing (NGS) innovations put a compelling landmark in life science and changed the direction of research in clinical oncology with its productivity to diagnose and treat cancer. The aim of our portal comprehensive resources for cancer NGS data analysis (CRCDA) is to provide a collection of different NGS tools and pipelines under diverse classes with cancer pathways and databases and furthermore, literature information from PubMed. The literature data was constrained to 18 most common cancer types such as breast cancer, colon cancer and other cancers that exhibit in worldwide population. NGS-cancer tools for the convenience have been categorized into cancer genomics, cancer transcriptomics, cancer epigenomics, quality control and visualization. Pipelines for variant detection, quality control and data analysis were listed to provide out-of-the box solution for NGS data analysis, which may help researchers to overcome challenges in selecting and configuring individual tools for analysing exome, whole genome and transcriptome data. An extensive search page was developed that can be queried by using (i) type of data [literature, gene data and sequence read archive (SRA) data] and (ii) type of cancer (selected based on global incidence and accessibility of data). For each category of analysis, variety of tools are available and the biggest challenge is in searching and using the right tool for the right application. The objective of the work is collecting tools in each category available at various places and arranging the tools and other data in a simple and user-friendly manner for biologists and oncologists to find information easier. To the best of our knowledge, we have collected and presented a comprehensive package of most of the resources available in cancer for NGS data analysis. Given these factors, we believe that this website will be an useful resource to the NGS research community working on cancer. Database URL: http://bioinfo.au-kbc.org.in/ngs/ngshome.html. PMID:26450948
CoReCG: a comprehensive database of genes associated with colon-rectal cancer

PubMed Central

Agarwal, Rahul; Kumar, Binayak; Jayadev, Msk; Raghav, Dhwani; Singh, Ashutosh

2016-01-01

Cancer of large intestine is commonly referred as colorectal cancer, which is also the third most frequently prevailing neoplasm across the globe. Though, much of work is being carried out to understand the mechanism of carcinogenesis and advancement of this disease but, fewer studies has been performed to collate the scattered information of alterations in tumorigenic cells like genes, mutations, expression changes, epigenetic alteration or post translation modification, genetic heterogeneity. Earlier findings were mostly focused on understanding etiology of colorectal carcinogenesis but less emphasis were given for the comprehensive review of the existing findings of individual studies which can provide better diagnostics based on the suggested markers in discrete studies. Colon Rectal Cancer Gene Database (CoReCG), contains 2056 colon-rectal cancer genes information involved in distinct colorectal cancer stages sourced from published literature with an effective knowledge based information retrieval system. Additionally, interactive web interface enriched with various browsing sections, augmented with advance search facility for querying the database is provided for user friendly browsing, online tools for sequence similarity searches and knowledge based schema ensures a researcher friendly information retrieval mechanism. Colorectal cancer gene database (CoReCG) is expected to be a single point source for identification of colorectal cancer-related genes, thereby helping with the improvement of classification, diagnosis and treatment of human cancers. Database URL: lms.snu.edu.in/corecg PMID:27114494
Prevalence of physical inactivity in Iran: a systematic review.

PubMed

Fakhrzadeh, Hossein; Djalalinia, Shirin; Mirarefin, Mojdeh; Arefirad, Tahereh; Asayesh, Hamid; Safiri, Saeid; Samami, Elham; Mansourian, Morteza; Shamsizadeh, Morteza; Qorbani, Mostafa

2016-01-01

Introduction: Physical inactivity is one of the most important risk factors for chronic diseases, including cardiovascular disease, cancer, and stroke. We aim to conduct a systematic review of the prevalence of physical inactivity in Iran. Methods: We searched international databases; ISI, PubMed/Medline, Scopus, and national databases Irandoc, Barakat knowledge network system, and Scientific Information Database (SID). We collected data for outcome measures of prevalence of physical inactivity by sex, age, province, and year. Quality assessment and data extraction has been conducted independently by two independent research experts. There were no limitations for time and language. Results: We analyzed data for prevalence of physical inactivity in Iranian population. According to our search strategy we found 254 records; of them 185 were from international databases and the remaining 69 were obtained from national databases after refining the data, 34 articles that met eligible criteria remained for data extraction. From them respectively; 9, 20, 2 and 3 studies were at national, provincial, regional and local levels. The estimates for inactivity ranged from approximately 30% to almost 70% and had considerable variation between sexes and studied sub-groups. Conclusion: In Iran, most of studies reported high prevalence of physical inactivity. Our findings reveal a heterogeneity of reported values, often from differences in study design, measurement tools and methods, different target groups and sub-population sampling. These data do not provide the possibility of aggregation of data for a comprehensive inference.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.