Dong, Runze; Pan, Shuo; Peng, Zhenling; Zhang, Yang; Yang, Jianyi
2018-05-21
With the rapid increase of the number of protein structures in the Protein Data Bank, it becomes urgent to develop algorithms for efficient protein structure comparisons. In this article, we present the mTM-align server, which consists of two closely related modules: one for structure database search and the other for multiple structure alignment. The database search is speeded up based on a heuristic algorithm and a hierarchical organization of the structures in the database. The multiple structure alignment is performed using the recently developed algorithm mTM-align. Benchmark tests demonstrate that our algorithms outperform other peering methods for both modules, in terms of speed and accuracy. One of the unique features for the server is the interplay between database search and multiple structure alignment. The server provides service not only for performing fast database search, but also for making accurate multiple structure alignment with the structures found by the search. For the database search, it takes about 2-5 min for a structure of a medium size (∼300 residues). For the multiple structure alignment, it takes a few seconds for ∼10 structures of medium sizes. The server is freely available at: http://yanglab.nankai.edu.cn/mTM-align/.
The Impact of Online Bibliographic Databases on Teaching and Research in Political Science.
ERIC Educational Resources Information Center
Reichel, Mary
The availability of online bibliographic databases greatly facilitates literature searching in political science. The advantages to searching databases online include combination of concepts, comprehensiveness, multiple database searching, free-text searching, currency, current awareness services, document delivery service, and convenience.…
Evaluation of DNA mixtures from database search.
Chung, Yuk-Ka; Hu, Yue-Qing; Fung, Wing K
2010-03-01
With the aim of bridging the gap between DNA mixture analysis and DNA database search, a novel approach is proposed to evaluate the forensic evidence of DNA mixtures when the suspect is identified by the search of a database of DNA profiles. General formulae are developed for the calculation of the likelihood ratio for a two-person mixture under general situations including multiple matches and imperfect evidence. The influence of the prior probabilities on the weight of evidence under the scenario of multiple matches is demonstrated by a numerical example based on Hong Kong data. Our approach is shown to be capable of presenting the forensic evidence of DNA mixtures in a comprehensive way when the suspect is identified through database search.
NASA Astrophysics Data System (ADS)
Piasecki, M.; Beran, B.
2007-12-01
Search engines have changed the way we see the Internet. The ability to find the information by just typing in keywords was a big contribution to the overall web experience. While the conventional search engine methodology worked well for textual documents, locating scientific data remains a problem since they are stored in databases not readily accessible by search engine bots. Considering different temporal, spatial and thematic coverage of different databases, especially for interdisciplinary research it is typically necessary to work with multiple data sources. These sources can be federal agencies which generally offer national coverage or regional sources which cover a smaller area with higher detail. However for a given geographic area of interest there often exists more than one database with relevant data. Thus being able to query multiple databases simultaneously is a desirable feature that would be tremendously useful for scientists. Development of such a search engine requires dealing with various heterogeneity issues. In scientific databases, systems often impose controlled vocabularies which ensure that they are generally homogeneous within themselves but are semantically heterogeneous when moving between different databases. This defines the boundaries of possible semantic related problems making it easier to solve than with the conventional search engines that deal with free text. We have developed a search engine that enables querying multiple data sources simultaneously and returns data in a standardized output despite the aforementioned heterogeneity issues between the underlying systems. This application relies mainly on metadata catalogs or indexing databases, ontologies and webservices with virtual globe and AJAX technologies for the graphical user interface. Users can trigger a search of dozens of different parameters over hundreds of thousands of stations from multiple agencies by providing a keyword, a spatial extent, i.e. a bounding box, and a temporal bracket. As part of this development we have also added an environment that allows users to do some of the semantic tagging, i.e. the linkage of a variable name (which can be anything they desire) to defined concepts in the ontology structure which in turn provides the backbone of the search engine.
Quantum partial search for uneven distribution of multiple target items
NASA Astrophysics Data System (ADS)
Zhang, Kun; Korepin, Vladimir
2018-06-01
Quantum partial search algorithm is an approximate search. It aims to find a target block (which has the target items). It runs a little faster than full Grover search. In this paper, we consider quantum partial search algorithm for multiple target items unevenly distributed in a database (target blocks have different number of target items). The algorithm we describe can locate one of the target blocks. Efficiency of the algorithm is measured by number of queries to the oracle. We optimize the algorithm in order to improve efficiency. By perturbation method, we find that the algorithm runs the fastest when target items are evenly distributed in database.
System for Performing Single Query Searches of Heterogeneous and Dispersed Databases
NASA Technical Reports Server (NTRS)
Maluf, David A. (Inventor); Okimura, Takeshi (Inventor); Gurram, Mohana M. (Inventor); Tran, Vu Hoang (Inventor); Knight, Christopher D. (Inventor); Trinh, Anh Ngoc (Inventor)
2017-01-01
The present invention is a distributed computer system of heterogeneous databases joined in an information grid and configured with an Application Programming Interface hardware which includes a search engine component for performing user-structured queries on multiple heterogeneous databases in real time. This invention reduces overhead associated with the impedance mismatch that commonly occurs in heterogeneous database queries.
Expert searching in public health
Alpi, Kristine M.
2005-01-01
Objective: The article explores the characteristics of public health information needs and the resources available to address those needs that distinguish it as an area of searching requiring particular expertise. Methods: Public health searching activities from reference questions and literature search requests at a large, urban health department library were reviewed to identify the challenges in finding relevant public health information. Results: The terminology of the information request frequently differed from the vocabularies available in the databases. Searches required the use of multiple databases and/or Web resources with diverse interfaces. Issues of the scope and features of the databases relevant to the search questions were considered. Conclusion: Expert searching in public health differs from other types of expert searching in the subject breadth and technical demands of the databases to be searched, the fluidity and lack of standardization of the vocabulary, and the relative scarcity of high-quality investigations at the appropriate level of geographic specificity. Health sciences librarians require a broad exposure to databases, gray literature, and public health terminology to perform as expert searchers in public health. PMID:15685281
Citation searching: a systematic review case study of multiple risk behaviour interventions
2014-01-01
Background The value of citation searches as part of the systematic review process is currently unknown. While the major guides to conducting systematic reviews state that citation searching should be carried out in addition to searching bibliographic databases there are still few studies in the literature that support this view. Rather than using a predefined search strategy to retrieve studies, citation searching uses known relevant papers to identify further papers. Methods We describe a case study about the effectiveness of using the citation sources Google Scholar, Scopus, Web of Science and OVIDSP MEDLINE to identify records for inclusion in a systematic review. We used the 40 included studies identified by traditional database searches from one systematic review of interventions for multiple risk behaviours. We searched for each of the included studies in the four citation sources to retrieve the details of all papers that have cited these studies. We carried out two analyses; the first was to examine the overlap between the four citation sources to identify which citation tool was the most useful; the second was to investigate whether the citation searches identified any relevant records in addition to those retrieved by the original database searches. Results The highest number of citations was retrieved from Google Scholar (1680), followed by Scopus (1173), then Web of Science (1095) and lastly OVIDSP (213). To retrieve all the records identified by the citation tracking searching all four resources was required. Google Scholar identified the highest number of unique citations. The citation tracking identified 9 studies that met the review’s inclusion criteria. Eight of these had already been identified by the traditional databases searches and identified in the screening process while the ninth was not available in any of the databases when the original searches were carried out. It would, however, have been identified by two of the database search strategies if searches had been carried out later. Conclusions Based on the results from this investigation, citation searching as a supplementary search method for systematic reviews may not be the best use of valuable time and resources. It would be useful to verify these findings in other reviews. PMID:24893958
A Full-Text-Based Search Engine for Finding Highly Matched Documents Across Multiple Categories
NASA Technical Reports Server (NTRS)
Nguyen, Hung D.; Steele, Gynelle C.
2016-01-01
This report demonstrates the full-text-based search engine that works on any Web-based mobile application. The engine has the capability to search databases across multiple categories based on a user's queries and identify the most relevant or similar. The search results presented here were found using an Android (Google Co.) mobile device; however, it is also compatible with other mobile phones.
NASA Astrophysics Data System (ADS)
Graham, Jim; Jarnevich, Catherine S.; Simpson, Annie; Newman, Gregory J.; Stohlgren, Thomas J.
2011-06-01
Invasive species are a universal global problem, but the information to identify them, manage them, and prevent invasions is stored around the globe in a variety of formats. The Global Invasive Species Information Network is a consortium of organizations working toward providing seamless access to these disparate databases via the Internet. A distributed network of databases can be created using the Internet and a standard web service protocol. There are two options to provide this integration. First, federated searches are being proposed to allow users to search "deep" web documents such as databases for invasive species. A second method is to create a cache of data from the databases for searching. We compare these two methods, and show that federated searches will not provide the performance and flexibility required from users and a central cache of the datum are required to improve performance.
Graham, Jim; Jarnevich, Catherine S.; Simpson, Annie; Newman, Gregory J.; Stohlgren, Thomas J.
2011-01-01
Invasive species are a universal global problem, but the information to identify them, manage them, and prevent invasions is stored around the globe in a variety of formats. The Global Invasive Species Information Network is a consortium of organizations working toward providing seamless access to these disparate databases via the Internet. A distributed network of databases can be created using the Internet and a standard web service protocol. There are two options to provide this integration. First, federated searches are being proposed to allow users to search “deep” web documents such as databases for invasive species. A second method is to create a cache of data from the databases for searching. We compare these two methods, and show that federated searches will not provide the performance and flexibility required from users and a central cache of the datum are required to improve performance.
Lawrence, D W
2008-12-01
To assess what is lost if only one literature database is searched for articles relevant to injury prevention and safety promotion (IPSP) topics. Serial textword (keyword, free-text) searches using multiple synonym terms for five key IPSP topics (bicycle-related brain injuries, ethanol-impaired driving, house fires, road rage, and suicidal behaviors among adolescents) were conducted in four of the bibliographic databases that are most used by IPSP professionals: EMBASE, MEDLINE, PsycINFO, and Web of Science. Through a systematic procedure, an inventory of articles on each topic in each database was conducted to identify the total unduplicated count of all articles on each topic, the number of articles unique to each database, and the articles available if only one database is searched. No single database included all of the relevant articles on any topic, and the database with the broadest coverage differed by topic. A search of only one literature database will return 16.7-81.5% (median 43.4%) of the available articles on any of five key IPSP topics. Each database contributed unique articles to the total bibliography for each topic. A literature search performed in only one database will, on average, lead to a loss of more than half of the available literature on a topic.
Kwon, Taejoon; Choi, Hyungwon; Vogel, Christine; Nesvizhskii, Alexey I; Marcotte, Edward M
2011-07-01
Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance scores such as p-values or posterior probabilities of peptide-spectrum matches (PSMs) from multiple search engines after high scoring peptides have been assigned to spectra, but these methods lack reliable control of identification error rates as data are integrated from different search engines. We developed a statistically coherent method for integrative analysis, termed MSblender. MSblender converts raw search scores from search engines into a probability score for every possible PSM and properly accounts for the correlation between search scores. The method reliably estimates false discovery rates and identifies more PSMs than any single search engine at the same false discovery rate. Increased identifications increment spectral counts for most proteins and allow quantification of proteins that would not have been quantified by individual search engines. We also demonstrate that enhanced quantification contributes to improve sensitivity in differential expression analyses.
Kwon, Taejoon; Choi, Hyungwon; Vogel, Christine; Nesvizhskii, Alexey I.; Marcotte, Edward M.
2011-01-01
Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance scores such as p-values or posterior probabilities of peptide-spectrum matches (PSMs) from multiple search engines after high scoring peptides have been assigned to spectra, but these methods lack reliable control of identification error rates as data are integrated from different search engines. We developed a statistically coherent method for integrative analysis, termed MSblender. MSblender converts raw search scores from search engines into a probability score for all possible PSMs and properly accounts for the correlation between search scores. The method reliably estimates false discovery rates and identifies more PSMs than any single search engine at the same false discovery rate. Increased identifications increment spectral counts for all detected proteins and allow quantification of proteins that would not have been quantified by individual search engines. We also demonstrate that enhanced quantification contributes to improve sensitivity in differential expression analyses. PMID:21488652
BioCarian: search engine for exploratory searches in heterogeneous biological databases.
Zaki, Nazar; Tennakoon, Chandana
2017-10-02
There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community. We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search on previously published viral integration data and were able to deduce the main conclusions of the original publication. BioCarian is accessible via http://www.biocarian.com . We have developed a search engine to explore RDF databases that can be used by both novice and advanced users.
A review on quantum search algorithms
NASA Astrophysics Data System (ADS)
Giri, Pulak Ranjan; Korepin, Vladimir E.
2017-12-01
The use of superposition of states in quantum computation, known as quantum parallelism, has significant advantage in terms of speed over the classical computation. It is evident from the early invented quantum algorithms such as Deutsch's algorithm, Deutsch-Jozsa algorithm and its variation as Bernstein-Vazirani algorithm, Simon algorithm, Shor's algorithms, etc. Quantum parallelism also significantly speeds up the database search algorithm, which is important in computer science because it comes as a subroutine in many important algorithms. Quantum database search of Grover achieves the task of finding the target element in an unsorted database in a time quadratically faster than the classical computer. We review Grover's quantum search algorithms for a singe and multiple target elements in a database. The partial search algorithm of Grover and Radhakrishnan and its optimization by Korepin called GRK algorithm are also discussed.
Large-scale feature searches of collections of medical imagery
NASA Astrophysics Data System (ADS)
Hedgcock, Marcus W.; Karshat, Walter B.; Levitt, Tod S.; Vosky, D. N.
1993-09-01
Large scale feature searches of accumulated collections of medical imagery are required for multiple purposes, including clinical studies, administrative planning, epidemiology, teaching, quality improvement, and research. To perform a feature search of large collections of medical imagery, one can either search text descriptors of the imagery in the collection (usually the interpretation), or (if the imagery is in digital format) the imagery itself. At our institution, text interpretations of medical imagery are all available in our VA Hospital Information System. These are downloaded daily into an off-line computer. The text descriptors of most medical imagery are usually formatted as free text, and so require a user friendly database search tool to make searches quick and easy for any user to design and execute. We are tailoring such a database search tool (Liveview), developed by one of the authors (Karshat). To further facilitate search construction, we are constructing (from our accumulated interpretation data) a dictionary of medical and radiological terms and synonyms. If the imagery database is digital, the imagery which the search discovers is easily retrieved from the computer archive. We describe our database search user interface, with examples, and compare the efficacy of computer assisted imagery searches from a clinical text database with manual searches. Our initial work on direct feature searches of digital medical imagery is outlined.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-05-25
... multiple nondramatic musical works may be submitted electronically as XML files. Electronically submitted Notices will be maintained in a database that can be searched using any of the included fields of... the Licensing Division for a search of the database during the interim period. As such, the Office...
Deciu, Cosmin; Sun, Jun; Wall, Mark A
2007-09-01
We discuss several aspects related to load balancing of database search jobs in a distributed computing environment, such as Linux cluster. Load balancing is a technique for making the most of multiple computational resources, which is particularly relevant in environments in which the usage of such resources is very high. The particular case of the Sequest program is considered here, but the general methodology should apply to any similar database search program. We show how the runtimes for Sequest searches of tandem mass spectral data can be predicted from profiles of previous representative searches, and how this information can be used for better load balancing of novel data. A well-known heuristic load balancing method is shown to be applicable to this problem, and its performance is analyzed for a variety of search parameters.
Searching Across the International Space Station Databases
NASA Technical Reports Server (NTRS)
Maluf, David A.; McDermott, William J.; Smith, Ernest E.; Bell, David G.; Gurram, Mohana
2007-01-01
Data access in the enterprise generally requires us to combine data from different sources and different formats. It is advantageous thus to focus on the intersection of the knowledge across sources and domains; keeping irrelevant knowledge around only serves to make the integration more unwieldy and more complicated than necessary. A context search over multiple domain is proposed in this paper to use context sensitive queries to support disciplined manipulation of domain knowledge resources. The objective of a context search is to provide the capability for interrogating many domain knowledge resources, which are largely semantically disjoint. The search supports formally the tasks of selecting, combining, extending, specializing, and modifying components from a diverse set of domains. This paper demonstrates a new paradigm in composition of information for enterprise applications. In particular, it discusses an approach to achieving data integration across multiple sources, in a manner that does not require heavy investment in database and middleware maintenance. This lean approach to integration leads to cost-effectiveness and scalability of data integration with an underlying schemaless object-relational database management system. This highly scalable, information on demand system framework, called NX-Search, which is an implementation of an information system built on NETMARK. NETMARK is a flexible, high-throughput open database integration framework for managing, storing, and searching unstructured or semi-structured arbitrary XML and HTML used widely at the National Aeronautics Space Administration (NASA) and industry.
Sagace: A web-based search engine for biomedical databases in Japan
2012-01-01
Background In the big data era, biomedical research continues to generate a large amount of data, and the generated information is often stored in a database and made publicly available. Although combining data from multiple databases should accelerate further studies, the current number of life sciences databases is too large to grasp features and contents of each database. Findings We have developed Sagace, a web-based search engine that enables users to retrieve information from a range of biological databases (such as gene expression profiles and proteomics data) and biological resource banks (such as mouse models of disease and cell lines). With Sagace, users can search more than 300 databases in Japan. Sagace offers features tailored to biomedical research, including manually tuned ranking, a faceted navigation to refine search results, and rich snippets constructed with retrieved metadata for each database entry. Conclusions Sagace will be valuable for experts who are involved in biomedical research and drug development in both academia and industry. Sagace is freely available at http://sagace.nibio.go.jp/en/. PMID:23110816
OrChem - An open source chemistry search engine for Oracle(R).
Rijnbeek, Mark; Steinbeck, Christoph
2009-10-22
Registration, indexing and searching of chemical structures in relational databases is one of the core areas of cheminformatics. However, little detail has been published on the inner workings of search engines and their development has been mostly closed-source. We decided to develop an open source chemistry extension for Oracle, the de facto database platform in the commercial world. Here we present OrChem, an extension for the Oracle 11G database that adds registration and indexing of chemical structures to support fast substructure and similarity searching. The cheminformatics functionality is provided by the Chemistry Development Kit. OrChem provides similarity searching with response times in the order of seconds for databases with millions of compounds, depending on a given similarity cut-off. For substructure searching, it can make use of multiple processor cores on today's powerful database servers to provide fast response times in equally large data sets. OrChem is free software and can be redistributed and/or modified under the terms of the GNU Lesser General Public License as published by the Free Software Foundation. All software is available via http://orchem.sourceforge.net.
Detection of alternative splice variants at the proteome level in Aspergillus flavus.
Chang, Kung-Yen; Georgianna, D Ryan; Heber, Steffen; Payne, Gary A; Muddiman, David C
2010-03-05
Identification of proteins from proteolytic peptides or intact proteins plays an essential role in proteomics. Researchers use search engines to match the acquired peptide sequences to the target proteins. However, search engines depend on protein databases to provide candidates for consideration. Alternative splicing (AS), the mechanism where the exon of pre-mRNAs can be spliced and rearranged to generate distinct mRNA and therefore protein variants, enable higher eukaryotic organisms, with only a limited number of genes, to have the requisite complexity and diversity at the proteome level. Multiple alternative isoforms from one gene often share common segments of sequences. However, many protein databases only include a limited number of isoforms to keep minimal redundancy. As a result, the database search might not identify a target protein even with high quality tandem MS data and accurate intact precursor ion mass. We computationally predicted an exhaustive list of putative isoforms of Aspergillus flavus proteins from 20 371 expressed sequence tags to investigate whether an alternative splicing protein database can assign a greater proportion of mass spectrometry data. The newly constructed AS database provided 9807 new alternatively spliced variants in addition to 12 832 previously annotated proteins. The searches of the existing tandem MS spectra data set using the AS database identified 29 new proteins encoded by 26 genes. Nine fungal genes appeared to have multiple protein isoforms. In addition to the discovery of splice variants, AS database also showed potential to improve genome annotation. In summary, the introduction of an alternative splicing database helps identify more proteins and unveils more information about a proteome.
ASGARD: an open-access database of annotated transcriptomes for emerging model arthropod species.
Zeng, Victor; Extavour, Cassandra G
2012-01-01
The increased throughput and decreased cost of next-generation sequencing (NGS) have shifted the bottleneck genomic research from sequencing to annotation, analysis and accessibility. This is particularly challenging for research communities working on organisms that lack the basic infrastructure of a sequenced genome, or an efficient way to utilize whatever sequence data may be available. Here we present a new database, the Assembled Searchable Giant Arthropod Read Database (ASGARD). This database is a repository and search engine for transcriptomic data from arthropods that are of high interest to multiple research communities but currently lack sequenced genomes. We demonstrate the functionality and utility of ASGARD using de novo assembled transcriptomes from the milkweed bug Oncopeltus fasciatus, the cricket Gryllus bimaculatus and the amphipod crustacean Parhyale hawaiensis. We have annotated these transcriptomes to assign putative orthology, coding region determination, protein domain identification and Gene Ontology (GO) term annotation to all possible assembly products. ASGARD allows users to search all assemblies by orthology annotation, GO term annotation or Basic Local Alignment Search Tool. User-friendly features of ASGARD include search term auto-completion suggestions based on database content, the ability to download assembly product sequences in FASTA format, direct links to NCBI data for predicted orthologs and graphical representation of the location of protein domains and matches to similar sequences from the NCBI non-redundant database. ASGARD will be a useful repository for transcriptome data from future NGS studies on these and other emerging model arthropods, regardless of sequencing platform, assembly or annotation status. This database thus provides easy, one-stop access to multi-species annotated transcriptome information. We anticipate that this database will be useful for members of multiple research communities, including developmental biology, physiology, evolutionary biology, ecology, comparative genomics and phylogenomics. Database URL: asgard.rc.fas.harvard.edu.
GWFASTA: server for FASTA search in eukaryotic and microbial genomes.
Issac, Biju; Raghava, G P S
2002-09-01
Similarity searches are a powerful method for solving important biological problems such as database scanning, evolutionary studies, gene prediction, and protein structure prediction. FASTA is a widely used sequence comparison tool for rapid database scanning. Here we describe the GWFASTA server that was developed to assist the FASTA user in similarity searches against partially and/or completely sequenced genomes. GWFASTA consists of more than 60 microbial genomes, eight eukaryote genomes, and proteomes of annotatedgenomes. Infact, it provides the maximum number of databases for similarity searching from a single platform. GWFASTA allows the submission of more than one sequence as a single query for a FASTA search. It also provides integrated post-processing of FASTA output, including compositional analysis of proteins, multiple sequences alignment, and phylogenetic analysis. Furthermore, it summarizes the search results organism-wise for prokaryotes and chromosome-wise for eukaryotes. Thus, the integration of different tools for sequence analyses makes GWFASTA a powerful toolfor biologists.
Electronic Reference Library: Silverplatter's Database Networking Solution.
ERIC Educational Resources Information Center
Millea, Megan
Silverplatter's Electronic Reference Library (ERL) provides wide area network access to its databases using TCP/IP communications and client-server architecture. ERL has two main components: The ERL clients (retrieval interface) and the ERL server (search engines). ERL clients provide patrons with seamless access to multiple databases on multiple…
Huang, Wenjie; Feng, Wei; Li, Yang; Chen, Yu
2014-11-01
To explore the correlation regarding the prognostic influence between multiple lung lobe lesions and acquired pneumonia in hospitalized elderly patients by a Meta-analysis. We collected all studies which investigated the correlation regarding the prognostic effect between multiple lung lobe lesions and acquired pneumonia by searching China National Knowledge Infrastructure, Wanfang Database, Chinese Science and Technology Periodical Database, Chinese Biological Medical Literature Database, PubMed, and EMBase in accordance with the inclusion and exclusion criteria. Th e retrieval limit time of searches was from databases establishment to July 2014. Th e Meta-analysis was performed by using RevMan5.2 soft ware. We calculated the odds ratio (OR) and 95% confidence interval (95% CI) by using heterogeneous tests. Publication bias was assessed by Egger's test and funnel plot, and the sensitivity was analyzed. Ten studies involving 1 836 patients were finally included, with 487 cases (the dead group) and 1 349 controls (the survival group). The Meta-analysis demonstrated that multiple lung lobe lesions was highly correlated with the prognosis for the aged acquired pneumonia (OR=3.22, 95% CI 1.84 to 5.63). Multiple lung lobe lesions increase the risk of death in the prognosis of the aged patients with acquired pneumonia.
SHIFT: Shared Information Framework and Technology Concept
2009-02-01
easy to get an overview of all those tasks what is happening around each mission. Federated Search allows actors to search multiple data sources...is sent to every individual database in the portal or federated search list. 4.7. SHIFT and the world around it The idea in SHIFT is to enable
OrChem - An open source chemistry search engine for Oracle®
2009-01-01
Background Registration, indexing and searching of chemical structures in relational databases is one of the core areas of cheminformatics. However, little detail has been published on the inner workings of search engines and their development has been mostly closed-source. We decided to develop an open source chemistry extension for Oracle, the de facto database platform in the commercial world. Results Here we present OrChem, an extension for the Oracle 11G database that adds registration and indexing of chemical structures to support fast substructure and similarity searching. The cheminformatics functionality is provided by the Chemistry Development Kit. OrChem provides similarity searching with response times in the order of seconds for databases with millions of compounds, depending on a given similarity cut-off. For substructure searching, it can make use of multiple processor cores on today's powerful database servers to provide fast response times in equally large data sets. Availability OrChem is free software and can be redistributed and/or modified under the terms of the GNU Lesser General Public License as published by the Free Software Foundation. All software is available via http://orchem.sourceforge.net. PMID:20298521
Manríquez, Juan J
2008-04-01
Systematic reviews should include as many articles as possible. However, many systematic reviews use only databases with high English language content as sources of trials. Literatura Latino Americana e do Caribe em Ciências da Saúde (LILACS) is an underused source of trials, and there is not a validated strategy for searching clinical trials to be used in this database. The objective of this study was to develop a sensitive search strategy for clinical trials in LILACS. An analytical survey was performed. Several single and multiple-term search strategies were tested for their ability to retrieve clinical trials in LILACS. Sensitivity, specificity, and accuracy of each single and multiple-term strategy were calculated using the results of a hand-search of 44 Chilean journals as gold standard. After combining the most sensitive, specific, and accurate single and multiple-term search strategy, a strategy with a sensitivity of 97.75% (95% confidence interval [CI]=95.98-99.53) and a specificity of 61.85 (95% CI=61.19-62.51) was obtained. LILACS is a source of trials that could improve systematic reviews. A new highly sensitive search strategy for clinical trials in LILACS has been developed. It is hoped this search strategy will improve and increase the utilization of LILACS in future systematic reviews.
LymPHOS 2.0: an update of a phosphosite database of primary human T cells
Nguyen, Tien Dung; Vidal-Cortes, Oriol; Gallardo, Oscar; Abian, Joaquin; Carrascal, Montserrat
2015-01-01
LymPHOS is a web-oriented database containing peptide and protein sequences and spectrometric information on the phosphoproteome of primary human T-Lymphocytes. Current release 2.0 contains 15 566 phosphorylation sites from 8273 unique phosphopeptides and 4937 proteins, which correspond to a 45-fold increase over the original database description. It now includes quantitative data on phosphorylation changes after time-dependent treatment with activators of the TCR-mediated signal transduction pathway. Sequence data quality has also been improved with the use of multiple search engines for database searching. LymPHOS can be publicly accessed at http://www.lymphos.org. Database URL: http://www.lymphos.org. PMID:26708986
Phenotip - a web-based instrument to help diagnosing fetal syndromes antenatally.
Porat, Shay; de Rham, Maud; Giamboni, Davide; Van Mieghem, Tim; Baud, David
2014-12-10
Prenatal ultrasound can often reliably distinguish fetal anatomic anomalies, particularly in the hands of an experienced ultrasonographer. Given the large number of existing syndromes and the significant overlap in prenatal findings, antenatal differentiation for syndrome diagnosis is difficult. We constructed a hierarchic tree of 1140 sonographic markers and submarkers, organized per organ system. Subsequently, a database of prenatally diagnosable syndromes was built. An internet-based search engine was then designed to search the syndrome database based on a single or multiple sonographic markers. Future developments will include a database with magnetic resonance imaging findings as well as further refinements in the search engine to allow prioritization based on incidence of syndromes and markers.
Cogo, Elise; Sampson, Margaret; Ajiferuke, Isola; Manheimer, Eric; Campbell, Kaitryn; Daniel, Raymond; Moher, David
2011-01-01
This project aims to assess the utility of bibliographic databases beyond the three major ones (MEDLINE, EMBASE and Cochrane CENTRAL) for finding controlled trials of complementary and alternative medicine (CAM). Fifteen databases were searched to identify controlled clinical trials (CCTs) of CAM not also indexed in MEDLINE. Searches were conducted in May 2006 using the revised Cochrane highly sensitive search strategy (HSSS) and the PubMed CAM Subset. Yield of CAM trials per 100 records was determined, and databases were compared over a standardized period (2005). The Acudoc2 RCT, Acubriefs, Index to Chiropractic Literature (ICL) and Hom-Inform databases had the highest concentrations of non-MEDLINE records, with more than 100 non-MEDLINE records per 500. Other productive databases had ratios between 500 and 1500 records to 100 non-MEDLINE records—these were AMED, MANTIS, PsycINFO, CINAHL, Global Health and Alt HealthWatch. Five databases were found to be unproductive: AGRICOLA, CAIRSS, Datadiwan, Herb Research Foundation and IBIDS. Acudoc2 RCT yielded 100 CAM trials in the most recent 100 records screened. Acubriefs, AMED, Hom-Inform, MANTIS, PsycINFO and CINAHL had more than 25 CAM trials per 100 records screened. Global Health, ICL and Alt HealthWatch were below 25 in yield. There were 255 non-MEDLINE trials from eight databases in 2005, with only 10% indexed in more than one database. Yield varied greatly between databases; the most productive databases from both sampling methods were Acubriefs, Acudoc2 RCT, AMED and CINAHL. Low overlap between databases indicates comprehensive CAM literature searches will require multiple databases. PMID:19468052
Cogo, Elise; Sampson, Margaret; Ajiferuke, Isola; Manheimer, Eric; Campbell, Kaitryn; Daniel, Raymond; Moher, David
2011-01-01
This project aims to assess the utility of bibliographic databases beyond the three major ones (MEDLINE, EMBASE and Cochrane CENTRAL) for finding controlled trials of complementary and alternative medicine (CAM). Fifteen databases were searched to identify controlled clinical trials (CCTs) of CAM not also indexed in MEDLINE. Searches were conducted in May 2006 using the revised Cochrane highly sensitive search strategy (HSSS) and the PubMed CAM Subset. Yield of CAM trials per 100 records was determined, and databases were compared over a standardized period (2005). The Acudoc2 RCT, Acubriefs, Index to Chiropractic Literature (ICL) and Hom-Inform databases had the highest concentrations of non-MEDLINE records, with more than 100 non-MEDLINE records per 500. Other productive databases had ratios between 500 and 1500 records to 100 non-MEDLINE records-these were AMED, MANTIS, PsycINFO, CINAHL, Global Health and Alt HealthWatch. Five databases were found to be unproductive: AGRICOLA, CAIRSS, Datadiwan, Herb Research Foundation and IBIDS. Acudoc2 RCT yielded 100 CAM trials in the most recent 100 records screened. Acubriefs, AMED, Hom-Inform, MANTIS, PsycINFO and CINAHL had more than 25 CAM trials per 100 records screened. Global Health, ICL and Alt HealthWatch were below 25 in yield. There were 255 non-MEDLINE trials from eight databases in 2005, with only 10% indexed in more than one database. Yield varied greatly between databases; the most productive databases from both sampling methods were Acubriefs, Acudoc2 RCT, AMED and CINAHL. Low overlap between databases indicates comprehensive CAM literature searches will require multiple databases.
Embedding strategies for effective use of information from multiple sequence alignments.
Henikoff, S.; Henikoff, J. G.
1997-01-01
We describe a new strategy for utilizing multiple sequence alignment information to detect distant relationships in searches of sequence databases. A single sequence representing a protein family is enriched by replacing conserved regions with position-specific scoring matrices (PSSMs) or consensus residues derived from multiple alignments of family members. In comprehensive tests of these and other family representations, PSSM-embedded queries produced the best results overall when used with a special version of the Smith-Waterman searching algorithm. Moreover, embedding consensus residues instead of PSSMs improved performance with readily available single sequence query searching programs, such as BLAST and FASTA. Embedding PSSMs or consensus residues into a representative sequence improves searching performance by extracting multiple alignment information from motif regions while retaining single sequence information where alignment is uncertain. PMID:9070452
Colangelo, Christopher M.; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L.; Carriero, Nicholas J.; Gulcicek, Erol E.; Lam, TuKiet T.; Wu, Terence; Bjornson, Robert D.; Bruce, Can; Nairn, Angus C.; Rinehart, Jesse; Miller, Perry L.; Williams, Kenneth R.
2015-01-01
We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography–tandem mass spectrometry (LC–MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED’s database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. PMID:25712262
Colangelo, Christopher M; Shifman, Mark; Cheung, Kei-Hoi; Stone, Kathryn L; Carriero, Nicholas J; Gulcicek, Erol E; Lam, TuKiet T; Wu, Terence; Bjornson, Robert D; Bruce, Can; Nairn, Angus C; Rinehart, Jesse; Miller, Perry L; Williams, Kenneth R
2015-02-01
We report a significantly-enhanced bioinformatics suite and database for proteomics research called Yale Protein Expression Database (YPED) that is used by investigators at more than 300 institutions worldwide. YPED meets the data management, archival, and analysis needs of a high-throughput mass spectrometry-based proteomics research ranging from a single laboratory, group of laboratories within and beyond an institution, to the entire proteomics community. The current version is a significant improvement over the first version in that it contains new modules for liquid chromatography-tandem mass spectrometry (LC-MS/MS) database search results, label and label-free quantitative proteomic analysis, and several scoring outputs for phosphopeptide site localization. In addition, we have added both peptide and protein comparative analysis tools to enable pairwise analysis of distinct peptides/proteins in each sample and of overlapping peptides/proteins between all samples in multiple datasets. We have also implemented a targeted proteomics module for automated multiple reaction monitoring (MRM)/selective reaction monitoring (SRM) assay development. We have linked YPED's database search results and both label-based and label-free fold-change analysis to the Skyline Panorama repository for online spectra visualization. In addition, we have built enhanced functionality to curate peptide identifications into an MS/MS peptide spectral library for all of our protein database search identification results. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Clinical and research searching on the wild side: exploring the veterinary literature
Alpi, Kristine M.; Stringer, Elizabeth; DeVoe, Ryan S.; Stoskopf, Michael
2009-01-01
Zoological medicine furthers the health and well-being of captive and free-ranging wild animals. Effective information retrieval of the zoological medicine literature demands searching multiple databases, conference proceedings, and organization websites using a wide variety of keywords and controlled vocabulary. Veterinarians, residents, students, and the librarians who serve them must have patience for multiple search iterations to capture the majority of the available knowledge. The complexities of thorough literature searches are more difficult for nondomestic animal clinical cases and research reviews as demonstrated by three search requests involving poisonous snakes, a gorilla, and spiders. Expanding and better disseminating the knowledgebase of zoological medicine will make veterinary searching easier. PMID:19626142
CRAVE: a database, middleware and visualization system for phenotype ontologies.
Gkoutos, Georgios V; Green, Eain C J; Greenaway, Simon; Blake, Andrew; Mallon, Ann-Marie; Hancock, John M
2005-04-01
A major challenge in modern biology is to link genome sequence information to organismal function. In many organisms this is being done by characterizing phenotypes resulting from mutations. Efficiently expressing phenotypic information requires combinatorial use of ontologies. However tools are not currently available to visualize combinations of ontologies. Here we describe CRAVE (Concept Relation Assay Value Explorer), a package allowing storage, active updating and visualization of multiple ontologies. CRAVE is a web-accessible JAVA application that accesses an underlying MySQL database of ontologies via a JAVA persistent middleware layer (Chameleon). This maps the database tables into discrete JAVA classes and creates memory resident, interlinked objects corresponding to the ontology data. These JAVA objects are accessed via calls through the middleware's application programming interface. CRAVE allows simultaneous display and linking of multiple ontologies and searching using Boolean and advanced searches.
Task-Driven Dynamic Text Summarization
ERIC Educational Resources Information Center
Workman, Terri Elizabeth
2011-01-01
The objective of this work is to examine the efficacy of natural language processing (NLP) in summarizing bibliographic text for multiple purposes. Researchers have noted the accelerating growth of bibliographic databases. Information seekers using traditional information retrieval techniques when searching large bibliographic databases are often…
Analysis on the Correlation of Traffic Flow in Hainan Province Based on Baidu Search
NASA Astrophysics Data System (ADS)
Chen, Caixia; Shi, Chun
2018-03-01
Internet search data records user’s search attention and consumer demand, providing necessary database for the Hainan traffic flow model. Based on Baidu Index, with Hainan traffic flow as example, this paper conduct both qualitative and quantitative analysis on the relationship between search keyword from Baidu Index and actual Hainan tourist traffic flow, and build multiple regression model by SPSS.
Combined semantic and similarity search in medical image databases
NASA Astrophysics Data System (ADS)
Seifert, Sascha; Thoma, Marisa; Stegmaier, Florian; Hammon, Matthias; Kramer, Martin; Huber, Martin; Kriegel, Hans-Peter; Cavallaro, Alexander; Comaniciu, Dorin
2011-03-01
The current diagnostic process at hospitals is mainly based on reviewing and comparing images coming from multiple time points and modalities in order to monitor disease progression over a period of time. However, for ambiguous cases the radiologist deeply relies on reference literature or second opinion. Although there is a vast amount of acquired images stored in PACS systems which could be reused for decision support, these data sets suffer from weak search capabilities. Thus, we present a search methodology which enables the physician to fulfill intelligent search scenarios on medical image databases combining ontology-based semantic and appearance-based similarity search. It enabled the elimination of 12% of the top ten hits which would arise without taking the semantic context into account.
The Human Transcript Database: A Catalogue of Full Length cDNA Inserts
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bouckk John; Michael McLeod; Kim Worley
1999-09-10
The BCM Search Launcher provided improved access to web-based sequence analysis services during the granting period and beyond. The Search Launcher web site grouped analysis procedures by function and provided default parameters that provided reasonable search results for most applications. For instance, most queries were automatically masked for repeat sequences prior to sequence database searches to avoid spurious matches. In addition to the web-based access and arrangements that were made using the functions easier, the BCM Search Launcher provided unique value-added applications like the BEAUTY sequence database search tool that combined information about protein domains and sequence database search resultsmore » to give an enhanced, more complete picture of the reliability and relative value of the information reported. This enhanced search tool made evaluating search results more straight-forward and consistent. Some of the favorite features of the web site are the sequence utilities and the batch client functionality that allows processing of multiple samples from the command line interface. One measure of the success of the BCM Search Launcher is the number of sites that have adopted the models first developed on the site. The graphic display on the BLAST search from the NCBI web site is one such outgrowth, as is the display of protein domain search results within BLAST search results, and the design of the Biology Workbench application. The logs of usage and comments from users confirm the great utility of this resource.« less
Structator: fast index-based search for RNA sequence-structure patterns
2011-01-01
Background The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence-structure patterns. Unfortunately, current tools for this task have, in the best case, a running time that is only linear in the size of sequence databases. Furthermore, established index data structures for fast sequence matching, like suffix trees or arrays, cannot benefit from the complementarity constraints introduced by the secondary structure of RNAs. Results We present a novel method and readily applicable software for time efficient matching of RNA sequence-structure patterns in sequence databases. Our approach is based on affix arrays, a recently introduced index data structure, preprocessed from the target database. Affix arrays support bidirectional pattern search, which is required for efficiently handling the structural constraints of the pattern. Structural patterns like stem-loops can be matched inside out, such that the loop region is matched first and then the pairing bases on the boundaries are matched consecutively. This allows to exploit base pairing information for search space reduction and leads to an expected running time that is sublinear in the size of the sequence database. The incorporation of a new chaining approach in the search of RNA sequence-structure patterns enables the description of molecules folding into complex secondary structures with multiple ordered patterns. The chaining approach removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our method runs up to two orders of magnitude faster than previous methods. Conclusions The presented method's sublinear expected running time makes it well suited for RNA sequence-structure pattern matching in large sequence databases. RNA molecules containing several stem-loop substructures can be described by multiple sequence-structure patterns and their matches are efficiently handled by a novel chaining method. Beyond our algorithmic contributions, we provide with Structator a complete and robust open-source software solution for index-based search of RNA sequence-structure patterns. The Structator software is available at http://www.zbh.uni-hamburg.de/Structator. PMID:21619640
BLAST and FASTA similarity searching for multiple sequence alignment.
Pearson, William R
2014-01-01
BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.
Reneker, Jeff; Shyu, Chi-Ren; Zeng, Peiyu; Polacco, Joseph C.; Gassmann, Walter
2004-01-01
We have developed a web server for the life sciences community to use to search for short repeats of DNA sequence of length between 3 and 10 000 bases within multiple species. This search employs a unique and fast hash function approach. Our system also applies information retrieval algorithms to discover knowledge of cross-species conservation of repeat sequences. Furthermore, we have incorporated a part of the Gene Ontology database into our information retrieval algorithms to broaden the coverage of the search. Our web server and tutorial can be found at http://acmes.rnet.missouri.edu. PMID:15215469
ERIC Educational Resources Information Center
Quin, Daniel
2017-01-01
This systematic review examined multiple indicators of adolescent students' engagement in school, and the indicators' associations with teacher-student relationships (TSRs). Seven psychology, education, and social sciences databases were systematically searched. From this search, 46 published studies (13 longitudinal) were included for detailed…
Self-Report Measures of Juvenile Psychopathic Personality Traits: A Comparative Review
ERIC Educational Resources Information Center
Vaughn, Michael G.; Howard, Matthew O.
2005-01-01
The authors evaluated self-report instruments currently being used to assess children and adolescents with psychopathic personality traits with respect to their reliability, validity, and research utility. Comprehensive searches across multiple computerized bibliographic databases were conducted and supplemented with manual searches. A total of 30…
A Taxonomic Search Engine: Federating taxonomic databases using web services
Page, Roderic DM
2005-01-01
Background The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. Results The Taxonomic Search Engine (TSE) is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO) and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID) authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata) for each name. Conclusion The Taxonomic Search Engine is available at and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names. PMID:15757517
Brown, Elliot G
2003-01-01
The Medical Dictionary for Regulatory Activities (MedDRA) is a unified standard terminology for recording and reporting adverse drug event data. Its introduction is widely seen as a significant improvement on the previous situation, where a multitude of terminologies of widely varying scope and quality were in use. However, there are some complexities that may cause difficulties, and these will form the focus for this paper. Two methods of searching MedDRA-coded databases are described: searching based on term selection from all of MedDRA and searching based on terms in the safety database. There are several potential traps for the unwary in safety searches. There may be multiple locations of relevant terms within a system organ class (SOC) and lack of recognition of appropriate group terms; the user may think that group terms are more inclusive than is the case. MedDRA may distribute terms relevant to one medical condition across several primary SOCs. If the database supports the MedDRA model, it is possible to perform multiaxial searching: while this may help find terms that might have been missed, it is still necessary to consider the entire contents of the SOCs to find all relevant terms and there are many instances of incomplete secondary linkages. It is important to adjust for multiaxiality if data are presented using primary and secondary locations. Other sources for errors in searching are non-intuitive placement and the selection of terms as preferred terms (PTs) that may not be widely recognised. Some MedDRA rules could also result in errors in data retrieval if the individual is unaware of these: in particular, the lack of multiaxial linkages for the Investigations SOC, Social circumstances SOC and Surgical and medical procedures SOC and the requirement that a PT may only be present under one High Level Term (HLT) and one High Level Group Term (HLGT) within any single SOC. Special Search Categories (collections of PTs assembled from various SOCs by searching all of MedDRA) are limited by the small number available and by lack of clarity about criteria applied in their construction. Difficulties in database searching may be addressed by suitable user training and experience, and by central reporting of detected deficiencies in MedDRA. Other remedies may include regulatory guidance on implementation and use of MedDRA. Further systematic review of MedDRA is needed and generation of standardised searches that may be used 'off the shelf' will help, particularly where the same search is performed repeatedly on multiple data sets. Until these enhancements are widely available, MedDRA users should take great care when searching a safety database to ensure that cases are not inadvertently missed.
SW#db: GPU-Accelerated Exact Sequence Similarity Database Search.
Korpar, Matija; Šošić, Martin; Blažeka, Dino; Šikić, Mile
2015-01-01
In recent years we have witnessed a growth in sequencing yield, the number of samples sequenced, and as a result-the growth of publicly maintained sequence databases. The increase of data present all around has put high requirements on protein similarity search algorithms with two ever-opposite goals: how to keep the running times acceptable while maintaining a high-enough level of sensitivity. The most time consuming step of similarity search are the local alignments between query and database sequences. This step is usually performed using exact local alignment algorithms such as Smith-Waterman. Due to its quadratic time complexity, alignments of a query to the whole database are usually too slow. Therefore, the majority of the protein similarity search methods prior to doing the exact local alignment apply heuristics to reduce the number of possible candidate sequences in the database. However, there is still a need for the alignment of a query sequence to a reduced database. In this paper we present the SW#db tool and a library for fast exact similarity search. Although its running times, as a standalone tool, are comparable to the running times of BLAST, it is primarily intended to be used for exact local alignment phase in which the database of sequences has already been reduced. It uses both GPU and CPU parallelization and was 4-5 times faster than SSEARCH, 6-25 times faster than CUDASW++ and more than 20 times faster than SSW at the time of writing, using multiple queries on Swiss-prot and Uniref90 databases.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Webb-Robertson, Bobbie-Jo M.
Accurate identification of peptides is a current challenge in mass spectrometry (MS) based proteomics. The standard approach uses a search routine to compare tandem mass spectra to a database of peptides associated with the target organism. These database search routines yield multiple metrics associated with the quality of the mapping of the experimental spectrum to the theoretical spectrum of a peptide. The structure of these results make separating correct from false identifications difficult and has created a false identification problem. Statistical confidence scores are an approach to battle this false positive problem that has led to significant improvements in peptidemore » identification. We have shown that machine learning, specifically support vector machine (SVM), is an effective approach to separating true peptide identifications from false ones. The SVM-based peptide statistical scoring method transforms a peptide into a vector representation based on database search metrics to train and validate the SVM. In practice, following the database search routine, a peptides is denoted in its vector representation and the SVM generates a single statistical score that is then used to classify presence or absence in the sample« less
MEDLINE versus EMBASE and CINAHL for telemedicine searches.
Bahaadinbeigy, Kambiz; Yogesan, Kanagasingam; Wootton, Richard
2010-10-01
Researchers in the domain of telemedicine throughout the world tend to search multiple bibliographic databases to retrieve the highest possible number of publications when conducting review projects. Medical Literature Analysis and Retrieval System Online (MEDLINE), Excerpta Medica Database (EMBASE), and Cumulative Index to Nursing and Allied Health Literature (CINAHL) are three popular databases in the discipline of biomedicine that are used for conducting reviews. Access to the MEDLINE database is free and easy, whereas EMBASE and CINAHL are not free and sometimes not easy to access for researchers in small research centers. This project sought to compare MEDLINE with EMBASE and CINAHL to estimate what proportion of potentially relevant publications would be missed when only MEDLINE is used in a review project, in comparison to when EMBASE and CINAHL are also used. Twelve simple keywords relevant to 12 different telemedicine applications were searched using all three databases, and the results were compared. About 9%-18% of potentially relevant articles would have been missed if MEDLINE had been the only database used. It is preferable if all three or more databases are used when conducting a review in telemedicine. Researchers from developing countries or small research institutions could rely on only MEDLINE, but they would loose 9%-18% of the potentially relevant publications. Searching MEDLINE alone is not ideal, but in a resource-constrained situation, it is definitely better than nothing.
An Investigation of Graduate Student Knowledge and Usage of Open-Access Journals
ERIC Educational Resources Information Center
Beard, Regina M.
2016-01-01
Graduate students lament the need to achieve the proficiency necessary to competently search multiple databases for their research assignments, regularly eschewing these sources in favor of Google Scholar or some other search engine. The author conducted an anonymous survey investigating graduate student knowledge or awareness of the open-access…
Data-Base Software For Tracking Technological Developments
NASA Technical Reports Server (NTRS)
Aliberti, James A.; Wright, Simon; Monteith, Steve K.
1996-01-01
Technology Tracking System (TechTracS) computer program developed for use in storing and retrieving information on technology and related patent information developed under auspices of NASA Headquarters and NASA's field centers. Contents of data base include multiple scanned still images and quick-time movies as well as text. TechTracS includes word-processing, report-editing, chart-and-graph-editing, and search-editing subprograms. Extensive keyword searching capabilities enable rapid location of technologies, innovators, and companies. System performs routine functions automatically and serves multiple users.
ERIC Educational Resources Information Center
Nijs, Sara; Maes, Bea
2014-01-01
Social interactions may positively influence developmental and quality of life outcomes. Research in persons with profound intellectual and multiple disabilities (PIMD) mostly investigated interactions with caregivers. This literature review focuses on peer interactions of persons with PIMD. A computerized literature search of three databases was…
Adding Hierarchical Objects to Relational Database General-Purpose XML-Based Information Managements
NASA Technical Reports Server (NTRS)
Lin, Shu-Chun; Knight, Chris; La, Tracy; Maluf, David; Bell, David; Tran, Khai Peter; Gawdiak, Yuri
2006-01-01
NETMARK is a flexible, high-throughput software system for managing, storing, and rapid searching of unstructured and semi-structured documents. NETMARK transforms such documents from their original highly complex, constantly changing, heterogeneous data formats into well-structured, common data formats in using Hypertext Markup Language (HTML) and/or Extensible Markup Language (XML). The software implements an object-relational database system that combines the best practices of the relational model utilizing Structured Query Language (SQL) with those of the object-oriented, semantic database model for creating complex data. In particular, NETMARK takes advantage of the Oracle 8i object-relational database model using physical-address data types for very efficient keyword searches of records across both context and content. NETMARK also supports multiple international standards such as WEBDAV for drag-and-drop file management and SOAP for integrated information management using Web services. The document-organization and -searching capabilities afforded by NETMARK are likely to make this software attractive for use in disciplines as diverse as science, auditing, and law enforcement.
Beyond MEDLINE for literature searches.
Conn, Vicki S; Isaramalai, Sang-arun; Rath, Sabyasachi; Jantarakupt, Peeranuch; Wadhawan, Rohini; Dash, Yashodhara
2003-01-01
To describe strategies for a comprehensive literature search. MEDLINE searches result in limited numbers of studies that are often biased toward statistically significant findings. Diversified search strategies are needed. Empirical evidence about the recall and precision of diverse search strategies is presented. Challenges and strengths of each search strategy are identified. Search strategies vary in recall and precision. Often sensitivity and specificity are inversely related. Valuable search strategies include examination of multiple diverse computerized databases, ancestry searches, citation index searches, examination of research registries, journal hand searching, contact with the "invisible college," examination of abstracts, Internet searches, and contact with sources of synthesized information. Extending searches beyond MEDLINE enables researchers to conduct more systematic comprehensive searches.
Search and selection methodology of systematic reviews in orthodontics (2000-2004).
Flores-Mir, Carlos; Major, Michael P; Major, Paul W
2006-08-01
More systematic reviews related to orthodontic topics are published each year, although little has been done to evaluate their search and selection methodologies. Systematic reviews related to orthodontics published between January 1, 2000, and December 31, 2004, were searched for their use of multiple electronic databases and secondary searches. The search and selection methods of identified systematic reviews were evaluated against the Cochrane Handbook's guidelines. Sixteen orthodontic systematic reviews were identified in this period. The percentage of reviews documenting and using each criterion of article searching has changed over the last 5 years, with no recognizable directional trend. On average, most systematic reviews documented their electronic search terms (88%) and inclusion-exclusion criteria (100%), and used secondary searching (75%). Many still failed to search more than MEDLINE (56%), failed to document the database names and search dates (37%), failed to document the search strategy (62%), did not use several reviewers for selecting studies (75%), and did not include all languages (81%). The methodology of systematic reviews in orthodontics is still limited, with key methodological components frequently absent or not appropriately described.
Computer-based Interactive Literature Searching for CSU-Chico Chemistry Students.
ERIC Educational Resources Information Center
Cooke, Ron C.; And Others
The intent of this instructional manual, which is aimed at exploring the literature of a discipline and presented in a self-paced, course segment format applicable to any course content, is to enable college students to conduct computer-based interactive searches through multiple databases. The manual is divided into 10 chapters: (1) Introduction,…
Information Technology and the Evolution of the Library
2009-03-01
Resource Commons/ Repository/ Federated Search ILS (GLADIS/Pathfinder - Millenium)/ Catalog/ Circulation/ Acquisitions/ Digital Object Content...content management services to help centralize and distribute digi- tal content from across the institution, software to allow for seamless federated ... search - ing across multiple databases, and imaging software to allow for daily reimaging of ter- minals to reduce security concerns that otherwise
GenderMedDB: an interactive database of sex and gender-specific medical literature.
Oertelt-Prigione, Sabine; Gohlke, Björn-Oliver; Dunkel, Mathias; Preissner, Robert; Regitz-Zagrosek, Vera
2014-01-01
Searches for sex and gender-specific publications are complicated by the absence of a specific algorithm within search engines and by the lack of adequate archives to collect the retrieved results. We previously addressed this issue by initiating the first systematic archive of medical literature containing sex and/or gender-specific analyses. This initial collection has now been greatly enlarged and re-organized as a free user-friendly database with multiple functions: GenderMedDB (http://gendermeddb.charite.de). GenderMedDB retrieves the included publications from the PubMed database. Manuscripts containing sex and/or gender-specific analysis are continuously screened and the relevant findings organized systematically into disciplines and diseases. Publications are furthermore classified by research type, subject and participant numbers. More than 11,000 abstracts are currently included in the database, after screening more than 40,000 publications. The main functions of the database include searches by publication data or content analysis based on pre-defined classifications. In addition, registrants are enabled to upload relevant publications, access descriptive publication statistics and interact in an open user forum. Overall, GenderMedDB offers the advantages of a discipline-specific search engine as well as the functions of a participative tool for the gender medicine community.
The CTBTO Link to the database of the International Seismological Centre (ISC)
NASA Astrophysics Data System (ADS)
Bondar, I.; Storchak, D. A.; Dando, B.; Harris, J.; Di Giacomo, D.
2011-12-01
The CTBTO Link to the database of the International Seismological Centre (ISC) is a project to provide access to seismological data sets maintained by the ISC using specially designed interactive tools. The Link is open to National Data Centres and to the CTBTO. By means of graphical interfaces and database queries tailored to the needs of the monitoring community, the users are given access to a multitude of products. These include the ISC and ISS bulletins, covering the seismicity of the Earth since 1904; nuclear and chemical explosions; the EHB bulletin; the IASPEI Reference Event list (ground truth database); and the IDC Reviewed Event Bulletin. The searches are divided into three main categories: The Area Based Search (a spatio-temporal search based on the ISC Bulletin), the REB search (a spatio-temporal search based on specific events in the REB) and the IMS Station Based Search (a search for historical patterns in the reports of seismic stations close to a particular IMS seismic station). The outputs are HTML based web-pages with a simplified version of the ISC Bulletin showing the most relevant parameters with access to ISC, GT, EHB and REB Bulletins in IMS1.0 format for single or multiple events. The CTBTO Link offers a tool to view REB events in context within the historical seismicity, look at observations reported by non-IMS networks, and investigate station histories and residual patterns for stations registered in the International Seismographic Station Registry.
CHERNOLITTM. Chernobyl Bibliographic Search System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Caff, F., Jr.; Kennedy, R.A.; Mahaffey, J.A.
1992-03-02
The Chernobyl Bibliographic Search System (Chernolit TM) provides bibliographic data in a usable format for research studies relating to the Chernobyl nuclear accident that occurred in the former Ukrainian Republic of the USSR in 1986. Chernolit TM is a portable and easy to use product. The bibliographic data is provided under the control of a graphical user interface so that the user may quickly and easily retrieve pertinent information from the large database. The user may search the database for occurrences of words, names, or phrases; view bibliographic references on screen; and obtain reports of selected references. Reports may bemore » viewed on the screen, printed, or accumulated in a folder that is written to a disk file when the user exits the software. Chernolit TM provides a cost-effective alternative to multiple, independent literature searches. Forty-five hundred references concerning the accident, including abstracts, are distributed with Chernolit TM. The data contained in the database were obtained from electronic literature searches and from requested donations from individuals and organizations. These literature searches interrogated the Energy Science and Technology database (formerly DOE ENERGY) of the DIALOG Information Retrieval Service. Energy Science and Technology, provided by the U.S. DOE, Washington, D.C., is a multi-disciplinary database containing references to the world`s scientific and technical literature on energy. All unclassified information processed at the Office of Scientific and Technical Information (OSTI) of the U.S. DOE is included in the database. In addition, information on many documents has been manually added to Chernolit TM. Most of this information was obtained in response to requests for data sent to people and/or organizations throughout the world.« less
Chernobyl Bibliographic Search System
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carr, Jr, F.; Kennedy, R. A.; Mahaffey, J. A.
1992-05-11
The Chernobyl Bibliographic Search System (Chernolit TM) provides bibliographic data in a usable format for research studies relating to the Chernobyl nuclear accident that occurred in the former Ukrainian Republic of the USSR in 1986. Chernolit TM is a portable and easy to use product. The bibliographic data is provided under the control of a graphical user interface so that the user may quickly and easily retrieve pertinent information from the large database. The user may search the database for occurrences of words, names, or phrases; view bibliographic references on screen; and obtain reports of selected references. Reports may bemore » viewed on the screen, printed, or accumulated in a folder that is written to a disk file when the user exits the software. Chernolit TM provides a cost-effective alternative to multiple, independent literature searches. Forty-five hundred references concerning the accident, including abstracts, are distributed with Chernolit TM. The data contained in the database were obtained from electronic literature searches and from requested donations from individuals and organizations. These literature searches interrogated the Energy Science and Technology database (formerly DOE ENERGY) of the DIALOG Information Retrieval Service. Energy Science and Technology, provided by the U.S. DOE, Washington, D.C., is a multi-disciplinary database containing references to the world''s scientific and technical literature on energy. All unclassified information processed at the Office of Scientific and Technical Information (OSTI) of the U.S. DOE is included in the database. In addition, information on many documents has been manually added to Chernolit TM. Most of this information was obtained in response to requests for data sent to people and/or organizations throughout the world.« less
Characteristics and correlates of coping with multiple sclerosis: a systematic review.
Keramat Kar, Maryam; Whitehead, Lisa; Smith, Catherine M
2017-10-10
The purpose of this systematic review was to examine coping strategies that people with multiple sclerosis use, and to identify factors that influence their coping pattern. This systematic review followed the Joanna Briggs Institute guidelines for synthesizing descriptive quantitative research. The following databases were searched from the inception of databases until December 2016: Ovid (Medline, Embase, CINAHL, and PsycINFO), Science Direct, Web of Science, and Scopus. Manual search was also conducted from the reference lists of retrieved articles. Findings related to the patterns of coping with multiple sclerosis and factors influencing coping with multiple sclerosis were extracted and synthesized. The search of the database yielded 455 articles. After excluding duplicates (n = 341) and studies that did not meet the inclusion criteria (n = 27), 71 studies were included in the full-text review. Following the full-text, a further 21 studies were excluded. Quality appraisal of 50 studies was completed, and 38 studies were included in the review. Synthesis of findings indicated that people with multiple sclerosis use emotional and avoidance coping strategies more than other types of coping, particularly in the early stages of the disease. In comparison to the general population, people with multiple sclerosis were less likely to use active coping strategies and used more avoidance and emotional coping strategies. The pattern of coping with multiple sclerosis was associated with individual, clinical and psychological factors including gender, educational level, clinical course, mood and mental status, attitude, personality traits, and religious beliefs. The findings of this review suggest that considering individual or disease-related factors could help healthcare professionals in identifying those less likely to adapt to multiple sclerosis. This information could also be used to provide client-centered rehabilitation for people living with multiple sclerosis based on their individual responses and perceptions for coping. Implications for rehabilitation Engagement in coping with multiple sclerosis has been associated with individual factors and neuropsychological functions. Considering individual and disease-related factors would allow healthcare professionals to provide more tailored interventions to maintain and master coping with multiple sclerosis. People living with multiple sclerosis should be empowered to appraise and manage ability to cope based on the contextual evidence (individual and clinical condition). Rehabilitation services should move beyond physical management incorporating behavioral aspects for better functioning in living with multiple sclerosis.
Searching for patterns in remote sensing image databases using neural networks
NASA Technical Reports Server (NTRS)
Paola, Justin D.; Schowengerdt, Robert A.
1995-01-01
We have investigated a method, based on a successful neural network multispectral image classification system, of searching for single patterns in remote sensing databases. While defining the pattern to search for and the feature to be used for that search (spectral, spatial, temporal, etc.) is challenging, a more difficult task is selecting competing patterns to train against the desired pattern. Schemes for competing pattern selection, including random selection and human interpreted selection, are discussed in the context of an example detection of dense urban areas in Landsat Thematic Mapper imagery. When applying the search to multiple images, a simple normalization method can alleviate the problem of inconsistent image calibration. Another potential problem, that of highly compressed data, was found to have a minimal effect on the ability to detect the desired pattern. The neural network algorithm has been implemented using the PVM (Parallel Virtual Machine) library and nearly-optimal speedups have been obtained that help alleviate the long process of searching through imagery.
Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics.
Deutsch, Eric W; Sun, Zhi; Campbell, David S; Binz, Pierre-Alain; Farrah, Terry; Shteynberg, David; Mendoza, Luis; Omenn, Gilbert S; Moritz, Robert L
2016-11-04
The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances-a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ∼20,000 primary isoforms plus contaminants to a very large database that includes almost all nonredundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the discovered peptides against a more complex database. We have set up an automated system that downloads all the source databases on the first of each month and automatically generates a new set of search databases and makes them available for download at http://www.peptideatlas.org/thisp/ .
Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics
Deutsch, Eric W.; Sun, Zhi; Campbell, David S.; Binz, Pierre-Alain; Farrah, Terry; Shteynberg, David; Mendoza, Luis; Omenn, Gilbert S.; Moritz, Robert L.
2016-01-01
The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances – a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ~20,000 primary isoforms plus contaminants to a very large database that includes almost all non-redundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the discovered peptides against a more complex database. We have set up an automated system that downloads all the source databases on the first of each month and automatically generates a new set of search databases and makes them available for download at http://www.peptideatlas.org/thisp/. PMID:27577934
ERIC Educational Resources Information Center
Noesgaard, Signe Schack; Ørngreen, Rikke
2015-01-01
A structured search of library databases revealed that research examining the effectiveness of e-Learning has heavily increased within the last five years. After taking a closer look at the search results, the authors discovered that previous researchers defined and investigated effectiveness in multiple ways. At the same time, learning and…
Mobile object retrieval in server-based image databases
NASA Astrophysics Data System (ADS)
Manger, D.; Pagel, F.; Widak, H.
2013-05-01
The increasing number of mobile phones equipped with powerful cameras leads to huge collections of user-generated images. To utilize the information of the images on site, image retrieval systems are becoming more and more popular to search for similar objects in an own image database. As the computational performance and the memory capacity of mobile devices are constantly increasing, this search can often be performed on the device itself. This is feasible, for example, if the images are represented with global image features or if the search is done using EXIF or textual metadata. However, for larger image databases, if multiple users are meant to contribute to a growing image database or if powerful content-based image retrieval methods with local features are required, a server-based image retrieval backend is needed. In this work, we present a content-based image retrieval system with a client server architecture working with local features. On the server side, the scalability to large image databases is addressed with the popular bag-of-word model with state-of-the-art extensions. The client end of the system focuses on a lightweight user interface presenting the most similar images of the database highlighting the visual information which is common with the query image. Additionally, new images can be added to the database making it a powerful and interactive tool for mobile contentbased image retrieval.
Practical and Efficient Searching in Proteomics: A Cross Engine Comparison
Paulo, Joao A.
2014-01-01
Background Analysis of large datasets produced by mass spectrometry-based proteomics relies on database search algorithms to sequence peptides and identify proteins. Several such scoring methods are available, each based on different statistical foundations and thereby not producing identical results. Here, the aim is to compare peptide and protein identifications using multiple search engines and examine the additional proteins gained by increasing the number of technical replicate analyses. Methods A HeLa whole cell lysate was analyzed on an Orbitrap mass spectrometer for 10 technical replicates. The data were combined and searched using Mascot, SEQUEST, and Andromeda. Comparisons were made of peptide and protein identifications among the search engines. In addition, searches using each engine were performed with incrementing number of technical replicates. Results The number and identity of peptides and proteins differed across search engines. For all three search engines, the differences in proteins identifications were greater than the differences in peptide identifications indicating that the major source of the disparity may be at the protein inference grouping level. The data also revealed that analysis of 2 technical replicates can increase protein identifications by up to 10-15%, while a third replicate results in an additional 4-5%. Conclusions The data emphasize two practical methods of increasing the robustness of mass spectrometry data analysis. The data show that 1) using multiple search engines can expand the number of identified proteins (union) and validate protein identifications (intersection), and 2) analysis of 2 or 3 technical replicates can substantially expand protein identifications. Moreover, information can be extracted from a dataset by performing database searching with different engines and performing technical repeats, which requires no additional sample preparation and effectively utilizes research time and effort. PMID:25346847
Practical and Efficient Searching in Proteomics: A Cross Engine Comparison.
Paulo, Joao A
2013-10-01
Analysis of large datasets produced by mass spectrometry-based proteomics relies on database search algorithms to sequence peptides and identify proteins. Several such scoring methods are available, each based on different statistical foundations and thereby not producing identical results. Here, the aim is to compare peptide and protein identifications using multiple search engines and examine the additional proteins gained by increasing the number of technical replicate analyses. A HeLa whole cell lysate was analyzed on an Orbitrap mass spectrometer for 10 technical replicates. The data were combined and searched using Mascot, SEQUEST, and Andromeda. Comparisons were made of peptide and protein identifications among the search engines. In addition, searches using each engine were performed with incrementing number of technical replicates. The number and identity of peptides and proteins differed across search engines. For all three search engines, the differences in proteins identifications were greater than the differences in peptide identifications indicating that the major source of the disparity may be at the protein inference grouping level. The data also revealed that analysis of 2 technical replicates can increase protein identifications by up to 10-15%, while a third replicate results in an additional 4-5%. The data emphasize two practical methods of increasing the robustness of mass spectrometry data analysis. The data show that 1) using multiple search engines can expand the number of identified proteins (union) and validate protein identifications (intersection), and 2) analysis of 2 or 3 technical replicates can substantially expand protein identifications. Moreover, information can be extracted from a dataset by performing database searching with different engines and performing technical repeats, which requires no additional sample preparation and effectively utilizes research time and effort.
Memory-based attention capture when multiple items are maintained in visual working memory.
Hollingworth, Andrew; Beck, Valerie M
2016-07-01
Efficient visual search requires that attention is guided strategically to relevant objects, and most theories of visual search implement this function by means of a target template maintained in visual working memory (VWM). However, there is currently debate over the architecture of VWM-based attentional guidance. We contrasted a single-item-template hypothesis with a multiple-item-template hypothesis, which differ in their claims about structural limits on the interaction between VWM representations and perceptual selection. Recent evidence from van Moorselaar, Theeuwes, and Olivers (2014) indicated that memory-based capture during search, an index of VWM guidance, is not observed when memory set size is increased beyond a single item, suggesting that multiple items in VWM do not guide attention. In the present study, we maximized the overlap between multiple colors held in VWM and the colors of distractors in a search array. Reliable capture was observed when 2 colors were held in VWM and both colors were present as distractors, using both the original van Moorselaar et al. singleton-shape search task and a search task that required focal attention to array elements (gap location in outline square stimuli). In the latter task, memory-based capture was consistent with the simultaneous guidance of attention by multiple VWM representations. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Recruitment strategies for an osteoporosis clinical trial: analysis of effectiveness.
Heard, Allison; March, Rachel; Maguire, Patricia; Reilly, Penny; Helmore, Joy; Cameron, Sheryl; Frampton, Christopher; Nicholls, Gary; Gilchrist, Nigel
2012-09-01
To examine the effectiveness of a planned rapid recruitment strategy in an osteoporosis clinical trial. Multiple recruitment methods were explored, including media advertising, searching bone density scan and X-ray results in specialist and primary practice databases, community initiatives, and generation of research centre and study-specific pamphlets. Of 246 women screened, 41 consenting to the study, only 14 were randomised. Thus, 232 (94%) volunteers were screen failures, ineligible or declined to participate. With regard to the cost-effectiveness of all recruitment strategies, searching the research centre database was the most successful, with four women randomised at a cost of approximately NZ$302 per volunteer. Other strategies were less cost-effective. Obtaining a specific study cohort can be achieved by a comprehensive, targeted, rapid recruitment program. A research centre database search was the most successful and cost-effective recruitment modality in this small study. © 2012 Canterbury Geriatric Medical Research Trust. Australasian Journal on Ageing © 2012 ACOTA.
Design and implementation of the NPOI database and website
NASA Astrophysics Data System (ADS)
Newman, K.; Jorgensen, A. M.; Landavazo, M.; Sun, B.; Hutter, D. J.; Armstrong, J. T.; Mozurkewich, David; Elias, N.; van Belle, G. T.; Schmitt, H. R.; Baines, E. K.
2014-07-01
The Navy Precision Optical Interferometer (NPOI) has been recording astronomical observations for nearly two decades, at this point with hundreds of thousands of individual observations recorded to date for a total data volume of many terabytes. To make maximum use of the NPOI data it is necessary to organize them in an easily searchable manner and be able to extract essential diagnostic information from the data to allow users to quickly gauge data quality and suitability for a specific science investigation. This sets the motivation for creating a comprehensive database of observation metadata as well as, at least, reduced data products. The NPOI database is implemented in MySQL using standard database tools and interfaces. The use of standard database tools allows us to focus on top-level database and interface implementation and take advantage of standard features such as backup, remote access, mirroring, and complex queries which would otherwise be time-consuming to implement. A website was created in order to give scientists a user friendly interface for searching the database. It allows the user to select various metadata to search for and also allows them to decide how and what results are displayed. This streamlines the searches, making it easier and quicker for scientists to find the information they are looking for. The website has multiple browser and device support. In this paper we present the design of the NPOI database and website, and give examples of its use.
An image database management system for conducting CAD research
NASA Astrophysics Data System (ADS)
Gruszauskas, Nicholas; Drukker, Karen; Giger, Maryellen L.
2007-03-01
The development of image databases for CAD research is not a trivial task. The collection and management of images and their related metadata from multiple sources is a time-consuming but necessary process. By standardizing and centralizing the methods in which these data are maintained, one can generate subsets of a larger database that match the specific criteria needed for a particular research project in a quick and efficient manner. A research-oriented management system of this type is highly desirable in a multi-modality CAD research environment. An online, webbased database system for the storage and management of research-specific medical image metadata was designed for use with four modalities of breast imaging: screen-film mammography, full-field digital mammography, breast ultrasound and breast MRI. The system was designed to consolidate data from multiple clinical sources and provide the user with the ability to anonymize the data. Input concerning the type of data to be stored as well as desired searchable parameters was solicited from researchers in each modality. The backbone of the database was created using MySQL. A robust and easy-to-use interface for entering, removing, modifying and searching information in the database was created using HTML and PHP. This standardized system can be accessed using any modern web-browsing software and is fundamental for our various research projects on computer-aided detection, diagnosis, cancer risk assessment, multimodality lesion assessment, and prognosis. Our CAD database system stores large amounts of research-related metadata and successfully generates subsets of cases that match the user's desired search criteria.
Jones, Andrew R; Siepen, Jennifer A; Hubbard, Simon J; Paton, Norman W
2009-03-01
LC-MS experiments can generate large quantities of data, for which a variety of database search engines are available to make peptide and protein identifications. Decoy databases are becoming widely used to place statistical confidence in result sets, allowing the false discovery rate (FDR) to be estimated. Different search engines produce different identification sets so employing more than one search engine could result in an increased number of peptides (and proteins) being identified, if an appropriate mechanism for combining data can be defined. We have developed a search engine independent score, based on FDR, which allows peptide identifications from different search engines to be combined, called the FDR Score. The results demonstrate that the observed FDR is significantly different when analysing the set of identifications made by all three search engines, by each pair of search engines or by a single search engine. Our algorithm assigns identifications to groups according to the set of search engines that have made the identification, and re-assigns the score (combined FDR Score). The combined FDR Score can differentiate between correct and incorrect peptide identifications with high accuracy, allowing on average 35% more peptide identifications to be made at a fixed FDR than using a single search engine.
Nursing Reference Center: a point-of-care resource.
Vardell, Emily; Paulaitis, Gediminas Geddy
2012-01-01
Nursing Reference Center is a point-of-care resource designed for the practicing nurse, as well as nursing administrators, nursing faculty, and librarians. Users can search across multiple resources, including topical Quick Lessons, evidence-based care sheets, patient education materials, practice guidelines, and more. Additional features include continuing education modules, e-books, and a new iPhone application. A sample search and comparison with similar databases were conducted.
Smith, Constance M; Finger, Jacqueline H; Kadin, James A; Richardson, Joel E; Ringwald, Martin
2014-10-01
Because molecular mechanisms of development are extraordinarily complex, the understanding of these processes requires the integration of pertinent research data. Using the Gene Expression Database for Mouse Development (GXD) as an example, we illustrate the progress made toward this goal, and discuss relevant issues that apply to developmental databases and developmental research in general. Since its first release in 1998, GXD has served the scientific community by integrating multiple types of expression data from publications and electronic submissions and by making these data freely and widely available. Focusing on endogenous gene expression in wild-type and mutant mice and covering data from RNA in situ hybridization, in situ reporter (knock-in), immunohistochemistry, reverse transcriptase-polymerase chain reaction, Northern blot, and Western blot experiments, the database has grown tremendously over the years in terms of data content and search utilities. Currently, GXD includes over 1.4 million annotated expression results and over 260,000 images. All these data and images are readily accessible to many types of database searches. Here we describe the data and search tools of GXD; explain how to use the database most effectively; discuss how we acquire, curate, and integrate developmental expression information; and describe how the research community can help in this process. Copyright © 2014 The Authors Developmental Dynamics published by Wiley Periodicals, Inc. on behalf of American Association of Anatomists.
Note on online books and articles about the history of dissociation.
Alvarado, Carlos S
2008-01-01
Students of the history of dissociation will be interested in the materials on the subject available in the digital document database Google Book Search. This includes a variety of books and journals covering automatic writing, hypnosis, mediumship, multiple personality, trance, somnambulism, and other topics. Among the authors represented in the database are: Eugène Azam, Alfred Binet, James Braid, Jean-Martin Charcot, Pierre Janet, Frederic W.H. Myers, Morton Prince, and Boris Sidis, among others. The database includes examples of case reports, conceptual discussions, and psychiatric and psychological textbook literature.
Memory for found targets interferes with subsequent performance in multiple-target visual search.
Cain, Matthew S; Mitroff, Stephen R
2013-10-01
Multiple-target visual searches--when more than 1 target can appear in a given search display--are commonplace in radiology, airport security screening, and the military. Whereas 1 target is often found accurately, additional targets are more likely to be missed in multiple-target searches. To better understand this decrement in 2nd-target detection, here we examined 2 potential forms of interference that can arise from finding a 1st target: interference from the perceptual salience of the 1st target (a now highly relevant distractor in a known location) and interference from a newly created memory representation for the 1st target. Here, we found that removing found targets from the display or making them salient and easily segregated color singletons improved subsequent search accuracy. However, replacing found targets with random distractor items did not improve subsequent search accuracy. Removing and highlighting found targets likely reduced both a target's visual salience and its memory load, whereas replacing a target removed its visual salience but not its representation in memory. Collectively, the current experiments suggest that the working memory load of a found target has a larger effect on subsequent search accuracy than does its perceptual salience. PsycINFO Database Record (c) 2013 APA, all rights reserved.
MassSieve: Panning MS/MS peptide data for proteins
Slotta, Douglas J.; McFarland, Melinda A.; Markey, Sanford P.
2010-01-01
We present MassSieve, a Java-based platform for visualization and parsimony analysis of single and comparative LC-MS/MS database search engine results. The success of mass spectrometric peptide sequence assignment algorithms has led to the need for a tool to merge and evaluate the increasing data set sizes that result from LC-MS/MS-based shotgun proteomic experiments. MassSieve supports reports from multiple search engines with differing search characteristics, which can increase peptide sequence coverage and/or identify conflicting or ambiguous spectral assignments. PMID:20564260
Transition From Clinical to Educator Roles in Nursing: An Integrative Review.
Fritz, Elizabeth
This review identified barriers to and facilitators of nurses' transition from clinical positions into nursing professional development and other nurse educator roles. The author conducted literature searches using multiple databases. Twenty-one articles met search criteria, representing a variety of practice settings. The findings, both barriers and facilitators, were remarkably consistent across practice settings. Four practice recommendations were drawn from the literature to promote nurses' successful transition to nursing professional development roles.
Cho, Jin-Young; Lee, Hyoung-Joo; Jeong, Seul-Ki; Paik, Young-Ki
2017-12-01
Mass spectrometry (MS) is a widely used proteome analysis tool for biomedical science. In an MS-based bottom-up proteomic approach to protein identification, sequence database (DB) searching has been routinely used because of its simplicity and convenience. However, searching a sequence DB with multiple variable modification options can increase processing time, false-positive errors in large and complicated MS data sets. Spectral library searching is an alternative solution, avoiding the limitations of sequence DB searching and allowing the detection of more peptides with high sensitivity. Unfortunately, this technique has less proteome coverage, resulting in limitations in the detection of novel and whole peptide sequences in biological samples. To solve these problems, we previously developed the "Combo-Spec Search" method, which uses manually multiple references and simulated spectral library searching to analyze whole proteomes in a biological sample. In this study, we have developed a new analytical interface tool called "Epsilon-Q" to enhance the functions of both the Combo-Spec Search method and label-free protein quantification. Epsilon-Q performs automatically multiple spectral library searching, class-specific false-discovery rate control, and result integration. It has a user-friendly graphical interface and demonstrates good performance in identifying and quantifying proteins by supporting standard MS data formats and spectrum-to-spectrum matching powered by SpectraST. Furthermore, when the Epsilon-Q interface is combined with the Combo-Spec search method, called the Epsilon-Q system, it shows a synergistic function by outperforming other sequence DB search engines for identifying and quantifying low-abundance proteins in biological samples. The Epsilon-Q system can be a versatile tool for comparative proteome analysis based on multiple spectral libraries and label-free quantification.
Blin, Kai; Medema, Marnix H; Kottmann, Renzo; Lee, Sang Yup; Weber, Tilmann
2017-01-04
Secondary metabolites produced by microorganisms are the main source of bioactive compounds that are in use as antimicrobial and anticancer drugs, fungicides, herbicides and pesticides. In the last decade, the increasing availability of microbial genomes has established genome mining as a very important method for the identification of their biosynthetic gene clusters (BGCs). One of the most popular tools for this task is antiSMASH. However, so far, antiSMASH is limited to de novo computing results for user-submitted genomes and only partially connects these with BGCs from other organisms. Therefore, we developed the antiSMASH database, a simple but highly useful new resource to browse antiSMASH-annotated BGCs in the currently 3907 bacterial genomes in the database and perform advanced search queries combining multiple search criteria. antiSMASH-DB is available at http://antismash-db.secondarymetabolites.org/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Grubert, Anna; Eimer, Martin
2013-10-01
To find out whether attentional target selection can be effectively guided by top-down task sets for multiple colors, we measured behavioral and ERP markers of attentional target selection in an experiment where participants had to identify color-defined target digits that were accompanied by a single gray distractor object in the opposite visual field. In the One Color task, target color was constant. In the Two Color task, targets could have one of two equally likely colors. Color-guided target selection was less efficient during multiple-color relative to single-color search, and this was reflected by slower response times and delayed N2pc components. Nontarget-color items that were presented in half of all trials captured attention and gained access to working memory when participants searched for two colors, but were excluded from attentional processing in the One Color task. Results demonstrate qualitative differences in the guidance of attentional target selection between single-color and multiple-color visual search. They suggest that top-down attentional control can be applied much more effectively when it is based on a single feature-specific attentional template. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Quality of search strategies reported in systematic reviews published in stereotactic radiosurgery.
Faggion, Clovis M; Wu, Yun-Chun; Tu, Yu-Kang; Wasiak, Jason
2016-06-01
Systematic reviews require comprehensive literature search strategies to avoid publication bias. This study aimed to assess and evaluate the reporting quality of search strategies within systematic reviews published in the field of stereotactic radiosurgery (SRS). Three electronic databases (Ovid MEDLINE(®), Ovid EMBASE(®) and the Cochrane Library) were searched to identify systematic reviews addressing SRS interventions, with the last search performed in October 2014. Manual searches of the reference lists of included systematic reviews were conducted. The search strategies of the included systematic reviews were assessed using a standardized nine-question form based on the Cochrane Collaboration guidelines and Assessment of Multiple Systematic Reviews checklist. Multiple linear regression analyses were performed to identify the important predictors of search quality. A total of 85 systematic reviews were included. The median quality score of search strategies was 2 (interquartile range = 2). Whilst 89% of systematic reviews reported the use of search terms, only 14% of systematic reviews reported searching the grey literature. Multiple linear regression analyses identified publication year (continuous variable), meta-analysis performance and journal impact factor (continuous variable) as predictors of higher mean quality scores. This study identified the urgent need to improve the quality of search strategies within systematic reviews published in the field of SRS. This study is the first to address how authors performed searches to select clinical studies for inclusion in their systematic reviews. Comprehensive and well-implemented search strategies are pivotal to reduce the chance of publication bias and consequently generate more reliable systematic review findings.
Rathbone, John; Carter, Matt; Hoffmann, Tammy; Glasziou, Paul
2016-02-09
Bibliographic databases are the primary resource for identifying systematic reviews of health care interventions. Reliable retrieval of systematic reviews depends on the scope of indexing used by database providers. Therefore, searching one database may be insufficient, but it is unclear how many need to be searched. We sought to evaluate the performance of seven major bibliographic databases for the identification of systematic reviews for hypertension. We searched seven databases (Cochrane library, Database of Abstracts of Reviews of Effects (DARE), Excerpta Medica Database (EMBASE), Epistemonikos, Medical Literature Analysis and Retrieval System Online (MEDLINE), PubMed Health and Turning Research Into Practice (TRIP)) from 2003 to 2015 for systematic reviews of any intervention for hypertension. Citations retrieved were screened for relevance, coded and checked for screening consistency using a fuzzy text matching query. The performance of each database was assessed by calculating its sensitivity, precision, the number of missed reviews and the number of unique records retrieved. Four hundred systematic reviews were identified for inclusion from 11,381 citations retrieved from seven databases. No single database identified all the retrieved systematic reviews for hypertension. EMBASE identified the most reviews (sensitivity 69 %) but also retrieved the most irrelevant citations with 7.2 % precision (Pr). The sensitivity of the Cochrane library was 60 %, DARE 57 %, MEDLINE 57 %, PubMed Health 53 %, Epistemonikos 49 % and TRIP 33 %. EMBASE contained the highest number of unique records (n = 43). The Cochrane library identified seven unique records and had the highest precision (Pr = 30 %), followed by Epistemonikos (n = 2, Pr = 19 %). No unique records were found in PubMed Health (Pr = 24 %) DARE (Pr = 21 %), TRIP (Pr = 10 %) or MEDLINE (Pr = 10 %). Searching EMBASE and the Cochrane library identified 88 % of all systematic reviews in the reference set, and searching the freely available databases (Cochrane, Epistemonikos, MEDLINE) identified 83 % of all the reviews. The databases were re-analysed after systematic reviews of non-conventional interventions (e.g. yoga, acupuncture) were removed. Similarly, no database identified all the retrieved systematic reviews. EMBASE identified the most relevant systematic reviews (sensitivity 73 %) but also retrieved the most irrelevant citations with Pr = 5 %. The sensitivity of the Cochrane database was 62 %, followed by MEDLINE (60 %), DARE (55 %), PubMed Health (54 %), Epistemonikos (50 %) and TRIP (31 %). The precision of the Cochrane library was the highest (20 %), followed by PubMed Health (Pr = 16 %), DARE (Pr = 13 %), Epistemonikos (Pr = 12 %), MEDLINE (Pr = 6 %), TRIP (Pr = 6 %) and EMBASE (Pr = 5 %). EMBASE contained the most unique records (n = 34). The Cochrane library identified seven unique records. The other databases held no unique records. The coverage of bibliographic databases varies considerably due to differences in their scope and content. Researchers wishing to identify systematic reviews should not rely on one database but search multiple databases.
Biometric Fusion Demonstration System Scientific Report
2004-03-01
verification and facial recognition , searching watchlist databases comprised of full or partial facial images or voice recordings. Multiple-biometric...17 2.2.1.1 Fingerprint and Facial Recognition ............................... 17...iv DRDC Ottawa CR 2004 – 056 2.2.1.2 Iris Recognition and Facial Recognition ........................ 18
An Imaging Sensor-Aided Vision Navigation Approach that Uses a Geo-Referenced Image Database.
Li, Yan; Hu, Qingwu; Wu, Meng; Gao, Yang
2016-01-28
In determining position and attitude, vision navigation via real-time image processing of data collected from imaging sensors is advanced without a high-performance global positioning system (GPS) and an inertial measurement unit (IMU). Vision navigation is widely used in indoor navigation, far space navigation, and multiple sensor-integrated mobile mapping. This paper proposes a novel vision navigation approach aided by imaging sensors and that uses a high-accuracy geo-referenced image database (GRID) for high-precision navigation of multiple sensor platforms in environments with poor GPS. First, the framework of GRID-aided vision navigation is developed with sequence images from land-based mobile mapping systems that integrate multiple sensors. Second, a highly efficient GRID storage management model is established based on the linear index of a road segment for fast image searches and retrieval. Third, a robust image matching algorithm is presented to search and match a real-time image with the GRID. Subsequently, the image matched with the real-time scene is considered to calculate the 3D navigation parameter of multiple sensor platforms. Experimental results show that the proposed approach retrieves images efficiently and has navigation accuracies of 1.2 m in a plane and 1.8 m in height under GPS loss in 5 min and within 1500 m.
An Imaging Sensor-Aided Vision Navigation Approach that Uses a Geo-Referenced Image Database
Li, Yan; Hu, Qingwu; Wu, Meng; Gao, Yang
2016-01-01
In determining position and attitude, vision navigation via real-time image processing of data collected from imaging sensors is advanced without a high-performance global positioning system (GPS) and an inertial measurement unit (IMU). Vision navigation is widely used in indoor navigation, far space navigation, and multiple sensor-integrated mobile mapping. This paper proposes a novel vision navigation approach aided by imaging sensors and that uses a high-accuracy geo-referenced image database (GRID) for high-precision navigation of multiple sensor platforms in environments with poor GPS. First, the framework of GRID-aided vision navigation is developed with sequence images from land-based mobile mapping systems that integrate multiple sensors. Second, a highly efficient GRID storage management model is established based on the linear index of a road segment for fast image searches and retrieval. Third, a robust image matching algorithm is presented to search and match a real-time image with the GRID. Subsequently, the image matched with the real-time scene is considered to calculate the 3D navigation parameter of multiple sensor platforms. Experimental results show that the proposed approach retrieves images efficiently and has navigation accuracies of 1.2 m in a plane and 1.8 m in height under GPS loss in 5 min and within 1500 m. PMID:26828496
Efficient RNA structure comparison algorithms.
Arslan, Abdullah N; Anandan, Jithendar; Fry, Eric; Monschke, Keith; Ganneboina, Nitin; Bowerman, Jason
2017-12-01
Recently proposed relative addressing-based ([Formula: see text]) RNA secondary structure representation has important features by which an RNA structure database can be stored into a suffix array. A fast substructure search algorithm has been proposed based on binary search on this suffix array. Using this substructure search algorithm, we present a fast algorithm that finds the largest common substructure of given multiple RNA structures in [Formula: see text] format. The multiple RNA structure comparison problem is NP-hard in its general formulation. We introduced a new problem for comparing multiple RNA structures. This problem has more strict similarity definition and objective, and we propose an algorithm that solves this problem efficiently. We also develop another comparison algorithm that iteratively calls this algorithm to locate nonoverlapping large common substructures in compared RNAs. With the new resulting tools, we improved the RNASSAC website (linked from http://faculty.tamuc.edu/aarslan ). This website now also includes two drawing tools: one specialized for preparing RNA substructures that can be used as input by the search tool, and another one for automatically drawing the entire RNA structure from a given structure sequence.
The immune response to anesthesia: part 2 sedatives, opioids, and injectable anesthetic agents.
Anderson, Stacy L; Duke-Novakovski, Tanya; Singh, Baljit
2014-11-01
To review the immune response to injectable anesthetics and sedatives and to compare the immunomodulatory properties between inhalation and injectable anesthetic protocols. Review. Multiple literature searches were performed using PubMed and Google Scholar from March 2012 through November 2013. Relevant anesthetic and immune terms were used to search databases without year published or species constraints. The online database for Veterinary Anaesthesia and Analgesia and the Journal of Veterinary Emergency and Critical Care were searched by issue starting in 2000 for relevant articles. Sedatives, injectable anesthetics, opioids, and local anesthetics have immunomodulatory effects that may have positive or negative consequences on disease processes such as endotoxemia, generalized sepsis, tumor growth and metastasis, and ischemia-reperfusion injury. Therefore, anesthetists should consider the immunomodulatory effects of anesthetic drugs when designing anesthetic protocols for their patients. © 2014 Association of Veterinary Anaesthetists and the American College of Veterinary Anesthesia and Analgesia.
75 FR 53262 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2010-08-31
... a new Privacy Act system of records, JUSTICE/FBI- 021, the Data Integration and Visualization System... Act system of records, the Data Integration and Visualization System (DIVS), Justice/FBI-021. The... investigative mission by enabling access, search, integration, and analytics across multiple existing databases...
Emotional Intelligence Research within Human Resource Development Scholarship
ERIC Educational Resources Information Center
Farnia, Forouzan; Nafukho, Fredrick Muyia
2016-01-01
Purpose: The purpose of this study is to review and synthesize pertinent emotional intelligence (EI) research within the human resource development (HRD) scholarship. Design/methodology/approach: An integrative review of literature was conducted and multiple electronic databases were searched to find the relevant resources. Using the content…
Chen, Yao-Yi; Dasari, Surendra; Ma, Ze-Qiang; Vega-Montoto, Lorenzo J.; Li, Ming
2013-01-01
Spectral counting has become a widely used approach for measuring and comparing protein abundance in label-free shotgun proteomics. However, when analyzing complex samples, the ambiguity of matching between peptides and proteins greatly affects the assessment of peptide and protein inventories, differentiation, and quantification. Meanwhile, the configuration of database searching algorithms that assign peptides to MS/MS spectra may produce different results in comparative proteomic analysis. Here, we present three strategies to improve comparative proteomics through spectral counting. We show that comparing spectral counts for peptide groups rather than for protein groups forestalls problems introduced by shared peptides. We demonstrate the advantage and flexibility of this new method in two datasets. We present four models to combine four popular search engines that lead to significant gains in spectral counting differentiation. Among these models, we demonstrate a powerful vote counting model that scales well for multiple search engines. We also show that semi-tryptic searching outperforms tryptic searching for comparative proteomics. Overall, these techniques considerably improve protein differentiation on the basis of spectral count tables. PMID:22552787
Chen, Yao-Yi; Dasari, Surendra; Ma, Ze-Qiang; Vega-Montoto, Lorenzo J; Li, Ming; Tabb, David L
2012-09-01
Spectral counting has become a widely used approach for measuring and comparing protein abundance in label-free shotgun proteomics. However, when analyzing complex samples, the ambiguity of matching between peptides and proteins greatly affects the assessment of peptide and protein inventories, differentiation, and quantification. Meanwhile, the configuration of database searching algorithms that assign peptides to MS/MS spectra may produce different results in comparative proteomic analysis. Here, we present three strategies to improve comparative proteomics through spectral counting. We show that comparing spectral counts for peptide groups rather than for protein groups forestalls problems introduced by shared peptides. We demonstrate the advantage and flexibility of this new method in two datasets. We present four models to combine four popular search engines that lead to significant gains in spectral counting differentiation. Among these models, we demonstrate a powerful vote counting model that scales well for multiple search engines. We also show that semi-tryptic searching outperforms tryptic searching for comparative proteomics. Overall, these techniques considerably improve protein differentiation on the basis of spectral count tables.
Cloud parallel processing of tandem mass spectrometry based proteomics data.
Mohammed, Yassene; Mostovenko, Ekaterina; Henneman, Alex A; Marissen, Rob J; Deelder, André M; Palmblad, Magnus
2012-10-05
Data analysis in mass spectrometry based proteomics struggles to keep pace with the advances in instrumentation and the increasing rate of data acquisition. Analyzing this data involves multiple steps requiring diverse software, using different algorithms and data formats. Speed and performance of the mass spectral search engines are continuously improving, although not necessarily as needed to face the challenges of acquired big data. Improving and parallelizing the search algorithms is one possibility; data decomposition presents another, simpler strategy for introducing parallelism. We describe a general method for parallelizing identification of tandem mass spectra using data decomposition that keeps the search engine intact and wraps the parallelization around it. We introduce two algorithms for decomposing mzXML files and recomposing resulting pepXML files. This makes the approach applicable to different search engines, including those relying on sequence databases and those searching spectral libraries. We use cloud computing to deliver the computational power and scientific workflow engines to interface and automate the different processing steps. We show how to leverage these technologies to achieve faster data analysis in proteomics and present three scientific workflows for parallel database as well as spectral library search using our data decomposition programs, X!Tandem and SpectraST.
Li, Guo-Zhong; Vissers, Johannes P C; Silva, Jeffrey C; Golick, Dan; Gorenstein, Marc V; Geromanos, Scott J
2009-03-01
A novel database search algorithm is presented for the qualitative identification of proteins over a wide dynamic range, both in simple and complex biological samples. The algorithm has been designed for the analysis of data originating from data independent acquisitions, whereby multiple precursor ions are fragmented simultaneously. Measurements used by the algorithm include retention time, ion intensities, charge state, and accurate masses on both precursor and product ions from LC-MS data. The search algorithm uses an iterative process whereby each iteration incrementally increases the selectivity, specificity, and sensitivity of the overall strategy. Increased specificity is obtained by utilizing a subset database search approach, whereby for each subsequent stage of the search, only those peptides from securely identified proteins are queried. Tentative peptide and protein identifications are ranked and scored by their relative correlation to a number of models of known and empirically derived physicochemical attributes of proteins and peptides. In addition, the algorithm utilizes decoy database techniques for automatically determining the false positive identification rates. The search algorithm has been tested by comparing the search results from a four-protein mixture, the same four-protein mixture spiked into a complex biological background, and a variety of other "system" type protein digest mixtures. The method was validated independently by data dependent methods, while concurrently relying on replication and selectivity. Comparisons were also performed with other commercially and publicly available peptide fragmentation search algorithms. The presented results demonstrate the ability to correctly identify peptides and proteins from data independent acquisition strategies with high sensitivity and specificity. They also illustrate a more comprehensive analysis of the samples studied; providing approximately 20% more protein identifications, compared to a more conventional data directed approach using the same identification criteria, with a concurrent increase in both sequence coverage and the number of modified peptides.
Chen, Chou-Cheng; Ho, Chung-Liang
2014-01-01
While a huge amount of information about biological literature can be obtained by searching the PubMed database, reading through all the titles and abstracts resulting from such a search for useful information is inefficient. Text mining makes it possible to increase this efficiency. Some websites use text mining to gather information from the PubMed database; however, they are database-oriented, using pre-defined search keywords while lacking a query interface for user-defined search inputs. We present the PubMed Abstract Reading Helper (PubstractHelper) website which combines text mining and reading assistance for an efficient PubMed search. PubstractHelper can accept a maximum of ten groups of keywords, within each group containing up to ten keywords. The principle behind the text-mining function of PubstractHelper is that keywords contained in the same sentence are likely to be related. PubstractHelper highlights sentences with co-occurring keywords in different colors. The user can download the PMID and the abstracts with color markings to be reviewed later. The PubstractHelper website can help users to identify relevant publications based on the presence of related keywords, which should be a handy tool for their research. http://bio.yungyun.com.tw/ATM/PubstractHelper.aspx and http://holab.med.ncku.edu.tw/ATM/PubstractHelper.aspx.
PIRIA: a general tool for indexing, search, and retrieval of multimedia content
NASA Astrophysics Data System (ADS)
Joint, Magali; Moellic, Pierre-Alain; Hede, P.; Adam, P.
2004-05-01
The Internet is a continuously expanding source of multimedia content and information. There are many products in development to search, retrieve, and understand multimedia content. But most of the current image search/retrieval engines, rely on a image database manually pre-indexed with keywords. Computers are still powerless to understand the semantic meaning of still or animated image content. Piria (Program for the Indexing and Research of Images by Affinity), the search engine we have developed brings this possibility closer to reality. Piria is a novel search engine that uses the query by example method. A user query is submitted to the system, which then returns a list of images ranked by similarity, obtained by a metric distance that operates on every indexed image signature. These indexed images are compared according to several different classifiers, not only Keywords, but also Form, Color and Texture, taking into account geometric transformations and variance like rotation, symmetry, mirroring, etc. Form - Edges extracted by an efficient segmentation algorithm. Color - Histogram, semantic color segmentation and spatial color relationship. Texture - Texture wavelets and local edge patterns. If required, Piria is also able to fuse results from multiple classifiers with a new classification of index categories: Single Indexer Single Call (SISC), Single Indexer Multiple Call (SIMC), Multiple Indexers Single Call (MISC) or Multiple Indexers Multiple Call (MIMC). Commercial and industrial applications will be explored and discussed as well as current and future development.
Creating a FIESTA (Framework for Integrated Earth Science and Technology Applications) with MagIC
NASA Astrophysics Data System (ADS)
Minnett, R.; Koppers, A. A. P.; Jarboe, N.; Tauxe, L.; Constable, C.
2017-12-01
The Magnetics Information Consortium (https://earthref.org/MagIC) has recently developed a containerized web application to considerably reduce the friction in contributing, exploring and combining valuable and complex datasets for the paleo-, geo- and rock magnetic scientific community. The data produced in this scientific domain are inherently hierarchical and the communities evolving approaches to this scientific workflow, from sampling to taking measurements to multiple levels of interpretations, require a large and flexible data model to adequately annotate the results and ensure reproducibility. Historically, contributing such detail in a consistent format has been prohibitively time consuming and often resulted in only publishing the highly derived interpretations. The new open-source (https://github.com/earthref/MagIC) application provides a flexible upload tool integrated with the data model to easily create a validated contribution and a powerful search interface for discovering datasets and combining them to enable transformative science. MagIC is hosted at EarthRef.org along with several interdisciplinary geoscience databases. A FIESTA (Framework for Integrated Earth Science and Technology Applications) is being created by generalizing MagIC's web application for reuse in other domains. The application relies on a single configuration document that describes the routing, data model, component settings and external services integrations. The container hosts an isomorphic Meteor JavaScript application, MongoDB database and ElasticSearch search engine. Multiple containers can be configured as microservices to serve portions of the application or rely on externally hosted MongoDB, ElasticSearch, or third-party services to efficiently scale computational demands. FIESTA is particularly well suited for many Earth Science disciplines with its flexible data model, mapping, account management, upload tool to private workspaces, reference metadata, image galleries, full text searches and detailed filters. EarthRef's Seamount Catalog of bathymetry and morphology data, EarthRef's Geochemical Earth Reference Model (GERM) databases, and Oregon State University's Marine and Geology Repository (http://osu-mgr.org) will benefit from custom adaptations of FIESTA.
OASIS: Prototyping Graphical Interfaces to Networked Information.
ERIC Educational Resources Information Center
Buckland, Michael K.; And Others
1993-01-01
Describes the latest modifications being made to OASIS, a front-end enhancement to the University of California's MELVYL online union catalog. Highlights include the X Windows interface; multiple database searching to act as an information network; Lisp implementation for flexible data representation; and OASIS commands and features to help…
A User Interface for Multiple Retrieval Systems.
ERIC Educational Resources Information Center
Teskey, Niall; And Others
1987-01-01
Reviews current systems designed to help end-users search online databases without the assistance of an intermediary and describes a prototype system which emulates the Deco (the text storage and retrieval system used by Unilever) interface on Dialog and Data-Star. Initial trials of the prototype system are reported. (15 references) (MES)
Using dBASE II for Bibliographic Files.
ERIC Educational Resources Information Center
Sullivan, Jeanette
1985-01-01
Describes use of a database management system (dBASE II, produced by Ashton-Tate), noting best features and disadvantages. Highlights include data entry, multiple access points available, training requirements, use of dBASE for a bibliographic application, auxiliary software, and dBASE updates. Sample searches, auxiliary programs, and requirements…
Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming
2016-07-08
The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F
2012-01-01
Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086
BIOZON: a system for unification, management and analysis of heterogeneous biological data.
Birkland, Aaron; Yona, Golan
2006-02-15
Integration of heterogeneous data types is a challenging problem, especially in biology, where the number of databases and data types increase rapidly. Amongst the problems that one has to face are integrity, consistency, redundancy, connectivity, expressiveness and updatability. Here we present a system (Biozon) that addresses these problems, and offers biologists a new knowledge resource to navigate through and explore. Biozon unifies multiple biological databases consisting of a variety of data types (such as DNA sequences, proteins, interactions and cellular pathways). It is fundamentally different from previous efforts as it uses a single extensive and tightly connected graph schema wrapped with hierarchical ontology of documents and relations. Beyond warehousing existing data, Biozon computes and stores novel derived data, such as similarity relationships and functional predictions. The integration of similarity data allows propagation of knowledge through inference and fuzzy searches. Sophisticated methods of query that span multiple data types were implemented and first-of-a-kind biological ranking systems were explored and integrated. The Biozon system is an extensive knowledge resource of heterogeneous biological data. Currently, it holds more than 100 million biological documents and 6.5 billion relations between them. The database is accessible through an advanced web interface that supports complex queries, "fuzzy" searches, data materialization and more, online at http://biozon.org.
The use of intelligent database systems in acute pancreatitis--a systematic review.
van den Heever, Marc; Mittal, Anubhav; Haydock, Matthew; Windsor, John
2014-01-01
Acute pancreatitis (AP) is a complex disease with multiple aetiological factors, wide ranging severity, and multiple challenges to effective triage and management. Databases, data mining and machine learning algorithms (MLAs), including artificial neural networks (ANNs), may assist by storing and interpreting data from multiple sources, potentially improving clinical decision-making. 1) Identify database technologies used to store AP data, 2) collate and categorise variables stored in AP databases, 3) identify the MLA technologies, including ANNs, used to analyse AP data, and 4) identify clinical and non-clinical benefits and obstacles in establishing a national or international AP database. Comprehensive systematic search of online reference databases. The predetermined inclusion criteria were all papers discussing 1) databases, 2) data mining or 3) MLAs, pertaining to AP, independently assessed by two reviewers with conflicts resolved by a third author. Forty-three papers were included. Three data mining technologies and five ANN methodologies were reported in the literature. There were 187 collected variables identified. ANNs increase accuracy of severity prediction, one study showed ANNs had a sensitivity of 0.89 and specificity of 0.96 six hours after admission--compare APACHE II (cutoff score ≥8) with 0.80 and 0.85 respectively. Problems with databases were incomplete data, lack of clinical data, diagnostic reliability and missing clinical data. This is the first systematic review examining the use of databases, MLAs and ANNs in the management of AP. The clinical benefits these technologies have over current systems and other advantages to adopting them are identified. Copyright © 2013 IAP and EPC. Published by Elsevier B.V. All rights reserved.
ARMOUR - A Rice miRNA: mRNA Interaction Resource.
Sanan-Mishra, Neeti; Tripathi, Anita; Goswami, Kavita; Shukla, Rohit N; Vasudevan, Madavan; Goswami, Hitesh
2018-01-01
ARMOUR was developed as A Rice miRNA:mRNA interaction resource. This informative and interactive database includes the experimentally validated expression profiles of miRNAs under different developmental and abiotic stress conditions across seven Indian rice cultivars. This comprehensive database covers 689 known and 1664 predicted novel miRNAs and their expression profiles in more than 38 different tissues or conditions along with their predicted/known target transcripts. The understanding of miRNA:mRNA interactome in regulation of functional cellular machinery is supported by the sequence information of the mature and hairpin structures. ARMOUR provides flexibility to users in querying the database using multiple ways like known gene identifiers, gene ontology identifiers, KEGG identifiers and also allows on the fly fold change analysis and sequence search query with inbuilt BLAST algorithm. ARMOUR database provides a cohesive platform for novel and mature miRNAs and their expression in different experimental conditions and allows searching for their interacting mRNA targets, GO annotation and their involvement in various biological pathways. The ARMOUR database includes a provision for adding more experimental data from users, with an aim to develop it as a platform for sharing and comparing experimental data contributed by research groups working on rice.
Kremer, Lukas P M; Leufken, Johannes; Oyunchimeg, Purevdulam; Schulze, Stefan; Fufezan, Christian
2016-03-04
Proteomics data integration has become a broad field with a variety of programs offering innovative algorithms to analyze increasing amounts of data. Unfortunately, this software diversity leads to many problems as soon as the data is analyzed using more than one algorithm for the same task. Although it was shown that the combination of multiple peptide identification algorithms yields more robust results, it is only recently that unified approaches are emerging; however, workflows that, for example, aim to optimize search parameters or that employ cascaded style searches can only be made accessible if data analysis becomes not only unified but also and most importantly scriptable. Here we introduce Ursgal, a Python interface to many commonly used bottom-up proteomics tools and to additional auxiliary programs. Complex workflows can thus be composed using the Python scripting language using a few lines of code. Ursgal is easily extensible, and we have made several database search engines (X!Tandem, OMSSA, MS-GF+, Myrimatch, MS Amanda), statistical postprocessing algorithms (qvality, Percolator), and one algorithm that combines statistically postprocessed outputs from multiple search engines ("combined FDR") accessible as an interface in Python. Furthermore, we have implemented a new algorithm ("combined PEP") that combines multiple search engines employing elements of "combined FDR", PeptideShaker, and Bayes' theorem.
Databases and coordinated research projects at the IAEA on atomic processes in plasmas
NASA Astrophysics Data System (ADS)
Braams, Bastiaan J.; Chung, Hyun-Kyung
2012-05-01
The Atomic and Molecular Data Unit at the IAEA works with a network of national data centres to encourage and coordinate production and dissemination of fundamental data for atomic, molecular and plasma-material interaction (A+M/PMI) processes that are relevant to the realization of fusion energy. The Unit maintains numerical and bibliographical databases and has started a Wiki-style knowledge base. The Unit also contributes to A+M database interface standards and provides a search engine that offers a common interface to multiple numerical A+M/PMI databases. Coordinated Research Projects (CRPs) bring together fusion energy researchers and atomic, molecular and surface physicists for joint work towards the development of new data and new methods. The databases and current CRPs on A+M/PMI processes are briefly described here.
Sheynkman, Gloria M.; Shortreed, Michael R.; Frey, Brian L.; Scalf, Mark; Smith, Lloyd M.
2013-01-01
Each individual carries thousands of non-synonymous single nucleotide variants (nsSNVs) in their genome, each corresponding to a single amino acid polymorphism (SAP) in the encoded proteins. It is important to be able to directly detect and quantify these variations at the protein level in order to study post-transcriptional regulation, differential allelic expression, and other important biological processes. However, such variant peptides are not generally detected in standard proteomic analyses, due to their absence from the generic databases that are employed for mass spectrometry searching. Here, we extend previous work that demonstrated the use of customized SAP databases constructed from sample-matched RNA-Seq data. We collected deep coverage RNA-Seq data from the Jurkat cell line, compiled the set of nsSNVs that are expressed, used this information to construct a customized SAP database, and searched it against deep coverage shotgun MS data obtained from the same sample. This approach enabled detection of 421 SAP peptides mapping to 395 nsSNVs. We compared these peptides to peptides identified from a large generic search database containing all known nsSNVs (dbSNP) and found that more than 70% of the SAP peptides from this dbSNP-derived search were not supported by the RNA-Seq data, and thus are likely false positives. Next, we increased the SAP coverage from the RNA-Seq derived database by utilizing multiple protease digestions, thereby increasing variant detection to 695 SAP peptides mapping to 504 nsSNV sites. These detected SAP peptides corresponded to moderate to high abundance transcripts (30+ transcripts per million, TPM). The SAP peptides included 192 allelic pairs; the relative expression levels of the two alleles were evaluated for 51 of those pairs, and found to be comparable in all cases. PMID:24175627
AIM: a comprehensive Arabidopsis interactome module database and related interologs in plants.
Wang, Yi; Thilmony, Roger; Zhao, Yunjun; Chen, Guoping; Gu, Yong Q
2014-01-01
Systems biology analysis of protein modules is important for understanding the functional relationships between proteins in the interactome. Here, we present a comprehensive database named AIM for Arabidopsis (Arabidopsis thaliana) interactome modules. The database contains almost 250,000 modules that were generated using multiple analysis methods and integration of microarray expression data. All the modules in AIM are well annotated using multiple gene function knowledge databases. AIM provides a user-friendly interface for different types of searches and offers a powerful graphical viewer for displaying module networks linked to the enrichment annotation terms. Both interactive Venn diagram and power graph viewer are integrated into the database for easy comparison of modules. In addition, predicted interologs from other plant species (homologous proteins from different species that share a conserved interaction module) are available for each Arabidopsis module. AIM is a powerful systems biology platform for obtaining valuable insights into the function of proteins in Arabidopsis and other plants using the modules of the Arabidopsis interactome. Database URL:http://probes.pw.usda.gov/AIM Published by Oxford University Press 2014. This work is written by US Government employees and is in the public domain in the US.
Bogdán, István A.; Rivers, Jenny; Beynon, Robert J.; Coca, Daniel
2008-01-01
Motivation: Peptide mass fingerprinting (PMF) is a method for protein identification in which a protein is fragmented by a defined cleavage protocol (usually proteolysis with trypsin), and the masses of these products constitute a ‘fingerprint’ that can be searched against theoretical fingerprints of all known proteins. In the first stage of PMF, the raw mass spectrometric data are processed to generate a peptide mass list. In the second stage this protein fingerprint is used to search a database of known proteins for the best protein match. Although current software solutions can typically deliver a match in a relatively short time, a system that can find a match in real time could change the way in which PMF is deployed and presented. In a paper published earlier we presented a hardware design of a raw mass spectra processor that, when implemented in Field Programmable Gate Array (FPGA) hardware, achieves almost 170-fold speed gain relative to a conventional software implementation running on a dual processor server. In this article we present a complementary hardware realization of a parallel database search engine that, when running on a Xilinx Virtex 2 FPGA at 100 MHz, delivers 1800-fold speed-up compared with an equivalent C software routine, running on a 3.06 GHz Xeon workstation. The inherent scalability of the design means that processing speed can be multiplied by deploying the design on multiple FPGAs. The database search processor and the mass spectra processor, running on a reconfigurable computing platform, provide a complete real-time PMF protein identification solution. Contact: d.coca@sheffield.ac.uk PMID:18453553
RPG: the Ribosomal Protein Gene database.
Nakao, Akihiro; Yoshihama, Maki; Kenmochi, Naoya
2004-01-01
RPG (http://ribosome.miyazaki-med.ac.jp/) is a new database that provides detailed information about ribosomal protein (RP) genes. It contains data from humans and other organisms, including Drosophila melanogaster, Caenorhabditis elegans, Saccharo myces cerevisiae, Methanococcus jannaschii and Escherichia coli. Users can search the database by gene name and organism. Each record includes sequences (genomic, cDNA and amino acid sequences), intron/exon structures, genomic locations and information about orthologs. In addition, users can view and compare the gene structures of the above organisms and make multiple amino acid sequence alignments. RPG also provides information on small nucleolar RNAs (snoRNAs) that are encoded in the introns of RP genes.
RPG: the Ribosomal Protein Gene database
Nakao, Akihiro; Yoshihama, Maki; Kenmochi, Naoya
2004-01-01
RPG (http://ribosome.miyazaki-med.ac.jp/) is a new database that provides detailed information about ribosomal protein (RP) genes. It contains data from humans and other organisms, including Drosophila melanogaster, Caenorhabditis elegans, Saccharo myces cerevisiae, Methanococcus jannaschii and Escherichia coli. Users can search the database by gene name and organism. Each record includes sequences (genomic, cDNA and amino acid sequences), intron/exon structures, genomic locations and information about orthologs. In addition, users can view and compare the gene structures of the above organisms and make multiple amino acid sequence alignments. RPG also provides information on small nucleolar RNAs (snoRNAs) that are encoded in the introns of RP genes. PMID:14681386
UniVIO: A Multiple Omics Database with Hormonome and Transcriptome Data from Rice
Sakurai, Tetsuya; Sakakibara, Hitoshi
2013-01-01
Plant hormones play important roles as signaling molecules in the regulation of growth and development by controlling the expression of downstream genes. Since the hormone signaling system represents a complex network involving functional cross-talk through the mutual regulation of signaling and metabolism, a comprehensive and integrative analysis of plant hormone concentrations and gene expression is important for a deeper understanding of hormone actions. We have developed a database named Uniformed Viewer for Integrated Omics (UniVIO: http://univio.psc.riken.jp/), which displays hormone-metabolome (hormonome) and transcriptome data in a single formatted (uniformed) heat map. At the present time, hormonome and transcriptome data obtained from 14 organ parts of rice plants at the reproductive stage and seedling shoots of three gibberellin signaling mutants are included in the database. The hormone concentration and gene expression data can be searched by substance name, probe ID, gene locus ID or gene description. A correlation search function has been implemented to enable users to obtain information of correlated substance accumulation and gene expression. In the correlation search, calculation method, range of correlation coefficient and plant samples can be selected freely. PMID:23314752
Literature searches on Ayurveda: An update.
Aggithaya, Madhur G; Narahari, Saravu R
2015-01-01
The journals that publish on Ayurveda are increasingly indexed by popular medical databases in recent years. However, many Eastern journals are not indexed biomedical journal databases such as PubMed. Literature searches for Ayurveda continue to be challenging due to the nonavailability of active, unbiased dedicated databases for Ayurvedic literature. In 2010, authors identified 46 databases that can be used for systematic search of Ayurvedic papers and theses. This update reviewed our previous recommendation and identified current and relevant databases. To update on Ayurveda literature search and strategy to retrieve maximum publications. Author used psoriasis as an example to search previously listed databases and identify new. The population, intervention, control, and outcome table included keywords related to psoriasis and Ayurvedic terminologies for skin diseases. Current citation update status, search results, and search options of previous databases were assessed. Eight search strategies were developed. Hundred and five journals, both biomedical and Ayurveda, which publish on Ayurveda, were identified. Variability in databases was explored to identify bias in journal citation. Five among 46 databases are now relevant - AYUSH research portal, Annotated Bibliography of Indian Medicine, Digital Helpline for Ayurveda Research Articles (DHARA), PubMed, and Directory of Open Access Journals. Search options in these databases are not uniform, and only PubMed allows complex search strategy. "The Researches in Ayurveda" and "Ayurvedic Research Database" (ARD) are important grey resources for hand searching. About 44/105 (41.5%) journals publishing Ayurvedic studies are not indexed in any database. Only 11/105 (10.4%) exclusive Ayurveda journals are indexed in PubMed. AYUSH research portal and DHARA are two major portals after 2010. It is mandatory to search PubMed and four other databases because all five carry citations from different groups of journals. The hand searching is important to identify Ayurveda publications that are not indexed elsewhere. Availability information of citations in Ayurveda libraries from National Union Catalogue of Scientific Serials in India if regularly updated will improve the efficacy of hand searching. A grey database (ARD) contains unpublished PG/Ph.D. theses. The AYUSH portal, DHARA (funded by Ministry of AYUSH), and ARD should be merged to form single larger database to limit Ayurveda literature searches.
A Replication of Failure, Not a Failure to Replicate
ERIC Educational Resources Information Center
Holden, Gary; Barker, Kathleen; Kuppens, Sofie; Rosenberg, Gary; LeBreton, Jonathan
2015-01-01
Purpose: The increasing role of systematic reviews in knowledge production demands greater rigor in the literature search process. The performance of the Social Work Abstracts (SWA) database has been examined multiple times over the past three decades. The current study is a replication within this line of research. Method: Issue-level coverage…
A Critical Review of Research on the Alert Program®
ERIC Educational Resources Information Center
Gill, Kamaldeep; Thompson-Hodgetts, Sandra; Rasmussen, Carmen
2018-01-01
To evaluate the strength of evidence for the effectiveness, feasibility, and appropriateness of the Alert Program®. Multiple databases were systematically searched for peer-reviewed, English-language articles that evaluated the Alert Program®. Six articles met the inclusion criteria. The strength of evidence ranged from weak to moderate using the…
Adjacency and Proximity Searching in the Science Citation Index and Google
2005-01-01
major database search engines , including commercial S&T database search engines (e.g., Science Citation Index (SCI), Engineering Compendex (EC...PubMed, OVID), Federal agency award database search engines (e.g., NSF, NIH, DOE, EPA, as accessed in Federal R&D Project Summaries), Web search Engines (e.g...searching. Some database search engines allow strict constrained co- occurrence searching as a user option (e.g., OVID, EC), while others do not (e.g., SCI
The Role of Data Archives in Synoptic Solar Physics
NASA Astrophysics Data System (ADS)
Reardon, Kevin
The detailed study of solar cycle variations requires analysis of recorded datasets spanning many years of observations, that is, a data archive. The use of digital data, combined with powerful database server software, gives such archives new capabilities to provide, quickly and flexibly, selected pieces of information to scientists. Use of standardized protocols will allow multiple databases, independently maintained, to be seamlessly joined, allowing complex searches spanning multiple archives. These data archives also benefit from being developed in parallel with the telescope itself, which helps to assure data integrity and to provide close integration between the telescope and archive. Development of archives that can guarantee long-term data availability and strong compatibility with other projects makes solar-cycle studies easier to plan and realize.
Comet: an open-source MS/MS sequence database search tool.
Eng, Jimmy K; Jahan, Tahmina A; Hoopmann, Michael R
2013-01-01
Proteomics research routinely involves identifying peptides and proteins via MS/MS sequence database search. Thus the database search engine is an integral tool in many proteomics research groups. Here, we introduce the Comet search engine to the existing landscape of commercial and open-source database search tools. Comet is open source, freely available, and based on one of the original sequence database search tools that has been widely used for many years. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Database Search Engines: Paradigms, Challenges and Solutions.
Verheggen, Kenneth; Martens, Lennart; Berven, Frode S; Barsnes, Harald; Vaudel, Marc
2016-01-01
The first step in identifying proteins from mass spectrometry based shotgun proteomics data is to infer peptides from tandem mass spectra, a task generally achieved using database search engines. In this chapter, the basic principles of database search engines are introduced with a focus on open source software, and the use of database search engines is demonstrated using the freely available SearchGUI interface. This chapter also discusses how to tackle general issues related to sequence database searching and shows how to minimize their impact.
Sandhu, Maninder; Sureshkumar, V; Prakash, Chandra; Dixit, Rekha; Solanke, Amolkumar U; Sharma, Tilak Raj; Mohapatra, Trilochan; S V, Amitha Mithra
2017-09-30
Genome-wide microarray has enabled development of robust databases for functional genomics studies in rice. However, such databases do not directly cater to the needs of breeders. Here, we have attempted to develop a web interface which combines the information from functional genomic studies across different genetic backgrounds with DNA markers so that they can be readily deployed in crop improvement. In the current version of the database, we have included drought and salinity stress studies since these two are the major abiotic stresses in rice. RiceMetaSys, a user-friendly and freely available web interface provides comprehensive information on salt responsive genes (SRGs) and drought responsive genes (DRGs) across genotypes, crop development stages and tissues, identified from multiple microarray datasets. 'Physical position search' is an attractive tool for those using QTL based approach for dissecting tolerance to salt and drought stress since it can provide the list of SRGs and DRGs in any physical interval. To identify robust candidate genes for use in crop improvement, the 'common genes across varieties' search tool is useful. Graphical visualization of expression profiles across genes and rice genotypes has been enabled to facilitate the user and to make the comparisons more impactful. Simple Sequence Repeat (SSR) search in the SRGs and DRGs is a valuable tool for fine mapping and marker assisted selection since it provides primers for survey of polymorphism. An external link to intron specific markers is also provided for this purpose. Bulk retrieval of data without any limit has been enabled in case of locus and SSR search. The aim of this database is to facilitate users with a simple and straight-forward search options for identification of robust candidate genes from among thousands of SRGs and DRGs so as to facilitate linking variation in expression profiles to variation in phenotype. Database URL: http://14.139.229.201.
Fast parallel tandem mass spectral library searching using GPU hardware acceleration.
Baumgardner, Lydia Ashleigh; Shanmugam, Avinash Kumar; Lam, Henry; Eng, Jimmy K; Martin, Daniel B
2011-06-03
Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate-limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper, we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching), is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA, which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment.
WheatGenome.info: an integrated database and portal for wheat genome information.
Lai, Kaitao; Berkman, Paul J; Lorenc, Michal Tadeusz; Duran, Chris; Smits, Lars; Manoli, Sahana; Stiller, Jiri; Edwards, David
2012-02-01
Bread wheat (Triticum aestivum) is one of the most important crop plants, globally providing staple food for a large proportion of the human population. However, improvement of this crop has been limited due to its large and complex genome. Advances in genomics are supporting wheat crop improvement. We provide a variety of web-based systems hosting wheat genome and genomic data to support wheat research and crop improvement. WheatGenome.info is an integrated database resource which includes multiple web-based applications. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second-generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This system includes links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/.
An approach in building a chemical compound search engine in oracle database.
Wang, H; Volarath, P; Harrison, R
2005-01-01
A searching or identifying of chemical compounds is an important process in drug design and in chemistry research. An efficient search engine involves a close coupling of the search algorithm and database implementation. The database must process chemical structures, which demands the approaches to represent, store, and retrieve structures in a database system. In this paper, a general database framework for working as a chemical compound search engine in Oracle database is described. The framework is devoted to eliminate data type constrains for potential search algorithms, which is a crucial step toward building a domain specific query language on top of SQL. A search engine implementation based on the database framework is also demonstrated. The convenience of the implementation emphasizes the efficiency and simplicity of the framework.
Content-based video indexing and searching with wavelet transformation
NASA Astrophysics Data System (ADS)
Stumpf, Florian; Al-Jawad, Naseer; Du, Hongbo; Jassim, Sabah
2006-05-01
Biometric databases form an essential tool in the fight against international terrorism, organised crime and fraud. Various government and law enforcement agencies have their own biometric databases consisting of combination of fingerprints, Iris codes, face images/videos and speech records for an increasing number of persons. In many cases personal data linked to biometric records are incomplete and/or inaccurate. Besides, biometric data in different databases for the same individual may be recorded with different personal details. Following the recent terrorist atrocities, law enforcing agencies collaborate more than before and have greater reliance on database sharing. In such an environment, reliable biometric-based identification must not only determine who you are but also who else you are. In this paper we propose a compact content-based video signature and indexing scheme that can facilitate retrieval of multiple records in face biometric databases that belong to the same person even if their associated personal data are inconsistent. We shall assess the performance of our system using a benchmark audio visual face biometric database that has multiple videos for each subject but with different identity claims. We shall demonstrate that retrieval of relatively small number of videos that are nearest, in terms of the proposed index, to any video in the database results in significant proportion of that individual biometric data.
Past, present, and future of water data delivery from the U.S. Geological Survey
Hirsch, Robert M.; Fisher, Gary T.
2014-01-01
We present an overview of national water databases managed by the U.S. Geological Survey, including surface-water, groundwater, water-quality, and water-use data. These are readily accessible to users through web interfaces and data services. Multiple perspectives of data are provided, including search and retrieval of real-time data and historical data, on-demand current conditions and alert services, data compilations, spatial representations, analytical products, and availability of data across multiple agencies.
Corp, Nadia; Jordan, Joanne L; Hayden, Jill A; Irvin, Emma; Parker, Robin; Smith, Andrea; van der Windt, Danielle A
2017-04-20
Prognosis research is on the rise, its importance recognised because chronic health conditions and diseases are increasingly common and costly. Prognosis systematic reviews are needed to collate and synthesise these research findings, especially to help inform effective clinical decision-making and healthcare policy. A detailed, comprehensive search strategy is central to any systematic review. However, within prognosis research, this is challenging due to poor reporting and inconsistent use of available indexing terms in electronic databases. Whilst many published search filters exist for finding clinical trials, this is not the case for prognosis studies. This systematic review aims to identify and compare existing methodological filters developed and evaluated to identify prognosis studies of any of the three main types: overall prognosis, prognostic factors, and prognostic [risk prediction] models. Primary studies reporting the development and/or evaluation of methodological search filters to retrieve any type of prognosis study will be included in this systematic review. Multiple electronic bibliographic databases will be searched, grey literature will be sought from relevant organisations and websites, experts will be contacted, and citation tracking of key papers and reference list checking of all included papers will be undertaken. Titles will be screened by one person, and abstracts and full articles will be reviewed for inclusion independently by two reviewers. Data extraction and quality assessment will also be undertaken independently by two reviewers with disagreements resolved by discussion or by a third reviewer if necessary. Filters' characteristics and performance metrics reported in the included studies will be extracted and tabulated. To enable comparisons, filters will be grouped according to database, platform, type of prognosis study, and type of filter for which it was intended. This systematic review will identify all existing validated prognosis search filters and synthesise evidence about their applicability and performance. These findings will identify if current filters provide a proficient means of searching electronic bibliographic databases or if further prognosis filters are needed and can feasibly be developed for systematic searches of prognosis studies.
Unsworth, Nash
2016-01-01
The relation between working memory capacity (WMC) and recall from long-term memory (LTM) was examined in the current study. Participants performed multiple measures of delayed free recall varying in presentation duration and self-reported their strategy usage after each task. Participants also performed multiple measures of WMC. The results suggested that WMC and LTM recall were related, and part of this relation was due to effective strategy use. However, adaptive changes in strategy use and study time allocation were not related to WMC. Examining multiple variables with structural equation modeling suggested that the relation between WMC and LTM recall was due to variation in effective strategy use, search efficiency, and monitoring abilities. Furthermore, all variables were shown to account for individual differences in LTM recall. These results suggest that the relation between WMC and recall from LTM is due to multiple strategic factors operating at both encoding and retrieval. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Databases and coordinated research projects at the IAEA on atomic processes in plasmas
DOE Office of Scientific and Technical Information (OSTI.GOV)
Braams, Bastiaan J.; Chung, Hyun-Kyung
2012-05-25
The Atomic and Molecular Data Unit at the IAEA works with a network of national data centres to encourage and coordinate production and dissemination of fundamental data for atomic, molecular and plasma-material interaction (A+M/PMI) processes that are relevant to the realization of fusion energy. The Unit maintains numerical and bibliographical databases and has started a Wiki-style knowledge base. The Unit also contributes to A+M database interface standards and provides a search engine that offers a common interface to multiple numerical A+M/PMI databases. Coordinated Research Projects (CRPs) bring together fusion energy researchers and atomic, molecular and surface physicists for joint workmore » towards the development of new data and new methods. The databases and current CRPs on A+M/PMI processes are briefly described here.« less
Proposal of Network-Based Multilingual Space Dictionary Database System
NASA Astrophysics Data System (ADS)
Yoshimitsu, T.; Hashimoto, T.; Ninomiya, K.
2002-01-01
The International Academy of Astronautics (IAA) is now constructing a multilingual dictionary database system of space-friendly terms. The database consists of a lexicon and dictionaries of multiple languages. The lexicon is a table which relates corresponding terminology in different languages. Each language has a dictionary which contains terms and their definitions. The database assumes the use on the internet. Updating and searching the terms and definitions are conducted via the network. Maintaining the database is conducted by the international cooperation. A new word arises day by day, thus to easily input new words and their definitions to the database is required for the longstanding success of the system. The main key of the database is an English term which is approved at the table held once or twice with the working group members. Each language has at lease one working group member who is responsible of assigning the corresponding term and the definition of the term of his/her native language. Inputting and updating terms and their definitions can be conducted via the internet from the office of each member which may be located at his/her native country. The system is constructed by freely distributed database server program working on the Linux operating system, which will be installed at the head office of IAA. Once it is installed, it will be open to all IAA members who can search the terms via the internet. Currently the authors are constructing the prototype system which is described in this paper.
ERIC Educational Resources Information Center
Star Snyder, Marjorie
2010-01-01
A comprehensive search of multiple databases for references to the connection between families and schools yields a rich representation from family therapy, school counseling, school psychology, and education literature supporting the idea that schools must serve not only students, but students' families as well. One of the common themes emerging…
Literature searches on Ayurveda: An update
Aggithaya, Madhur G.; Narahari, Saravu R.
2015-01-01
Introduction: The journals that publish on Ayurveda are increasingly indexed by popular medical databases in recent years. However, many Eastern journals are not indexed biomedical journal databases such as PubMed. Literature searches for Ayurveda continue to be challenging due to the nonavailability of active, unbiased dedicated databases for Ayurvedic literature. In 2010, authors identified 46 databases that can be used for systematic search of Ayurvedic papers and theses. This update reviewed our previous recommendation and identified current and relevant databases. Aims: To update on Ayurveda literature search and strategy to retrieve maximum publications. Methods: Author used psoriasis as an example to search previously listed databases and identify new. The population, intervention, control, and outcome table included keywords related to psoriasis and Ayurvedic terminologies for skin diseases. Current citation update status, search results, and search options of previous databases were assessed. Eight search strategies were developed. Hundred and five journals, both biomedical and Ayurveda, which publish on Ayurveda, were identified. Variability in databases was explored to identify bias in journal citation. Results: Five among 46 databases are now relevant – AYUSH research portal, Annotated Bibliography of Indian Medicine, Digital Helpline for Ayurveda Research Articles (DHARA), PubMed, and Directory of Open Access Journals. Search options in these databases are not uniform, and only PubMed allows complex search strategy. “The Researches in Ayurveda” and “Ayurvedic Research Database” (ARD) are important grey resources for hand searching. About 44/105 (41.5%) journals publishing Ayurvedic studies are not indexed in any database. Only 11/105 (10.4%) exclusive Ayurveda journals are indexed in PubMed. Conclusion: AYUSH research portal and DHARA are two major portals after 2010. It is mandatory to search PubMed and four other databases because all five carry citations from different groups of journals. The hand searching is important to identify Ayurveda publications that are not indexed elsewhere. Availability information of citations in Ayurveda libraries from National Union Catalogue of Scientific Serials in India if regularly updated will improve the efficacy of hand searching. A grey database (ARD) contains unpublished PG/Ph.D. theses. The AYUSH portal, DHARA (funded by Ministry of AYUSH), and ARD should be merged to form single larger database to limit Ayurveda literature searches. PMID:27313409
WheatGenome.info: A Resource for Wheat Genomics Resource.
Lai, Kaitao
2016-01-01
An integrated database with a variety of Web-based systems named WheatGenome.info hosting wheat genome and genomic data has been developed to support wheat research and crop improvement. The resource includes multiple Web-based applications, which are implemented as a variety of Web-based systems. These include a GBrowse2-based wheat genome viewer with BLAST search portal, TAGdb for searching wheat second generation genome sequence data, wheat autoSNPdb, links to wheat genetic maps using CMap and CMap3D, and a wheat genome Wiki to allow interaction between diverse wheat genome sequencing activities. This portal provides links to a variety of wheat genome resources hosted at other research organizations. This integrated database aims to accelerate wheat genome research and is freely accessible via the web interface at http://www.wheatgenome.info/ .
ERIC Educational Resources Information Center
Fitzgibbons, Megan; Meert, Deborah
2010-01-01
The use of bibliographic management software and its internal search interfaces is now pervasive among researchers. This study compares the results between searches conducted in academic databases' search interfaces versus the EndNote search interface. The results show mixed search reliability, depending on the database and type of search…
Evaluating Open-Source Full-Text Search Engines for Matching ICD-10 Codes.
Jurcău, Daniel-Alexandru; Stoicu-Tivadar, Vasile
2016-01-01
This research presents the results of evaluating multiple free, open-source engines on matching ICD-10 diagnostic codes via full-text searches. The study investigates what it takes to get an accurate match when searching for a specific diagnostic code. For each code the evaluation starts by extracting the words that make up its text and continues with building full-text search queries from the combinations of these words. The queries are then run against all the ICD-10 codes until a match indicates the code in question as a match with the highest relative score. This method identifies the minimum number of words that must be provided in order for the search engines choose the desired entry. The engines analyzed include a popular Java-based full-text search engine, a lightweight engine written in JavaScript which can even execute on the user's browser, and two popular open-source relational database management systems.
Jones, Andrew R.; Siepen, Jennifer A.; Hubbard, Simon J.; Paton, Norman W.
2010-01-01
Tandem mass spectrometry, run in combination with liquid chromatography (LC-MS/MS), can generate large numbers of peptide and protein identifications, for which a variety of database search engines are available. Distinguishing correct identifications from false positives is far from trivial because all data sets are noisy, and tend to be too large for manual inspection, therefore probabilistic methods must be employed to balance the trade-off between sensitivity and specificity. Decoy databases are becoming widely used to place statistical confidence in results sets, allowing the false discovery rate (FDR) to be estimated. It has previously been demonstrated that different MS search engines produce different peptide identification sets, and as such, employing more than one search engine could result in an increased number of peptides being identified. However, such efforts are hindered by the lack of a single scoring framework employed by all search engines. We have developed a search engine independent scoring framework based on FDR which allows peptide identifications from different search engines to be combined, called the FDRScore. We observe that peptide identifications made by three search engines are infrequently false positives, and identifications made by only a single search engine, even with a strong score from the source search engine, are significantly more likely to be false positives. We have developed a second score based on the FDR within peptide identifications grouped according to the set of search engines that have made the identification, called the combined FDRScore. We demonstrate by searching large publicly available data sets that the combined FDRScore can differentiate between between correct and incorrect peptide identifications with high accuracy, allowing on average 35% more peptide identifications to be made at a fixed FDR than using a single search engine. PMID:19253293
Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B.; Tóth, Gábor; Ortutay, Csaba P.; Patthy, László
2005-01-01
DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21 061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically. PMID:15608291
Barta, Endre; Sebestyén, Endre; Pálfy, Tamás B; Tóth, Gábor; Ortutay, Csaba P; Patthy, László
2005-01-01
DoOP (http://doop.abc.hu/) is a database of eukaryotic promoter sequences (upstream regions) aiming to facilitate the recognition of regulatory sites conserved between species. The annotated first exons of human and Arabidopsis thaliana genes were used as queries in BLAST searches to collect the most closely related orthologous first exon sequences from Chordata and Viridiplantae species. Up to 3000 bp DNA segments upstream from these first exons constitute the clusters in the chordate and plant sections of the Database of Orthologous Promoters. Release 1.0 of DoOP contains 21,061 chordate clusters from 284 different species and 7548 plant clusters from 269 different species. The database can be used to find and retrieve promoter sequences of a given gene from various species and it is also suitable to see the most trivial conserved sequence blocks in the orthologous upstream regions. Users can search DoOP with either sequence or text (annotation) to find promoter clusters of various genes. In addition to the sequence data, the positions of the conserved sequence blocks derived from multiple alignments, the positions of repetitive elements and the positions of transcription start sites known from the Eukaryotic Promoter Database (EPD) can be viewed graphically.
The Open Spectral Database: an open platform for sharing and searching spectral data.
Chalk, Stuart J
2016-01-01
A number of websites make available spectral data for download (typically as JCAMP-DX text files) and one (ChemSpider) that also allows users to contribute spectral files. As a result, searching and retrieving such spectral data can be time consuming, and difficult to reuse if the data is compressed in the JCAMP-DX file. What is needed is a single resource that allows submission of JCAMP-DX files, export of the raw data in multiple formats, searching based on multiple chemical identifiers, and is open in terms of license and access. To address these issues a new online resource called the Open Spectral Database (OSDB) http://osdb.info/ has been developed and is now available. Built using open source tools, using open code (hosted on GitHub), providing open data, and open to community input about design and functionality, the OSDB is available for anyone to submit spectral data, making it searchable and available to the scientific community. This paper details the concept and coding, internal architecture, export formats, Representational State Transfer (REST) Application Programming Interface and options for submission of data. The OSDB website went live in November 2015. Concurrently, the GitHub repository was made available at https://github.com/stuchalk/OSDB/, and is open for collaborators to join the project, submit issues, and contribute code. The combination of a scripting environment (PHPStorm), a PHP Framework (CakePHP), a relational database (MySQL) and a code repository (GitHub) provides all the capabilities to easily develop REST based websites for ingestion, curation and exposure of open chemical data to the community at all levels. It is hoped this software stack (or equivalent ones in other scripting languages) will be leveraged to make more chemical data available for both humans and computers.
Park, Gun Wook; Hwang, Heeyoun; Kim, Kwang Hoe; Lee, Ju Yeon; Lee, Hyun Kyoung; Park, Ji Yeong; Ji, Eun Sun; Park, Sung-Kyu Robin; Yates, John R; Kwon, Kyung-Hoon; Park, Young Mok; Lee, Hyoung-Joo; Paik, Young-Ki; Kim, Jin Young; Yoo, Jong Shin
2016-11-04
In the Chromosome-Centric Human Proteome Project (C-HPP), false-positive identification by peptide spectrum matches (PSMs) after database searches is a major issue for proteogenomic studies using liquid-chromatography and mass-spectrometry-based large proteomic profiling. Here we developed a simple strategy for protein identification, with a controlled false discovery rate (FDR) at the protein level, using an integrated proteomic pipeline (IPP) that consists of four engrailed steps as follows. First, using three different search engines, SEQUEST, MASCOT, and MS-GF+, individual proteomic searches were performed against the neXtProt database. Second, the search results from the PSMs were combined using statistical evaluation tools including DTASelect and Percolator. Third, the peptide search scores were converted into E-scores normalized using an in-house program. Last, ProteinInferencer was used to filter the proteins containing two or more peptides with a controlled FDR of 1.0% at the protein level. Finally, we compared the performance of the IPP to a conventional proteomic pipeline (CPP) for protein identification using a controlled FDR of <1% at the protein level. Using the IPP, a total of 5756 proteins (vs 4453 using the CPP) including 477 alternative splicing variants (vs 182 using the CPP) were identified from human hippocampal tissue. In addition, a total of 10 missing proteins (vs 7 using the CPP) were identified with two or more unique peptides, and their tryptic peptides were validated using MS/MS spectral pattern from a repository database or their corresponding synthetic peptides. This study shows that the IPP effectively improved the identification of proteins, including alternative splicing variants and missing proteins, in human hippocampal tissues for the C-HPP. All RAW files used in this study were deposited in ProteomeXchange (PXD000395).
Analysis of human serum phosphopeptidome by a focused database searching strategy.
Zhu, Jun; Wang, Fangjun; Cheng, Kai; Song, Chunxia; Qin, Hongqiang; Hu, Lianghai; Figeys, Daniel; Ye, Mingliang; Zou, Hanfa
2013-01-14
As human serum is an important source for early diagnosis of many serious diseases, analysis of serum proteome and peptidome has been extensively performed. However, the serum phosphopeptidome was less explored probably because the effective method for database searching is lacking. Conventional database searching strategy always uses the whole proteome database, which is very time-consuming for phosphopeptidome search due to the huge searching space resulted from the high redundancy of the database and the setting of dynamic modifications during searching. In this work, a focused database searching strategy using an in-house collected human serum pro-peptidome target/decoy database (HuSPep) was established. It was found that the searching time was significantly decreased without compromising the identification sensitivity. By combining size-selective Ti (IV)-MCM-41 enrichment, RP-RP off-line separation, and complementary CID and ETD fragmentation with the new searching strategy, 143 unique endogenous phosphopeptides and 133 phosphorylation sites (109 novel sites) were identified from human serum with high reliability. Copyright © 2012 Elsevier B.V. All rights reserved.
Kelley, George A.; Kelley, Kristi S.
2012-01-01
Objective. The purpose of this study was to determine the database indexing of randomized controlled trials (RCTs) for a meta-analysis addressing the effects of exercise on pain and physical function in adults with arthritis and other rheumatic diseases (AORD). Methods. The number, percentage, and 95% confidence intervals (CIs) for included articles at initial and follow-up periods were calculated from PubMed, EMBASE, CENTRAL, CINAHL, SPORTDiscus, and DAO databases. The number needed to review (NNR) was also calculated along with the number of articles retrieved by expert review. Cross-referencing from reviews and included articles also occurred. Results. Thirty-four of 36 articles (94.4%, 95% CI, 81.3–99.3) were located by database searching. PubMed and CENTRAL yielded 32 of 36 articles (88.9%, 73.9–96.9). Two articles not identified in any of the other databases were found in either CINAHL or SPORTDicsus. Two other articles were located by scanning the reference lists of review articles. The NNR ranged from 2 (CINAHL) to 118 (SPORTDiscus). More articles were identified in EMBASE at follow-up (36%, 12.1–42.2 versus 86.1%, 70.5–95.3). Conclusions. Searching multiple databases and cross-referencing from reviews was important for identifying RCTs addressing the effects of exercise on pain and physical function in adults with AORD. PMID:22924128
Kelley, George A; Kelley, Kristi S
2012-01-01
Objective. The purpose of this study was to determine the database indexing of randomized controlled trials (RCTs) for a meta-analysis addressing the effects of exercise on pain and physical function in adults with arthritis and other rheumatic diseases (AORD). Methods. The number, percentage, and 95% confidence intervals (CIs) for included articles at initial and follow-up periods were calculated from PubMed, EMBASE, CENTRAL, CINAHL, SPORTDiscus, and DAO databases. The number needed to review (NNR) was also calculated along with the number of articles retrieved by expert review. Cross-referencing from reviews and included articles also occurred. Results. Thirty-four of 36 articles (94.4%, 95% CI, 81.3-99.3) were located by database searching. PubMed and CENTRAL yielded 32 of 36 articles (88.9%, 73.9-96.9). Two articles not identified in any of the other databases were found in either CINAHL or SPORTDicsus. Two other articles were located by scanning the reference lists of review articles. The NNR ranged from 2 (CINAHL) to 118 (SPORTDiscus). More articles were identified in EMBASE at follow-up (36%, 12.1-42.2 versus 86.1%, 70.5-95.3). Conclusions. Searching multiple databases and cross-referencing from reviews was important for identifying RCTs addressing the effects of exercise on pain and physical function in adults with AORD.
A design for a reusable Ada library
NASA Technical Reports Server (NTRS)
Litke, John D.
1986-01-01
A goal of the Ada language standardization effort is to promote reuse of software, implying the existence of substantial software libraries and the storage/retrieval mechanisms to support them. A searching/cataloging mechanism is proposed that permits full or partial distribution of the database, adapts to a variety of searching mechanisms, permits a changine taxonomy with minimal disruption, and minimizes the requirement of specialized cataloger/indexer skills. The important observation is that key words serve not only as indexing mechanism, but also as an identification mechanism, especially via concatenation and as support for a searching mechanism. By deliberately separating these multiple uses, the modifiability and ease of growth that current libraries require, is achieved.
Li, Honglan; Joh, Yoon Sung; Kim, Hyunwoo; Paek, Eunok; Lee, Sang-Won; Hwang, Kyu-Baek
2016-12-22
Proteogenomics is a promising approach for various tasks ranging from gene annotation to cancer research. Databases for proteogenomic searches are often constructed by adding peptide sequences inferred from genomic or transcriptomic evidence to reference protein sequences. Such inflation of databases has potential of identifying novel peptides. However, it also raises concerns on sensitive and reliable peptide identification. Spurious peptides included in target databases may result in underestimated false discovery rate (FDR). On the other hand, inflation of decoy databases could decrease the sensitivity of peptide identification due to the increased number of high-scoring random hits. Although several studies have addressed these issues, widely applicable guidelines for sensitive and reliable proteogenomic search have hardly been available. To systematically evaluate the effect of database inflation in proteogenomic searches, we constructed a variety of real and simulated proteogenomic databases for yeast and human tandem mass spectrometry (MS/MS) data, respectively. Against these databases, we tested two popular database search tools with various approaches to search result validation: the target-decoy search strategy (with and without a refined scoring-metric) and a mixture model-based method. The effect of separate filtering of known and novel peptides was also examined. The results from real and simulated proteogenomic searches confirmed that separate filtering increases the sensitivity and reliability in proteogenomic search. However, no one method consistently identified the largest (or the smallest) number of novel peptides from real proteogenomic searches. We propose to use a set of search result validation methods with separate filtering, for sensitive and reliable identification of peptides in proteogenomic search.
Wedge, David C; Krishna, Ritesh; Blackhurst, Paul; Siepen, Jennifer A; Jones, Andrew R; Hubbard, Simon J
2011-04-01
Confident identification of peptides via tandem mass spectrometry underpins modern high-throughput proteomics. This has motivated considerable recent interest in the postprocessing of search engine results to increase confidence and calculate robust statistical measures, for example through the use of decoy databases to calculate false discovery rates (FDR). FDR-based analyses allow for multiple testing and can assign a single confidence value for both sets and individual peptide spectrum matches (PSMs). We recently developed an algorithm for combining the results from multiple search engines, integrating FDRs for sets of PSMs made by different search engine combinations. Here we describe a web-server and a downloadable application that makes this routinely available to the proteomics community. The web server offers a range of outputs including informative graphics to assess the confidence of the PSMs and any potential biases. The underlying pipeline also provides a basic protein inference step, integrating PSMs into protein ambiguity groups where peptides can be matched to more than one protein. Importantly, we have also implemented full support for the mzIdentML data standard, recently released by the Proteomics Standards Initiative, providing users with the ability to convert native formats to mzIdentML files, which are available to download.
Wedge, David C; Krishna, Ritesh; Blackhurst, Paul; Siepen, Jennifer A; Jones, Andrew R.; Hubbard, Simon J.
2013-01-01
Confident identification of peptides via tandem mass spectrometry underpins modern high-throughput proteomics. This has motivated considerable recent interest in the post-processing of search engine results to increase confidence and calculate robust statistical measures, for example through the use of decoy databases to calculate false discovery rates (FDR). FDR-based analyses allow for multiple testing and can assign a single confidence value for both sets and individual peptide spectrum matches (PSMs). We recently developed an algorithm for combining the results from multiple search engines, integrating FDRs for sets of PSMs made by different search engine combinations. Here we describe a web-server, and a downloadable application, which makes this routinely available to the proteomics community. The web server offers a range of outputs including informative graphics to assess the confidence of the PSMs and any potential biases. The underlying pipeline provides a basic protein inference step, integrating PSMs into protein ambiguity groups where peptides can be matched to more than one protein. Importantly, we have also implemented full support for the mzIdentML data standard, recently released by the Proteomics Standards Initiative, providing users with the ability to convert native formats to mzIdentML files, which are available to download. PMID:21222473
RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics
Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo
2007-01-01
Background The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. Results Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request. PMID:17961253
Fast parallel tandem mass spectral library searching using GPU hardware acceleration
Baumgardner, Lydia Ashleigh; Shanmugam, Avinash Kumar; Lam, Henry; Eng, Jimmy K.; Martin, Daniel B.
2011-01-01
Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching) is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment. PMID:21545112
DynGO: a tool for visualizing and mining of Gene Ontology and its associations
Liu, Hongfang; Hu, Zhang-Zhi; Wu, Cathy H
2005-01-01
Background A large volume of data and information about genes and gene products has been stored in various molecular biology databases. A major challenge for knowledge discovery using these databases is to identify related genes and gene products in disparate databases. The development of Gene Ontology (GO) as a common vocabulary for annotation allows integrated queries across multiple databases and identification of semantically related genes and gene products (i.e., genes and gene products that have similar GO annotations). Meanwhile, dozens of tools have been developed for browsing, mining or editing GO terms, their hierarchical relationships, or their "associated" genes and gene products (i.e., genes and gene products annotated with GO terms). Tools that allow users to directly search and inspect relations among all GO terms and their associated genes and gene products from multiple databases are needed. Results We present a standalone package called DynGO, which provides several advanced functionalities in addition to the standard browsing capability of the official GO browsing tool (AmiGO). DynGO allows users to conduct batch retrieval of GO annotations for a list of genes and gene products, and semantic retrieval of genes and gene products sharing similar GO annotations. The result are shown in an association tree organized according to GO hierarchies and supported with many dynamic display options such as sorting tree nodes or changing orientation of the tree. For GO curators and frequent GO users, DynGO provides fast and convenient access to GO annotation data. DynGO is generally applicable to any data set where the records are annotated with GO terms, as illustrated by two examples. Conclusion We have presented a standalone package DynGO that provides functionalities to search and browse GO and its association databases as well as several additional functions such as batch retrieval and semantic retrieval. The complete documentation and software are freely available for download from the website . PMID:16091147
EasyKSORD: A Platform of Keyword Search Over Relational Databases
NASA Astrophysics Data System (ADS)
Peng, Zhaohui; Li, Jing; Wang, Shan
Keyword Search Over Relational Databases (KSORD) enables casual users to use keyword queries (a set of keywords) to search relational databases just like searching the Web, without any knowledge of the database schema or any need of writing SQL queries. Based on our previous work, we design and implement a novel KSORD platform named EasyKSORD for users and system administrators to use and manage different KSORD systems in a novel and simple manner. EasyKSORD supports advanced queries, efficient data-graph-based search engines, multiform result presentations, and system logging and analysis. Through EasyKSORD, users can search relational databases easily and read search results conveniently, and system administrators can easily monitor and analyze the operations of KSORD and manage KSORD systems much better.
Using SQL Databases for Sequence Similarity Searching and Analysis.
Pearson, William R; Mackey, Aaron J
2017-09-13
Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
Family Quality of Life: A Key Outcome in Early Childhood Intervention Services--A Scoping Review
ERIC Educational Resources Information Center
Bhopti, Anoo; Brown, Ted; Lentin, Primrose
2016-01-01
A scoping review was conducted to identify factors influencing the quality of life of families of children with disability. The review also explored the scales used to measure family quality of life (FQOL) as an outcome in early childhood intervention services (ECIS). Multiple databases were searched from 2000 to 2013 to include studies pertinent…
Got Power? A Systematic Review of Sample Size Adequacy in Health Professions Education Research
ERIC Educational Resources Information Center
Cook, David A.; Hatala, Rose
2015-01-01
Many education research studies employ small samples, which in turn lowers statistical power. We re-analyzed the results of a meta-analysis of simulation-based education to determine study power across a range of effect sizes, and the smallest effect that could be plausibly excluded. We systematically searched multiple databases through May 2011,…
Audain, Enrique; Uszkoreit, Julian; Sachsenberg, Timo; Pfeuffer, Julianus; Liang, Xiao; Hermjakob, Henning; Sanchez, Aniel; Eisenacher, Martin; Reinert, Knut; Tabb, David L; Kohlbacher, Oliver; Perez-Riverol, Yasset
2017-01-06
In mass spectrometry-based shotgun proteomics, protein identifications are usually the desired result. However, most of the analytical methods are based on the identification of reliable peptides and not the direct identification of intact proteins. Thus, assembling peptides identified from tandem mass spectra into a list of proteins, referred to as protein inference, is a critical step in proteomics research. Currently, different protein inference algorithms and tools are available for the proteomics community. Here, we evaluated five software tools for protein inference (PIA, ProteinProphet, Fido, ProteinLP, MSBayesPro) using three popular database search engines: Mascot, X!Tandem, and MS-GF+. All the algorithms were evaluated using a highly customizable KNIME workflow using four different public datasets with varying complexities (different sample preparation, species and analytical instruments). We defined a set of quality control metrics to evaluate the performance of each combination of search engines, protein inference algorithm, and parameters on each dataset. We show that the results for complex samples vary not only regarding the actual numbers of reported protein groups but also concerning the actual composition of groups. Furthermore, the robustness of reported proteins when using databases of differing complexities is strongly dependant on the applied inference algorithm. Finally, merging the identifications of multiple search engines does not necessarily increase the number of reported proteins, but does increase the number of peptides per protein and thus can generally be recommended. Protein inference is one of the major challenges in MS-based proteomics nowadays. Currently, there are a vast number of protein inference algorithms and implementations available for the proteomics community. Protein assembly impacts in the final results of the research, the quantitation values and the final claims in the research manuscript. Even though protein inference is a crucial step in proteomics data analysis, a comprehensive evaluation of the many different inference methods has never been performed. Previously Journal of proteomics has published multiple studies about other benchmark of bioinformatics algorithms (PMID: 26585461; PMID: 22728601) in proteomics studies making clear the importance of those studies for the proteomics community and the journal audience. This manuscript presents a new bioinformatics solution based on the KNIME/OpenMS platform that aims at providing a fair comparison of protein inference algorithms (https://github.com/KNIME-OMICS). Six different algorithms - ProteinProphet, MSBayesPro, ProteinLP, Fido and PIA- were evaluated using the highly customizable workflow on four public datasets with varying complexities. Five popular database search engines Mascot, X!Tandem, MS-GF+ and combinations thereof were evaluated for every protein inference tool. In total >186 proteins lists were analyzed and carefully compare using three metrics for quality assessments of the protein inference results: 1) the numbers of reported proteins, 2) peptides per protein, and the 3) number of uniquely reported proteins per inference method, to address the quality of each inference method. We also examined how many proteins were reported by choosing each combination of search engines, protein inference algorithms and parameters on each dataset. The results show that using 1) PIA or Fido seems to be a good choice when studying the results of the analyzed workflow, regarding not only the reported proteins and the high-quality identifications, but also the required runtime. 2) Merging the identifications of multiple search engines gives almost always more confident results and increases the number of peptides per protein group. 3) The usage of databases containing not only the canonical, but also known isoforms of proteins has a small impact on the number of reported proteins. The detection of specific isoforms could, concerning the question behind the study, compensate for slightly shorter reports using the parsimonious reports. 4) The current workflow can be easily extended to support new algorithms and search engine combinations. Copyright © 2016. Published by Elsevier B.V.
The chordate proteome history database.
Levasseur, Anthony; Paganini, Julien; Dainat, Jacques; Thompson, Julie D; Poch, Olivier; Pontarotti, Pierre; Gouret, Philippe
2012-01-01
The chordate proteome history database (http://ioda.univ-provence.fr) comprises some 20,000 evolutionary analyses of proteins from chordate species. Our main objective was to characterize and study the evolutionary histories of the chordate proteome, and in particular to detect genomic events and automatic functional searches. Firstly, phylogenetic analyses based on high quality multiple sequence alignments and a robust phylogenetic pipeline were performed for the whole protein and for each individual domain. Novel approaches were developed to identify orthologs/paralogs, and predict gene duplication/gain/loss events and the occurrence of new protein architectures (domain gains, losses and shuffling). These important genetic events were localized on the phylogenetic trees and on the genomic sequence. Secondly, the phylogenetic trees were enhanced by the creation of phylogroups, whereby groups of orthologous sequences created using OrthoMCL were corrected based on the phylogenetic trees; gene family size and gene gain/loss in a given lineage could be deduced from the phylogroups. For each ortholog group obtained from the phylogenetic or the phylogroup analysis, functional information and expression data can be retrieved. Database searches can be performed easily using biological objects: protein identifier, keyword or domain, but can also be based on events, eg, domain exchange events can be retrieved. To our knowledge, this is the first database that links group clustering, phylogeny and automatic functional searches along with the detection of important events occurring during genome evolution, such as the appearance of a new domain architecture.
Database Resources of the BIG Data Center in 2018
Xu, Xingjian; Hao, Lili; Zhu, Junwei; Tang, Bixia; Zhou, Qing; Song, Fuhai; Chen, Tingting; Zhang, Sisi; Dong, Lili; Lan, Li; Wang, Yanqing; Sang, Jian; Hao, Lili; Liang, Fang; Cao, Jiabao; Liu, Fang; Liu, Lin; Wang, Fan; Ma, Yingke; Xu, Xingjian; Zhang, Lijuan; Chen, Meili; Tian, Dongmei; Li, Cuiping; Dong, Lili; Du, Zhenglin; Yuan, Na; Zeng, Jingyao; Zhang, Zhewen; Wang, Jinyue; Shi, Shuo; Zhang, Yadong; Pan, Mengyu; Tang, Bixia; Zou, Dong; Song, Shuhui; Sang, Jian; Xia, Lin; Wang, Zhennan; Li, Man; Cao, Jiabao; Niu, Guangyi; Zhang, Yang; Sheng, Xin; Lu, Mingming; Wang, Qi; Xiao, Jingfa; Zou, Dong; Wang, Fan; Hao, Lili; Liang, Fang; Li, Mengwei; Sun, Shixiang; Zou, Dong; Li, Rujiao; Yu, Chunlei; Wang, Guangyu; Sang, Jian; Liu, Lin; Li, Mengwei; Li, Man; Niu, Guangyi; Cao, Jiabao; Sun, Shixiang; Xia, Lin; Yin, Hongyan; Zou, Dong; Xu, Xingjian; Ma, Lina; Chen, Huanxin; Sun, Yubin; Yu, Lei; Zhai, Shuang; Sun, Mingyuan; Zhang, Zhang; Zhao, Wenming; Xiao, Jingfa; Bao, Yiming; Song, Shuhui; Hao, Lili; Li, Rujiao; Ma, Lina; Sang, Jian; Wang, Yanqing; Tang, Bixia; Zou, Dong; Wang, Fan
2018-01-01
Abstract The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. PMID:29036542
Casuso-Holgado, María Jesús; Martín-Valero, Rocío; Carazo, Ana F; Medrano-Sánchez, Esther M; Cortés-Vega, M Dolores; Montero-Bancalero, Francisco José
2018-04-01
To evaluate the evidence for the use of virtual reality to treat balance and gait impairments in multiple sclerosis rehabilitation. Systematic review and meta-analysis of randomized controlled trials and quasi-randomized clinical trials. An electronic search was conducted using the following databases: MEDLINE (PubMed), Physiotherapy Evidence Database (PEDro), Cochrane Database of Systematic Reviews (CDSR) and (CINHAL). A quality assessment was performed using the PEDro scale. The data were pooled and a meta-analysis was completed. This systematic review was conducted in accordance with the (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) PRISMA guideline statement. It was registered in the PROSPERO database (CRD42016049360). A total of 11 studies were included. The data were pooled, allowing meta-analysis of seven outcomes of interest. A total of 466 participants clinically diagnosed with multiple sclerosis were analysed. Results showed that virtual reality balance training is more effective than no intervention for postural control improvement (standard mean difference (SMD) = -0.64; 95% confidence interval (CI) = -1.05, -0.24; P = 0.002). However, significant overall effect was not showed when compared with conventional training (SMD = -0.04; 95% CI = -0.70, 0.62; P = 0.90). Inconclusive results were also observed for gait rehabilitation. Virtual reality training could be considered at least as effective as conventional training and more effective than no intervention to treat balance and gait impairments in multiple sclerosis rehabilitation.
Diagnostic Evaluation of Nontraumatic Chest Pain in Athletes.
Moran, Byron; Bryan, Sean; Farrar, Ted; Salud, Chris; Visser, Gary; Decuba, Raymond; Renelus, Deborah; Buckley, Tyler; Dressing, Michael; Peterkin, Nicholas; Coris, Eric
This article is a clinically relevant review of the existing medical literature relating to the assessment and diagnostic evaluation for athletes complaining of nontraumatic chest pain. The literature was searched using the following databases for the years 1975 forward: Cochrane Database of Systematic Reviews; CINAHL; PubMed (MEDLINE); and SportDiscus. The general search used the keywords chest pain and athletes. The search was revised to include subject headings and subheadings, including chest pain and prevalence and athletes. Cross-referencing published articles from the databases searched discovered additional articles. No dissertations, theses, or meeting proceedings were reviewed. The authors discuss the scope of this complex problem and the diagnostic dilemma chest pain in athletes can provide. Next, the authors delve into the vast differential and attempt to simplify this process for the sports medicine physician by dividing potential etiologies into cardiac and noncardiac conditions. Life-threatening causes of chest pain in athletes may be cardiac or noncardiac in origin, which highlights the need for the sports medicine physician to consider pathology in multiple organ systems simultaneously. This article emphasizes the importance of ruling out immediately life threatening diagnoses, while acknowledging the most common causes of noncardiac chest pain in young athletes are benign. The authors propose a practical algorithm the sports medicine physician can use as a guide for the assessment and diagnostic work-up of the athlete with chest pain designed to help the physician arrive at the correct diagnosis in a clinically efficient and cost-effective manner.
Search prefilters to assist in library searching of infrared spectra of automotive clear coats.
Lavine, Barry K; Fasasi, Ayuba; Mirjankar, Nikhil; White, Collin; Sandercock, Mark
2015-01-01
Clear coat searches of the infrared (IR) spectral library of the paint data query (PDQ) forensic database often generate an unusable number of hits that span multiple manufacturers, assembly plants, and years. To improve the accuracy of the hit list, pattern recognition methods have been used to develop search prefilters (i.e., principal component models) that differentiate between similar but non-identical IR spectra of clear coats on the basis of manufacturer (e.g., General Motors, Ford, Chrysler) or assembly plant. A two step procedure to develop these search prefilters was employed. First, the discrete wavelet transform was used to decompose each IR spectrum into wavelet coefficients to enhance subtle but significant features in the spectral data. Second, a genetic algorithm for IR spectral pattern recognition was employed to identify wavelet coefficients characteristic of the manufacturer or assembly plant of the vehicle. Even in challenging trials where the paint samples evaluated were all from the same manufacturer (General Motors) within a limited production year range (2000-2006), the respective assembly plant of the vehicle was correctly identified. Search prefilters to identify assembly plants were successfully validated using 10 blind samples provided by the Royal Canadian Mounted Police (RCMP) as part of a study to populate PDQ to current production years, whereas the search prefilter to discriminate among automobile manufacturers was successfully validated using IR spectra obtained directly from the PDQ database. Copyright © 2014 Elsevier B.V. All rights reserved.
Opportunities for research to improve employment for people with spinal cord injuries.
Frieden, L; Winnegar, A J
2012-05-01
This paper reviews the literature pertaining to the employment of people who experience spinal cord injury (SCI) in the United States and recommending future research. The literature was reviewed with search terms such as SCI, employment, working from home and telework using databases in EBSCO, including Academic Search Complete and the American Psychological Association’s databases. Literature and findings on key factors related to employment illustrate the multiple dimensions of work environments, and health demands, that effect employment outcomes for people with SCI. Employment is important for people with SCI and valued in society. The literature reviewed indicates that researchers understand the work demands for people with SCI and may help to identify suitable supports, training and job opportunities. There remains a need for research focus on understanding future employment demands, necessary work skills, differing work environments and methods for increasing and preserving employment.
dbCPG: A web resource for cancer predisposition genes.
Wei, Ran; Yao, Yao; Yang, Wu; Zheng, Chun-Hou; Zhao, Min; Xia, Junfeng
2016-06-21
Cancer predisposition genes (CPGs) are genes in which inherited mutations confer highly or moderately increased risks of developing cancer. Identification of these genes and understanding the biological mechanisms that underlie them is crucial for the prevention, early diagnosis, and optimized management of cancer. Over the past decades, great efforts have been made to identify CPGs through multiple strategies. However, information on these CPGs and their molecular functions is scattered. To address this issue and provide a comprehensive resource for researchers, we developed the Cancer Predisposition Gene Database (dbCPG, Database URL: http://bioinfo.ahu.edu.cn:8080/dbCPG/index.jsp), the first literature-based gene resource for exploring human CPGs. It contains 827 human (724 protein-coding, 23 non-coding, and 80 unknown type genes), 637 rats, and 658 mouse CPGs. Furthermore, data mining was performed to gain insights into the understanding of the CPGs data, including functional annotation, gene prioritization, network analysis of prioritized genes and overlap analysis across multiple cancer types. A user-friendly web interface with multiple browse, search, and upload functions was also developed to facilitate access to the latest information on CPGs. Taken together, the dbCPG database provides a comprehensive data resource for further studies of cancer predisposition genes.
Native Health Research Database
... Indian Health Board) Welcome to the Native Health Database. Please enter your search terms. Basic Search Advanced ... To learn more about searching the Native Health Database, click here. Tutorial Video The NHD has made ...
ChemProt-2.0: visual navigation in a disease chemical biology database
Kim Kjærulff, Sonny; Wich, Louis; Kringelum, Jens; Jacobsen, Ulrik P.; Kouskoumvekaki, Irene; Audouze, Karine; Lund, Ole; Brunak, Søren; Oprea, Tudor I.; Taboureau, Olivier
2013-01-01
ChemProt-2.0 (http://www.cbs.dtu.dk/services/ChemProt-2.0) is a public available compilation of multiple chemical–protein annotation resources integrated with diseases and clinical outcomes information. The database has been updated to >1.15 million compounds with 5.32 millions bioactivity measurements for 15 290 proteins. Each protein is linked to quality-scored human protein–protein interactions data based on more than half a million interactions, for studying diseases and biological outcomes (diseases, pathways and GO terms) through protein complexes. In ChemProt-2.0, therapeutic effects as well as adverse drug reactions have been integrated allowing for suggesting proteins associated to clinical outcomes. New chemical structure fingerprints were computed based on the similarity ensemble approach. Protein sequence similarity search was also integrated to evaluate the promiscuity of proteins, which can help in the prediction of off-target effects. Finally, the database was integrated into a visual interface that enables navigation of the pharmacological space for small molecules. Filtering options were included in order to facilitate and to guide dynamic search of specific queries. PMID:23185041
Mi-DISCOVERER: A bioinformatics tool for the detection of mi-RNA in human genome.
Arshad, Saadia; Mumtaz, Asia; Ahmad, Freed; Liaquat, Sadia; Nadeem, Shahid; Mehboob, Shahid; Afzal, Muhammad
2010-11-27
MicroRNAs (miRNAs) are 22 nucleotides non-coding RNAs that play pivotal regulatory roles in diverse organisms including the humans and are difficult to be identified due to lack of either sequence features or robust algorithms to efficiently identify. Therefore, we made a tool that is Mi-Discoverer for the detection of miRNAs in human genome. The tools used for the development of software are Microsoft Office Access 2003, the JDK version 1.6.0, BioJava version 1.0, and the NetBeans IDE version 6.0. All already made miRNAs softwares were web based; so the advantage of our project was to make a desktop facility to the user for sequence alignment search with already identified miRNAs of human genome present in the database. The user can also insert and update the newly discovered human miRNA in the database. Mi-Discoverer, a bioinformatics tool successfully identifies human miRNAs based on multiple sequence alignment searches. It's a non redundant database containing a large collection of publicly available human miRNAs.
Mi-DISCOVERER: A bioinformatics tool for the detection of mi-RNA in human genome
Arshad, Saadia; Mumtaz, Asia; Ahmad, Freed; Liaquat, Sadia; Nadeem, Shahid; Mehboob, Shahid; Afzal, Muhammad
2010-01-01
MicroRNAs (miRNAs) are 22 nucleotides non-coding RNAs that play pivotal regulatory roles in diverse organisms including the humans and are difficult to be identified due to lack of either sequence features or robust algorithms to efficiently identify. Therefore, we made a tool that is Mi-Discoverer for the detection of miRNAs in human genome. The tools used for the development of software are Microsoft Office Access 2003, the JDK version 1.6.0, BioJava version 1.0, and the NetBeans IDE version 6.0. All already made miRNAs softwares were web based; so the advantage of our project was to make a desktop facility to the user for sequence alignment search with already identified miRNAs of human genome present in the database. The user can also insert and update the newly discovered human miRNA in the database. Mi-Discoverer, a bioinformatics tool successfully identifies human miRNAs based on multiple sequence alignment searches. It's a non redundant database containing a large collection of publicly available human miRNAs. PMID:21364831
Methods for structuring scientific knowledge from many areas related to aging research.
Zhavoronkov, Alex; Cantor, Charles R
2011-01-01
Aging and age-related disease represents a substantial quantity of current natural, social and behavioral science research efforts. Presently, no centralized system exists for tracking aging research projects across numerous research disciplines. The multidisciplinary nature of this research complicates the understanding of underlying project categories, the establishment of project relations, and the development of a unified project classification scheme. We have developed a highly visual database, the International Aging Research Portfolio (IARP), available at AgingPortfolio.org to address this issue. The database integrates information on research grants, peer-reviewed publications, and issued patent applications from multiple sources. Additionally, the database uses flexible project classification mechanisms and tools for analyzing project associations and trends. This system enables scientists to search the centralized project database, to classify and categorize aging projects, and to analyze the funding aspects across multiple research disciplines. The IARP is designed to provide improved allocation and prioritization of scarce research funding, to reduce project overlap and improve scientific collaboration thereby accelerating scientific and medical progress in a rapidly growing area of research. Grant applications often precede publications and some grants do not result in publications, thus, this system provides utility to investigate an earlier and broader view on research activity in many research disciplines. This project is a first attempt to provide a centralized database system for research grants and to categorize aging research projects into multiple subcategories utilizing both advanced machine algorithms and a hierarchical environment for scientific collaboration.
Library Instruction and Online Database Searching.
ERIC Educational Resources Information Center
Mercado, Heidi
1999-01-01
Reviews changes in online database searching in academic libraries. Topics include librarians conducting all searches; the advent of end-user searching and the need for user instruction; compact disk technology; online public catalogs; the Internet; full text databases; electronic information literacy; user education and the remote library user;…
New drug information resources for pharmacists at the National Library of Medicine.
Knoben, James E; Phillips, Steven J
2014-01-01
To provide an overview of selected drug information-related databases of the National Library of Medicine (NLM), with a focus on newer resources that support the professional information needs of pharmacists and other health care providers. NLM, which is the world's largest medical library, provides an array of bibliographic, factual, and evidence-based drug, herbal remedy, and dietary supplement information resources. Five of the more recently introduced online resources include areas of particular importance to pharmacists, including a repository of current product labeling/package inserts, with automated search links to associated information resources; a portal to drug information that allows pharmacists to search multiple databases simultaneously and link to related medication and health care information resources; authoritative information on the effects of medications, herbal remedies, and dietary supplements in nursing infants and their mothers; comprehensive information, including a case registry, on the potential for liver toxicity due to drugs, herbal remedies, and dietary supplements; and a pill identification system with two intuitive search methodologies. NLM provides several clinical-scientific drug information resources that are particularly useful in meeting the professional information needs of pharmacists.
Vogt, Martin; Bajorath, Jürgen
2008-01-01
Bayesian classifiers are increasingly being used to distinguish active from inactive compounds and search large databases for novel active molecules. We introduce an approach to directly combine the contributions of property descriptors and molecular fingerprints in the search for active compounds that is based on a Bayesian framework. Conventionally, property descriptors and fingerprints are used as alternative features for virtual screening methods. Following the approach introduced here, probability distributions of descriptor values and fingerprint bit settings are calculated for active and database molecules and the divergence between the resulting combined distributions is determined as a measure of biological activity. In test calculations on a large number of compound activity classes, this methodology was found to consistently perform better than similarity searching using fingerprints and multiple reference compounds or Bayesian screening calculations using probability distributions calculated only from property descriptors. These findings demonstrate that there is considerable synergy between different types of property descriptors and fingerprints in recognizing diverse structure-activity relationships, at least in the context of Bayesian modeling.
Learned suppression for multiple distractors in visual search.
Won, Bo-Yeong; Geng, Joy J
2018-05-07
Visual search for a target object occurs rapidly if there were no distractors to compete for attention, but this rarely happens in real-world environments. Distractors are almost always present and must be suppressed for target selection to succeed. Previous research suggests that one way this occurs is through the creation of a stimulus-specific distractor template. However, it remains unknown how information within such templates scale up with multiple distractors. Here we investigated the informational content of distractor templates created from repeated exposures to multiple distractors. We investigated this question using a visual search task in which participants searched for a gray square among colored squares. During "training," participants always saw the same set of colored distractors. During "testing," new distractor sets were interleaved with the trained distractors. The critical manipulation in each study was the distance (in color space) of the new test distractors from the trained distractors. We hypothesized that the pattern of distractor interference during testing would reveal the tuning of the suppression template: RTs should be commensurate with the degree to which distractor colors are encoded within the suppression template. Results from four experiments converged on the notion that the distractor template includes information about specific color values, but has broad "tuning," allowing suppression to generalize to new distractors. These results suggest that distractor templates, unlike target templates, encode multiple features and have broad representations, which have the advantage of generalizing suppression more easily to other potential distractors. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Addition of a breeding database in the Genome Database for Rosaceae
Evans, Kate; Jung, Sook; Lee, Taein; Brutcher, Lisa; Cho, Ilhyung; Peace, Cameron; Main, Dorrie
2013-01-01
Breeding programs produce large datasets that require efficient management systems to keep track of performance, pedigree, geographical and image-based data. With the development of DNA-based screening technologies, more breeding programs perform genotyping in addition to phenotyping for performance evaluation. The integration of breeding data with other genomic and genetic data is instrumental for the refinement of marker-assisted breeding tools, enhances genetic understanding of important crop traits and maximizes access and utility by crop breeders and allied scientists. Development of new infrastructure in the Genome Database for Rosaceae (GDR) was designed and implemented to enable secure and efficient storage, management and analysis of large datasets from the Washington State University apple breeding program and subsequently expanded to fit datasets from other Rosaceae breeders. The infrastructure was built using the software Chado and Drupal, making use of the Natural Diversity module to accommodate large-scale phenotypic and genotypic data. Breeders can search accessions within the GDR to identify individuals with specific trait combinations. Results from Search by Parentage lists individuals with parents in common and results from Individual Variety pages link to all data available on each chosen individual including pedigree, phenotypic and genotypic information. Genotypic data are searchable by markers and alleles; results are linked to other pages in the GDR to enable the user to access tools such as GBrowse and CMap. This breeding database provides users with the opportunity to search datasets in a fully targeted manner and retrieve and compare performance data from multiple selections, years and sites, and to output the data needed for variety release publications and patent applications. The breeding database facilitates efficient program management. Storing publicly available breeding data in a database together with genomic and genetic data will further accelerate the cross-utilization of diverse data types by researchers from various disciplines. Database URL: http://www.rosaceae.org/breeders_toolbox PMID:24247530
Addition of a breeding database in the Genome Database for Rosaceae.
Evans, Kate; Jung, Sook; Lee, Taein; Brutcher, Lisa; Cho, Ilhyung; Peace, Cameron; Main, Dorrie
2013-01-01
Breeding programs produce large datasets that require efficient management systems to keep track of performance, pedigree, geographical and image-based data. With the development of DNA-based screening technologies, more breeding programs perform genotyping in addition to phenotyping for performance evaluation. The integration of breeding data with other genomic and genetic data is instrumental for the refinement of marker-assisted breeding tools, enhances genetic understanding of important crop traits and maximizes access and utility by crop breeders and allied scientists. Development of new infrastructure in the Genome Database for Rosaceae (GDR) was designed and implemented to enable secure and efficient storage, management and analysis of large datasets from the Washington State University apple breeding program and subsequently expanded to fit datasets from other Rosaceae breeders. The infrastructure was built using the software Chado and Drupal, making use of the Natural Diversity module to accommodate large-scale phenotypic and genotypic data. Breeders can search accessions within the GDR to identify individuals with specific trait combinations. Results from Search by Parentage lists individuals with parents in common and results from Individual Variety pages link to all data available on each chosen individual including pedigree, phenotypic and genotypic information. Genotypic data are searchable by markers and alleles; results are linked to other pages in the GDR to enable the user to access tools such as GBrowse and CMap. This breeding database provides users with the opportunity to search datasets in a fully targeted manner and retrieve and compare performance data from multiple selections, years and sites, and to output the data needed for variety release publications and patent applications. The breeding database facilitates efficient program management. Storing publicly available breeding data in a database together with genomic and genetic data will further accelerate the cross-utilization of diverse data types by researchers from various disciplines. Database URL: http://www.rosaceae.org/breeders_toolbox.
HOWDY: an integrated database system for human genome research
Hirakawa, Mika
2002-01-01
HOWDY is an integrated database system for accessing and analyzing human genomic information (http://www-alis.tokyo.jst.go.jp/HOWDY/). HOWDY stores information about relationships between genetic objects and the data extracted from a number of databases. HOWDY consists of an Internet accessible user interface that allows thorough searching of the human genomic databases using the gene symbols and their aliases. It also permits flexible editing of the sequence data. The database can be searched using simple words and the search can be restricted to a specific cytogenetic location. Linear maps displaying markers and genes on contig sequences are available, from which an object can be chosen. Any search starting point identifies all the information matching the query. HOWDY provides a convenient search environment of human genomic data for scientists unsure which database is most appropriate for their search. PMID:11752279
TryTransDB: A web-based resource for transport proteins in Trypanosomatidae.
Sonar, Krushna; Kabra, Ritika; Singh, Shailza
2018-03-12
TryTransDB is a web-based resource that stores transport protein data which can be retrieved using a standalone BLAST tool. We have attempted to create an integrated database that can be a one-stop shop for the researchers working with transport proteins of Trypanosomatidae family. TryTransDB (Trypanosomatidae Transport Protein Database) is a web based comprehensive resource that can fire a BLAST search against most of the transport protein sequences (protein and nucleotide) from Trypanosomatidae family organisms. This web resource further allows to compute a phylogenetic tree by performing multiple sequence alignment (MSA) using CLUSTALW suite embedded in it. Also, cross-linking to other databases helps in gathering more information for a certain transport protein in a single website.
Choosing an Optimal Database for Protein Identification from Tandem Mass Spectrometry Data.
Kumar, Dhirendra; Yadav, Amit Kumar; Dash, Debasis
2017-01-01
Database searching is the preferred method for protein identification from digital spectra of mass to charge ratios (m/z) detected for protein samples through mass spectrometers. The search database is one of the major influencing factors in discovering proteins present in the sample and thus in deriving biological conclusions. In most cases the choice of search database is arbitrary. Here we describe common search databases used in proteomic studies and their impact on final list of identified proteins. We also elaborate upon factors like composition and size of the search database that can influence the protein identification process. In conclusion, we suggest that choice of the database depends on the type of inferences to be derived from proteomics data. However, making additional efforts to build a compact and concise database for a targeted question should generally be rewarding in achieving confident protein identifications.
Towards computational improvement of DNA database indexing and short DNA query searching.
Stojanov, Done; Koceski, Sašo; Mileva, Aleksandra; Koceska, Nataša; Bande, Cveta Martinovska
2014-09-03
In order to facilitate and speed up the search of massive DNA databases, the database is indexed at the beginning, employing a mapping function. By searching through the indexed data structure, exact query hits can be identified. If the database is searched against an annotated DNA query, such as a known promoter consensus sequence, then the starting locations and the number of potential genes can be determined. This is particularly relevant if unannotated DNA sequences have to be functionally annotated. However, indexing a massive DNA database and searching an indexed data structure with millions of entries is a time-demanding process. In this paper, we propose a fast DNA database indexing and searching approach, identifying all query hits in the database, without having to examine all entries in the indexed data structure, limiting the maximum length of a query that can be searched against the database. By applying the proposed indexing equation, the whole human genome could be indexed in 10 hours on a personal computer, under the assumption that there is enough RAM to store the indexed data structure. Analysing the methodology proposed by Reneker, we observed that hits at starting positions [Formula: see text] are not reported, if the database is searched against a query shorter than [Formula: see text] nucleotides, such that [Formula: see text] is the length of the DNA database words being mapped and [Formula: see text] is the length of the query. A solution of this drawback is also presented.
NASA Astrophysics Data System (ADS)
Kutzleb, C. D.
1997-02-01
The high incidence of recidivism (repeat offenders) in the criminal population makes the use of the IAFIS III/FBI criminal database an important tool in law enforcement. The problems and solutions employed by IAFIS III/FBI criminal subject searches are discussed for the following topics: (1) subject search selectivity and reliability; (2) the difficulty and limitations of identifying subjects whose anonymity may be a prime objective; (3) database size, search workload, and search response time; (4) techniques and advantages of normalizing the variability in an individual's name and identifying features into identifiable and discrete categories; and (5) the use of database demographics to estimate the likelihood of a match between a search subject and database subjects.
Mirsky, Matthew M; Marrie, Ruth Ann; Rae-Grant, Alexander
2016-01-01
Background: The Explorys Enterprise Performance Management (EPM) database contains de-identified clinical data for 50 million patients. Multiple sclerosis (MS) disease-modifying therapies (DMTs), specifically interferon beta (IFNβ) treatments, may potentiate depression. Conflicting data have emerged, and a large-scale claims-based study by Patten et al. did not support such an association. This study compares the results of Patten et al. with those using the EPM database. Methods: "Power searches" were built to test the relationship between antidepressant drug use and DMT in the MS population. Searches were built to produce a cohort of individuals diagnosed as having MS in the past 3 years taking a specific DMT who were then given any antidepressant drug. The antidepressant drug therapy prevalence was tested in the MS population on the following DMTs: IFNβ-1a, IFNβ-1b, combined IFNβ, glatiramer acetate, natalizumab, fingolimod, and dimethyl fumarate. Results: In patients with MS, the rate of antidepressant drug use in those receiving DMTs was 40.60% to 44.57%. The rate of antidepressant drug use for combined IFNβ DMTs was 41.61% (males: 31.25%-39.62%; females: 43.10%-47.33%). Antidepressant drug use peaked in the group aged 45 to 54 years for five of six DMTs. Conclusions: We found no association between IFNβ treatment and antidepressant drug use in the MS population compared with other DMTs. The EPM database has been validated against the Patten et al. data for future use in the MS population.
Sathorn, C; Parashos, P; Messer, H
2008-02-01
The aim of this systematic review was to assess the evidence regarding postoperative pain and flare-up of single- or multiple-visit root canal treatment. CENTRAL, MEDLINE and EMBASE databases were searched. Reference lists from identified articles were scanned. A forward search was undertaken on the authors of the identified articles. Papers that had cited these articles were also identified through Science Citation Index to identify potentially relevant subsequent primary research. The included clinical studies compared the prevalence/severity of postoperative pain or flare-up in single- and multiple-visit root canal treatment. Data in those studies were extracted independently. Sixteen studies fitted the inclusion criteria in the review, with sample size varying from 60 to 1012 cases. The prevalence of postoperative pain ranged from 3% to 58%. The heterogeneity amongst included studies was far too great to conduct meta-analysis and yield meaningful results. Compelling evidence indicating a significantly different prevalence of postoperative pain/flare-up of either single- or multiple-visit root canal treatment is lacking.
The BioGRID Interaction Database: 2011 update
Stark, Chris; Breitkreutz, Bobby-Joe; Chatr-aryamontri, Andrew; Boucher, Lorrie; Oughtred, Rose; Livstone, Michael S.; Nixon, Julie; Van Auken, Kimberly; Wang, Xiaodong; Shi, Xiaoqi; Reguly, Teresa; Rust, Jennifer M.; Winter, Andrew; Dolinski, Kara; Tyers, Mike
2011-01-01
The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein interaction data from model organisms and humans (http://www.thebiogrid.org). BioGRID currently holds 347 966 interactions (170 162 genetic, 177 804 protein) curated from both high-throughput data sets and individual focused studies, as derived from over 23 000 publications in the primary literature. Complete coverage of the entire literature is maintained for budding yeast (Saccharomyces cerevisiae), fission yeast (Schizosaccharomyces pombe) and thale cress (Arabidopsis thaliana), and efforts to expand curation across multiple metazoan species are underway. The BioGRID houses 48 831 human protein interactions that have been curated from 10 247 publications. Current curation drives are focused on particular areas of biology to enable insights into conserved networks and pathways that are relevant to human health. The BioGRID 3.0 web interface contains new search and display features that enable rapid queries across multiple data types and sources. An automated Interaction Management System (IMS) is used to prioritize, coordinate and track curation across international sites and projects. BioGRID provides interaction data to several model organism databases, resources such as Entrez-Gene and other interaction meta-databases. The entire BioGRID 3.0 data collection may be downloaded in multiple file formats, including PSI MI XML. Source code for BioGRID 3.0 is freely available without any restrictions. PMID:21071413
Non-redundant patent sequence databases with value-added annotations at two levels
Li, Weizhong; McWilliam, Hamish; de la Torre, Ana Richart; Grodowski, Adam; Benediktovich, Irina; Goujon, Mickael; Nauche, Stephane; Lopez, Rodrigo
2010-01-01
The European Bioinformatics Institute (EMBL-EBI) provides public access to patent data, including abstracts, chemical compounds and sequences. Sequences can appear multiple times due to the filing of the same invention with multiple patent offices, or the use of the same sequence by different inventors in different contexts. Information relating to the source invention may be incomplete, and biological information available in patent documents elsewhere may not be reflected in the annotation of the sequence. Search and analysis of these data have become increasingly challenging for both the scientific and intellectual-property communities. Here, we report a collection of non-redundant patent sequence databases, which cover the EMBL-Bank nucleotides patent class and the patent protein databases and contain value-added annotations from patent documents. The databases were created at two levels by the use of sequence MD5 checksums. Sequences within a level-1 cluster are 100% identical over their whole length. Level-2 clusters were defined by sub-grouping level-1 clusters based on patent family information. Value-added annotations, such as publication number corrections, earliest publication dates and feature collations, significantly enhance the quality of the data, allowing for better tracking and cross-referencing. The databases are available format: http://www.ebi.ac.uk/patentdata/nr/. PMID:19884134
Non-redundant patent sequence databases with value-added annotations at two levels.
Li, Weizhong; McWilliam, Hamish; de la Torre, Ana Richart; Grodowski, Adam; Benediktovich, Irina; Goujon, Mickael; Nauche, Stephane; Lopez, Rodrigo
2010-01-01
The European Bioinformatics Institute (EMBL-EBI) provides public access to patent data, including abstracts, chemical compounds and sequences. Sequences can appear multiple times due to the filing of the same invention with multiple patent offices, or the use of the same sequence by different inventors in different contexts. Information relating to the source invention may be incomplete, and biological information available in patent documents elsewhere may not be reflected in the annotation of the sequence. Search and analysis of these data have become increasingly challenging for both the scientific and intellectual-property communities. Here, we report a collection of non-redundant patent sequence databases, which cover the EMBL-Bank nucleotides patent class and the patent protein databases and contain value-added annotations from patent documents. The databases were created at two levels by the use of sequence MD5 checksums. Sequences within a level-1 cluster are 100% identical over their whole length. Level-2 clusters were defined by sub-grouping level-1 clusters based on patent family information. Value-added annotations, such as publication number corrections, earliest publication dates and feature collations, significantly enhance the quality of the data, allowing for better tracking and cross-referencing. The databases are available format: http://www.ebi.ac.uk/patentdata/nr/.
Thomas, Paul D; Kejariwal, Anish; Campbell, Michael J; Mi, Huaiyu; Diemer, Karen; Guo, Nan; Ladunga, Istvan; Ulitsky-Lazareva, Betty; Muruganujan, Anushya; Rabkin, Steven; Vandergriff, Jody A; Doremieux, Olivier
2003-01-01
The PANTHER database was designed for high-throughput analysis of protein sequences. One of the key features is a simplified ontology of protein function, which allows browsing of the database by biological functions. Biologist curators have associated the ontology terms with groups of protein sequences rather than individual sequences. Statistical models (Hidden Markov Models, or HMMs) are built from each of these groups. The advantage of this approach is that new sequences can be automatically classified as they become available. To ensure accurate functional classification, HMMs are constructed not only for families, but also for functionally distinct subfamilies. Multiple sequence alignments and phylogenetic trees, including curator-assigned information, are available for each family. The current version of the PANTHER database includes training sequences from all organisms in the GenBank non-redundant protein database, and the HMMs have been used to classify gene products across the entire genomes of human, and Drosophila melanogaster. The ontology terms and protein families and subfamilies, as well as Drosophila gene c;assifications, can be browsed and searched for free. Due to outstanding contractual obligations, access to human gene classifications and to protein family trees and multiple sequence alignments will temporarily require a nominal registration fee. PANTHER is publicly available on the web at http://panther.celera.com.
An Information System for European culture collections: the way forward.
Casaregola, Serge; Vasilenko, Alexander; Romano, Paolo; Robert, Vincent; Ozerskaya, Svetlana; Kopf, Anna; Glöckner, Frank O; Smith, David
2016-01-01
Culture collections contain indispensable information about the microorganisms preserved in their repositories, such as taxonomical descriptions, origins, physiological and biochemical characteristics, bibliographic references, etc. However, information currently accessible in databases rarely adheres to common standard protocols. The resultant heterogeneity between culture collections, in terms of both content and format, notably hampers microorganism-based research and development (R&D). The optimized exploitation of these resources thus requires standardized, and simplified, access to the associated information. To this end, and in the interest of supporting R&D in the fields of agriculture, health and biotechnology, a pan-European distributed research infrastructure, MIRRI, including over 40 public culture collections and research institutes from 19 European countries, was established. A prime objective of MIRRI is to unite and provide universal access to the fragmented, and untapped, resources, information and expertise available in European public collections of microorganisms; a key component of which is to develop a dynamic Information System. For the first time, both culture collection curators as well as their users have been consulted and their feedback, concerning the needs and requirements for collection databases and data accessibility, utilised. Users primarily noted that databases were not interoperable, thus rendering a global search of multiple databases impossible. Unreliable or out-of-date and, in particular, non-homogenous, taxonomic information was also considered to be a major obstacle to searching microbial data efficiently. Moreover, complex searches are rarely possible in online databases thus limiting the extent of search queries. Curators also consider that overall harmonization-including Standard Operating Procedures, data structure, and software tools-is necessary to facilitate their work and to make high-quality data easily accessible to their users. Clearly, the needs of culture collection curators coincide with those of users on the crucial point of database interoperability. In this regard, and in order to design an appropriate Information System, important aspects on which the culture collection community should focus include: the interoperability of data sets with the ontologies to be used; setting best practice in data management, and the definition of an appropriate data standard.
Comparison Study of Overlap among 21 Scientific Databases in Searching Pesticide Information.
ERIC Educational Resources Information Center
Meyer, Daniel E.; And Others
1983-01-01
Evaluates overlapping coverage of 21 scientific databases used in 10 online pesticide searches in an attempt to identify minimum number of databases needed to generate 90 percent of unique, relevant citations for given search. Comparison of searches combined under given pesticide usage (herbicide, fungicide, insecticide) is discussed. Nine…
Content based information retrieval in forensic image databases.
Geradts, Zeno; Bijhold, Jurrien
2002-03-01
This paper gives an overview of the various available image databases and ways of searching these databases on image contents. The developments in research groups of searching in image databases is evaluated and compared with the forensic databases that exist. Forensic image databases of fingerprints, faces, shoeprints, handwriting, cartridge cases, drugs tablets, and tool marks are described. The developments in these fields appear to be valuable for forensic databases, especially that of the framework in MPEG-7, where the searching in image databases is standardized. In the future, the combination of the databases (also DNA-databases) and possibilities to combine these can result in stronger forensic evidence.
Mapping the literature of nursing: 1996–2000
Allen, Margaret (Peg); Jacobs, Susan Kaplan; Levy, June R.
2006-01-01
Introduction: This project is a collaborative effort of the Task Force on Mapping the Nursing Literature of the Nursing and Allied Health Resources Section of the Medical Library Association. This overview summarizes eighteen studies covering general nursing and sixteen specialties. Method: Following a common protocol, citations from source journals were analyzed for a three-year period within the years 1996 to 2000. Analysis included cited formats, age, and ranking of the frequency of cited journal titles. Highly cited journals were analyzed for coverage in twelve health sciences and academic databases. Results: Journals were the most frequently cited format, followed by books. More than 60% of the cited resources were published in the previous seven years. Bradford's law was validated, with a small core of cited journals accounting for a third of the citations. Medical and science databases provided the most comprehensive access for biomedical titles, while CINAHL and PubMed provided the best access for nursing journals. Discussion: Beyond a heavily cited core, nursing journal citations are widely dispersed among a variety of sources and disciplines, with corresponding access via a variety of bibliographic tools. Results underscore the interdisciplinary nature of the nursing profession. Conclusion: For comprehensive searches, nurses need to search multiple databases. Libraries need to provide access to databases beyond PubMed, including CINAHL and academic databases. Database vendors should improve their coverage of nursing, biomedical, and psychosocial titles identified in these studies. Additional research is needed to update these studies and analyze nursing specialties not covered. PMID:16636714
Mapping the literature of nursing: 1996-2000.
Allen, Margaret Peg; Jacobs, Susan Kaplan; Levy, June R
2006-04-01
This project is a collaborative effort of the Task Force on Mapping the Nursing Literature of the Nursing and Allied Health Resources Section of the Medical Library Association. This overview summarizes eighteen studies covering general nursing and sixteen specialties. Following a common protocol, citations from source journals were analyzed for a three-year period within the years 1996 to 2000. Analysis included cited formats, age, and ranking of the frequency of cited journal titles. Highly cited journals were analyzed for coverage in twelve health sciences and academic databases. Journals were the most frequently cited format, followed by books. More than 60% of the cited resources were published in the previous seven years. Bradford's law was validated, with a small core of cited journals accounting for a third of the citations. Medical and science databases provided the most comprehensive access for biomedical titles, while CINAHL and PubMed provided the best access for nursing journals. Beyond a heavily cited core, nursing journal citations are widely dispersed among a variety of sources and disciplines, with corresponding access via a variety of bibliographic tools. Results underscore the interdisciplinary nature of the nursing profession. For comprehensive searches, nurses need to search multiple databases. Libraries need to provide access to databases beyond PubMed, including CINAHL and academic databases. Database vendors should improve their coverage of nursing, biomedical, and psychosocial titles identified in these studies. Additional research is needed to update these studies and analyze nursing specialties not covered.
Wright, Judy M; Cottrell, David J; Mir, Ghazala
2014-07-01
To determine the optimal databases to search for studies of faith-sensitive interventions for treating depression. We examined 23 health, social science, religious, and grey literature databases searched for an evidence synthesis. Databases were prioritized by yield of (1) search results, (2) potentially relevant references identified during screening, (3) included references contained in the synthesis, and (4) included references that were available in the database. We assessed the impact of databases beyond MEDLINE, EMBASE, and PsycINFO by their ability to supply studies identifying new themes and issues. We identified pragmatic workload factors that influence database selection. PsycINFO was the best performing database within all priority lists. ArabPsyNet, CINAHL, Dissertations and Theses, EMBASE, Global Health, Health Management Information Consortium, MEDLINE, PsycINFO, and Sociological Abstracts were essential for our searches to retrieve the included references. Citation tracking activities and the personal library of one of the research teams made significant contributions of unique, relevant references. Religion studies databases (Am Theo Lib Assoc, FRANCIS) did not provide unique, relevant references. Literature searches for reviews and evidence syntheses of religion and health studies should include social science, grey literature, non-Western databases, personal libraries, and citation tracking activities. Copyright © 2014 Elsevier Inc. All rights reserved.
The SUPERFAMILY database in 2004: additions and improvements.
Madera, Martin; Vogel, Christine; Kummerfeld, Sarah K; Chothia, Cyrus; Gough, Julian
2004-01-01
The SUPERFAMILY database provides structural assignments to protein sequences and a framework for analysis of the results. At the core of the database is a library of profile Hidden Markov Models that represent all proteins of known structure. The library is based on the SCOP classification of proteins: each model corresponds to a SCOP domain and aims to represent an entire superfamily. We have applied the library to predicted proteins from all completely sequenced genomes (currently 154), the Swiss-Prot and TrEMBL databases and other sequence collections. Close to 60% of all proteins have at least one match, and one half of all residues are covered by assignments. All models and full results are available for download and online browsing at http://supfam.org. Users can study the distribution of their superfamily of interest across all completely sequenced genomes, investigate with which other superfamilies it combines and retrieve proteins in which it occurs. Alternatively, concentrating on a particular genome as a whole, it is possible first, to find out its superfamily composition, and secondly, to compare it with that of other genomes to detect superfamilies that are over- or under-represented. In addition, the webserver provides the following standard services: sequence search; keyword search for genomes, superfamilies and sequence identifiers; and multiple alignment of genomic, PDB and custom sequences.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zamora, Antonio
Advanced Natural Language Processing Tools for Web Information Retrieval, Content Analysis, and Synthesis. The goal of this SBIR was to implement and evaluate several advanced Natural Language Processing (NLP) tools and techniques to enhance the precision and relevance of search results by analyzing and augmenting search queries and by helping to organize the search output obtained from heterogeneous databases and web pages containing textual information of interest to DOE and the scientific-technical user communities in general. The SBIR investigated 1) the incorporation of spelling checkers in search applications, 2) identification of significant phrases and concepts using a combination of linguisticmore » and statistical techniques, and 3) enhancement of the query interface and search retrieval results through the use of semantic resources, such as thesauri. A search program with a flexible query interface was developed to search reference databases with the objective of enhancing search results from web queries or queries of specialized search systems such as DOE's Information Bridge. The DOE ETDE/INIS Joint Thesaurus was processed to create a searchable database. Term frequencies and term co-occurrences were used to enhance the web information retrieval by providing algorithmically-derived objective criteria to organize relevant documents into clusters containing significant terms. A thesaurus provides an authoritative overview and classification of a field of knowledge. By organizing the results of a search using the thesaurus terminology, the output is more meaningful than when the results are just organized based on the terms that co-occur in the retrieved documents, some of which may not be significant. An attempt was made to take advantage of the hierarchy provided by broader and narrower terms, as well as other field-specific information in the thesauri. The search program uses linguistic morphological routines to find relevant entries regardless of whether terms are stored in singular or plural form. Implementation of additional inflectional morphology processes for verbs can enhance retrieval further, but this has to be balanced by the possibility of broadening the results too much. In addition to the DOE energy thesaurus, other sources of specialized organized knowledge such as the Medical Subject Headings (MeSH), the Unified Medical Language System (UMLS), and Wikipedia were investigated. The supporting role of the NLP thesaurus search program was enhanced by incorporating spelling aid and a part-of-speech tagger to cope with misspellings in the queries and to determine the grammatical roles of the query words and identify nouns for special processing. To improve precision, multiple modes of searching were implemented including Boolean operators, and field-specific searches. Programs to convert a thesaurus or reference file into searchable support files can be deployed easily, and the resulting files are immediately searchable to produce relevance-ranked results with builtin spelling aid, morphological processing, and advanced search logic. Demonstration systems were built for several databases, including the DOE energy thesaurus.« less
2013-01-01
Background Research in organic chemistry generates samples of novel chemicals together with their properties and other related data. The involved scientists must be able to store this data and search it by chemical structure. There are commercial solutions for common needs like chemical registration systems or electronic lab notebooks. However for specific requirements of in-house databases and processes no such solutions exist. Another issue is that commercial solutions have the risk of vendor lock-in and may require an expensive license of a proprietary relational database management system. To speed up and simplify the development for applications that require chemical structure search capabilities, I have developed Molecule Database Framework. The framework abstracts the storing and searching of chemical structures into method calls. Therefore software developers do not require extensive knowledge about chemistry and the underlying database cartridge. This decreases application development time. Results Molecule Database Framework is written in Java and I created it by integrating existing free and open-source tools and frameworks. The core functionality includes: • Support for multi-component compounds (mixtures) • Import and export of SD-files • Optional security (authorization) For chemical structure searching Molecule Database Framework leverages the capabilities of the Bingo Cartridge for PostgreSQL and provides type-safe searching, caching, transactions and optional method level security. Molecule Database Framework supports multi-component chemical compounds (mixtures). Furthermore the design of entity classes and the reasoning behind it are explained. By means of a simple web application I describe how the framework could be used. I then benchmarked this example application to create some basic performance expectations for chemical structure searches and import and export of SD-files. Conclusions By using a simple web application it was shown that Molecule Database Framework successfully abstracts chemical structure searches and SD-File import and export to simple method calls. The framework offers good search performance on a standard laptop without any database tuning. This is also due to the fact that chemical structure searches are paged and cached. Molecule Database Framework is available for download on the projects web page on bitbucket: https://bitbucket.org/kienerj/moleculedatabaseframework. PMID:24325762
Kiener, Joos
2013-12-11
Research in organic chemistry generates samples of novel chemicals together with their properties and other related data. The involved scientists must be able to store this data and search it by chemical structure. There are commercial solutions for common needs like chemical registration systems or electronic lab notebooks. However for specific requirements of in-house databases and processes no such solutions exist. Another issue is that commercial solutions have the risk of vendor lock-in and may require an expensive license of a proprietary relational database management system. To speed up and simplify the development for applications that require chemical structure search capabilities, I have developed Molecule Database Framework. The framework abstracts the storing and searching of chemical structures into method calls. Therefore software developers do not require extensive knowledge about chemistry and the underlying database cartridge. This decreases application development time. Molecule Database Framework is written in Java and I created it by integrating existing free and open-source tools and frameworks. The core functionality includes:•Support for multi-component compounds (mixtures)•Import and export of SD-files•Optional security (authorization)For chemical structure searching Molecule Database Framework leverages the capabilities of the Bingo Cartridge for PostgreSQL and provides type-safe searching, caching, transactions and optional method level security. Molecule Database Framework supports multi-component chemical compounds (mixtures).Furthermore the design of entity classes and the reasoning behind it are explained. By means of a simple web application I describe how the framework could be used. I then benchmarked this example application to create some basic performance expectations for chemical structure searches and import and export of SD-files. By using a simple web application it was shown that Molecule Database Framework successfully abstracts chemical structure searches and SD-File import and export to simple method calls. The framework offers good search performance on a standard laptop without any database tuning. This is also due to the fact that chemical structure searches are paged and cached. Molecule Database Framework is available for download on the projects web page on bitbucket: https://bitbucket.org/kienerj/moleculedatabaseframework.
BioSearch: a semantic search engine for Bio2RDF
Qiu, Honglei; Huang, Jiacheng
2017-01-01
Abstract Biomedical data are growing at an incredible pace and require substantial expertise to organize data in a manner that makes them easily findable, accessible, interoperable and reusable. Massive effort has been devoted to using Semantic Web standards and technologies to create a network of Linked Data for the life sciences, among others. However, while these data are accessible through programmatic means, effective user interfaces for non-experts to SPARQL endpoints are few and far between. Contributing to user frustrations is that data are not necessarily described using common vocabularies, thereby making it difficult to aggregate results, especially when distributed across multiple SPARQL endpoints. We propose BioSearch — a semantic search engine that uses ontologies to enhance federated query construction and organize search results. BioSearch also features a simplified query interface that allows users to optionally filter their keywords according to classes, properties and datasets. User evaluation demonstrated that BioSearch is more effective and usable than two state of the art search and browsing solutions. Database URL: http://ws.nju.edu.cn/biosearch/ PMID:29220451
Finney, Andrew; Healey, Emma; Jordan, Joanne L; Ryan, Sarah; Dziedzic, Krysia S
2016-07-08
The National Institute for Health and Care Excellence's Osteoarthritis (OA) guidelines recommended that future research should consider the benefits of combination therapies in people with OA across multiple joint sites. However, the clinical effectiveness of such approaches to OA management is unknown. This systematic review therefore aimed to identify the clinical and cost effectiveness of multidisciplinary approaches targeting multiple joint sites for OA in primary care. A systematic review of randomised controlled trials. Computerised bibliographic databases were searched (MEDLINE, EMBASE, CINAHL, PsychINFO, BNI, HBE, HMIC, AMED, Web of Science and Cochrane). Studies were included if they met the following criteria; a randomised controlled trial (RCT), a primary care population with OA across at least two different peripheral joint sites (multiple joint sites), and interventions undertaken by at least two different health disciplines (multidisciplinary). The Cochrane 'Risk of Bias' tool and PEDro were used for quality assessment of eligible studies. Clinical and cost effectiveness was determined by extracting and examining self-reported outcomes for pain, function, quality of life (QoL) and health care utilisation. The date range for the search was from database inception until August 2015. The search identified 1148 individual titles of which four were included in the review. A narrative review was conducted due to the heterogeneity of the included trials. Each of the four trials used either educational or exercise interventions facilitated by a range of different health disciplines. Moderate clinical benefits on pain, function and QoL were reported across the studies. The beneficial effects of exercise generally decreased over time within all studies. Two studies were able to show a reduction in healthcare utilisation due to a reduction in visits to a physiotherapist or a reduction in x-rays and orthopaedic referrals. The intervention that showed the most promise used educational interventions delivered by GPs with reinforcement by practice nurses. There are currently very few studies that target multidisciplinary approaches suitable for OA across multiple joint sites, in primary care. A more consistent approach to outcome measurement in future studies of this nature should be considered to allow for better comparison.
Database Resources of the BIG Data Center in 2018.
2018-01-04
The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters.
Wang, Chunlin; Lefkowitz, Elliot J
2004-10-28
Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single node. Used together, QS-search and DS-BLAST provide a flexible solution to adapt sequential similarity searching applications in high performance computing environments. Their ease of use and their ability to wrap a variety of database search programs provide an analytical architecture to assist both the seasoned bioinformaticist and the wet-bench biologist.
SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters
Wang, Chunlin; Lefkowitz, Elliot J
2004-01-01
Background Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. Results We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single node. Conclusions Used together, QS-search and DS-BLAST provide a flexible solution to adapt sequential similarity searching applications in high performance computing environments. Their ease of use and their ability to wrap a variety of database search programs provide an analytical architecture to assist both the seasoned bioinformaticist and the wet-bench biologist. PMID:15511296
Linder, Suzanne K; Kamath, Geetanjali R; Pratt, Gregory F; Saraykar, Smita S; Volk, Robert J
2015-04-01
To compare the effectiveness of two search methods in identifying studies that used the Control Preferences Scale (CPS), a health care decision-making instrument commonly used in clinical settings. We searched the literature using two methods: (1) keyword searching using variations of "Control Preferences Scale" and (2) cited reference searching using two seminal CPS publications. We searched three bibliographic databases [PubMed, Scopus, and Web of Science (WOS)] and one full-text database (Google Scholar). We report precision and sensitivity as measures of effectiveness. Keyword searches in bibliographic databases yielded high average precision (90%) but low average sensitivity (16%). PubMed was the most precise, followed closely by Scopus and WOS. The Google Scholar keyword search had low precision (54%) but provided the highest sensitivity (70%). Cited reference searches in all databases yielded moderate sensitivity (45-54%), but precision ranged from 35% to 75% with Scopus being the most precise. Cited reference searches were more sensitive than keyword searches, making it a more comprehensive strategy to identify all studies that use a particular instrument. Keyword searches provide a quick way of finding some but not all relevant articles. Goals, time, and resources should dictate the combination of which methods and databases are used. Copyright © 2015 Elsevier Inc. All rights reserved.
Linder, Suzanne K.; Kamath, Geetanjali R.; Pratt, Gregory F.; Saraykar, Smita S.; Volk, Robert J.
2015-01-01
Objective To compare the effectiveness of two search methods in identifying studies that used the Control Preferences Scale (CPS), a healthcare decision-making instrument commonly used in clinical settings. Study Design & Setting We searched the literature using two methods: 1) keyword searching using variations of “control preferences scale” and 2) cited reference searching using two seminal CPS publications. We searched three bibliographic databases [PubMed, Scopus, Web of Science (WOS)] and one full-text database (Google Scholar). We report precision and sensitivity as measures of effectiveness. Results Keyword searches in bibliographic databases yielded high average precision (90%), but low average sensitivity (16%). PubMed was the most precise, followed closely by Scopus and WOS. The Google Scholar keyword search had low precision (54%) but provided the highest sensitivity (70%). Cited reference searches in all databases yielded moderate sensitivity (45–54%), but precision ranged from 35–75% with Scopus being the most precise. Conclusion Cited reference searches were more sensitive than keyword searches, making it a more comprehensive strategy to identify all studies that use a particular instrument. Keyword searches provide a quick way of finding some but not all relevant articles. Goals, time and resources should dictate the combination of which methods and databases are used. PMID:25554521
Evaluation of Proteomic Search Engines for the Analysis of Histone Modifications
2015-01-01
Identification of histone post-translational modifications (PTMs) is challenging for proteomics search engines. Including many histone PTMs in one search increases the number of candidate peptides dramatically, leading to low search speed and fewer identified spectra. To evaluate database search engines on identifying histone PTMs, we present a method in which one kind of modification is searched each time, for example, unmodified, individually modified, and multimodified, each search result is filtered with false discovery rate less than 1%, and the identifications of multiple search engines are combined to obtain confident results. We apply this method for eight search engines on histone data sets. We find that two search engines, pFind and Mascot, identify most of the confident results at a reasonable speed, so we recommend using them to identify histone modifications. During the evaluation, we also find some important aspects for the analysis of histone modifications. Our evaluation of different search engines on identifying histone modifications will hopefully help those who are hoping to enter the histone proteomics field. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD001118. PMID:25167464
Evaluation of proteomic search engines for the analysis of histone modifications.
Yuan, Zuo-Fei; Lin, Shu; Molden, Rosalynn C; Garcia, Benjamin A
2014-10-03
Identification of histone post-translational modifications (PTMs) is challenging for proteomics search engines. Including many histone PTMs in one search increases the number of candidate peptides dramatically, leading to low search speed and fewer identified spectra. To evaluate database search engines on identifying histone PTMs, we present a method in which one kind of modification is searched each time, for example, unmodified, individually modified, and multimodified, each search result is filtered with false discovery rate less than 1%, and the identifications of multiple search engines are combined to obtain confident results. We apply this method for eight search engines on histone data sets. We find that two search engines, pFind and Mascot, identify most of the confident results at a reasonable speed, so we recommend using them to identify histone modifications. During the evaluation, we also find some important aspects for the analysis of histone modifications. Our evaluation of different search engines on identifying histone modifications will hopefully help those who are hoping to enter the histone proteomics field. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD001118.
VIEWCACHE: An incremental pointer-based access method for autonomous interoperable databases
NASA Technical Reports Server (NTRS)
Roussopoulos, N.; Sellis, Timos
1992-01-01
One of biggest problems facing NASA today is to provide scientists efficient access to a large number of distributed databases. Our pointer-based incremental database access method, VIEWCACHE, provides such an interface for accessing distributed data sets and directories. VIEWCACHE allows database browsing and search performing inter-database cross-referencing with no actual data movement between database sites. This organization and processing is especially suitable for managing Astrophysics databases which are physically distributed all over the world. Once the search is complete, the set of collected pointers pointing to the desired data are cached. VIEWCACHE includes spatial access methods for accessing image data sets, which provide much easier query formulation by referring directly to the image and very efficient search for objects contained within a two-dimensional window. We will develop and optimize a VIEWCACHE External Gateway Access to database management systems to facilitate distributed database search.
Is electroconvulsive therapy during pregnancy safe?
Jiménez-Cornejo, Magdalena; Zamorano-Levi, Natalia; Jeria, Álvaro
2016-12-07
Therapeutic options for psychiatric conditions are limited during pregnancy because many drugs are restricted or contraindicated. Electroconvulsive therapy constitutes an alternative, however there is controversy over its safety. Using the Epistemonikos database, which is maintained by searching multiple databases, we found five systematic reviews, including 81 studies overall describing case series or individual cases. Data were extracted from the identified reviews and summary tables of the results were prepared using the GRADE method. We concluded it is not clear what are the risks associated with electroconvulsive therapy during pregnancy because the certainty of the existing evidence is very low. Likewise, existing systematic reviews and international clinical guidelines differ in their conclusions and recommendations.
ClassLess: A Comprehensive Database of Young Stellar Objects
NASA Astrophysics Data System (ADS)
Hillenbrand, Lynne; Baliber, Nairn
2015-01-01
We have designed and constructed a database housing published measurements of Young Stellar Objects (YSOs) within ~1 kpc of the Sun. ClassLess, so called because it includes YSOs in all stages of evolution, is a relational database in which user interaction is conducted via HTML web browsers, queries are performed in scientific language, and all data are linked to the sources of publication. Each star is associated with a cluster (or clusters), and both spatially resolved and unresolved measurements are stored, allowing proper use of data from multiple star systems. With this fully searchable tool, myriad ground- and space-based instruments and surveys across wavelength regimes can be exploited. In addition to primary measurements, the database self consistently calculates and serves higher level data products such as extinction, luminosity, and mass. As a result, searches for young stars with specific physical characteristics can be completed with just a few mouse clicks.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Starr, D. L.; Wozniak, P. R.; Vestrand, W. T.
2002-01-01
SkyDOT (Sky Database for Objects in Time-Domain) is a Virtual Observatory currently comprised of data from the RAPTOR, ROTSE I, and OGLE I1 survey projects. This makes it a very large time domain database. In addition, the RAPTOR project provides SkyDOT with real-time variability data as well as stereoscopic information. With its web interface, we believe SkyDOT will be a very useful tool for both astronomers, and the public. Our main task has been to construct an efficient relational database containing all existing data, while handling a real-time inflow of data. We also provide a useful web interface allowing easymore » access to both astronomers and the public. Initially, this server will allow common searches, specific queries, and access to light curves. In the future we will include machine learning classification tools and access to spectral information.« less
Evaluation of Federated Searching Options for the School Library
ERIC Educational Resources Information Center
Abercrombie, Sarah E.
2008-01-01
Three hosted federated search tools, Follett One Search, Gale PowerSearch Plus, and WebFeat Express, were configured and implemented in a school library. Databases from five vendors and the OPAC were systematically searched. Federated search results were compared with each other and to the results of the same searches in the database's native…
Protocol for developing a Database of Zoonotic disease Research in India (DoZooRI).
Chatterjee, Pranab; Bhaumik, Soumyadeep; Chauhan, Abhimanyu Singh; Kakkar, Manish
2017-12-10
Zoonotic and emerging infectious diseases (EIDs) represent a public health threat that has been acknowledged only recently although they have been on the rise for the past several decades. On an average, every year since the Second World War, one pathogen has emerged or re-emerged on a global scale. Low/middle-income countries such as India bear a significant burden of zoonotic and EIDs. We propose that the creation of a database of published, peer-reviewed research will open up avenues for evidence-based policymaking for targeted prevention and control of zoonoses. A large-scale systematic mapping of the published peer-reviewed research conducted in India will be undertaken. All published research will be included in the database, without any prejudice for quality screening, to broaden the scope of included studies. Structured search strategies will be developed for priority zoonotic diseases (leptospirosis, rabies, anthrax, brucellosis, cysticercosis, salmonellosis, bovine tuberculosis, Japanese encephalitis and rickettsial infections), and multiple databases will be searched for studies conducted in India. The database will be managed and hosted on a cloud-based platform called Rayyan. Individual studies will be tagged based on key preidentified parameters (disease, study design, study type, location, randomisation status and interventions, host involvement and others, as applicable). The database will incorporate already published studies, obviating the need for additional ethical clearances. The database will be made available online, and in collaboration with multisectoral teams, domains of enquiries will be identified and subsequent research questions will be raised. The database will be queried for these and resulting evidence will be analysed and published in peer-reviewed journals. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Online Patent Searching: The Realities.
ERIC Educational Resources Information Center
Kaback, Stuart M.
1983-01-01
Considers patent subject searching capabilities of major online databases, noting patent claims, "deep-indexed" files, test searches, retrieval of related references, multi-database searching, improvements needed in indexing of chemical structures, full text searching, improvements needed in handling numerical data, and augmenting a…
Lynx web services for annotations and systems analysis of multi-gene disorders.
Sulakhe, Dinanath; Taylor, Andrew; Balasubramanian, Sandhya; Feng, Bo; Xie, Bingqing; Börnigen, Daniela; Dave, Utpal J; Foster, Ian T; Gilliam, T Conrad; Maltsev, Natalia
2014-07-01
Lynx is a web-based integrated systems biology platform that supports annotation and analysis of experimental data and generation of weighted hypotheses on molecular mechanisms contributing to human phenotypes and disorders of interest. Lynx has integrated multiple classes of biomedical data (genomic, proteomic, pathways, phenotypic, toxicogenomic, contextual and others) from various public databases as well as manually curated data from our group and collaborators (LynxKB). Lynx provides tools for gene list enrichment analysis using multiple functional annotations and network-based gene prioritization. Lynx provides access to the integrated database and the analytical tools via REST based Web Services (http://lynx.ci.uchicago.edu/webservices.html). This comprises data retrieval services for specific functional annotations, services to search across the complete LynxKB (powered by Lucene), and services to access the analytical tools built within the Lynx platform. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
The Degradome database: mammalian proteases and diseases of proteolysis.
Quesada, Víctor; Ordóñez, Gonzalo R; Sánchez, Luis M; Puente, Xose S; López-Otín, Carlos
2009-01-01
The degradome is defined as the complete set of proteases present in an organism. The recent availability of whole genomic sequences from multiple organisms has led us to predict the contents of the degradomes of several mammalian species. To ensure the fidelity of these predictions, our methods have included manual curation of individual sequences and, when necessary, direct cloning and sequencing experiments. The results of these studies in human, chimpanzee, mouse and rat have been incorporated into the Degradome database, which can be accessed through a web interface at http://degradome.uniovi.es. The annotations about each individual protease can be retrieved by browsing catalytic classes and families or by searching specific terms. This web site also provides detailed information about genetic diseases of proteolysis, a growing field of great importance for multiple users. Finally, the user can find additional information about protease structures, protease inhibitors, ancillary domains of proteases and differences between mammalian degradomes.
The Degradome database: mammalian proteases and diseases of proteolysis
Quesada, Víctor; Ordóñez, Gonzalo R.; Sánchez, Luis M.; Puente, Xose S.; López-Otín, Carlos
2009-01-01
The degradome is defined as the complete set of proteases present in an organism. The recent availability of whole genomic sequences from multiple organisms has led us to predict the contents of the degradomes of several mammalian species. To ensure the fidelity of these predictions, our methods have included manual curation of individual sequences and, when necessary, direct cloning and sequencing experiments. The results of these studies in human, chimpanzee, mouse and rat have been incorporated into the Degradome database, which can be accessed through a web interface at http://degradome.uniovi.es. The annotations about each individual protease can be retrieved by browsing catalytic classes and families or by searching specific terms. This web site also provides detailed information about genetic diseases of proteolysis, a growing field of great importance for multiple users. Finally, the user can find additional information about protease structures, protease inhibitors, ancillary domains of proteases and differences between mammalian degradomes. PMID:18776217
The Chandra Source Catalog: Storage and Interfaces
NASA Astrophysics Data System (ADS)
van Stone, David; Harbo, Peter N.; Tibbetts, Michael S.; Zografou, Panagoula; Evans, Ian N.; Primini, Francis A.; Glotfelty, Kenny J.; Anderson, Craig S.; Bonaventura, Nina R.; Chen, Judy C.; Davis, John E.; Doe, Stephen M.; Evans, Janet D.; Fabbiano, Giuseppina; Galle, Elizabeth C.; Gibbs, Danny G., II; Grier, John D.; Hain, Roger; Hall, Diane M.; He, Xiang Qun (Helen); Houck, John C.; Karovska, Margarita; Kashyap, Vinay L.; Lauer, Jennifer; McCollough, Michael L.; McDowell, Jonathan C.; Miller, Joseph B.; Mitschang, Arik W.; Morgan, Douglas L.; Mossman, Amy E.; Nichols, Joy S.; Nowak, Michael A.; Plummer, David A.; Refsdal, Brian L.; Rots, Arnold H.; Siemiginowska, Aneta L.; Sundheim, Beth A.; Winkelman, Sherry L.
2009-09-01
The Chandra Source Catalog (CSC) is part of the Chandra Data Archive (CDA) at the Chandra X-ray Center. The catalog contains source properties and associated data objects such as images, spectra, and lightcurves. The source properties are stored in relational databases and the data objects are stored in files with their metadata stored in databases. The CDA supports different versions of the catalog: multiple fixed release versions and a live database version. There are several interfaces to the catalog: CSCview, a graphical interface for building and submitting queries and for retrieving data objects; a command-line interface for property and source searches using ADQL; and VO-compliant services discoverable though the VO registry. This poster describes the structure of the catalog and provides an overview of the interfaces.
ERIC Educational Resources Information Center
Kern, Joanne F.
The lack of opportunity for high school sophomores to learn database searching was addressed by the implementation of a computerized magazine article search program. "Reader's Guide to Periodical Literature" on CD-ROM was used to train students in database searching during the time they were assigned to the library to do research papers…
Laptop Computer - Based Facial Recognition System Assessment
DOE Office of Scientific and Technical Information (OSTI.GOV)
R. A. Cain; G. B. Singleton
2001-03-01
The objective of this project was to assess the performance of the leading commercial-off-the-shelf (COTS) facial recognition software package when used as a laptop application. We performed the assessment to determine the system's usefulness for enrolling facial images in a database from remote locations and conducting real-time searches against a database of previously enrolled images. The assessment involved creating a database of 40 images and conducting 2 series of tests to determine the product's ability to recognize and match subject faces under varying conditions. This report describes the test results and includes a description of the factors affecting the results.more » After an extensive market survey, we selected Visionics' FaceIt{reg_sign} software package for evaluation and a review of the Facial Recognition Vendor Test 2000 (FRVT 2000). This test was co-sponsored by the US Department of Defense (DOD) Counterdrug Technology Development Program Office, the National Institute of Justice, and the Defense Advanced Research Projects Agency (DARPA). Administered in May-June 2000, the FRVT 2000 assessed the capabilities of facial recognition systems that were currently available for purchase on the US market. Our selection of this Visionics product does not indicate that it is the ''best'' facial recognition software package for all uses. It was the most appropriate package based on the specific applications and requirements for this specific application. In this assessment, the system configuration was evaluated for effectiveness in identifying individuals by searching for facial images captured from video displays against those stored in a facial image database. An additional criterion was that the system be capable of operating discretely. For this application, an operational facial recognition system would consist of one central computer hosting the master image database with multiple standalone systems configured with duplicates of the master operating in remote locations. Remote users could perform real-time searches where network connectivity is not available. As images are enrolled at the remote locations, periodic database synchronization is necessary.« less
Bishop, Malachy; Pionke, J.J.; Strauser, David; Santens, Ryan L.
2017-01-01
Background: Individuals with multiple sclerosis (MS) face a range of barriers to accessing and using health-care services. The aim of this review was to identify specific barriers to accessing and using health-care services based on a continuum of the health-care delivery system. Methods: Literature searches were conducted in the PubMed, PsycINFO, CINAHL, and Web of Science databases. The following terms were searched as subject headings, key words, or abstracts: health care, access, barriers, physical disability, and multiple sclerosis. The literature search produced 361 potentially relevant citations. After screening titles, abstracts, and citations, eight citations were selected for full-text review. Results: Health-care barriers were divided into three continuous phases of receiving health care. In the before-visit phase, the most commonly identified barrier was transportation. In the during-visit phase, communication quality was the major concern. In the after-visit phase, discontinued referral was the major barrier encountered. Conclusions: There are multiple interrelated barriers to accessing and using health-care services along the health-care delivery continuum for people with MS and its associated physical disabilities, ranging from complex and long-recognized barriers that will likely require extended advocacy to create policy changes to issues that can and should be addressed through relatively minor changes in health-care delivery practices, improved care coordination, and increased provider awareness, education, and responsiveness to patients' needs. PMID:29270089
Application of kernel functions for accurate similarity search in large chemical databases.
Wang, Xiaohong; Huan, Jun; Smalter, Aaron; Lushington, Gerald H
2010-04-29
Similarity search in chemical structure databases is an important problem with many applications in chemical genomics, drug design, and efficient chemical probe screening among others. It is widely believed that structure based methods provide an efficient way to do the query. Recently various graph kernel functions have been designed to capture the intrinsic similarity of graphs. Though successful in constructing accurate predictive and classification models, graph kernel functions can not be applied to large chemical compound database due to the high computational complexity and the difficulties in indexing similarity search for large databases. To bridge graph kernel function and similarity search in chemical databases, we applied a novel kernel-based similarity measurement, developed in our team, to measure similarity of graph represented chemicals. In our method, we utilize a hash table to support new graph kernel function definition, efficient storage and fast search. We have applied our method, named G-hash, to large chemical databases. Our results show that the G-hash method achieves state-of-the-art performance for k-nearest neighbor (k-NN) classification. Moreover, the similarity measurement and the index structure is scalable to large chemical databases with smaller indexing size, and faster query processing time as compared to state-of-the-art indexing methods such as Daylight fingerprints, C-tree and GraphGrep. Efficient similarity query processing method for large chemical databases is challenging since we need to balance running time efficiency and similarity search accuracy. Our previous similarity search method, G-hash, provides a new way to perform similarity search in chemical databases. Experimental study validates the utility of G-hash in chemical databases.
Algorithms for database-dependent search of MS/MS data.
Matthiesen, Rune
2013-01-01
The frequent used bottom-up strategy for identification of proteins and their associated modifications generate nowadays typically thousands of MS/MS spectra that normally are matched automatically against a protein sequence database. Search engines that take as input MS/MS spectra and a protein sequence database are referred as database-dependent search engines. Many programs both commercial and freely available exist for database-dependent search of MS/MS spectra and most of the programs have excellent user documentation. The aim here is therefore to outline the algorithm strategy behind different search engines rather than providing software user manuals. The process of database-dependent search can be divided into search strategy, peptide scoring, protein scoring, and finally protein inference. Most efforts in the literature have been put in to comparing results from different software rather than discussing the underlining algorithms. Such practical comparisons can be cluttered by suboptimal implementation and the observed differences are frequently caused by software parameters settings which have not been set proper to allow even comparison. In other words an algorithmic idea can still be worth considering even if the software implementation has been demonstrated to be suboptimal. The aim in this chapter is therefore to split the algorithms for database-dependent searching of MS/MS data into the above steps so that the different algorithmic ideas become more transparent and comparable. Most search engines provide good implementations of the first three data analysis steps mentioned above, whereas the final step of protein inference are much less developed for most search engines and is in many cases performed by an external software. The final part of this chapter illustrates how protein inference is built into the VEMS search engine and discusses a stand-alone program SIR for protein inference that can import a Mascot search result.
CoSMoS: Conserved Sequence Motif Search in the proteome
Liu, Xiao I; Korde, Neeraj; Jakob, Ursula; Leichert, Lars I
2006-01-01
Background With the ever-increasing number of gene sequences in the public databases, generating and analyzing multiple sequence alignments becomes increasingly time consuming. Nevertheless it is a task performed on a regular basis by researchers in many labs. Results We have now created a database called CoSMoS to find the occurrences and at the same time evaluate the significance of sequence motifs and amino acids encoded in the whole genome of the model organism Escherichia coli K12. We provide a precomputed set of multiple sequence alignments for each individual E. coli protein with all of its homologues in the RefSeq database. The alignments themselves, information about the occurrence of sequence motifs together with information on the conservation of each of the more than 1.3 million amino acids encoded in the E. coli genome can be accessed via the web interface of CoSMoS. Conclusion CoSMoS is a valuable tool to identify highly conserved sequence motifs, to find regions suitable for mutational studies in functional analyses and to predict important structural features in E. coli proteins. PMID:16433915
Mackey, Aaron J; Pearson, William R
2004-10-01
Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.
DNA profiles, computer searches, and the Fourth Amendment.
Kimel, Catherine W
2013-01-01
Pursuant to federal statutes and to laws in all fifty states, the United States government has assembled a database containing the DNA profiles of over eleven million citizens. Without judicial authorization, the government searches each of these profiles one-hundred thousand times every day, seeking to link database subjects to crimes they are not suspected of committing. Yet, courts and scholars that have addressed DNA databasing have focused their attention almost exclusively on the constitutionality of the government's seizure of the biological samples from which the profiles are generated. This Note fills a gap in the scholarship by examining the Fourth Amendment problems that arise when the government searches its vast DNA database. This Note argues that each attempt to match two DNA profiles constitutes a Fourth Amendment search because each attempted match infringes upon database subjects' expectations of privacy in their biological relationships and physical movements. The Note further argues that database searches are unreasonable as they are currently conducted, and it suggests an adaptation of computer-search procedures to remedy the constitutional deficiency.
NASA Technical Reports Server (NTRS)
Maluf, David A. (Inventor); Bell, David G. (Inventor); Gurram, Mohana M. (Inventor); Gawdiak, Yuri O. (Inventor)
2009-01-01
A system for managing a project that includes multiple tasks and a plurality of workers. Input information includes characterizations based upon a human model, a team model and a product model. Periodic reports, such as a monthly report, a task plan report, a budget report and a risk management report, are generated and made available for display or further analysis. An extensible database allows searching for information based upon context and upon content.
Chapter 51: How to Build a Simple Cone Search Service Using a Local Database
NASA Astrophysics Data System (ADS)
Kent, B. R.; Greene, G. R.
The cone search service protocol will be examined from the server side in this chapter. A simple cone search service will be setup and configured locally using MySQL. Data will be read into a table, and the Java JDBC will be used to connect to the database. Readers will understand the VO cone search specification and how to use it to query a database on their local systems and return an XML/VOTable file based on an input of RA/DEC coordinates and a search radius. The cone search in this example will be deployed as a Java servlet. The resulting cone search can be tested with a verification service. This basic setup can be used with other languages and relational databases.
muBLASTP: database-indexed protein sequence search on multicore CPUs.
Zhang, Jing; Misra, Sanchit; Wang, Hao; Feng, Wu-Chun
2016-11-04
The Basic Local Alignment Search Tool (BLAST) is a fundamental program in the life sciences that searches databases for sequences that are most similar to a query sequence. Currently, the BLAST algorithm utilizes a query-indexed approach. Although many approaches suggest that sequence search with a database index can achieve much higher throughput (e.g., BLAT, SSAHA, and CAFE), they cannot deliver the same level of sensitivity as the query-indexed BLAST, i.e., NCBI BLAST, or they can only support nucleotide sequence search, e.g., MegaBLAST. Due to different challenges and characteristics between query indexing and database indexing, the existing techniques for query-indexed search cannot be used into database indexed search. muBLASTP, a novel database-indexed BLAST for protein sequence search, delivers identical hits returned to NCBI BLAST. On Intel Haswell multicore CPUs, for a single query, the single-threaded muBLASTP achieves up to a 4.41-fold speedup for alignment stages, and up to a 1.75-fold end-to-end speedup over single-threaded NCBI BLAST. For a batch of queries, the multithreaded muBLASTP achieves up to a 5.7-fold speedups for alignment stages, and up to a 4.56-fold end-to-end speedup over multithreaded NCBI BLAST. With a newly designed index structure for protein database and associated optimizations in BLASTP algorithm, we re-factored BLASTP algorithm for modern multicore processors that achieves much higher throughput with acceptable memory footprint for the database index.
[Effects of nutritional status on the multiple sclerosis disease: systematic review].
Ródenas Esteve, Irene; Wanden-Berghe, Carmina; Sanz-Valero, Javier
2018-01-19
To review the available scientific literature about the effects of nutritional status on the multiple sclerosis disease. A systematic review of the scientific literature in the Medline (PubMed), Scopus, Cochrane Library and Web of Science databases through November 2016. Search equation: ("Multiple Sclerosis"[Mesh] OR "Multiple Sclerosis"[Title/Abstract] OR "Disseminated Sclerosis"[Title/Abstract] OR "Multiple Sclerosis Acute Fulminating"[Title/Abstract]) AND ("Nutritional Status"[Mesh] OR "Nutritional Status"[Title/Abstract] OR "Nutrition Status"[Title/Abstract]). The quality of the selected articles was discussed using the STROBE questionnaire. The search was completed through experts inquiry and additional review of the bibliographic references included in the selected papers. The concordance between authors (Kappa index) had to be higher than 80% for inclusion in this review. Of the 160 references recovered, after applying inclusion and exclusion criteria, 29 articles were selected for review. Concordance between evaluators was 100.00%. The most studies established vitamin D levels. Others focused their research on finding out which nutrient deficits might be related to the multiple sclerosis development. Vitamin D may influence multiple sclerosis improvement. Sunlight and physical activity would be important factors, with nutritional status, in the course of this disease. It is necessary to produce new specific works that will delve into the subject to find out more about the relationship between nutritional status and multiple sclerosis.
What Is eHealth (4): A Scoping Exercise to Map the Field
Sloan, David; Gregor, Peter; Sullivan, Frank; Detmer, Don; Kahan, James P; Oortwijn, Wija; MacGillivray, Steve
2005-01-01
Background Lack of consensus on the meaning of eHealth has led to uncertainty among academics, policymakers, providers and consumers. This project was commissioned in light of the rising profile of eHealth on the international policy agenda and the emerging UK National Programme for Information Technology (now called Connecting for Health) and related developments in the UK National Health Service. Objectives To map the emergence and scope of eHealth as a topic and to identify its place within the wider health informatics field, as part of a larger review of research and expert analysis pertaining to current evidence, best practice and future trends. Methods Multiple databases of scientific abstracts were explored in a nonsystematic fashion to assess the presence of eHealth or conceptually related terms within their taxonomies, to identify journals in which articles explicitly referring to eHealth are contained and the topics covered, and to identify published definitions of the concept. The databases were Medline (PubMed), the Cumulative Index of Nursing and Allied Health Literature (CINAHL), the Science Citation Index (SCI), the Social Science Citation Index (SSCI), the Cochrane Database (including Dare, Central, NHS Economic Evaluation Database [NHS EED], Health Technology Assessment [HTA] database, NHS EED bibliographic) and ISTP (now known as ISI proceedings).We used the search query, “Ehealth OR e-health OR e*health”. The timeframe searched was 1997-2003, although some analyses contain data emerging subsequent to this period. This was supplemented by iterative searches of Web-based sources, such as commercial and policy reports, research commissioning programmes and electronic news pages. Definitions extracted from both searches were thematically analyzed and compared in order to assess conceptual heterogeneity. Results The term eHealth only came into use in the year 2000, but has since become widely prevalent. The scope of the topic was not immediately discernable from that of the wider health informatics field, for which over 320000 publications are listed in Medline alone, and it is not explicitly represented within the existing Medical Subject Headings (MeSH) taxonomy. Applying eHealth as narrative search term to multiple databases yielded 387 relevant articles, distributed across 154 different journals, most commonly related to information technology and telemedicine, but extending to such areas as law. Most eHealth articles are represented on Medline. Definitions of eHealth vary with respect to the functions, stakeholders, contexts and theoretical issues targeted. Most encompass a broad range of medical informatics applications either specified (eg, decision support, consumer health information) or presented in more general terms (eg, to manage, arrange or deliver health care). However the majority emphasize the communicative functions of eHealth and specify the use of networked digital technologies, primarily the Internet, thus differentiating eHealth from the field of medical informatics. While some definitions explicitly target health professionals or patients, most encompass applications for all stakeholder groups. The nature of the scientific and broader literature pertaining to eHealth closely reflects these conceptualizations. Conclusions We surmise that the field – as it stands today – may be characterized by the global definitions suggested by Eysenbach and Eng. PMID:15829481
dbCPG: A web resource for cancer predisposition genes
Wei, Ran; Yao, Yao; Yang, Wu; Zheng, Chun-Hou; Zhao, Min; Xia, Junfeng
2016-01-01
Cancer predisposition genes (CPGs) are genes in which inherited mutations confer highly or moderately increased risks of developing cancer. Identification of these genes and understanding the biological mechanisms that underlie them is crucial for the prevention, early diagnosis, and optimized management of cancer. Over the past decades, great efforts have been made to identify CPGs through multiple strategies. However, information on these CPGs and their molecular functions is scattered. To address this issue and provide a comprehensive resource for researchers, we developed the Cancer Predisposition Gene Database (dbCPG, Database URL: http://bioinfo.ahu.edu.cn:8080/dbCPG/index.jsp), the first literature-based gene resource for exploring human CPGs. It contains 827 human (724 protein-coding, 23 non-coding, and 80 unknown type genes), 637 rats, and 658 mouse CPGs. Furthermore, data mining was performed to gain insights into the understanding of the CPGs data, including functional annotation, gene prioritization, network analysis of prioritized genes and overlap analysis across multiple cancer types. A user-friendly web interface with multiple browse, search, and upload functions was also developed to facilitate access to the latest information on CPGs. Taken together, the dbCPG database provides a comprehensive data resource for further studies of cancer predisposition genes. PMID:27192119
Alternative Databases for Anthropology Searching.
ERIC Educational Resources Information Center
Brody, Fern; Lambert, Maureen
1984-01-01
Examines online search results of sample questions in several databases covering linguistics, cultural anthropology, and physical anthropology in order to determine if and where any overlap in results might occur, and which files have greatest number of relevant hits. Search results by database are given for each subject area. (EJS)
Effects of therapy for dysphagia in Parkinson's disease: systematic review.
Baijens, Laura W J; Speyer, Renée
2009-03-01
This systematic review explores the effects of dysphagia treatment for Parkinson's disease. The review includes rehabilitative, surgical, pharmacologic, and other treatments. Only oropharyngeal dysphagia is selected for this literature search, excluding dysphagia due to esophageal or gastric disorders. The effects of deep brain stimulation on dysphagia are not included. In general, the literature concerning dysphagia treatment in Parkinson's disease is rather limited. Most effect studies show diverse methodologic problems. Multiple case studies and trials are identified by searching biomedical literature databases PubMed and Embase, and by hand-searching reference lists. The conclusions of most studies cannot be compared with one another because of heterogeneous therapy methods and outcome measures. Further research based on randomized controlled trials to determine the effectiveness of different therapies for dysphagia in Parkinson's disease is required.
Younger, Paula; Boddy, Kate
2009-06-01
The researchers involved in this study work at Exeter Health library and at the Complementary Medicine Unit, Peninsula School of Medicine and Dentistry (PCMD). Within this collaborative environment it is possible to access the electronic resources of three institutions. This includes access to AMED and other databases using different interfaces. The aim of this study was to investigate whether searching different interfaces to the AMED allied health and complementary medicine database produced the same results when using identical search terms. The following Internet-based AMED interfaces were searched: DIALOG DataStar; EBSCOhost and OVID SP_UI01.00.02. Search results from all three databases were saved in an endnote database to facilitate analysis. A checklist was also compiled comparing interface features. In our initial search, DIALOG returned 29 hits, OVID 14 and Ebsco 8. If we assume that DIALOG returned 100% of potential hits, OVID initially returned only 48% of hits and EBSCOhost only 28%. In our search, a researcher using the Ebsco interface to carry out a simple search on AMED would miss over 70% of possible search hits. Subsequent EBSCOhost searches on different subjects failed to find between 21 and 86% of the hits retrieved using the same keywords via DIALOG DataStar. In two cases, the simple EBSCOhost search failed to find any of the results found via DIALOG DataStar. Depending on the interface, the number of hits retrieved from the same database with the same simple search can vary dramatically. Some simple searches fail to retrieve a substantial percentage of citations. This may result in an uninformed literature review, research funding application or treatment intervention. In addition to ensuring that keywords, spelling and medical subject headings (MeSH) accurately reflect the nature of the search, database users should include wildcards and truncation and adapt their search strategy substantially to retrieve the maximum number of appropriate citations possible. Librarians should be aware of these differences when making purchasing decisions, carrying out literature searches and planning user education.
VIEWCACHE: An incremental pointer-based access method for autonomous interoperable databases
NASA Technical Reports Server (NTRS)
Roussopoulos, N.; Sellis, Timos
1993-01-01
One of the biggest problems facing NASA today is to provide scientists efficient access to a large number of distributed databases. Our pointer-based incremental data base access method, VIEWCACHE, provides such an interface for accessing distributed datasets and directories. VIEWCACHE allows database browsing and search performing inter-database cross-referencing with no actual data movement between database sites. This organization and processing is especially suitable for managing Astrophysics databases which are physically distributed all over the world. Once the search is complete, the set of collected pointers pointing to the desired data are cached. VIEWCACHE includes spatial access methods for accessing image datasets, which provide much easier query formulation by referring directly to the image and very efficient search for objects contained within a two-dimensional window. We will develop and optimize a VIEWCACHE External Gateway Access to database management systems to facilitate database search.
NCBI GEO: archive for functional genomics data sets--10 years on.
Barrett, Tanya; Troup, Dennis B; Wilhite, Stephen E; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F; Tomashevsky, Maxim; Marshall, Kimberly A; Phillippy, Katherine H; Sherman, Patti M; Muertter, Rolf N; Holko, Michelle; Ayanbule, Oluwabukunmi; Yefanov, Andrey; Soboleva, Alexandra
2011-01-01
A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve as a public repository for high-throughput gene expression data generated mostly by microarray technology. However, the research community quickly applied microarrays to non-gene-expression studies, including examination of genome copy number variation and genome-wide profiling of DNA-binding proteins. Because the GEO database was designed with a flexible structure, it was possible to quickly adapt the repository to store these data types. More recently, as the microarray community switches to next-generation sequencing technologies, GEO has again adapted to host these data sets. Today, GEO stores over 20,000 microarray- and sequence-based functional genomics studies, and continues to handle the majority of direct high-throughput data submissions from the research community. Multiple mechanisms are provided to help users effectively search, browse, download and visualize the data at the level of individual genes or entire studies. This paper describes recent database enhancements, including new search and data representation tools, as well as a brief review of how the community uses GEO data. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/.
Database Searching by Managers.
ERIC Educational Resources Information Center
Arnold, Stephen E.
Managers and executives need the easy and quick access to business and management information that online databases can provide, but many have difficulty articulating their search needs to an intermediary. One possible solution would be to encourage managers and their immediate support staff members to search textual databases directly as they now…
Constructing Effective Search Strategies for Electronic Searching.
ERIC Educational Resources Information Center
Flanagan, Lynn; Parente, Sharon Campbell
Electronic databases have grown tremendously in both number and popularity since their development during the 1960s. Access to electronic databases in academic libraries was originally offered primarily through mediated search services by trained librarians; however, the advent of CD-ROM and end-user interfaces for online databases has shifted the…
Subject searching of monographs online in the medical literature.
Brahmi, F A
1988-01-01
Searching by subject for monographic information online in the medical literature is a challenging task. The NLM database of choice is CATLINE. Other NLM databases of interest are BIOTHICSLINE, CANCERLIT, HEALTH, POPLINE, and TOXLINE. Ten BRS databases are also discussed. Of these, Books in Print, Bookinfo, and OCLC are explored further. The databases are compared as to number of total records and number and percentage of monographs. Three topics were searched on CROSS to compare hits on BBIP, BOOK, and OCLC. The same searches were run on CATLINE. The parameters of time coverage and language were equalized and the resulting citations were compared and analyzed for duplication and uniqueness. With the input of CATLINE tapes into OCLC, OCLC has become the database of choice for searching by subject for medical monographs.
Topological materials discovery using electron filling constraints
NASA Astrophysics Data System (ADS)
Chen, Ru; Po, Hoi Chun; Neaton, Jeffrey B.; Vishwanath, Ashvin
2018-01-01
Nodal semimetals are classes of topological materials that have nodal-point or nodal-line Fermi surfaces, which give them novel transport and topological properties. Despite being highly sought after, there are currently very few experimental realizations, and identifying new materials candidates has mainly relied on exhaustive database searches. Here we show how recent studies on the interplay between electron filling and nonsymmorphic space-group symmetries can guide the search for filling-enforced nodal semimetals. We recast the previously derived constraints on the allowed band-insulator fillings in any space group into a new form, which enables effective screening of materials candidates based solely on their space group, electron count in the formula unit, and multiplicity of the formula unit. This criterion greatly reduces the computation load for discovering topological materials in a database of previously synthesized compounds. As a demonstration, we focus on a few selected nonsymmorphic space groups which are predicted to host filling-enforced Dirac semimetals. Of the more than 30,000 entires listed, our filling criterion alone eliminates 96% of the entries before they are passed on for further analysis. We discover a handful of candidates from this guided search; among them, the monoclinic crystal Ca2Pt2Ga is particularly promising.
Deleu, Dirk; Mesraoua, Boulenouar; Canibaño, Beatriz; Melikyan, Gayane; Al Hail, Hassan; El-Sheikh, Lubna; Ali, Musab; Al Hussein, Hassan; Ibrahim, Faiza; Hanssens, Yolande
2018-06-18
The introduction of new disease-modifying therapies (DMTs) for remitting-relapsing multiple sclerosis (RRMS) has considerably transformed the landscape of therapeutic opportunities for this chronic disabling disease. Unlike injectable drugs, oral DMTs promote patient satisfaction and increase therapeutic adherence. This article reviews the salient features about the mode of action, efficacy, safety, and tolerability profile of approved oral DMTs in RRMS, and reviews their place in clinical algorithms in the Middle East and North Africa (MENA) region. A systematic review was conducted using a comprehensive search of MEDLINE, PubMed, Cochrane Database of Systematic Reviews (period January 1, 1995-January 31, 2018). Additional searches of the American Academy of Neurology and European Committee for Treatment and Research in Multiple Sclerosis abstracts from 2012-2017 were performed, in addition to searches of the Food and Drug Administration and European Medicines Agency websites, to obtain relevant safety information on these DMTs. Four oral DMTs: fingolimod, teriflunomide, dimethyl fumarate, and cladribine have been approved by the regulatory agencies. Based on the number needed to treat (NNT), the potential role of these DMTs in the management of active and highly active or rapidly evolving RRMS is assessed. Finally, the place of the oral DMTs in clinical algorithms in the MENA region is reviewed.
Pediatric multiple sclerosis: current perspectives on health behaviors.
Sikes, Elizabeth Morghen; Motl, Robert W; Ness, Jayne M
2018-01-01
Pediatric-onset multiple sclerosis (POMS) accounts for ~5% of all multiple sclerosis cases, and has a prevalence of ~10,000 children in the USA. POMS is associated with a higher relapse rate, and results in irreversible disability on average 10 years earlier than adult-onset multiple sclerosis. Other manifestations of POMS include mental and physical fatigue, cognitive impairment, and depression. We believe that the health behaviors of physical activity, diet, and sleep may have potential benefits in POMS, and present a scoping review of the existing literature. We identified papers by searching three electronic databases (PubMed, GoogleScholar, and CINAHL). Search terms included: pediatric multiple sclerosis OR pediatric onset multiple sclerosis OR POMS AND health behavior OR physical activity OR sleep OR diet OR nutrition OR obesity. Papers were included in this review if they were published in English, referenced nutrition, diet, obesity, sleep, exercise, or physical activity, and included pediatric-onset multiple sclerosis as a primary population. Twenty papers were identified via the literature search that addressed health-promoting behaviors in POMS, and 11, 8, and 3 papers focused on diet, activity, and sleep, respectively. Health-promoting behaviors were associated with markers of disease burden in POMS. Physical activity participation was associated with reduced relapse rate, disease burden, and sleep/rest fatigue symptoms. Nutritional factors, particularly vitamin D intake, may be associated with relapse rate. Obesity has been associated with increased risk of developing POMS. POMS is associated with better sleep hygiene, and this may benefit fatigue and quality of life. Participation in health behaviors, particularly physical activity, diet, and sleep, may have benefits for POMS. Nevertheless, there are currently no interventions targeting promotion of these behaviors and examining the benefits of managing the primary and secondary manifestations of POMS.
Search Fermilab Plant Database
Select the characteristics of the plant you want to find below and click the Search button. To see Plants to see all the prairie plants in the database. Click Search All Plants at Fermilab to search for reflects observations at Fermilab. If you need a more sophisticated search, try the Advanced Search. Search
NASA Astrophysics Data System (ADS)
Yin, Lucy; Andrews, Jennifer; Heaton, Thomas
2018-05-01
Earthquake parameter estimations using nearest neighbor searching among a large database of observations can lead to reliable prediction results. However, in the real-time application of Earthquake Early Warning (EEW) systems, the accurate prediction using a large database is penalized by a significant delay in the processing time. We propose to use a multidimensional binary search tree (KD tree) data structure to organize large seismic databases to reduce the processing time in nearest neighbor search for predictions. We evaluated the performance of KD tree on the Gutenberg Algorithm, a database-searching algorithm for EEW. We constructed an offline test to predict peak ground motions using a database with feature sets of waveform filter-bank characteristics, and compare the results with the observed seismic parameters. We concluded that large database provides more accurate predictions of the ground motion information, such as peak ground acceleration, velocity, and displacement (PGA, PGV, PGD), than source parameters, such as hypocenter distance. Application of the KD tree search to organize the database reduced the average searching process by 85% time cost of the exhaustive method, allowing the method to be feasible for real-time implementation. The algorithm is straightforward and the results will reduce the overall time of warning delivery for EEW.
TokSearch: A search engine for fusion experimental data
Sammuli, Brian S.; Barr, Jayson L.; Eidietis, Nicholas W.; ...
2018-04-01
At a typical fusion research site, experimental data is stored using archive technologies that deal with each discharge as an independent set of data. These technologies (e.g. MDSplus or HDF5) are typically supplemented with a database that aggregates metadata for multiple shots to allow for efficient querying of certain predefined quantities. Often, however, a researcher will need to extract information from the archives, possibly for many shots, that is not available in the metadata store or otherwise indexed for quick retrieval. To address this need, a new search tool called TokSearch has been added to the General Atomics TokSys controlmore » design and analysis suite [1]. This tool provides the ability to rapidly perform arbitrary, parallelized queries of archived tokamak shot data (both raw and analyzed) over large numbers of shots. The TokSearch query API borrows concepts from SQL, and users can choose to implement queries in either MatlabTM or Python.« less
TokSearch: A search engine for fusion experimental data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sammuli, Brian S.; Barr, Jayson L.; Eidietis, Nicholas W.
At a typical fusion research site, experimental data is stored using archive technologies that deal with each discharge as an independent set of data. These technologies (e.g. MDSplus or HDF5) are typically supplemented with a database that aggregates metadata for multiple shots to allow for efficient querying of certain predefined quantities. Often, however, a researcher will need to extract information from the archives, possibly for many shots, that is not available in the metadata store or otherwise indexed for quick retrieval. To address this need, a new search tool called TokSearch has been added to the General Atomics TokSys controlmore » design and analysis suite [1]. This tool provides the ability to rapidly perform arbitrary, parallelized queries of archived tokamak shot data (both raw and analyzed) over large numbers of shots. The TokSearch query API borrows concepts from SQL, and users can choose to implement queries in either MatlabTM or Python.« less
PmiRExAt: plant miRNA expression atlas database and web applications
Gurjar, Anoop Kishor Singh; Panwar, Abhijeet Singh; Gupta, Rajinder; Mantri, Shrikant S.
2016-01-01
High-throughput small RNA (sRNA) sequencing technology enables an entirely new perspective for plant microRNA (miRNA) research and has immense potential to unravel regulatory networks. Novel insights gained through data mining in publically available rich resource of sRNA data will help in designing biotechnology-based approaches for crop improvement to enhance plant yield and nutritional value. Bioinformatics resources enabling meta-analysis of miRNA expression across multiple plant species are still evolving. Here, we report PmiRExAt, a new online database resource that caters plant miRNA expression atlas. The web-based repository comprises of miRNA expression profile and query tool for 1859 wheat, 2330 rice and 283 maize miRNA. The database interface offers open and easy access to miRNA expression profile and helps in identifying tissue preferential, differential and constitutively expressing miRNAs. A feature enabling expression study of conserved miRNA across multiple species is also implemented. Custom expression analysis feature enables expression analysis of novel miRNA in total 117 datasets. New sRNA dataset can also be uploaded for analysing miRNA expression profiles for 73 plant species. PmiRExAt application program interface, a simple object access protocol web service allows other programmers to remotely invoke the methods written for doing programmatic search operations on PmiRExAt database. Database URL: http://pmirexat.nabi.res.in. PMID:27081157
Dealing with the Data Deluge: Handling the Multitude Of Chemical Biology Data Sources
Guha, Rajarshi; Nguyen, Dac-Trung; Southall, Noel; Jadhav, Ajit
2012-01-01
Over the last 20 years, there has been an explosion in the amount and type of biological and chemical data that has been made publicly available in a variety of online databases. While this means that vast amounts of information can be found online, there is no guarantee that it can be found easily (or at all). A scientist searching for a specific piece of information is faced with a daunting task - many databases have overlapping content, use their own identifiers and, in some cases, have arcane and unintuitive user interfaces. In this overview, a variety of well known data sources for chemical and biological information are highlighted, focusing on those most useful for chemical biology research. The issue of using multiple data sources together and the associated problems such as identifier disambiguation are highlighted. A brief discussion is then provided on Tripod, a recently developed platform that supports the integration of arbitrary data sources, providing users a simple interface to search across a federated collection of resources. PMID:26609498
Interactive searching of facial image databases
NASA Astrophysics Data System (ADS)
Nicholls, Robert A.; Shepherd, John W.; Shepherd, Jean
1995-09-01
A set of psychological facial descriptors has been devised to enable computerized searching of criminal photograph albums. The descriptors have been used to encode image databased of up to twelve thousand images. Using a system called FACES, the databases are searched by translating a witness' verbal description into corresponding facial descriptors. Trials of FACES have shown that this coding scheme is more productive and efficient than searching traditional photograph albums. An alternative method of searching the encoded database using a genetic algorithm is currenly being tested. The genetic search method does not require the witness to verbalize a description of the target but merely to indicate a degree of similarity between the target and a limited selection of images from the database. The major drawback of FACES is that is requires a manual encoding of images. Research is being undertaken to automate the process, however, it will require an algorithm which can predict human descriptive values. Alternatives to human derived coding schemes exist using statistical classifications of images. Since databases encoded using statistical classifiers do not have an obvious direct mapping to human derived descriptors, a search method which does not require the entry of human descriptors is required. A genetic search algorithm is being tested for such a purpose.
Searching Harvard Business Review Online. . . Lessons in Searching a Full Text Database.
ERIC Educational Resources Information Center
Tenopir, Carol
1985-01-01
This article examines the Harvard Business Review Online (HBRO) database (bibliographic description fields, abstracts, extracted information, full text, subject descriptors) and reports on 31 sample HBRO searches conducted in Bibliographic Retrieval Services to test differences between searching full text and searching bibliographic record. Sample…
In search of the emotional face: anger versus happiness superiority in visual search.
Savage, Ruth A; Lipp, Ottmar V; Craig, Belinda M; Becker, Stefanie I; Horstmann, Gernot
2013-08-01
Previous research has provided inconsistent results regarding visual search for emotional faces, yielding evidence for either anger superiority (i.e., more efficient search for angry faces) or happiness superiority effects (i.e., more efficient search for happy faces), suggesting that these results do not reflect on emotional expression, but on emotion (un-)related low-level perceptual features. The present study investigated possible factors mediating anger/happiness superiority effects; specifically search strategy (fixed vs. variable target search; Experiment 1), stimulus choice (Nimstim database vs. Ekman & Friesen database; Experiments 1 and 2), and emotional intensity (Experiment 3 and 3a). Angry faces were found faster than happy faces regardless of search strategy using faces from the Nimstim database (Experiment 1). By contrast, a happiness superiority effect was evident in Experiment 2 when using faces from the Ekman and Friesen database. Experiment 3 employed angry, happy, and exuberant expressions (Nimstim database) and yielded anger and happiness superiority effects, respectively, highlighting the importance of the choice of stimulus materials. Ratings of the stimulus materials collected in Experiment 3a indicate that differences in perceived emotional intensity, pleasantness, or arousal do not account for differences in search efficiency. Across three studies, the current investigation indicates that prior reports of anger or happiness superiority effects in visual search are likely to reflect on low-level visual features associated with the stimulus materials used, rather than on emotion. PsycINFO Database Record (c) 2013 APA, all rights reserved.
NASA Astrophysics Data System (ADS)
Lee, Sangho; Suh, Jangwon; Park, Hyeong-Dong
2015-03-01
Boring logs are widely used in geological field studies since the data describes various attributes of underground and surface environments. However, it is difficult to manage multiple boring logs in the field as the conventional management and visualization methods are not suitable for integrating and combining large data sets. We developed an iPad application to enable its user to search the boring log rapidly and visualize them using the augmented reality (AR) technique. For the development of the application, a standard borehole database appropriate for a mobile-based borehole database management system was designed. The application consists of three modules: an AR module, a map module, and a database module. The AR module superimposes borehole data on camera imagery as viewed by the user and provides intuitive visualization of borehole locations. The map module shows the locations of corresponding borehole data on a 2D map with additional map layers. The database module provides data management functions for large borehole databases for other modules. Field survey was also carried out using more than 100,000 borehole data.
The contribution of nurses to incident disclosure: a narrative review.
Harrison, Reema; Birks, Yvonne; Hall, Jill; Bosanquet, Kate; Harden, Melissa; Iedema, Rick
2014-02-01
To explore (a) how nurses feel about disclosing patient safety incidents to patients, (b) the current contribution that nurses make to the process of disclosing patient safety incidents to patients and (c) the barriers that nurses report as inhibiting their involvement in disclosure. A systematic search process was used to identify and select all relevant material. Heterogeneity in study design of the included articles prohibited a meta-analysis and findings were therefore synthesised in a narrative review. A range of text words, synonyms and subject headings were developed in conjunction with the York Centre for Reviews and Dissemination and used to undertake a systematic search of electronic databases (MEDLINE; EMBASE; CENTRAL; PsycINFO; Health Management and Information Consortium; CINAHL; ASSIA; Science Citation Index; Social Science Citation Index; Cochrane Database of Systematic Reviews; Database of Abstracts of Reviews of Effects; Health Technology Assessment Database; Health Systems Evidence; PASCAL; LILACS). Retrieval of studies was restricted to those published after 1980. Further data sources were: websites, grey literature, research in progress databases, hand-searching of relevant journals and author contact. The title and abstract of each citation was independently screened by two reviewers and disagreements resolved by consensus or consultation with a third person. Full text articles retrieved were further screened against the inclusion and exclusion criteria then checked by a second reviewer (YB). Relevant data were extracted and findings were synthesised in a narrative empirical synthesis. The systematic search and selection process identified 15 publications which included 11 unique studies that emerged from a range of locations. Findings suggest that nurses currently support both physicians and patients through incident disclosure, but may be ill-prepared to disclose incidents independently. Barriers to nurse involvement included a lack of opportunities for education and training, and the multiple and sometimes conflicting roles within nursing. Numerous potential benefits were identified that may result from nurses having a greater contribution to the disclosure process, but the provision of support and training is essential to overcome the reported barriers faced by nurses internationally. Copyright © 2013 Elsevier Ltd. All rights reserved.
Bond, John W; Weart, Jocelyn R
2017-05-01
Recovery, profiling, and speculative searching of trace DNA (not attributable to a body fluid/cell type) over a twelve-month period in a U.S. Crime Laboratory and U.K. police force are compared. Results show greater numbers of U.S. firearm-related items submitted for analysis compared with the U.K., where greatest numbers were submitted from burglary or vehicle offenses. U.S. multiple recovery techniques (double swabbing) occurred mainly during laboratory examination, whereas the majority of U.K. multiple recovery techniques occurred at the scene. No statistical difference was observed for useful profiles from single or multiple recovery. Database loading of interpretable profiles was most successful for U.K. items related to burglary or vehicle offenses. Database associations (matches) represented 7.0% of all U.S. items and 13.1% of all U.K. items. The U.K. strategy for burglary and vehicle examination demonstrated that careful selection of both items and sampling techniques is crucial to obtaining the observed results. © 2016 American Academy of Forensic Sciences.
... Splign Vector Alignment Search Tool (VAST) All Data & Software Resources... Domains & Structures BioSystems Cn3D Conserved Domain Database (CDD) Conserved Domain Search Service (CD Search) Structure (Molecular Modeling Database) Vector Alignment ...
Database Search Strategies & Tips. Reprints from the Best of "ONLINE" [and]"DATABASE."
ERIC Educational Resources Information Center
Online, Inc., Weston, CT.
Reprints of 17 articles presenting strategies and tips for searching databases online appear in this collection, which is one in a series of volumes of reprints from "ONLINE" and "DATABASE" magazines. Edited for information professionals who use electronically distributed databases, these articles address such topics as: (1)…
Grubert, Anna; Eimer, Martin
2016-08-01
To study whether top-down attentional control processes can be set simultaneously for different visual features, we employed a spatial cueing procedure to measure behavioral and electrophysiological markers of task-set contingent attentional capture during search for targets defined by 1 or 2 possible colors (one-color and two-color tasks). Search arrays were preceded by spatially nonpredictive color singleton cues. Behavioral spatial cueing effects indicative of attentional capture were elicited only by target-matching but not by distractor-color cues. However, when search displays contained 1 target-color and 1 distractor-color object among gray nontargets, N2pc components were triggered not only by target-color but also by distractor-color cues both in the one-color and two-color task, demonstrating that task-set nonmatching items attracted attention. When search displays contained 6 items in 6 different colors, so that participants had to adopt a fully feature-specific task set, the N2pc to distractor-color cues was eliminated in both tasks, indicating that nonmatching items were now successfully excluded from attentional processing. These results demonstrate that when observers adopt a feature-specific search mode, attentional task sets can be configured flexibly for multiple features within the same dimension, resulting in the rapid allocation of attention to task-set matching objects only. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Assigning statistical significance to proteotypic peptides via database searches
Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo
2011-01-01
Querying MS/MS spectra against a database containing only proteotypic peptides reduces data analysis time due to reduction of database size. Despite the speed advantage, this search strategy is challenged by issues of statistical significance and coverage. The former requires separating systematically significant identifications from less confident identifications, while the latter arises when the underlying peptide is not present, due to single amino acid polymorphisms (SAPs) or post-translational modifications (PTMs), in the proteotypic peptide libraries searched. To address both issues simultaneously, we have extended RAId’s knowledge database to include proteotypic information, utilized RAId’s statistical strategy to assign statistical significance to proteotypic peptides, and modified RAId’s programs to allow for consideration of proteotypic information during database searches. The extended database alleviates the coverage problem since all annotated modifications, even those occurred within proteotypic peptides, may be considered. Taking into account the likelihoods of observation, the statistical strategy of RAId provides accurate E-value assignments regardless whether a candidate peptide is proteotypic or not. The advantage of including proteotypic information is evidenced by its superior retrieval performance when compared to regular database searches. PMID:21055489
Hale, Daniel R; Fitzgerald-Yau, Natasha; Viner, Russell Mark
2014-05-01
We systematically searched 9 biomedical and social science databases (1980-2012) for primary and secondary interventions that prevented or reduced 2 or more adolescent health risk behaviors (tobacco use, alcohol use, illicit drug use, risky sexual behavior, aggressive acts). We identified 44 randomized controlled trials of universal or selective interventions and were effective for multiple health risk behaviors. Most were school based, conducted in the United States, and effective for multiple forms of substance use. Effects were small, in line with findings for other universal prevention programs. In some studies, effects for more than 1 health risk behavior only emerged at long-term follow-up. Integrated prevention programs are feasible and effective and may be more efficient than discrete prevention strategies.
National Center for Biotechnology Information
... Splign Vector Alignment Search Tool (VAST) All Data & Software Resources... Domains & Structures BioSystems Cn3D Conserved Domain Database (CDD) Conserved Domain Search Service (CD Search) Structure (Molecular Modeling Database) Vector Alignment ...
Information sources for obesity prevention policy research: a review of systematic reviews.
Hanneke, Rosie; Young, Sabrina K
2017-08-08
Systematic identification of evidence in health policy can be time-consuming and challenging. This study examines three questions pertaining to systematic reviews on obesity prevention policy, in order to identify the most efficient search methods: (1) What percentage of the primary studies selected for inclusion in the reviews originated in scholarly as opposed to gray literature? (2) How much of the primary scholarly literature in this topic area is indexed in PubMed/MEDLINE? (3) Which databases index the greatest number of primary studies not indexed in PubMed, and are these databases searched consistently across systematic reviews? We identified systematic reviews on obesity prevention policy and explored their search methods and citations. We determined the percentage of scholarly vs. gray literature cited, the most frequently cited journals, and whether each primary study was indexed in PubMed. We searched 21 databases for all primary study articles not indexed in PubMed to determine which database(s) indexed the highest number of these relevant articles. In total, 21 systematic reviews were identified. Ten of the 21 systematic reviews reported searching gray literature, and 12 reviews ultimately included gray literature in their analyses. Scholarly articles accounted for 577 of the 649 total primary study papers. Of these, 495 (76%) were indexed in PubMed. Google Scholar retrieved the highest number of the remaining 82 non-PubMed scholarly articles, followed by Scopus and EconLit. The Journal of the American Dietetic Association was the most-cited journal. Researchers can maximize search efficiency by searching a small yet targeted selection of both scholarly and gray literature resources. A highly sensitive search of PubMed and those databases that index the greatest number of relevant articles not indexed in PubMed, namely multidisciplinary and economics databases, could save considerable time and effort. When combined with a gray literature search and additional search methods, including cited reference searching and consulting with experts, this approach could help maintain broad retrieval of relevant studies while improving search efficiency. Findings also have implications for designing specialized databases for public health research.
Transterm—extended search facilities and improved integration with other databases
Jacobs, Grant H.; Stockwell, Peter A.; Tate, Warren P.; Brown, Chris M.
2006-01-01
Transterm has now been publicly available for >10 years. Major changes have been made since its last description in this database issue in 2002. The current database provides data for key regions of mRNA sequences, a curated database of mRNA motifs and tools to allow users to investigate their own motifs or mRNA sequences. The key mRNA regions database is derived computationally from Genbank. It contains 3′ and 5′ flanking regions, the initiation and termination signal context and coding sequence for annotated CDS features from Genbank and RefSeq. The database is non-redundant, enabling summary files and statistics to be prepared for each species. Advances include providing extended search facilities, the database may now be searched by BLAST in addition to regular expressions (patterns) allowing users to search for motifs such as known miRNA sequences, and the inclusion of RefSeq data. The database contains >40 motifs or structural patterns important for translational control. In this release, patterns from UTRsite and Rfam are also incorporated with cross-referencing. Users may search their sequence data with Transterm or user-defined patterns. The system is accessible at . PMID:16381889
The EMBL-EBI bioinformatics web and programmatic tools framework.
Li, Weizhong; Cowley, Andrew; Uludag, Mahmut; Gur, Tamer; McWilliam, Hamish; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Lopez, Rodrigo
2015-07-01
Since 2009 the EMBL-EBI Job Dispatcher framework has provided free access to a range of mainstream sequence analysis applications. These include sequence similarity search services (https://www.ebi.ac.uk/Tools/sss/) such as BLAST, FASTA and PSI-Search, multiple sequence alignment tools (https://www.ebi.ac.uk/Tools/msa/) such as Clustal Omega, MAFFT and T-Coffee, and other sequence analysis tools (https://www.ebi.ac.uk/Tools/pfa/) such as InterProScan. Through these services users can search mainstream sequence databases such as ENA, UniProt and Ensembl Genomes, utilising a uniform web interface or systematically through Web Services interfaces (https://www.ebi.ac.uk/Tools/webservices/) using common programming languages, and obtain enriched results with novel visualisations. Integration with EBI Search (https://www.ebi.ac.uk/ebisearch/) and the dbfetch retrieval service (https://www.ebi.ac.uk/Tools/dbfetch/) further expands the usefulness of the framework. New tools and updates such as NCBI BLAST+, InterProScan 5 and PfamScan, new categories such as RNA analysis tools (https://www.ebi.ac.uk/Tools/rna/), new databases such as ENA non-coding, WormBase ParaSite, Pfam and Rfam, and new workflow methods, together with the retirement of depreciated services, ensure that the framework remains relevant to today's biological community. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Plikus, Maksim V; Zhang, Zina; Chuong, Cheng-Ming
2006-01-01
Background Understanding research activity within any given biomedical field is important. Search outputs generated by MEDLINE/PubMed are not well classified and require lengthy manual citation analysis. Automation of citation analytics can be very useful and timesaving for both novices and experts. Results PubFocus web server automates analysis of MEDLINE/PubMed search queries by enriching them with two widely used human factor-based bibliometric indicators of publication quality: journal impact factor and volume of forward references. In addition to providing basic volumetric statistics, PubFocus also prioritizes citations and evaluates authors' impact on the field of search. PubFocus also analyses presence and occurrence of biomedical key terms within citations by utilizing controlled vocabularies. Conclusion We have developed citations' prioritisation algorithm based on journal impact factor, forward referencing volume, referencing dynamics, and author's contribution level. It can be applied either to the primary set of PubMed search results or to the subsets of these results identified through key terms from controlled biomedical vocabularies and ontologies. NCI (National Cancer Institute) thesaurus and MGD (Mouse Genome Database) mammalian gene orthology have been implemented for key terms analytics. PubFocus provides a scalable platform for the integration of multiple available ontology databases. PubFocus analytics can be adapted for input sources of biomedical citations other than PubMed. PMID:17014720
Accelerated Profile HMM Searches
Eddy, Sean R.
2011-01-01
Profile hidden Markov models (profile HMMs) and probabilistic inference methods have made important contributions to the theory of sequence database homology search. However, practical use of profile HMM methods has been hindered by the computational expense of existing software implementations. Here I describe an acceleration heuristic for profile HMMs, the “multiple segment Viterbi” (MSV) algorithm. The MSV algorithm computes an optimal sum of multiple ungapped local alignment segments using a striped vector-parallel approach previously described for fast Smith/Waterman alignment. MSV scores follow the same statistical distribution as gapped optimal local alignment scores, allowing rapid evaluation of significance of an MSV score and thus facilitating its use as a heuristic filter. I also describe a 20-fold acceleration of the standard profile HMM Forward/Backward algorithms using a method I call “sparse rescaling”. These methods are assembled in a pipeline in which high-scoring MSV hits are passed on for reanalysis with the full HMM Forward/Backward algorithm. This accelerated pipeline is implemented in the freely available HMMER3 software package. Performance benchmarks show that the use of the heuristic MSV filter sacrifices negligible sensitivity compared to unaccelerated profile HMM searches. HMMER3 is substantially more sensitive and 100- to 1000-fold faster than HMMER2. HMMER3 is now about as fast as BLAST for protein searches. PMID:22039361
National Rehabilitation Information Center
... search the NARIC website or one of our databases Select a database or search for a webpage A NARIC webpage ... Projects conducting research and/or development (NIDILRR Program Database). Organizations, agencies, and online resources that support people ...
Quantum search of a real unstructured database
NASA Astrophysics Data System (ADS)
Broda, Bogusław
2016-02-01
A simple circuit implementation of the oracle for Grover's quantum search of a real unstructured classical database is proposed. The oracle contains a kind of quantumly accessible classical memory, which stores the database.
NASA Astrophysics Data System (ADS)
Williams, J. W.; Ashworth, A. C.; Betancourt, J. L.; Bills, B.; Blois, J.; Booth, R.; Buckland, P.; Charles, D.; Curry, B. B.; Goring, S. J.; Davis, E.; Grimm, E. C.; Graham, R. W.; Smith, A. J.
2015-12-01
Community-supported data repositories (CSDRs) in paleoecology and paleoclimatology have a decades-long tradition and serve multiple critical scientific needs. CSDRs facilitate synthetic large-scale scientific research by providing open-access and curated data that employ community-supported metadata and data standards. CSDRs serve as a 'middle tail' or boundary organization between information scientists and the long-tail community of individual geoscientists collecting and analyzing paleoecological data. Over the past decades, a distributed network of CSDRs has emerged, each serving a particular suite of data and research communities, e.g. Neotoma Paleoecology Database, Paleobiology Database, International Tree Ring Database, NOAA NCEI for Paleoclimatology, Morphobank, iDigPaleo, and Integrated Earth Data Alliance. Recently, these groups have organized into a common Paleobiology Data Consortium dedicated to improving interoperability and sharing best practices and protocols. The Neotoma Paleoecology Database offers one example of an active and growing CSDR, designed to facilitate research into ecological and evolutionary dynamics during recent past global change. Neotoma combines a centralized database structure with distributed scientific governance via multiple virtual constituent data working groups. The Neotoma data model is flexible and can accommodate a variety of paleoecological proxies from many depositional contests. Data input into Neotoma is done by trained Data Stewards, drawn from their communities. Neotoma data can be searched, viewed, and returned to users through multiple interfaces, including the interactive Neotoma Explorer map interface, REST-ful Application Programming Interfaces (APIs), the neotoma R package, and the Tilia stratigraphic software. Neotoma is governed by geoscientists and provides community engagement through training workshops for data contributors, stewards, and users. Neotoma is engaged in the Paleobiological Data Consortium and other efforts to improve interoperability among cyberinfrastructure in the paleogeosciences.
Renard, Bernhard Y.; Xu, Buote; Kirchner, Marc; Zickmann, Franziska; Winter, Dominic; Korten, Simone; Brattig, Norbert W.; Tzur, Amit; Hamprecht, Fred A.; Steen, Hanno
2012-01-01
Currently, the reliable identification of peptides and proteins is only feasible when thoroughly annotated sequence databases are available. Although sequencing capacities continue to grow, many organisms remain without reliable, fully annotated reference genomes required for proteomic analyses. Standard database search algorithms fail to identify peptides that are not exactly contained in a protein database. De novo searches are generally hindered by their restricted reliability, and current error-tolerant search strategies are limited by global, heuristic tradeoffs between database and spectral information. We propose a Bayesian information criterion-driven error-tolerant peptide search (BICEPS) and offer an open source implementation based on this statistical criterion to automatically balance the information of each single spectrum and the database, while limiting the run time. We show that BICEPS performs as well as current database search algorithms when such algorithms are applied to sequenced organisms, whereas BICEPS only uses a remotely related organism database. For instance, we use a chicken instead of a human database corresponding to an evolutionary distance of more than 300 million years (International Chicken Genome Sequencing Consortium (2004) Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432, 695–716). We demonstrate the successful application to cross-species proteomics with a 33% increase in the number of identified proteins for a filarial nematode sample of Litomosoides sigmodontis. PMID:22493179
LigandBox: A database for 3D structures of chemical compounds
Kawabata, Takeshi; Sugihara, Yusuke; Fukunishi, Yoshifumi; Nakamura, Haruki
2013-01-01
A database for the 3D structures of available compounds is essential for the virtual screening by molecular docking. We have developed the LigandBox database (http://ligandbox.protein.osaka-u.ac.jp/ligandbox/) containing four million available compounds, collected from the catalogues of 37 commercial suppliers, and approved drugs and biochemical compounds taken from KEGG_DRUG, KEGG_COMPOUND and PDB databases. Each chemical compound in the database has several 3D conformers with hydrogen atoms and atomic charges, which are ready to be docked into receptors using docking programs. The 3D conformations were generated using our molecular simulation program package, myPresto. Various physical properties, such as aqueous solubility (LogS) and carcinogenicity have also been calculated to characterize the ADME-Tox properties of the compounds. The Web database provides two services for compound searches: a property/chemical ID search and a chemical structure search. The chemical structure search is performed by a descriptor search and a maximum common substructure (MCS) search combination, using our program kcombu. By specifying a query chemical structure, users can find similar compounds among the millions of compounds in the database within a few minutes. Our database is expected to assist a wide range of researchers, in the fields of medical science, chemical biology, and biochemistry, who are seeking to discover active chemical compounds by the virtual screening. PMID:27493549
LigandBox: A database for 3D structures of chemical compounds.
Kawabata, Takeshi; Sugihara, Yusuke; Fukunishi, Yoshifumi; Nakamura, Haruki
2013-01-01
A database for the 3D structures of available compounds is essential for the virtual screening by molecular docking. We have developed the LigandBox database (http://ligandbox.protein.osaka-u.ac.jp/ligandbox/) containing four million available compounds, collected from the catalogues of 37 commercial suppliers, and approved drugs and biochemical compounds taken from KEGG_DRUG, KEGG_COMPOUND and PDB databases. Each chemical compound in the database has several 3D conformers with hydrogen atoms and atomic charges, which are ready to be docked into receptors using docking programs. The 3D conformations were generated using our molecular simulation program package, myPresto. Various physical properties, such as aqueous solubility (LogS) and carcinogenicity have also been calculated to characterize the ADME-Tox properties of the compounds. The Web database provides two services for compound searches: a property/chemical ID search and a chemical structure search. The chemical structure search is performed by a descriptor search and a maximum common substructure (MCS) search combination, using our program kcombu. By specifying a query chemical structure, users can find similar compounds among the millions of compounds in the database within a few minutes. Our database is expected to assist a wide range of researchers, in the fields of medical science, chemical biology, and biochemistry, who are seeking to discover active chemical compounds by the virtual screening.
The MAO NASU Plate Archive Database. Current Status and Perspectives
NASA Astrophysics Data System (ADS)
Pakuliak, L. K.; Sergeeva, T. P.
2006-04-01
The preliminary online version of the database of the MAO NASU plate archive is constructed on the basis of the relational database management system MySQL and permits an easy supplement of database with new collections of astronegatives, provides a high flexibility in constructing SQL-queries for data search optimization, PHP Basic Authorization protected access to administrative interface and wide range of search parameters. The current status of the database will be reported and the brief description of the search engine and means of the database integrity support will be given. Methods and means of the data verification and tasks for the further development will be discussed.
Fast online and index-based algorithms for approximate search of RNA sequence-structure patterns
2013-01-01
Background It is well known that the search for homologous RNAs is more effective if both sequence and structure information is incorporated into the search. However, current tools for searching with RNA sequence-structure patterns cannot fully handle mutations occurring on both these levels or are simply not fast enough for searching large sequence databases because of the high computational costs of the underlying sequence-structure alignment problem. Results We present new fast index-based and online algorithms for approximate matching of RNA sequence-structure patterns supporting a full set of edit operations on single bases and base pairs. Our methods efficiently compute semi-global alignments of structural RNA patterns and substrings of the target sequence whose costs satisfy a user-defined sequence-structure edit distance threshold. For this purpose, we introduce a new computing scheme to optimally reuse the entries of the required dynamic programming matrices for all substrings and combine it with a technique for avoiding the alignment computation of non-matching substrings. Our new index-based methods exploit suffix arrays preprocessed from the target database and achieve running times that are sublinear in the size of the searched sequences. To support the description of RNA molecules that fold into complex secondary structures with multiple ordered sequence-structure patterns, we use fast algorithms for the local or global chaining of approximate sequence-structure pattern matches. The chaining step removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our improved online algorithm is faster than the best previous method by up to factor 45. Our best new index-based algorithm achieves a speedup of factor 560. Conclusions The presented methods achieve considerable speedups compared to the best previous method. This, together with the expected sublinear running time of the presented index-based algorithms, allows for the first time approximate matching of RNA sequence-structure patterns in large sequence databases. Beyond the algorithmic contributions, we provide with RaligNAtor a robust and well documented open-source software package implementing the algorithms presented in this manuscript. The RaligNAtor software is available at http://www.zbh.uni-hamburg.de/ralignator. PMID:23865810
Geodata Modeling and Query in Geographic Information Systems
NASA Technical Reports Server (NTRS)
Adam, Nabil
1996-01-01
Geographic information systems (GIS) deal with collecting, modeling, man- aging, analyzing, and integrating spatial (locational) and non-spatial (attribute) data required for geographic applications. Examples of spatial data are digital maps, administrative boundaries, road networks, and those of non-spatial data are census counts, land elevations and soil characteristics. GIS shares common areas with a number of other disciplines such as computer- aided design, computer cartography, database management, and remote sensing. None of these disciplines however, can by themselves fully meet the requirements of a GIS application. Examples of such requirements include: the ability to use locational data to produce high quality plots, perform complex operations such as network analysis, enable spatial searching and overlay operations, support spatial analysis and modeling, and provide data management functions such as efficient storage, retrieval, and modification of large datasets; independence, integrity, and security of data; and concurrent access to multiple users. It is on the data management issues that we devote our discussions in this monograph. Traditionally, database management technology have been developed for business applications. Such applications require, among other things, capturing the data requirements of high-level business functions and developing machine- level implementations; supporting multiple views of data and yet providing integration that would minimize redundancy and maintain data integrity and security; providing a high-level language for data definition and manipulation; allowing concurrent access to multiple users; and processing user transactions in an efficient manner. The demands on database management systems have been for speed, reliability, efficiency, cost effectiveness, and user-friendliness. Significant progress have been made in all of these areas over the last two decades to the point that many generalized database platforms are now available for developing data intensive applications that run in real-time. While continuous improvement is still being made at a very fast-paced and competitive rate, new application areas such as computer aided design, image processing, VLSI design, and GIS have been identified by many as the next generation of database applications. These new application areas pose serious challenges to the currently available database technology. At the core of these challenges is the nature of data that is manipulated. In traditional database applications, the database objects do not have any spatial dimension, and as such, can be thought of as point data in a multi-dimensional space. For example, each instance of an entity EMPLOYEE will have a unique value corresponding to every attribute such as employee id, employee name, employee address and so on. Thus, every Employee instance can be thought of as a point in a multi-dimensional space where each dimension is represented by an attribute. Furthermore, all operations on such data are one-dimensional. Thus, users may retrieve all entities satisfying one or more constraints. Examples of such constraints include employees with addresses in a certain area code, or salaries within a certain range. Even though constraints can be specified on multiple attributes (dimensions), the search for such data is essentially orthogonal across these dimensions.
Hierarchical content-based image retrieval by dynamic indexing and guided search
NASA Astrophysics Data System (ADS)
You, Jane; Cheung, King H.; Liu, James; Guo, Linong
2003-12-01
This paper presents a new approach to content-based image retrieval by using dynamic indexing and guided search in a hierarchical structure, and extending data mining and data warehousing techniques. The proposed algorithms include: a wavelet-based scheme for multiple image feature extraction, the extension of a conventional data warehouse and an image database to an image data warehouse for dynamic image indexing, an image data schema for hierarchical image representation and dynamic image indexing, a statistically based feature selection scheme to achieve flexible similarity measures, and a feature component code to facilitate query processing and guide the search for the best matching. A series of case studies are reported, which include a wavelet-based image color hierarchy, classification of satellite images, tropical cyclone pattern recognition, and personal identification using multi-level palmprint and face features.
Adaptation of Decoy Fusion Strategy for Existing Multi-Stage Search Workflows
NASA Astrophysics Data System (ADS)
Ivanov, Mark V.; Levitsky, Lev I.; Gorshkov, Mikhail V.
2016-09-01
A number of proteomic database search engines implement multi-stage strategies aiming at increasing the sensitivity of proteome analysis. These approaches often employ a subset of the original database for the secondary stage of analysis. However, if target-decoy approach (TDA) is used for false discovery rate (FDR) estimation, the multi-stage strategies may violate the underlying assumption of TDA that false matches are distributed uniformly across the target and decoy databases. This violation occurs if the numbers of target and decoy proteins selected for the second search are not equal. Here, we propose a method of decoy database generation based on the previously reported decoy fusion strategy. This method allows unbiased TDA-based FDR estimation in multi-stage searches and can be easily integrated into existing workflows utilizing popular search engines and post-search algorithms.
Delaney, Aogán; Tamás, Peter A
2018-03-01
Despite recognition that database search alone is inadequate even within the health sciences, it appears that reviewers in fields that have adopted systematic review are choosing to rely primarily, or only, on database search for information retrieval. This commentary reminds readers of factors that call into question the appropriateness of default reliance on database searches particularly as systematic review is adapted for use in new and lower consensus fields. It then discusses alternative methods for information retrieval that require development, formalisation, and evaluation. Our goals are to encourage reviewers to reflect critically and transparently on their choice of information retrieval methods and to encourage investment in research on alternatives. Copyright © 2017 John Wiley & Sons, Ltd.
MICA: desktop software for comprehensive searching of DNA databases
Stokes, William A; Glick, Benjamin S
2006-01-01
Background Molecular biologists work with DNA databases that often include entire genomes. A common requirement is to search a DNA database to find exact matches for a nondegenerate or partially degenerate query. The software programs available for such purposes are normally designed to run on remote servers, but an appealing alternative is to work with DNA databases stored on local computers. We describe a desktop software program termed MICA (K-Mer Indexing with Compact Arrays) that allows large DNA databases to be searched efficiently using very little memory. Results MICA rapidly indexes a DNA database. On a Macintosh G5 computer, the complete human genome could be indexed in about 5 minutes. The indexing algorithm recognizes all 15 characters of the DNA alphabet and fully captures the information in any DNA sequence, yet for a typical sequence of length L, the index occupies only about 2L bytes. The index can be searched to return a complete list of exact matches for a nondegenerate or partially degenerate query of any length. A typical search of a long DNA sequence involves reading only a small fraction of the index into memory. As a result, searches are fast even when the available RAM is limited. Conclusion MICA is suitable as a search engine for desktop DNA analysis software. PMID:17018144
Minkiewicz, Piotr; Darewicz, Małgorzata; Iwaniak, Anna; Bucholska, Justyna; Starowicz, Piotr; Czyrko, Emilia
2016-01-01
Internet databases of small molecules, their enzymatic reactions, and metabolism have emerged as useful tools in food science. Database searching is also introduced as part of chemistry or enzymology courses for food technology students. Such resources support the search for information about single compounds and facilitate the introduction of secondary analyses of large datasets. Information can be retrieved from databases by searching for the compound name or structure, annotating with the help of chemical codes or drawn using molecule editing software. Data mining options may be enhanced by navigating through a network of links and cross-links between databases. Exemplary databases reviewed in this article belong to two classes: tools concerning small molecules (including general and specialized databases annotating food components) and tools annotating enzymes and metabolism. Some problems associated with database application are also discussed. Data summarized in computer databases may be used for calculation of daily intake of bioactive compounds, prediction of metabolism of food components, and their biological activity as well as for prediction of interactions between food component and drugs. PMID:27929431
Minkiewicz, Piotr; Darewicz, Małgorzata; Iwaniak, Anna; Bucholska, Justyna; Starowicz, Piotr; Czyrko, Emilia
2016-12-06
Internet databases of small molecules, their enzymatic reactions, and metabolism have emerged as useful tools in food science. Database searching is also introduced as part of chemistry or enzymology courses for food technology students. Such resources support the search for information about single compounds and facilitate the introduction of secondary analyses of large datasets. Information can be retrieved from databases by searching for the compound name or structure, annotating with the help of chemical codes or drawn using molecule editing software. Data mining options may be enhanced by navigating through a network of links and cross-links between databases. Exemplary databases reviewed in this article belong to two classes: tools concerning small molecules (including general and specialized databases annotating food components) and tools annotating enzymes and metabolism. Some problems associated with database application are also discussed. Data summarized in computer databases may be used for calculation of daily intake of bioactive compounds, prediction of metabolism of food components, and their biological activity as well as for prediction of interactions between food component and drugs.
Zhao, Panpan; Zhong, Jiayong; Liu, Wanting; Zhao, Jing; Zhang, Gong
2017-12-01
Multiple search engines based on various models have been developed to search MS/MS spectra against a reference database, providing different results for the same data set. How to integrate these results efficiently with minimal compromise on false discoveries is an open question due to the lack of an independent, reliable, and highly sensitive standard. We took the advantage of the translating mRNA sequencing (RNC-seq) result as a standard to evaluate the integration strategies of the protein identifications from various search engines. We used seven mainstream search engines (Andromeda, Mascot, OMSSA, X!Tandem, pFind, InsPecT, and ProVerB) to search the same label-free MS data sets of human cell lines Hep3B, MHCCLM3, and MHCC97H from the Chinese C-HPP Consortium for Chromosomes 1, 8, and 20. As expected, the union of seven engines resulted in a boosted false identification, whereas the intersection of seven engines remarkably decreased the identification power. We found that identifications of at least two out of seven engines resulted in maximizing the protein identification power while minimizing the ratio of suspicious/translation-supported identifications (STR), as monitored by our STR index, based on RNC-Seq. Furthermore, this strategy also significantly improves the peptides coverage of the protein amino acid sequence. In summary, we demonstrated a simple strategy to significantly improve the performance for shotgun mass spectrometry by protein-level integrating multiple search engines, maximizing the utilization of the current MS spectra without additional experimental work.
ERIC Educational Resources Information Center
Saracevic, Tefko
2000-01-01
Summarizes a presentation that discussed findings and implications of research projects using an Internet search service and Internet-accessible vendor databases, representing the two sides of public database searching: query formulation and resource utilization. Presenters included: Tefko Saracevic, Amanda Spink, Dietmar Wolfram and Hong Xie.…
Matsuda, Fumio; Shinbo, Yoko; Oikawa, Akira; Hirai, Masami Yokota; Fiehn, Oliver; Kanaya, Shigehiko; Saito, Kazuki
2009-01-01
Background In metabolomics researches using mass spectrometry (MS), systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. However, incorrect hits derived from errors in mass analyses will be included in the results of elemental composition searches. To assess the quality of peak annotation information, a novel methodology for false discovery rates (FDR) evaluation is presented in this study. Based on the FDR analyses, several aspects of an elemental composition search, including setting a threshold, estimating FDR, and the types of elemental composition databases most reliable for searching are discussed. Methodology/Principal Findings The FDR can be determined from one measured value (i.e., the hit rate for search queries) and four parameters determined by Monte Carlo simulation. The results indicate that relatively high FDR values (30–50%) were obtained when searching time-of-flight (TOF)/MS data using the KNApSAcK and KEGG databases. In addition, searches against large all-in-one databases (e.g., PubChem) always produced unacceptable results (FDR >70%). The estimated FDRs suggest that the quality of search results can be improved not only by performing more accurate mass analysis but also by modifying the properties of the compound database. A theoretical analysis indicates that FDR could be improved by using compound database with smaller but higher completeness entries. Conclusions/Significance High accuracy mass analysis, such as Fourier transform (FT)-MS, is needed for reliable annotation (FDR <10%). In addition, a small, customized compound database is preferable for high-quality annotation of metabolome data. PMID:19847304
NREL: Renewable Resource Data Center - Biomass Resource Publications
Marginal Lands in APEC Economies NREL Publications Database For a comprehensive list of other NREL biomass resource publications, explore NREL's Publications Database. When searching the database, search on "
Kwon, Yoojin; Powelson, Susan E; Wong, Holly; Ghali, William A; Conly, John M
2014-11-11
The purpose of our study is to determine the value and efficacy of searching biomedical databases beyond MEDLINE for systematic reviews. We analyzed the results from a systematic review conducted by the authors and others on ward closure as an infection control practice. Ovid MEDLINE including In-Process & Other Non-Indexed Citations, Ovid Embase, CINAHL Plus, LILACS, and IndMED were systematically searched for articles of any study type discussing ward closure, as were bibliographies of selected articles and recent infection control conference abstracts. Search results were tracked, recorded, and analyzed using a relative recall method. The sensitivity of searching in each database was calculated. Two thousand ninety-five unique citations were identified and screened for inclusion in the systematic review: 2,060 from database searching and 35 from hand searching and other sources. Ninety-seven citations were included in the final review. MEDLINE and Embase searches each retrieved 80 of the 97 articles included, only 4 articles from each database were unique. The CINAHL search retrieved 35 included articles, and 4 were unique. The IndMED and LILACS searches did not retrieve any included articles, although 75 of the included articles were indexed in LILACS. The true value of using regional databases, particularly LILACS, may lie with the ability to search in the language spoken in the region. Eight articles were found only through hand searching. Identifying studies for a systematic review where the research is observational is complex. The value each individual study contributes to the review cannot be accurately measured. Consequently, we could not determine the value of results found from searching beyond MEDLINE, Embase, and CINAHL with accuracy. However, hand searching for serendipitous retrieval remains an important aspect due to indexing and keyword challenges inherent in this literature.
Should we search Chinese biomedical databases when performing systematic reviews?
Cohen, Jérémie F; Korevaar, Daniël A; Wang, Junfeng; Spijker, René; Bossuyt, Patrick M
2015-03-06
Chinese biomedical databases contain a large number of publications available to systematic reviewers, but it is unclear whether they are used for synthesizing the available evidence. We report a case of two systematic reviews on the accuracy of anti-cyclic citrullinated peptide for diagnosing rheumatoid arthritis. In one of these, the authors did not search Chinese databases; in the other, they did. We additionally assessed the extent to which Cochrane reviewers have searched Chinese databases in a systematic overview of the Cochrane Library (inception to 2014). The two diagnostic reviews included a total of 269 unique studies, but only 4 studies were included in both reviews. The first review included five studies published in the Chinese language (out of 151) while the second included 114 (out of 118). The summary accuracy estimates from the two reviews were comparable. Only 243 of the published 8,680 Cochrane reviews (less than 3%) searched one or more of the five major Chinese databases. These Chinese databases index about 2,500 journals, of which less than 6% are also indexed in MEDLINE. All 243 Cochrane reviews evaluated an intervention, 179 (74%) had at least one author with a Chinese affiliation; 118 (49%) addressed a topic in complementary or alternative medicine. Although searching Chinese databases may lead to the identification of a large amount of additional clinical evidence, Cochrane reviewers have rarely included them in their search strategy. We encourage future initiatives to evaluate more systematically the relevance of searching Chinese databases, as well as collaborative efforts to allow better incorporation of Chinese resources in systematic reviews.
NBIC: Search Ballast Report Database
Smithsonian Environmental Research Center Logo US Coast Guard Logo Submit BW Report | Search NBIC Database developed an online database that can be queried through our website. Data are accessible for all coastal Lakes, have been incorporated into the NBIC database as of August 2004. Information on data availability
Ocean Drilling Program: Science Operator Search Engine
and products Drilling services and tools Online Janus database Search the ODP/TAMU web site ODP's main -USIO site, plus IODP, ODP, and DSDP Publications, together or separately. ODP | Search | Database
Using the TIGR gene index databases for biological discovery.
Lee, Yuandan; Quackenbush, John
2003-11-01
The TIGR Gene Index web pages provide access to analyses of ESTs and gene sequences for nearly 60 species, as well as a number of resources derived from these. Each species-specific database is presented using a common format with a homepage. A variety of methods exist that allow users to search each species-specific database. Methods implemented currently include nucleotide or protein sequence queries using WU-BLAST, text-based searches using various sequence identifiers, searches by gene, tissue and library name, and searches using functional classes through Gene Ontology assignments. This protocol provides guidance for using the Gene Index Databases to extract information.
Weinreb, Jeffrey H; Yoshida, Ryu; Cote, Mark P; O'Sullivan, Michael B; Mazzocca, Augustus D
2017-01-01
The purpose of this study was to evaluate how database use has changed over time in Arthroscopy: The Journal of Arthroscopic and Related Surgery and to inform readers about available databases used in orthopaedic literature. An extensive literature search was conducted to identify databases used in Arthroscopy and other orthopaedic literature. All articles published in Arthroscopy between January 1, 2006, and December 31, 2015, were reviewed. A database was defined as a national, widely available set of individual patient encounters, applicable to multiple patient populations, used in orthopaedic research in a peer-reviewed journal, not restricted by encounter setting or visit duration, and with information available in English. Databases used in Arthroscopy included PearlDiver, the American College of Surgeons National Surgical Quality Improvement Program, the Danish Common Orthopaedic Database, the Swedish National Knee Ligament Register, the Hospital Episodes Statistics database, and the National Inpatient Sample. Database use increased significantly from 4 articles in 2013 to 11 articles in 2015 (P = .012), with no database use between January 1, 2006, and December 31, 2012. Database use increased significantly between January 1, 2006, and December 31, 2015, in Arthroscopy. Level IV, systematic review of Level II through IV studies. Copyright © 2016 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.
An Online Resource for Flight Test Safety Planning
NASA Technical Reports Server (NTRS)
Lewis, Greg
2007-01-01
A viewgraph presentation describing an online database for flight test safety techniques is shown. The topics include: 1) Goal; 2) Test Hazard Analyses; 3) Online Database Background; 4) Data Gathering; 5) NTPS Role; 6) Organizations; 7) Hazard Titles; 8) FAR Paragraphs; 9) Maneuver Name; 10) Identified Hazard; 11) Matured Hazard Titles; 12) Loss of Control Causes; 13) Mitigations; 14) Database Now Open to the Public; 15) FAR Reference Search; 16) Record Field Search; 17) Keyword Search; and 18) Results of FAR Reference Search.
Searching Databases without Query-Building Aids: Implications for Dyslexic Users
ERIC Educational Resources Information Center
Berget, Gerd; Sandnes, Frode Eika
2015-01-01
Introduction: Few studies document the information searching behaviour of users with cognitive impairments. This paper therefore addresses the effect of dyslexia on information searching in a database with no tolerance for spelling errors and no query-building aids. The purpose was to identify effective search interface design guidelines that…
ERIC Educational Resources Information Center
Miller-Whitehead, Marie
Keyword and text string searches of online library catalogs often provide different results according to library and database used and depending upon how books and journals are indexed. For this reason, online databases such as ERIC often provide tutorials and recommendations for searching their site, such as how to use Boolean search strategies.…
BEAUTY-X: enhanced BLAST searches for DNA queries.
Worley, K C; Culpepper, P; Wiese, B A; Smith, R F
1998-01-01
BEAUTY (BLAST Enhanced Alignment Utility) is an enhanced version of the BLAST database search tool that facilitates identification of the functions of matched sequences. Three recent improvements to the BEAUTY program described here make the enhanced output (1) available for DNA queries, (2) available for searches of any protein database, and (3) more up-to-date, with periodic updates of the domain information. BEAUTY searches of the NCBI and EMBL non-redundant protein sequence databases are available from the BCM Search Launcher Web pages (http://gc.bcm.tmc. edu:8088/search-launcher/launcher.html). BEAUTY Post-Processing of submitted search results is available using the BCM Search Launcher Batch Client (version 2.6) (ftp://gc.bcm.tmc. edu/pub/software/search-launcher/). Example figures are available at http://dot.bcm.tmc. edu:9331/papers/beautypp.html (kworley,culpep)@bcm.tmc.edu
SNP-RFLPing 2: an updated and integrated PCR-RFLP tool for SNP genotyping.
Chang, Hsueh-Wei; Cheng, Yu-Huei; Chuang, Li-Yeh; Yang, Cheng-Hong
2010-04-08
PCR-restriction fragment length polymorphism (RFLP) assay is a cost-effective method for SNP genotyping and mutation detection, but the manual mining for restriction enzyme sites is challenging and cumbersome. Three years after we constructed SNP-RFLPing, a freely accessible database and analysis tool for restriction enzyme mining of SNPs, significant improvements over the 2006 version have been made and incorporated into the latest version, SNP-RFLPing 2. The primary aim of SNP-RFLPing 2 is to provide comprehensive PCR-RFLP information with multiple functionality about SNPs, such as SNP retrieval to multiple species, different polymorphism types (bi-allelic, tri-allelic, tetra-allelic or indels), gene-centric searching, HapMap tagSNPs, gene ontology-based searching, miRNAs, and SNP500Cancer. The RFLP restriction enzymes and the corresponding PCR primers for the natural and mutagenic types of each SNP are simultaneously analyzed. All the RFLP restriction enzyme prices are also provided to aid selection. Furthermore, the previously encountered updating problems for most SNP related databases are resolved by an on-line retrieval system. The user interfaces for functional SNP analyses have been substantially improved and integrated. SNP-RFLPing 2 offers a new and user-friendly interface for RFLP genotyping that can be used in association studies and is freely available at http://bio.kuas.edu.tw/snp-rflping2.
NASA Astrophysics Data System (ADS)
Kingdon, Andrew; Nayembil, Martin L.; Richardson, Anne E.; Smith, A. Graham
2016-11-01
New requirements to understand geological properties in three dimensions have led to the development of PropBase, a data structure and delivery tools to deliver this. At the BGS, relational database management systems (RDBMS) has facilitated effective data management using normalised subject-based database designs with business rules in a centralised, vocabulary controlled, architecture. These have delivered effective data storage in a secure environment. However, isolated subject-oriented designs prevented efficient cross-domain querying of datasets. Additionally, the tools provided often did not enable effective data discovery as they struggled to resolve the complex underlying normalised structures providing poor data access speeds. Users developed bespoke access tools to structures they did not fully understand sometimes delivering them incorrect results. Therefore, BGS has developed PropBase, a generic denormalised data structure within an RDBMS to store property data, to facilitate rapid and standardised data discovery and access, incorporating 2D and 3D physical and chemical property data, with associated metadata. This includes scripts to populate and synchronise the layer with its data sources through structured input and transcription standards. A core component of the architecture includes, an optimised query object, to deliver geoscience information from a structure equivalent to a data warehouse. This enables optimised query performance to deliver data in multiple standardised formats using a web discovery tool. Semantic interoperability is enforced through vocabularies combined from all data sources facilitating searching of related terms. PropBase holds 28.1 million spatially enabled property data points from 10 source databases incorporating over 50 property data types with a vocabulary set that includes 557 property terms. By enabling property data searches across multiple databases PropBase has facilitated new scientific research, previously considered impractical. PropBase is easily extended to incorporate 4D data (time series) and is providing a baseline for new "big data" monitoring projects.
An etiologic classification of autism spectrum disorders.
Gabis, Lidia V; Pomeroy, John
2014-05-01
Autism spectrum disorders (ASD) represent a common phenotype related to multiple etiologies, such as genetic, brain injury (e.g., prematurity), environmental (e.g., viral, toxic), multiple or unknown causes. To devise a clinical classification of children diagnosed with ASD according to etiologic workup. Children diagnosed with ASD (n = 436) from two databases were divided into groups of symptomatic cryptogenic or idiopathic, and variables within each database and diagnostic category were compared. By analyzing the two separate databases, 5.4% of the children were classified as symptomatic, 27% as cryptogenic and 67.75% as idiopathic. Among other findings, the entire symptomatic group demonstrated language delays, but almost none showed evidence for regression. Our results indicate similarities between the idiopathic and cryptogenic subgroups in most of the examined variables, and mutual differences from the symptomatic subgroup. The similarities between the first two subgroups support prior evidence that most perinatal factors and minor physical anomalies do not contribute to the development of core symptoms of autism. Differences in gender and clinical and diagnostic features were found when etiology was used to create subtypes of ASD. This classification could have heuristic importance in the search for an autism gene(s).
VisANT 3.0: new modules for pathway visualization, editing, prediction and construction.
Hu, Zhenjun; Ng, David M; Yamada, Takuji; Chen, Chunnuan; Kawashima, Shuichi; Mellor, Joe; Linghu, Bolan; Kanehisa, Minoru; Stuart, Joshua M; DeLisi, Charles
2007-07-01
With the integration of the KEGG and Predictome databases as well as two search engines for coexpressed genes/proteins using data sets obtained from the Stanford Microarray Database (SMD) and Gene Expression Omnibus (GEO) database, VisANT 3.0 supports exploratory pathway analysis, which includes multi-scale visualization of multiple pathways, editing and annotating pathways using a KEGG compatible visual notation and visualization of expression data in the context of pathways. Expression levels are represented either by color intensity or by nodes with an embedded expression profile. Multiple experiments can be navigated or animated. Known KEGG pathways can be enriched by querying either coexpressed components of known pathway members or proteins with known physical interactions. Predicted pathways for genes/proteins with unknown functions can be inferred from coexpression or physical interaction data. Pathways produced in VisANT can be saved as computer-readable XML format (VisML), graphic images or high-resolution Scalable Vector Graphics (SVG). Pathways in the format of VisML can be securely shared within an interested group or published online using a simple Web link. VisANT is freely available at http://visant.bu.edu.
A comparative study of six European databases of medically oriented Web resources.
Abad García, Francisca; González Teruel, Aurora; Bayo Calduch, Patricia; de Ramón Frias, Rosa; Castillo Blasco, Lourdes
2005-10-01
The paper describes six European medically oriented databases of Web resources, pertaining to five quality-controlled subject gateways, and compares their performance. The characteristics, coverage, procedure for selecting Web resources, record structure, searching possibilities, and existence of user assistance were described for each database. Performance indicators for each database were obtained by means of searches carried out using the key words, "myocardial infarction." Most of the databases originated in the 1990s in an academic or library context and include all types of Web resources of an international nature. Five databases use Medical Subject Headings. The number of fields per record varies between three and nineteen. The language of the search interfaces is mostly English, and some of them allow searches in other languages. In some databases, the search can be extended to Pubmed. Organizing Medical Networked Information, Catalogue et Index des Sites Médicaux Francophones, and Diseases, Disorders and Related Topics produced the best results. The usefulness of these databases as quick reference resources is clear. In addition, their lack of content overlap means that, for the user, they complement each other. Their continued survival faces three challenges: the instability of the Internet, maintenance costs, and lack of use in spite of their potential usefulness.
Verheggen, Kenneth; Raeder, Helge; Berven, Frode S; Martens, Lennart; Barsnes, Harald; Vaudel, Marc
2017-09-13
Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines. © 2017 Wiley Periodicals, Inc.
GenoMycDB: a database for comparative analysis of mycobacterial genes and genomes.
Catanho, Marcos; Mascarenhas, Daniel; Degrave, Wim; Miranda, Antonio Basílio de
2006-03-31
Several databases and computational tools have been created with the aim of organizing, integrating and analyzing the wealth of information generated by large-scale sequencing projects of mycobacterial genomes and those of other organisms. However, with very few exceptions, these databases and tools do not allow for massive and/or dynamic comparison of these data. GenoMycDB (http://www.dbbm.fiocruz.br/GenoMycDB) is a relational database built for large-scale comparative analyses of completely sequenced mycobacterial genomes, based on their predicted protein content. Its central structure is composed of the results obtained after pair-wise sequence alignments among all the predicted proteins coded by the genomes of six mycobacteria: Mycobacterium tuberculosis (strains H37Rv and CDC1551), M. bovis AF2122/97, M. avium subsp. paratuberculosis K10, M. leprae TN, and M. smegmatis MC2 155. The database stores the computed similarity parameters of every aligned pair, providing for each protein sequence the predicted subcellular localization, the assigned cluster of orthologous groups, the features of the corresponding gene, and links to several important databases. Tables containing pairs or groups of potential homologs between selected species/strains can be produced dynamically by user-defined criteria, based on one or multiple sequence similarity parameters. In addition, searches can be restricted according to the predicted subcellular localization of the protein, the DNA strand of the corresponding gene and/or the description of the protein. Massive data search and/or retrieval are available, and different ways of exporting the result are offered. GenoMycDB provides an on-line resource for the functional classification of mycobacterial proteins as well as for the analysis of genome structure, organization, and evolution.
Privacy-preserving search for chemical compound databases.
Shimizu, Kana; Nuida, Koji; Arai, Hiromi; Mitsunari, Shigeo; Attrapadung, Nuttapong; Hamada, Michiaki; Tsuda, Koji; Hirokawa, Takatsugu; Sakuma, Jun; Hanaoka, Goichiro; Asai, Kiyoshi
2015-01-01
Searching for similar compounds in a database is the most important process for in-silico drug screening. Since a query compound is an important starting point for the new drug, a query holder, who is afraid of the query being monitored by the database server, usually downloads all the records in the database and uses them in a closed network. However, a serious dilemma arises when the database holder also wants to output no information except for the search results, and such a dilemma prevents the use of many important data resources. In order to overcome this dilemma, we developed a novel cryptographic protocol that enables database searching while keeping both the query holder's privacy and database holder's privacy. Generally, the application of cryptographic techniques to practical problems is difficult because versatile techniques are computationally expensive while computationally inexpensive techniques can perform only trivial computation tasks. In this study, our protocol is successfully built only from an additive-homomorphic cryptosystem, which allows only addition performed on encrypted values but is computationally efficient compared with versatile techniques such as general purpose multi-party computation. In an experiment searching ChEMBL, which consists of more than 1,200,000 compounds, the proposed method was 36,900 times faster in CPU time and 12,000 times as efficient in communication size compared with general purpose multi-party computation. We proposed a novel privacy-preserving protocol for searching chemical compound databases. The proposed method, easily scaling for large-scale databases, may help to accelerate drug discovery research by making full use of unused but valuable data that includes sensitive information.
Privacy-preserving search for chemical compound databases
2015-01-01
Background Searching for similar compounds in a database is the most important process for in-silico drug screening. Since a query compound is an important starting point for the new drug, a query holder, who is afraid of the query being monitored by the database server, usually downloads all the records in the database and uses them in a closed network. However, a serious dilemma arises when the database holder also wants to output no information except for the search results, and such a dilemma prevents the use of many important data resources. Results In order to overcome this dilemma, we developed a novel cryptographic protocol that enables database searching while keeping both the query holder's privacy and database holder's privacy. Generally, the application of cryptographic techniques to practical problems is difficult because versatile techniques are computationally expensive while computationally inexpensive techniques can perform only trivial computation tasks. In this study, our protocol is successfully built only from an additive-homomorphic cryptosystem, which allows only addition performed on encrypted values but is computationally efficient compared with versatile techniques such as general purpose multi-party computation. In an experiment searching ChEMBL, which consists of more than 1,200,000 compounds, the proposed method was 36,900 times faster in CPU time and 12,000 times as efficient in communication size compared with general purpose multi-party computation. Conclusion We proposed a novel privacy-preserving protocol for searching chemical compound databases. The proposed method, easily scaling for large-scale databases, may help to accelerate drug discovery research by making full use of unused but valuable data that includes sensitive information. PMID:26678650
DOE Office of Scientific and Technical Information (OSTI.GOV)
Quock, D. E. R.; Cianciarulo, M. B.; APS Engineering Support Division
2007-01-01
The Integrated Relational Model of Installed Systems (IRMIS) is a relational database tool that has been implemented at the Advanced Photon Source to maintain an updated account of approximately 600 control system software applications, 400,000 process variables, and 30,000 control system hardware components. To effectively display this large amount of control system information to operators and engineers, IRMIS was initially built with nine Web-based viewers: Applications Organizing Index, IOC, PLC, Component Type, Installed Components, Network, Controls Spares, Process Variables, and Cables. However, since each viewer is designed to provide details from only one major category of the control system, themore » necessity for a one-stop global search tool for the entire database became apparent. The user requirements for extremely fast database search time and ease of navigation through search results led to the choice of Asynchronous JavaScript and XML (AJAX) technology in the implementation of the IRMIS global search tool. Unique features of the global search tool include a two-tier level of displayed search results, and a database data integrity validation and reporting mechanism.« less
[Profile of a systematic search. Search areas, databases and reports].
Korsbek, Lisa; Bendix, Ane Friis; Kidholm, Kristian
2006-04-03
Systematic literature search is a fundamental in evidence-based medicine. But systematic literature search is not yet a very well used way of retrieving evidence-based information. This article profiles a systematic literature search for evidence-based literature. It goes through the most central databases and gives an example of how to document the literature search. The article also sums up the literature search in all reviews in Ugeskrift for Laeger in the year 2004.
Jagtap, Pratik; Goslinga, Jill; Kooren, Joel A; McGowan, Thomas; Wroblewski, Matthew S; Seymour, Sean L; Griffin, Timothy J
2013-04-01
Large databases (>10(6) sequences) used in metaproteomic and proteogenomic studies present challenges in matching peptide sequences to MS/MS data using database-search programs. Most notably, strict filtering to avoid false-positive matches leads to more false negatives, thus constraining the number of peptide matches. To address this challenge, we developed a two-step method wherein matches derived from a primary search against a large database were used to create a smaller subset database. The second search was performed against a target-decoy version of this subset database merged with a host database. High confidence peptide sequence matches were then used to infer protein identities. Applying our two-step method for both metaproteomic and proteogenomic analysis resulted in twice the number of high confidence peptide sequence matches in each case, as compared to the conventional one-step method. The two-step method captured almost all of the same peptides matched by the one-step method, with a majority of the additional matches being false negatives from the one-step method. Furthermore, the two-step method improved results regardless of the database search program used. Our results show that our two-step method maximizes the peptide matching sensitivity for applications requiring large databases, especially valuable for proteogenomics and metaproteomics studies. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Shteynberg, David; Deutsch, Eric W.; Lam, Henry; Eng, Jimmy K.; Sun, Zhi; Tasman, Natalie; Mendoza, Luis; Moritz, Robert L.; Aebersold, Ruedi; Nesvizhskii, Alexey I.
2011-01-01
The combination of tandem mass spectrometry and sequence database searching is the method of choice for the identification of peptides and the mapping of proteomes. Over the last several years, the volume of data generated in proteomic studies has increased dramatically, which challenges the computational approaches previously developed for these data. Furthermore, a multitude of search engines have been developed that identify different, overlapping subsets of the sample peptides from a particular set of tandem mass spectrometry spectra. We present iProphet, the new addition to the widely used open-source suite of proteomic data analysis tools Trans-Proteomics Pipeline. Applied in tandem with PeptideProphet, it provides more accurate representation of the multilevel nature of shotgun proteomic data. iProphet combines the evidence from multiple identifications of the same peptide sequences across different spectra, experiments, precursor ion charge states, and modified states. It also allows accurate and effective integration of the results from multiple database search engines applied to the same data. The use of iProphet in the Trans-Proteomics Pipeline increases the number of correctly identified peptides at a constant false discovery rate as compared with both PeptideProphet and another state-of-the-art tool Percolator. As the main outcome, iProphet permits the calculation of accurate posterior probabilities and false discovery rate estimates at the level of sequence identical peptide identifications, which in turn leads to more accurate probability estimates at the protein level. Fully integrated with the Trans-Proteomics Pipeline, it supports all commonly used MS instruments, search engines, and computer platforms. The performance of iProphet is demonstrated on two publicly available data sets: data from a human whole cell lysate proteome profiling experiment representative of typical proteomic data sets, and from a set of Streptococcus pyogenes experiments more representative of organism-specific composite data sets. PMID:21876204
University Faculty Use of Computerized Databases: An Assessment of Needs and Resources.
ERIC Educational Resources Information Center
Borgman, Christine L.; And Others
1985-01-01
Results of survey indicate that: academic faculty are unaware of range of databases available; few recognize need for databases in research; most delegate searching to librarian or assistant, rather than perform searching themselves; and 39 database guides identified tended to be descriptive rather than evaluative. A comparison of the guides is…
Moeinipour, Aliasghar; Zarifian, Ahmadreza; Sheikh Andalibi, Mohammad Sobhan; Shamloo, Alireza Sepehri; Ahmadabadi, Ali; Amouzeshi, Ahmad; Hoseinikhah, Hamid
2015-12-22
It is common practice for patients with prosthetic cardiac devices, especially heart valve prosthesis, arterial stents, defibrillators, and pacemaker devices, to use anticoagulation treatment. When these patients suffer from multiple trauma after motor vehicle accidents, the best medical management for this challenging position is mandatory. This strategy should include a rapid diagnosis of all possible multiple organ injuries, with special attention to anticoagulation therapy so as to minimize the risk of thromboembolism complication in prosthetic devices. In this review, we describe the best medical management for patients with multiple trauma who use anticoagulants after heart valve replacement. We searched electronic databases PubMed/Medline, Scopus, Embase, and Google Scholar using the following terms: anticoagulant, warfarin, heparin, and multiple trauma. Also, similar studies suggested by the databases were included. Non-English articles were excluded from the review. For patients who use anticoagulation therapy, teamwork between cardiac surgeons, general surgeons, anesthesiologists, and cardiologists is essential. For optimal medical management, multiple consults between members of this team is mandatory for rapid diagnosis of all possible damaged organs, with special attention to the central nervous system, chest, and abdominal traumas. With this strategy, it is important to take note of anticoagulation drugs to minimize the risk of thromboembolism complications in cardiac devices. The best anticoagulant agents for emergency operations in patients with multiple trauma who are using an anticoagulant after heart valve replacement are fresh frozen plasma (FFP) and prothrombin complex concentrates (PCC).
ClassLess: A Comprehensive Database of Young Stellar Objects
NASA Astrophysics Data System (ADS)
Hillenbrand, Lynne A.; baliber, nairn
2015-08-01
We have designed and constructed a database intended to house catalog and literature-published measurements of Young Stellar Objects (YSOs) within ~1 kpc of the Sun. ClassLess, so called because it includes YSOs in all stages of evolution, is a relational database in which user interaction is conducted via HTML web browsers, queries are performed in scientific language, and all data are linked to the sources of publication. Each star is associated with a cluster (or clusters), and both spatially resolved and unresolved measurements are stored, allowing proper use of data from multiple star systems. With this fully searchable tool, myriad ground- and space-based instruments and surveys across wavelength regimes can be exploited. In addition to primary measurements, the database self consistently calculates and serves higher level data products such as extinction, luminosity, and mass. As a result, searches for young stars with specific physical characteristics can be completed with just a few mouse clicks. We are in the database population phase now, and are eager to engage with interested experts worldwide on local galactic star formation and young stellar populations.
BIOPEP database and other programs for processing bioactive peptide sequences.
Minkiewicz, Piotr; Dziuba, Jerzy; Iwaniak, Anna; Dziuba, Marta; Darewicz, Małgorzata
2008-01-01
This review presents the potential for application of computational tools in peptide science based on a sample BIOPEP database and program as well as other programs and databases available via the World Wide Web. The BIOPEP application contains a database of biologically active peptide sequences and a program enabling construction of profiles of the potential biological activity of protein fragments, calculation of quantitative descriptors as measures of the value of proteins as potential precursors of bioactive peptides, and prediction of bonds susceptible to hydrolysis by endopeptidases in a protein chain. Other bioactive and allergenic peptide sequence databases are also presented. Programs enabling the construction of binary and multiple alignments between peptide sequences, the construction of sequence motifs attributed to a given type of bioactivity, searching for potential precursors of bioactive peptides, and the prediction of sites susceptible to proteolytic cleavage in protein chains are available via the Internet as are other approaches concerning secondary structure prediction and calculation of physicochemical features based on amino acid sequence. Programs for prediction of allergenic and toxic properties have also been developed. This review explores the possibilities of cooperation between various programs.
Wang, Penghao; Wilson, Susan R
2013-01-01
Mass spectrometry-based protein identification is a very challenging task. The main identification approaches include de novo sequencing and database searching. Both approaches have shortcomings, so an integrative approach has been developed. The integrative approach firstly infers partial peptide sequences, known as tags, directly from tandem spectra through de novo sequencing, and then puts these sequences into a database search to see if a close peptide match can be found. However the current implementation of this integrative approach has several limitations. Firstly, simplistic de novo sequencing is applied and only very short sequence tags are used. Secondly, most integrative methods apply an algorithm similar to BLAST to search for exact sequence matches and do not accommodate sequence errors well. Thirdly, by applying these methods the integrated de novo sequencing makes a limited contribution to the scoring model which is still largely based on database searching. We have developed a new integrative protein identification method which can integrate de novo sequencing more efficiently into database searching. Evaluated on large real datasets, our method outperforms popular identification methods.
LMSD: LIPID MAPS structure database
Sud, Manish; Fahy, Eoin; Cotter, Dawn; Brown, Alex; Dennis, Edward A.; Glass, Christopher K.; Merrill, Alfred H.; Murphy, Robert C.; Raetz, Christian R. H.; Russell, David W.; Subramaniam, Shankar
2007-01-01
The LIPID MAPS Structure Database (LMSD) is a relational database encompassing structures and annotations of biologically relevant lipids. Structures of lipids in the database come from four sources: (i) LIPID MAPS Consortium's core laboratories and partners; (ii) lipids identified by LIPID MAPS experiments; (iii) computationally generated structures for appropriate lipid classes; (iv) biologically relevant lipids manually curated from LIPID BANK, LIPIDAT and other public sources. All the lipid structures in LMSD are drawn in a consistent fashion. In addition to a classification-based retrieval of lipids, users can search LMSD using either text-based or structure-based search options. The text-based search implementation supports data retrieval by any combination of these data fields: LIPID MAPS ID, systematic or common name, mass, formula, category, main class, and subclass data fields. The structure-based search, in conjunction with optional data fields, provides the capability to perform a substructure search or exact match for the structure drawn by the user. Search results, in addition to structure and annotations, also include relevant links to external databases. The LMSD is publicly available at PMID:17098933
Bioinformatics Approaches to Classifying Allergens and Predicting Cross-Reactivity
Schein, Catherine H.; Ivanciuc, Ovidiu; Braun, Werner
2007-01-01
The major advances in understanding why patients respond to several seemingly different stimuli have been through the isolation, sequencing and structural analysis of proteins that induce an IgE response. The most significant finding is that allergenic proteins from very different sources can have nearly identical sequences and structures, and that this similarity can account for clinically observed cross-reactivity. The increasing amount of information on the sequence, structure and IgE epitopes of allergens is now available in several databases and powerful bioinformatics search tools allow user access to relevant information. Here, we provide an overview of these databases and describe state-of-the art bioinformatics tools to identify the common proteins that may be at the root of multiple allergy syndromes. Progress has also been made in quantitatively defining characteristics that discriminate allergens from non-allergens. Search and software tools for this purpose have been developed and implemented in the Structural Database of Allergenic Proteins (SDAP, http://fermi.utmb.edu/SDAP/). SDAP contains information for over 800 allergens and extensive bibliographic references in a relational database with links to other publicly available databases. SDAP is freely available on the Web to clinicians and patients, and can be used to find structural and functional relations among known allergens and to identify potentially cross-reacting antigens. Here we illustrate how these bioinformatics tools can be used to group allergens, and to detect areas that may account for common patterns of IgE binding and cross-reactivity. Such results can be used to guide treatment regimens for allergy sufferers. PMID:17276876
Analysis Tool Web Services from the EMBL-EBI.
McWilliam, Hamish; Li, Weizhong; Uludag, Mahmut; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Cowley, Andrew Peter; Lopez, Rodrigo
2013-07-01
Since 2004 the European Bioinformatics Institute (EMBL-EBI) has provided access to a wide range of databases and analysis tools via Web Services interfaces. This comprises services to search across the databases available from the EMBL-EBI and to explore the network of cross-references present in the data (e.g. EB-eye), services to retrieve entry data in various data formats and to access the data in specific fields (e.g. dbfetch), and analysis tool services, for example, sequence similarity search (e.g. FASTA and NCBI BLAST), multiple sequence alignment (e.g. Clustal Omega and MUSCLE), pairwise sequence alignment and protein functional analysis (e.g. InterProScan and Phobius). The REST/SOAP Web Services (http://www.ebi.ac.uk/Tools/webservices/) interfaces to these databases and tools allow their integration into other tools, applications, web sites, pipeline processes and analytical workflows. To get users started using the Web Services, sample clients are provided covering a range of programming languages and popular Web Service tool kits, and a brief guide to Web Services technologies, including a set of tutorials, is available for those wishing to learn more and develop their own clients. Users of the Web Services are informed of improvements and updates via a range of methods.
Analysis Tool Web Services from the EMBL-EBI
McWilliam, Hamish; Li, Weizhong; Uludag, Mahmut; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Cowley, Andrew Peter; Lopez, Rodrigo
2013-01-01
Since 2004 the European Bioinformatics Institute (EMBL-EBI) has provided access to a wide range of databases and analysis tools via Web Services interfaces. This comprises services to search across the databases available from the EMBL-EBI and to explore the network of cross-references present in the data (e.g. EB-eye), services to retrieve entry data in various data formats and to access the data in specific fields (e.g. dbfetch), and analysis tool services, for example, sequence similarity search (e.g. FASTA and NCBI BLAST), multiple sequence alignment (e.g. Clustal Omega and MUSCLE), pairwise sequence alignment and protein functional analysis (e.g. InterProScan and Phobius). The REST/SOAP Web Services (http://www.ebi.ac.uk/Tools/webservices/) interfaces to these databases and tools allow their integration into other tools, applications, web sites, pipeline processes and analytical workflows. To get users started using the Web Services, sample clients are provided covering a range of programming languages and popular Web Service tool kits, and a brief guide to Web Services technologies, including a set of tutorials, is available for those wishing to learn more and develop their own clients. Users of the Web Services are informed of improvements and updates via a range of methods. PMID:23671338
NCBI GEO: archive for functional genomics data sets—10 years on
Barrett, Tanya; Troup, Dennis B.; Wilhite, Stephen E.; Ledoux, Pierre; Evangelista, Carlos; Kim, Irene F.; Tomashevsky, Maxim; Marshall, Kimberly A.; Phillippy, Katherine H.; Sherman, Patti M.; Muertter, Rolf N.; Holko, Michelle; Ayanbule, Oluwabukunmi; Yefanov, Andrey; Soboleva, Alexandra
2011-01-01
A decade ago, the Gene Expression Omnibus (GEO) database was established at the National Center for Biotechnology Information (NCBI). The original objective of GEO was to serve as a public repository for high-throughput gene expression data generated mostly by microarray technology. However, the research community quickly applied microarrays to non-gene-expression studies, including examination of genome copy number variation and genome-wide profiling of DNA-binding proteins. Because the GEO database was designed with a flexible structure, it was possible to quickly adapt the repository to store these data types. More recently, as the microarray community switches to next-generation sequencing technologies, GEO has again adapted to host these data sets. Today, GEO stores over 20 000 microarray- and sequence-based functional genomics studies, and continues to handle the majority of direct high-throughput data submissions from the research community. Multiple mechanisms are provided to help users effectively search, browse, download and visualize the data at the level of individual genes or entire studies. This paper describes recent database enhancements, including new search and data representation tools, as well as a brief review of how the community uses GEO data. GEO is freely accessible at http://www.ncbi.nlm.nih.gov/geo/. PMID:21097893
Large-Scale Concatenation cDNA Sequencing
Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.
1997-01-01
A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
Identifying known unknowns using the US EPA's CompTox ...
Chemical features observed using high-resolution mass spectrometry can be tentatively identified using online chemical reference databases by searching molecular formulae and monoisotopic masses and then rank-ordering of the hits using appropriate relevance criteria. The most likely candidate “known unknowns,” which are those chemicals unknown to an investigator but contained within a reference database or literature source, rise to the top of a chemical list when rank-ordered by the number of associated data sources. The U.S. EPA’s CompTox Chemistry Dashboard is a curated and freely available resource for chemistry and computational toxicology research, containing more than 720,000 chemicals of relevance to environmental health science. In this research, the performance of the Dashboard for identifying “known unknowns” was evaluated against that of the online ChemSpider database, one of the primary resources used by mass spectrometrists, using multiple previously studied datasets reported in the peer-reviewed literature totaling 162 chemicals. These chemicals were examined using both applications via molecular formula and monoisotopic mass searches followed by rank-ordering of candidate compounds by associated references or data sources. A greater percentage of chemicals ranked in the top position when using the Dashboard, indicating an advantage of this application over ChemSpider for identifying known unknowns using data source ranking. Addition
ERIC Educational Resources Information Center
Spink, Amanda
1995-01-01
This study uses the human approach to examine the sources and effectiveness of search terms selected during 40 mediated interactive database searches and focuses on determining the retrieval effectiveness of search terms identified by users and intermediaries from retrieved items during term relevance feedback. (Author/JKP)
The Weaknesses of Full-Text Searching
ERIC Educational Resources Information Center
Beall, Jeffrey
2008-01-01
This paper provides a theoretical critique of the deficiencies of full-text searching in academic library databases. Because full-text searching relies on matching words in a search query with words in online resources, it is an inefficient method of finding information in a database. This matching fails to retrieve synonyms, and it also retrieves…
The effectiveness of exercise programmes in patients with multiple myeloma: A literature review.
Gan, J H; Sim, C Y L; Santorelli, L A
2016-02-01
A limited number of clinical studies have investigated the effectiveness of participation in exercise training programmes for patients with multiple myeloma (MM), exploring the different biomedical, physical, psychological and quality of life. The aim of this literature review is to evaluate current quantitative and qualitative evidence concerning the effectiveness of participation in exercise programmes for patients with MM in improving physiological and/or psychological status. A literature search encompassing studies published between January 1998 and July 2013 was conducted through ten electronic databases. This search was further expanded through citation chaining, manual grey literature searches, and peer review consultation. In total, seven interventional studies were identified and appraised using Critical Appraisal Skill Programme (CASP) or Centre for Evidence-Based Management of Amsterdam (CEBMa). Though the majority of the studies presented encouraging data, however, three studies that implemented individualized exercise interventions for patients at different stages of MM and myeloablative treatment showed mixed results. In conclusion, the effectiveness of participation in exercise programmes remains unclear for patients with MM, as the studies reviewed were flawed by relatively weak methodological approaches. Crown Copyright © 2015. Published by Elsevier Ireland Ltd. All rights reserved.
Royle, P; Waugh, N
2003-01-01
To contribute to making searching for Technology Assessment Reports (TARs) more cost-effective by suggesting an optimum literature retrieval strategy. A sample of 20 recent TARs. All sources used to search for clinical and cost-effectiveness studies were recorded. In addition, all studies that were included in the clinical and cost-effectiveness sections of the TARs were identified, and their characteristics recorded, including author, journal, year, study design, study size and quality score. Each was also classified by publication type, and then checked to see whether it was indexed in the following databases: MEDLINE, EMBASE, and then either the Cochrane Controlled Trials Register (CCTR) for clinical effectiveness studies or the NHS Economic Evaluation Database (NHS EED) for the cost-effectiveness studies. Any study not found in at least one of these databases was checked to see whether it was indexed in the Science Citation Index (SCI) and BIOSIS, and the American Society of Clinical Oncology (ASCO) Online if a cancer review. Any studies still not found were checked to see whether they were in a number of additional databases. The median number of sources searched per TAR was 20, and the range was from 13 to 33 sources. Six sources (CCTR, DARE, EMBASE, MEDLINE, NHS EED and sponsor/industry submissions to National Institute for Clinical Excellence) were used in all reviews. After searching the MEDLINE, EMBASE and NHS EED databases, 87.3% of the clinical effectiveness studies and 94.8% of the cost-effectiveness studies were found, rising to 98.2% when SCI, BIOSIS and ASCO Online and 97.9% when SCI and ASCO Online, respectively, were added. The median number of sources searched for the 14 TARs that included an economic model was 9.0 per TAR. A sensitive search filter for identifying non-randomised controlled trials (RCT), constructed for MEDLINE and using the search terms from the bibliographic records in the included studies, retrieved only 85% of the known sample. Therefore, it is recommended that when searching for non-RCT studies a search is done for the intervention alone, and records are then scanned manually for those that look relevant. Searching additional databases beyond the Cochrane Library (which includes CCTR, NHS EED and the HTA database), MEDLINE, EMBASE and SCI, plus BIOSIS limited to meeting abstracts only, was seldom found to be effective in retrieving additional studies for inclusion in the clinical and cost-effectiveness sections of TARs (apart from reviews of cancer therapies, where a search of the ASCO database is recommended). A more selective approach to database searching would suffice in most cases and would save resources, thereby making the TAR process more efficient. However, searching non-database sources (including submissions from manufacturers, recent meeting abstracts, contact with experts and checking reference lists) does appear to be a productive way of identifying further studies.
Shen, Yonghua; Liu, Mingdong; Chen, Min; Li, Yunhong; Lu, Ying; Zou, Xiaoping
2014-01-01
Refractory chronic pancreatitis has been proposed as a challenge for endoscopists following routine single plastic stenting. However, data on the efficacy and safety of further endoscopic stenting are still controversial. The current systematic review aimed to assess the efficacy and safety of placement of fully covered self-expandable metal stent (FCSEMS) and multiple plastic stents. Databases including MEDLINE, EMBASE, the Cochrane Library, CBM, CNKI, VIP, and WANFANG Database were used to search relevant trials. Published studies were assessed by using well-defined inclusion and exclusion criteria. The process was independently performed by two investigators. A total of 5 studies provided data of 80 patients. Forest plots and publication bias were not carried out because few studies were relevant and screened studies were all case series. The technical success rate was 100% both in placement of FCSEMS and multiple plastic stents. The functional success rate after placement of FCSEMS was 100%, followed by multiple plastic stents (94.7%). Complications occurred 26.2% after FCSEMS placement, which was not described in detail in multiple plastic stents. The stent migration rate was 8.2% for FCSEMS and 10.5% for multiple plastic stents. Reintervention rate was 9.8% for FCSEMS and 15.8% for multiple plastic stents. Pain improvement rate was 85.2% for FCSEMS and 84.2% for multiple plastic stents. FCSEMS appeared to be no significant difference with multiple plastic stents in treatment of refractory chronic pancreatitis. We need to develop more investigations. Copyright © 2014 IAP and EPC. Published by Elsevier B.V. All rights reserved.
Searching the scientific literature: implications for quantitative and qualitative reviews.
Wu, Yelena P; Aylward, Brandon S; Roberts, Michael C; Evans, Spencer C
2012-08-01
Literature reviews are an essential step in the research process and are included in all empirical and review articles. Electronic databases are commonly used to gather this literature. However, several factors can affect the extent to which relevant articles are retrieved, influencing future research and conclusions drawn. The current project examined articles obtained by comparable search strategies in two electronic archives using an exemplar search to illustrate factors that authors should consider when designing their own search strategies. Specifically, literature searches were conducted in PsycINFO and PubMed targeting review articles on two exemplar disorders (bipolar disorder and attention deficit/hyperactivity disorder) and issues of classification and/or differential diagnosis. Articles were coded for relevance and characteristics of article content. The two search engines yielded significantly different proportions of relevant articles overall and by disorder. Keywords differed across search engines for the relevant articles identified. Based on these results, it is recommended that when gathering literature for review papers, multiple search engines should be used, and search syntax and strategies be tailored to the unique capabilities of particular engines. For meta-analyses and systematic reviews, authors may consider reporting the extent to which different archives or sources yielded relevant articles for their particular review. Copyright © 2012 Elsevier Ltd. All rights reserved.
[A systematic review of the effectiveness of workplace safety interventions].
Baldasseroni, A; Olimpi, Nadia; Bonaccorsi, G
2009-01-01
The authors carried out a systematic review of the effectiveness of workplace safety interventions, as a part of a wider project funded by CCM, Centre for Disease Control. Several electronic bibliographic databases were checked, using a standardized string selection. The string contained the following four items: the intervention; job features; type of injury; efficacy/effectiveness. Of the various databases consulted, Web of Science was the most efficient. Overall 5531 articles were selected. After reading the title and abstract, 4695 were excluded and eventually 35 systematic reviews were selected, which synthesized 769 original articles. The main topics of the selected systematic reviews were: certain sectors (building industry, agriculture, health care); personal protective equipment; work organization and prevention management at plant level; evaluation of prevention policies by national and regional authorities. A clear need for multiple bibliographical data-base search emerged at the end of this study.
PGMapper: a web-based tool linking phenotype to genes.
Xiong, Qing; Qiu, Yuhui; Gu, Weikuan
2008-04-01
With the availability of whole genome sequence in many species, linkage analysis, positional cloning and microarray are gradually becoming powerful tools for investigating the links between phenotype and genotype or genes. However, in these methods, causative genes underlying a quantitative trait locus, or a disease, are usually located within a large genomic region or a large set of genes. Examining the function of every gene is very time consuming and needs to retrieve and integrate the information from multiple databases or genome resources. PGMapper is a software tool for automatically matching phenotype to genes from a defined genome region or a group of given genes by combining the mapping information from the Ensembl database and gene function information from the OMIM and PubMed databases. PGMapper is currently available for candidate gene search of human, mouse, rat, zebrafish and 12 other species. Available online at http://www.genediscovery.org/pgmapper/index.jsp.
Evidence-based librarianship: searching for the needed EBL evidence.
Eldredge, J D
2000-01-01
This paper discusses the challenges of finding evidence needed to implement Evidence-Based Librarianship (EBL). Focusing first on database coverage for three health sciences librarianship journals, the article examines the information contents of different databases. Strategies are needed to search for relevant evidence in the library literature via these databases, and the problems associated with searching the grey literature of librarianship. Database coverage, plausible search strategies, and the grey literature of library science all pose challenges to finding the needed research evidence for practicing EBL. Health sciences librarians need to ensure that systems are designed that can track and provide access to needed research evidence to support Evidence-Based Librarianship (EBL).
Finding similar nucleotide sequences using network BLAST searches.
Ladunga, Istvan
2009-06-01
The Basic Local Alignment Search Tool (BLAST) is a keystone of bioinformatics due to its performance and user-friendliness. Beginner and intermediate users will learn how to design and submit blastn and Megablast searches on the Web pages at the National Center for Biotechnology Information. We map nucleic acid sequences to genomes, find identical or similar mRNA, expressed sequence tag, and noncoding RNA sequences, and run Megablast searches, which are much faster than blastn. Understanding results is assisted by taxonomy reports, genomic views, and multiple alignments. We interpret expected frequency thresholds, biological significance, and statistical significance. Weak hits provide no evidence, but hints for further analyses. We find genes that may code for homologous proteins by translated BLAST. We reduce false positives by filtering out low-complexity regions. Parsed BLAST results can be integrated into analysis pipelines. Links in the output connect to Entrez, PUBMED, structural, sequence, interaction, and expression databases. This facilitates integration with a wide spectrum of biological knowledge.
MIDAS: a database-searching algorithm for metabolite identification in metabolomics.
Wang, Yingfeng; Kora, Guruprasad; Bowen, Benjamin P; Pan, Chongle
2014-10-07
A database searching approach can be used for metabolite identification in metabolomics by matching measured tandem mass spectra (MS/MS) against the predicted fragments of metabolites in a database. Here, we present the open-source MIDAS algorithm (Metabolite Identification via Database Searching). To evaluate a metabolite-spectrum match (MSM), MIDAS first enumerates possible fragments from a metabolite by systematic bond dissociation, then calculates the plausibility of the fragments based on their fragmentation pathways, and finally scores the MSM to assess how well the experimental MS/MS spectrum from collision-induced dissociation (CID) is explained by the metabolite's predicted CID MS/MS spectrum. MIDAS was designed to search high-resolution tandem mass spectra acquired on time-of-flight or Orbitrap mass spectrometer against a metabolite database in an automated and high-throughput manner. The accuracy of metabolite identification by MIDAS was benchmarked using four sets of standard tandem mass spectra from MassBank. On average, for 77% of original spectra and 84% of composite spectra, MIDAS correctly ranked the true compounds as the first MSMs out of all MetaCyc metabolites as decoys. MIDAS correctly identified 46% more original spectra and 59% more composite spectra at the first MSMs than an existing database-searching algorithm, MetFrag. MIDAS was showcased by searching a published real-world measurement of a metabolome from Synechococcus sp. PCC 7002 against the MetaCyc metabolite database. MIDAS identified many metabolites missed in the previous study. MIDAS identifications should be considered only as candidate metabolites, which need to be confirmed using standard compounds. To facilitate manual validation, MIDAS provides annotated spectra for MSMs and labels observed mass spectral peaks with predicted fragments. The database searching and manual validation can be performed online at http://midas.omicsbio.org.
The PARIGA server for real time filtering and analysis of reciprocal BLAST results.
Orsini, Massimiliano; Carcangiu, Simone; Cuccuru, Gianmauro; Uva, Paolo; Tramontano, Anna
2013-01-01
BLAST-based similarity searches are commonly used in several applications involving both nucleotide and protein sequences. These applications span from simple tasks such as mapping sequences over a database to more complex procedures as clustering or annotation processes. When the amount of analysed data increases, manual inspection of BLAST results become a tedious procedure. Tools for parsing or filtering BLAST results for different purposes are then required. We describe here PARIGA (http://resources.bioinformatica.crs4.it/pariga/), a server that enables users to perform all-against-all BLAST searches on two sets of sequences selected by the user. Moreover, since it stores the two BLAST output in a python-serialized-objects database, results can be filtered according to several parameters in real-time fashion, without re-running the process and avoiding additional programming efforts. Results can be interrogated by the user using logical operations, for example to retrieve cases where two queries match same targets, or when sequences from the two datasets are reciprocal best hits, or when a query matches a target in multiple regions. The Pariga web server is designed to be a helpful tool for managing the results of sequence similarity searches. The design and implementation of the server renders all operations very fast and easy to use.
Treatment of Intravenous Leiomyomatosis with Cardiac Extension following Incomplete Resection.
Doyle, Mathew P; Li, Annette; Villanueva, Claudia I; Peeceeyen, Sheen C S; Cooper, Michael G; Hanel, Kevin C; Fermanis, Gary G; Robertson, Greg
2015-01-01
Aim. Intravenous leiomyomatosis (IVL) with cardiac extension (CE) is a rare variant of benign uterine leiomyoma. Incomplete resection has a recurrence rate of over 30%. Different hormonal treatments have been described following incomplete resection; however no standard therapy currently exists. We review the literature for medical treatments options following incomplete resection of IVL with CE. Methods. Electronic databases were searched for all studies reporting IVL with CE. These studies were then searched for reports of patients with inoperable or incomplete resection and any further medical treatments. Our database was searched for patients with medical therapy following incomplete resection of IVL with CE and their results were included. Results. All studies were either case reports or case series. Five literature reviews confirm that surgery is the only treatment to achieve cure. The uses of progesterone, estrogen modulation, gonadotropin-releasing hormone antagonism, and aromatase inhibition have been described following incomplete resection. Currently no studies have reviewed the outcomes of these treatments. Conclusions. Complete surgical resection is the only means of cure for IVL with CE, while multiple hormonal therapies have been used with varying results following incomplete resection. Aromatase inhibitors are the only reported treatment to prevent tumor progression or recurrence in patients with incompletely resected IVL with CE.
Treatment of Intravenous Leiomyomatosis with Cardiac Extension following Incomplete Resection
Doyle, Mathew P.; Li, Annette; Villanueva, Claudia I.; Peeceeyen, Sheen C. S.; Cooper, Michael G.; Hanel, Kevin C.; Fermanis, Gary G.; Robertson, Greg
2015-01-01
Aim. Intravenous leiomyomatosis (IVL) with cardiac extension (CE) is a rare variant of benign uterine leiomyoma. Incomplete resection has a recurrence rate of over 30%. Different hormonal treatments have been described following incomplete resection; however no standard therapy currently exists. We review the literature for medical treatments options following incomplete resection of IVL with CE. Methods. Electronic databases were searched for all studies reporting IVL with CE. These studies were then searched for reports of patients with inoperable or incomplete resection and any further medical treatments. Our database was searched for patients with medical therapy following incomplete resection of IVL with CE and their results were included. Results. All studies were either case reports or case series. Five literature reviews confirm that surgery is the only treatment to achieve cure. The uses of progesterone, estrogen modulation, gonadotropin-releasing hormone antagonism, and aromatase inhibition have been described following incomplete resection. Currently no studies have reviewed the outcomes of these treatments. Conclusions. Complete surgical resection is the only means of cure for IVL with CE, while multiple hormonal therapies have been used with varying results following incomplete resection. Aromatase inhibitors are the only reported treatment to prevent tumor progression or recurrence in patients with incompletely resected IVL with CE. PMID:26783463
DOE Office of Scientific and Technical Information (OSTI.GOV)
Park, Yubin; Shankar, Mallikarjun; Park, Byung H.
Designing a database system for both efficient data management and data services has been one of the enduring challenges in the healthcare domain. In many healthcare systems, data services and data management are often viewed as two orthogonal tasks; data services refer to retrieval and analytic queries such as search, joins, statistical data extraction, and simple data mining algorithms, while data management refers to building error-tolerant and non-redundant database systems. The gap between service and management has resulted in rigid database systems and schemas that do not support effective analytics. We compose a rich graph structure from an abstracted healthcaremore » RDBMS to illustrate how we can fill this gap in practice. We show how a healthcare graph can be automatically constructed from a normalized relational database using the proposed 3NF Equivalent Graph (3EG) transformation.We discuss a set of real world graph queries such as finding self-referrals, shared providers, and collaborative filtering, and evaluate their performance over a relational database and its 3EG-transformed graph. Experimental results show that the graph representation serves as multiple de-normalized tables, thus reducing complexity in a database and enhancing data accessibility of users. Based on this finding, we propose an ensemble framework of databases for healthcare applications.« less
In Silico PCR Tools for a Fast Primer, Probe, and Advanced Searching.
Kalendar, Ruslan; Muterko, Alexandr; Shamekova, Malika; Zhambakin, Kabyl
2017-01-01
The polymerase chain reaction (PCR) is fundamental to molecular biology and is the most important practical molecular technique for the research laboratory. The principle of this technique has been further used and applied in plenty of other simple or complex nucleic acid amplification technologies (NAAT). In parallel to laboratory "wet bench" experiments for nucleic acid amplification technologies, in silico or virtual (bioinformatics) approaches have been developed, among which in silico PCR analysis. In silico NAAT analysis is a useful and efficient complementary method to ensure the specificity of primers or probes for an extensive range of PCR applications from homology gene discovery, molecular diagnosis, DNA fingerprinting, and repeat searching. Predicting sensitivity and specificity of primers and probes requires a search to determine whether they match a database with an optimal number of mismatches, similarity, and stability. In the development of in silico bioinformatics tools for nucleic acid amplification technologies, the prospects for the development of new NAAT or similar approaches should be taken into account, including forward-looking and comprehensive analysis that is not limited to only one PCR technique variant. The software FastPCR and the online Java web tool are integrated tools for in silico PCR of linear and circular DNA, multiple primer or probe searches in large or small databases and for advanced search. These tools are suitable for processing of batch files that are essential for automation when working with large amounts of data. The FastPCR software is available for download at http://primerdigital.com/fastpcr.html and the online Java version at http://primerdigital.com/tools/pcr.html .
Fitzgerald-Yau, Natasha; Viner, Russell Mark
2014-01-01
We systematically searched 9 biomedical and social science databases (1980–2012) for primary and secondary interventions that prevented or reduced 2 or more adolescent health risk behaviors (tobacco use, alcohol use, illicit drug use, risky sexual behavior, aggressive acts). We identified 44 randomized controlled trials of universal or selective interventions and were effective for multiple health risk behaviors. Most were school based, conducted in the United States, and effective for multiple forms of substance use. Effects were small, in line with findings for other universal prevention programs. In some studies, effects for more than 1 health risk behavior only emerged at long-term follow-up. Integrated prevention programs are feasible and effective and may be more efficient than discrete prevention strategies. PMID:24625172
System, method and apparatus for conducting a keyterm search
NASA Technical Reports Server (NTRS)
McGreevy, Michael W. (Inventor)
2004-01-01
A keyterm search is a method of searching a database for subsets of the database that are relevant to an input query. First, a number of relational models of subsets of a database are provided. A query is then input. The query can include one or more keyterms. Next, a gleaning model of the query is created. The gleaning model of the query is then compared to each one of the relational models of subsets of the database. The identifiers of the relevant subsets are then output.
System, method and apparatus for conducting a phrase search
NASA Technical Reports Server (NTRS)
McGreevy, Michael W. (Inventor)
2004-01-01
A phrase search is a method of searching a database for subsets of the database that are relevant to an input query. First, a number of relational models of subsets of a database are provided. A query is then input. The query can include one or more sequences of terms. Next, a relational model of the query is created. The relational model of the query is then compared to each one of the relational models of subsets of the database. The identifiers of the relevant subsets are then output.
PIPI: PTM-Invariant Peptide Identification Using Coding Method.
Yu, Fengchao; Li, Ning; Yu, Weichuan
2016-12-02
In computational proteomics, the identification of peptides with an unlimited number of post-translational modification (PTM) types is a challenging task. The computational cost associated with database search increases exponentially with respect to the number of modified amino acids and linearly with respect to the number of potential PTM types at each amino acid. The problem becomes intractable very quickly if we want to enumerate all possible PTM patterns. To address this issue, one group of methods named restricted tools (including Mascot, Comet, and MS-GF+) only allow a small number of PTM types in database search process. Alternatively, the other group of methods named unrestricted tools (including MS-Alignment, ProteinProspector, and MODa) avoids enumerating PTM patterns with an alignment-based approach to localizing and characterizing modified amino acids. However, because of the large search space and PTM localization issue, the sensitivity of these unrestricted tools is low. This paper proposes a novel method named PIPI to achieve PTM-invariant peptide identification. PIPI belongs to the category of unrestricted tools. It first codes peptide sequences into Boolean vectors and codes experimental spectra into real-valued vectors. For each coded spectrum, it then searches the coded sequence database to find the top scored peptide sequences as candidates. After that, PIPI uses dynamic programming to localize and characterize modified amino acids in each candidate. We used simulation experiments and real data experiments to evaluate the performance in comparison with restricted tools (i.e., Mascot, Comet, and MS-GF+) and unrestricted tools (i.e., Mascot with error tolerant search, MS-Alignment, ProteinProspector, and MODa). Comparison with restricted tools shows that PIPI has a close sensitivity and running speed. Comparison with unrestricted tools shows that PIPI has the highest sensitivity except for Mascot with error tolerant search and ProteinProspector. These two tools simplify the task by only considering up to one modified amino acid in each peptide, which results in a higher sensitivity but has difficulty in dealing with multiple modified amino acids. The simulation experiments also show that PIPI has the lowest false discovery proportion, the highest PTM characterization accuracy, and the shortest running time among the unrestricted tools.
Physiotherapy Rehabilitation for People With Progressive Multiple Sclerosis: A Systematic Review.
Campbell, Evan; Coulter, Elaine H; Mattison, Paul G; Miller, Linda; McFadyen, Angus; Paul, Lorna
2016-01-01
To assess the efficacy of physiotherapy interventions, including exercise therapy, for the rehabilitation of people with progressive multiple sclerosis. Five databases (Cochrane Library, Physiotherapy Evidence Database [PEDro], Web of Science Core Collections, MEDLINE, Embase) and reference lists of relevant articles were searched. Randomized experimental trials, including participants with progressive multiple sclerosis and investigating a physiotherapy intervention or an intervention containing a physiotherapy element, were included. Data were independently extracted using a standardized form, and methodologic quality was assessed using the PEDro scale. Thirteen studies (described by 15 articles) were identified and scored between 5 and 9 out of 10 on the PEDro scale. Eight interventions were assessed: exercise therapy, multidisciplinary rehabilitation, functional electrical stimulation, botulinum toxin type A injections and manual stretches, inspiratory muscle training, therapeutic standing, acupuncture, and body weight-supported treadmill training. All studies, apart from 1, produced positive results in at least 1 outcome measure; however, only 1 article used a power calculation to determine the sample size and because of dropouts the results were subsequently underpowered. This review suggests that physiotherapy may be effective for the rehabilitation of people with progressive multiple sclerosis. However, further appropriately powered studies are required. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
On Building a Search Interface Discovery System
NASA Astrophysics Data System (ADS)
Shestakov, Denis
A huge portion of the Web known as the deep Web is accessible via search interfaces to myriads of databases on the Web. While relatively good approaches for querying the contents of web databases have been recently proposed, one cannot fully utilize them having most search interfaces unlocated. Thus, the automatic recognition of search interfaces to online databases is crucial for any application accessing the deep Web. This paper describes the architecture of the I-Crawler, a system for finding and classifying search interfaces. The I-Crawler is intentionally designed to be used in the deep web characterization surveys and for constructing directories of deep web resources.
Cruickshank, Travis M.; Reyes, Alvaro R.; Ziman, Melanie R.
2015-01-01
Abstract Strength training has, in recent years, been shown to be beneficial for people with Parkinson disease and multiple sclerosis. Consensus regarding its utility for these disorders nevertheless remains contentious among healthcare professionals. Greater clarity is required, especially in regards to the type and magnitude of effects as well as the response differences to strength training between individuals with Parkinson disease or multiple sclerosis. This study examines the effects, magnitude of those effects, and response differences to strength training between patients with Parkinson disease or multiple sclerosis. A comprehensive search of electronic databases including Physiotherapy Evidence Database scale, PubMed, EMBASE, Cochrane Central Register of Controlled Trials, and CINAHL was conducted from inception to July 2014. English articles investigating the effect of strength training for individuals with neurodegenerative disorders were selected. Strength training trials that met the inclusion criteria were found for individuals with Parkinson disease or multiple sclerosis. Individuals with Parkinson disease or multiple sclerosis were included in the study. Strength training interventions included traditional (free weights/machine exercises) and nontraditional programs (eccentric cycling). Included articles were critically appraised using the Physiotherapy Evidence Database scale. Of the 507 articles retrieved, only 20 articles met the inclusion criteria. Of these, 14 were randomized and 6 were nonrandomized controlled articles in Parkinson disease or multiple sclerosis. Six randomized and 2 nonrandomized controlled articles originated from 3 trials and were subsequently pooled for systematic analysis. Strength training was found to significantly improve muscle strength in people with Parkinson disease (15%–83.2%) and multiple sclerosis (4.5%–36%). Significant improvements in mobility (11.4%) and disease progression were also reported in people with Parkinson disease after strength training. Furthermore, significant improvements in fatigue (8.2%), functional capacity (21.5%), quality of life (8.3%), power (17.6%), and electromyography activity (24.4%) were found in individuals with multiple sclerosis after strength training. The limitations of the study were the heterogeneity of interventions and study outcomes in Parkinson disease and multiple sclerosis trials. Strength training is useful for increasing muscle strength in Parkinson disease and to a lesser extent multiple sclerosis. PMID:25634170
Cruickshank, Travis M; Reyes, Alvaro R; Ziman, Melanie R
2015-01-01
Strength training has, in recent years, been shown to be beneficial for people with Parkinson disease and multiple sclerosis. Consensus regarding its utility for these disorders nevertheless remains contentious among healthcare professionals. Greater clarity is required, especially in regards to the type and magnitude of effects as well as the response differences to strength training between individuals with Parkinson disease or multiple sclerosis. This study examines the effects, magnitude of those effects, and response differences to strength training between patients with Parkinson disease or multiple sclerosis. A comprehensive search of electronic databases including Physiotherapy Evidence Database scale, PubMed, EMBASE, Cochrane Central Register of Controlled Trials, and CINAHL was conducted from inception to July 2014. English articles investigating the effect of strength training for individuals with neurodegenerative disorders were selected. Strength training trials that met the inclusion criteria were found for individuals with Parkinson disease or multiple sclerosis. Individuals with Parkinson disease or multiple sclerosis were included in the study. Strength training interventions included traditional (free weights/machine exercises) and nontraditional programs (eccentric cycling). Included articles were critically appraised using the Physiotherapy Evidence Database scale. Of the 507 articles retrieved, only 20 articles met the inclusion criteria. Of these, 14 were randomized and 6 were nonrandomized controlled articles in Parkinson disease or multiple sclerosis. Six randomized and 2 nonrandomized controlled articles originated from 3 trials and were subsequently pooled for systematic analysis. Strength training was found to significantly improve muscle strength in people with Parkinson disease (15%-83.2%) and multiple sclerosis (4.5%-36%). Significant improvements in mobility (11.4%) and disease progression were also reported in people with Parkinson disease after strength training. Furthermore, significant improvements in fatigue (8.2%), functional capacity (21.5%), quality of life (8.3%), power (17.6%), and electromyography activity (24.4%) were found in individuals with multiple sclerosis after strength training. The limitations of the study were the heterogeneity of interventions and study outcomes in Parkinson disease and multiple sclerosis trials. Strength training is useful for increasing muscle strength in Parkinson disease and to a lesser extent multiple sclerosis.
Mahar, Alyson L; Cobigo, Virginie; Stuart, Heather
2013-06-01
To develop a transdisciplinary conceptualization of social belonging that could be used to guide measurement approaches aimed at evaluating the effectiveness of community-based programs for people with disabilities. We conducted a narrative, scoping review of peer reviewed English language literature published between 1990 and July 2011 using multiple databases, with "sense of belonging" as a key search term. The search engine ranked articles for relevance to the search strategy. Articles were searched in order until theoretical saturation was reached. We augmented this search strategy by reviewing reference lists of relevant papers. Theoretical saturation was reached after 40 articles; 22 of which were qualitative accounts. We identified five intersecting themes: subjectivity; groundedness to an external referent; reciprocity; dynamism and self-determination. We define a sense of belonging as a subjective feeling of value and respect derived from a reciprocal relationship to an external referent that is built on a foundation of shared experiences, beliefs or personal characteristics. These feelings of external connectedness are grounded to the context or referent group, to whom one chooses, wants and feels permission to belong. This dynamic phenomenon may be either hindered or promoted by complex interactions between environmental and personal factors.
Barrett, Annette; Terry, Daniel R; Lê, Quynh; Hoang, Ha
2016-02-01
This review sought to better understand the issues and challenges experienced by community nurses working in rural areas and how these factors shape their role. Databases were searched to identify relevant studies, published between 1990 and 2015, that focussed on issues and challenges experienced by rural community nurses. Generic and grey literature relating to the subject was also searched. The search was systematically conducted multiple times to assure accuracy. A total of 14 articles met the inclusion criteria. This critical review identified common issues impacting community nursing and included role definition, organisational change, human resource, workplace and geographic challenges. Community nurses are flexible, autonomous, able to adapt care to the service delivery setting, and have a diversity of knowledge and skills. Considerably more research is essential to identify factors that impact rural community nursing practice. In addition, greater advocacy is required to develop the role.
Flowers, Natalie L
2010-01-01
CodeSlinger is a desktop application that was developed to aid medical professionals in the intertranslation, exploration, and use of biomedical coding schemes. The application was designed to provide a highly intuitive, easy-to-use interface that simplifies a complex business problem: a set of time-consuming, laborious tasks that were regularly performed by a group of medical professionals involving manually searching coding books, searching the Internet, and checking documentation references. A workplace observation session with a target user revealed the details of the current process and a clear understanding of the business goals of the target user group. These goals drove the design of the application's interface, which centers on searches for medical conditions and displays the codes found in the application's database that represent those conditions. The interface also allows the exploration of complex conceptual relationships across multiple coding schemes.
Mapping the literature of transcultural nursing*
Murphy, Sharon C.
2006-01-01
Overview: No bibliometric studies of the literature of the field of transcultural nursing have been published. This paper describes a citation analysis as part of the project undertaken by the Nursing and Allied Health Resources Section of the Medical Library Association to map the literature of nursing. Objective: The purpose of this study was to identify the core literature and determine which databases provided the most complete access to the transcultural nursing literature. Methods: Cited references from essential source journals were analyzed for a three-year period. Eight major databases were compared for indexing coverage of the identified core list of journals. Results: This study identifies 138 core journals. Transcultural nursing relies on journal literature from associated health sciences fields in addition to nursing. Books provide an important format. Nearly all cited references were from the previous 18 years. In comparing indexing coverage among 8 major databases, 3 databases rose to the top. Conclusions: No single database can claim comprehensive indexing coverage for this broad field. It is essential to search multiple databases. Based on this study, PubMed/MEDLINE, Social Sciences Citation Index, and CINAHL provide the best coverage. Collections supporting transcultural nursing require robust access to literature beyond nursing publications. PMID:16710461
Mathematical Notation in Bibliographic Databases.
ERIC Educational Resources Information Center
Pasterczyk, Catherine E.
1990-01-01
Discusses ways in which using mathematical symbols to search online bibliographic databases in scientific and technical areas can improve search results. The representations used for Greek letters, relations, binary operators, arrows, and miscellaneous special symbols in the MathSci, Inspec, Compendex, and Chemical Abstracts databases are…
Su, Xiaoquan; Xu, Jian; Ning, Kang
2012-10-01
It has long been intriguing scientists to effectively compare different microbial communities (also referred as 'metagenomic samples' here) in a large scale: given a set of unknown samples, find similar metagenomic samples from a large repository and examine how similar these samples are. With the current metagenomic samples accumulated, it is possible to build a database of metagenomic samples of interests. Any metagenomic samples could then be searched against this database to find the most similar metagenomic sample(s). However, on one hand, current databases with a large number of metagenomic samples mostly serve as data repositories that offer few functionalities for analysis; and on the other hand, methods to measure the similarity of metagenomic data work well only for small set of samples by pairwise comparison. It is not yet clear, how to efficiently search for metagenomic samples against a large metagenomic database. In this study, we have proposed a novel method, Meta-Storms, that could systematically and efficiently organize and search metagenomic data. It includes the following components: (i) creating a database of metagenomic samples based on their taxonomical annotations, (ii) efficient indexing of samples in the database based on a hierarchical taxonomy indexing strategy, (iii) searching for a metagenomic sample against the database by a fast scoring function based on quantitative phylogeny and (iv) managing database by index export, index import, data insertion, data deletion and database merging. We have collected more than 1300 metagenomic data from the public domain and in-house facilities, and tested the Meta-Storms method on these datasets. Our experimental results show that Meta-Storms is capable of database creation and effective searching for a large number of metagenomic samples, and it could achieve similar accuracies compared with the current popular significance testing-based methods. Meta-Storms method would serve as a suitable database management and search system to quickly identify similar metagenomic samples from a large pool of samples. ningkang@qibebt.ac.cn Supplementary data are available at Bioinformatics online.
Secure searching of biomarkers through hybrid homomorphic encryption scheme.
Kim, Miran; Song, Yongsoo; Cheon, Jung Hee
2017-07-26
As genome sequencing technology develops rapidly, there has lately been an increasing need to keep genomic data secure even when stored in the cloud and still used for research. We are interested in designing a protocol for the secure outsourcing matching problem on encrypted data. We propose an efficient method to securely search a matching position with the query data and extract some information at the position. After decryption, only a small amount of comparisons with the query information should be performed in plaintext state. We apply this method to find a set of biomarkers in encrypted genomes. The important feature of our method is to encode a genomic database as a single element of polynomial ring. Since our method requires a single homomorphic multiplication of hybrid scheme for query computation, it has the advantage over the previous methods in parameter size, computation complexity, and communication cost. In particular, the extraction procedure not only prevents leakage of database information that has not been queried by user but also reduces the communication cost by half. We evaluate the performance of our method and verify that the computation on large-scale personal data can be securely and practically outsourced to a cloud environment during data analysis. It takes about 3.9 s to search-and-extract the reference and alternate sequences at the queried position in a database of size 4M. Our solution for finding a set of biomarkers in DNA sequences shows the progress of cryptographic techniques in terms of their capability can support real-world genome data analysis in a cloud environment.
ERIC Educational Resources Information Center
Lowe, M. Sara; Maxson, Bronwen K.; Stone, Sean M.; Miller, Willie; Snajdr, Eric; Hanna, Kathleen
2018-01-01
Boolean logic can be a difficult concept for first-year, introductory students to grasp. This paper compares the results of Boolean and natural language searching across several databases with searches created from student research questions. Performance differences between databases varied. Overall, natural search language is at least as good as…
A World Wide Web (WWW) server database engine for an organelle database, MitoDat.
Lemkin, P F; Chipperfield, M; Merril, C; Zullo, S
1996-03-01
We describe a simple database search engine "dbEngine" which may be used to quickly create a searchable database on a World Wide Web (WWW) server. Data may be prepared from spreadsheet programs (such as Excel, etc.) or from tables exported from relationship database systems. This Common Gateway Interface (CGI-BIN) program is used with a WWW server such as available commercially, or from National Center for Supercomputer Algorithms (NCSA) or CERN. Its capabilities include: (i) searching records by combinations of terms connected with ANDs or ORs; (ii) returning search results as hypertext links to other WWW database servers; (iii) mapping lists of literature reference identifiers to the full references; (iv) creating bidirectional hypertext links between pictures and the database. DbEngine has been used to support the MitoDat database (Mendelian and non-Mendelian inheritance associated with the Mitochondrion) on the WWW.
A comprehensive and scalable database search system for metaproteomics.
Chatterjee, Sandip; Stupp, Gregory S; Park, Sung Kyu Robin; Ducom, Jean-Christophe; Yates, John R; Su, Andrew I; Wolan, Dennis W
2016-08-16
Mass spectrometry-based shotgun proteomics experiments rely on accurate matching of experimental spectra against a database of protein sequences. Existing computational analysis methods are limited in the size of their sequence databases, which severely restricts the proteomic sequencing depth and functional analysis of highly complex samples. The growing amount of public high-throughput sequencing data will only exacerbate this problem. We designed a broadly applicable metaproteomic analysis method (ComPIL) that addresses protein database size limitations. Our approach to overcome this significant limitation in metaproteomics was to design a scalable set of sequence databases assembled for optimal library querying speeds. ComPIL was integrated with a modified version of the search engine ProLuCID (termed "Blazmass") to permit rapid matching of experimental spectra. Proof-of-principle analysis of human HEK293 lysate with a ComPIL database derived from high-quality genomic libraries was able to detect nearly all of the same peptides as a search with a human database (~500x fewer peptides in the database), with a small reduction in sensitivity. We were also able to detect proteins from the adenovirus used to immortalize these cells. We applied our method to a set of healthy human gut microbiome proteomic samples and showed a substantial increase in the number of identified peptides and proteins compared to previous metaproteomic analyses, while retaining a high degree of protein identification accuracy and allowing for a more in-depth characterization of the functional landscape of the samples. The combination of ComPIL with Blazmass allows proteomic searches to be performed with database sizes much larger than previously possible. These large database searches can be applied to complex meta-samples with unknown composition or proteomic samples where unexpected proteins may be identified. The protein database, proteomic search engine, and the proteomic data files for the 5 microbiome samples characterized and discussed herein are open source and available for use and additional analysis.
Moskovets, Eugene; Goloborodko, Anton A; Gorshkov, Alexander V; Gorshkov, Mikhail V
2012-07-01
A two-dimensional (2-D) liquid chromatography (LC) separation of complex peptide mixtures that combines a normal phase utilizing hydrophilic interactions and a reversed phase offers reportedly the highest level of 2-D LC orthogonality by providing an even spread of peptides across multiple LC fractions. Matching experimental peptide retention times to those predicted by empirical models describing chromatographic separation in each LC dimension leads to a significant reduction in a database search space. In this work, we calculated the retention times of tryptic peptides separated in the C18 reversed phase at different separation conditions (pH 2 and pH 10) and in TSK gel Amide-80 normal phase. We show that retention times calculated for different 2-D LC separation schemes utilizing these phases start to correlate once the mass range of peptides under analysis becomes progressively narrow. This effect is explained by high degree of correlation between retention coefficients in the considered phases. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology
Latendresse, Mario; Paley, Suzanne M.; Krummenacker, Markus; Ong, Quang D.; Billington, Richard; Kothari, Anamika; Weaver, Daniel; Lee, Thomas; Subhraveti, Pallavi; Spaulding, Aaron; Fulcher, Carol; Keseler, Ingrid M.; Caspi, Ron
2016-01-01
Pathway Tools is a bioinformatics software environment with a broad set of capabilities. The software provides genome-informatics tools such as a genome browser, sequence alignments, a genome-variant analyzer and comparative-genomics operations. It offers metabolic-informatics tools, such as metabolic reconstruction, quantitative metabolic modeling, prediction of reaction atom mappings and metabolic route search. Pathway Tools also provides regulatory-informatics tools, such as the ability to represent and visualize a wide range of regulatory interactions. This article outlines the advances in Pathway Tools in the past 5 years. Major additions include components for metabolic modeling, metabolic route search, computation of atom mappings and estimation of compound Gibbs free energies of formation; addition of editors for signaling pathways, for genome sequences and for cellular architecture; storage of gene essentiality data and phenotype data; display of multiple alignments, and of signaling and electron-transport pathways; and development of Python and web-services application programming interfaces. Scientists around the world have created more than 9800 Pathway/Genome Databases by using Pathway Tools, many of which are curated databases for important model organisms. PMID:26454094
Paracetamol (acetaminophen) poisoning.
Park, B Kevin; Dear, James W; Antoine, Daniel J
2015-10-19
Paracetamol directly causes around 150 deaths per year in UK. We conducted a systematic overview, aiming to answer the following clinical question: What are the effects of treatments for acute paracetamol poisoning? We searched: Medline, Embase, The Cochrane Library, and other important databases up to October 2014 (Clinical Evidence overviews are updated periodically; please check our website for the most up-to-date version of this overview). At this update, searching of electronic databases retrieved 127 studies. After deduplication and removal of conference abstracts, 64 records were screened for inclusion in the overview. Appraisal of titles and abstracts led to the exclusion of 46 studies and the further review of 18 full publications. Of the 18 full articles evaluated, one systematic review was updated and one RCT was added at this update. In addition, two systematic reviews and three RCTs not meeting our inclusion criteria were added to the Comment sections. We performed a GRADE evaluation for three PICO combinations. In this systematic overview we categorised the efficacy for six interventions, based on information about the effectiveness and safety of activated charcoal (single or multiple dose), gastric lavage, haemodialysis, liver transplant, methionine, and acetylcysteine.
Network meta-analyses could be improved by searching more sources and by involving a librarian.
Li, Lun; Tian, Jinhui; Tian, Hongliang; Moher, David; Liang, Fuxiang; Jiang, Tongxiao; Yao, Liang; Yang, Kehu
2014-09-01
Network meta-analyses (NMAs) aim to rank the benefits (or harms) of interventions, based on all available randomized controlled trials. Thus, the identification of relevant data is critical. We assessed the conduct of the literature searches in NMAs. Published NMAs were retrieved by searching electronic bibliographic databases and other sources. Two independent reviewers selected studies and five trained reviewers abstracted data regarding literature searches, in duplicate. Search method details were examined using descriptive statistics. Two hundred forty-nine NMAs were included. Eight used previous systematic reviews to identify primary studies without further searching, and five did not report any literature searches. In the 236 studies that used electronic databases to identify primary studies, the median number of databases was 3 (interquartile range: 3-5). MEDLINE, EMBASE, and Cochrane Central Register of Controlled Trials were the most commonly used databases. The most common supplemental search methods included reference lists of included studies (48%), reference lists of previous systematic reviews (40%), and clinical trial registries (32%). None of these supplemental methods was conducted in more than 50% of the NMAs. Literature searches in NMAs could be improved by searching more sources, and by involving a librarian or information specialist. Copyright © 2014 Elsevier Inc. All rights reserved.
Single- or multiple-visit endodontics: which technique results in fewest postoperative problems?
Balto, Khaled
2009-01-01
The Cochrane Central Register of Controlled Trials, Medline, Embase, six thesis databases (Networked Digital Library of Theses and Dissertations, Proquest Digital Dissertations, OAIster, Index to Theses, Australian Digital Thesis Program and Dissertation.com) and one conference report database (BIOSIS Previews) were searched. There were no language restrictions. Studies were included if subjects had a noncontributory medical history; underwent nonsurgical root canal treatment during the study; there was comparison between single- and multiple-visit root canal treatment; and if outcome was measured in terms of pain degree or prevalence of flare-up. Data were extracted using a standard data extraction sheet. Because of variations in recorded outcomes and methodological and clinical heterogeneity, a meta-analysis was not carried out, although a qualitative synthesis was presented. Sixteen studies fitted the inclusion criteria in the review, with sample size varying from 60-1012 cases. The prevalence of postoperative pain ranged from 3-58%. The heterogeneity of the included studies was far too great to yield meaningful results from a meta-analysis. Compelling evidence is lacking to indicate any significantly different prevalence of postoperative pain or flare-up following either single- or multiple-visit root canal treatment.
cMapper: gene-centric connectivity mapper for EBI-RDF platform.
Shoaib, Muhammad; Ansari, Adnan Ahmad; Ahn, Sung-Min
2017-01-15
In this era of biological big data, data integration has become a common task and a challenge for biologists. The Resource Description Framework (RDF) was developed to enable interoperability of heterogeneous datasets. The EBI-RDF platform enables an efficient data integration of six independent biological databases using RDF technologies and shared ontologies. However, to take advantage of this platform, biologists need to be familiar with RDF technologies and SPARQL query language. To overcome this practical limitation of the EBI-RDF platform, we developed cMapper, a web-based tool that enables biologists to search the EBI-RDF databases in a gene-centric manner without a thorough knowledge of RDF and SPARQL. cMapper allows biologists to search data entities in the EBI-RDF platform that are connected to genes or small molecules of interest in multiple biological contexts. The input to cMapper consists of a set of genes or small molecules, and the output are data entities in six independent EBI-RDF databases connected with the given genes or small molecules in the user's query. cMapper provides output to users in the form of a graph in which nodes represent data entities and the edges represent connections between data entities and inputted set of genes or small molecules. Furthermore, users can apply filters based on database, taxonomy, organ and pathways in order to focus on a core connectivity graph of their interest. Data entities from multiple databases are differentiated based on background colors. cMapper also enables users to investigate shared connections between genes or small molecules of interest. Users can view the output graph on a web browser or download it in either GraphML or JSON formats. cMapper is available as a web application with an integrated MySQL database. The web application was developed using Java and deployed on Tomcat server. We developed the user interface using HTML5, JQuery and the Cytoscape Graph API. cMapper can be accessed at http://cmapper.ewostech.net Readers can download the development manual from the website http://cmapper.ewostech.net/docs/cMapperDocumentation.pdf. Source Code is available at https://github.com/muhammadshoaib/cmapperContact:smahn@gachon.ac.krSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Data Discovery and Access via the Heliophysics Events Knowledgebase (HEK)
NASA Astrophysics Data System (ADS)
Somani, A.; Hurlburt, N. E.; Schrijver, C. J.; Cheung, M.; Freeland, S.; Slater, G. L.; Seguin, R.; Timmons, R.; Green, S.; Chang, L.; Kobashi, A.; Jaffey, A.
2011-12-01
The HEK is a integrated system which helps direct scientists to solar events and data from a variety of providers. The system is fully operational and adoption of HEK has been growing since the launch of NASA's SDO mission. In this presentation we describe the different components that comprise HEK. The Heliophysics Events Registry (HER) and Heliophysics Coverage Registry (HCR) form the two major databases behind the system. The HCR allows the user to search on coverage event metadata for a variety of instruments. The HER allows the user to search on annotated event metadata for a variety of instruments. Both the HCR and HER are accessible via a web API which can return search results in machine readable formats (e.g. XML and JSON). A variety of SolarSoft services are also provided to allow users to search the HEK as well as obtain and manipulate data. Other components include - the Event Detection System (EDS) continually runs feature finding algorithms on SDO data to populate the HER with relevant events, - A web form for users to request SDO data cutouts for multiple AIA channels as well as HMI line-of-sight magnetograms, - iSolSearch, which allows a user to browse events in the HER and search for specific events over a specific time interval, all within a graphical web page, - Panorama, which is the software tool used for rapid visualization of large volumes of solar image data in multiple channels/wavelengths. The user can also easily create WYSIWYG movies and launch the Annotator tool to describe events and features. - EVACS, which provides a JOGL powered client for the HER and HCR. EVACS displays the searched for events on a full disk magnetogram of the sun while displaying more detailed information for events.
Teaching Data Base Search Strategies.
ERIC Educational Resources Information Center
Hannah, Larry
1987-01-01
Discusses database searching as a method for developing thinking skills, and describes an activity suitable for fifth grade through high school using a president's and vice president's database. Teaching methods are presented, including student team activities, and worksheets designed for the AppleWorks database are included. (LRW)
Aagaard, Thomas; Lund, Hans; Juhl, Carsten
2016-11-22
When conducting systematic reviews, it is essential to perform a comprehensive literature search to identify all published studies relevant to the specific research question. The Cochrane Collaborations Methodological Expectations of Cochrane Intervention Reviews (MECIR) guidelines state that searching MEDLINE, EMBASE and CENTRAL should be considered mandatory. The aim of this study was to evaluate the MECIR recommendations to use MEDLINE, EMBASE and CENTRAL combined, and examine the yield of using these to find randomized controlled trials (RCTs) within the area of musculoskeletal disorders. Data sources were systematic reviews published by the Cochrane Musculoskeletal Review Group, including at least five RCTs, reporting a search history, searching MEDLINE, EMBASE, CENTRAL, and adding reference- and hand-searching. Additional databases were deemed eligible if they indexed RCTs, were in English and used in more than three of the systematic reviews. Relative recall was calculated as the number of studies identified by the literature search divided by the number of eligible studies i.e. included studies in the individual systematic reviews. Finally, cumulative median recall was calculated for MEDLINE, EMBASE and CENTRAL combined followed by the databases yielding additional studies. Deemed eligible was twenty-three systematic reviews and the databases included other than MEDLINE, EMBASE and CENTRAL was AMED, CINAHL, HealthSTAR, MANTIS, OT-Seeker, PEDro, PsychINFO, SCOPUS, SportDISCUS and Web of Science. Cumulative median recall for combined searching in MEDLINE, EMBASE and CENTRAL was 88.9% and increased to 90.9% when adding 10 additional databases. Searching MEDLINE, EMBASE and CENTRAL was not sufficient for identifying all effect studies on musculoskeletal disorders, but additional ten databases did only increase the median recall by 2%. It is possible that searching databases is not sufficient to identify all relevant references, and that reviewers must rely upon additional sources in their literature search. However further research is needed.
A Web-based Tool for SDSS and 2MASS Database Searches
NASA Astrophysics Data System (ADS)
Hendrickson, M. A.; Uomoto, A.; Golimowski, D. A.
We have developed a web site using HTML, Php, Python, and MySQL that extracts, processes, and displays data from the Sloan Digital Sky Survey (SDSS) and the Two-Micron All-Sky Survey (2MASS). The goal is to locate brown dwarf candidates in the SDSS database by looking at color cuts; however, this site could also be useful for targeted searches of other databases as well. MySQL databases are created from broad searches of SDSS and 2MASS data. Broad queries on the SDSS and 2MASS database servers are run weekly so that observers have the most up-to-date information from which to select candidates for observation. Observers can look at detailed information about specific objects including finding charts, images, and available spectra. In addition, updates from previous observations can be added by any collaborators; this format makes observational collaboration simple. Observers can also restrict the database search, just before or during an observing run, to select objects of special interest.
Sherwin, Trevor; Gilhotra, Amardeep K
2006-02-01
Literature databases are an ever-expanding resource available to the field of medical sciences. Understanding how to use such databases efficiently is critical for those involved in research. However, for the uninitiated, getting started is a major hurdle to overcome and for the occasional user, the finer points of database searching remain an unacquired skill. In the fifth and final article in this series aimed at those embarking on ophthalmology and vision science research, we look at how the beginning researcher can start to use literature databases and, by using a stepwise approach, how they can optimize their use. This instructional paper gives a hypothetical example of a researcher writing a review article and how he or she acquires the necessary scientific literature for the article. A prototype search of the Medline database is used to illustrate how even a novice might swiftly acquire the skills required for a medium-level search. It provides examples and key tips that can increase the proficiency of the occasional user. Pitfalls of database searching are discussed, as are the limitations of which the user should be aware.
SUGAR-FREE CHEWING GUM AND DENTAL CARIES – A SYSTEMATIC REVIEW
Mickenautsch, Steffen; Leal, Soraya Coelho; Yengopal, Veerasamy; Bezerra, Ana Cristina; Cruvinel, Vanessa
2007-01-01
Objective: To appraise existing evidence for a therapeutic / anti-cariogenic effect of sugar-free chewing gum for patients. Method: 9 English and 2 Portuguese databases were searched using English and Portuguese keywords. Relevant articles in English, German, Portuguese and Spanish were included for review. Trials were excluded on lack of randomisation, control group, blinding and baseline data, drop out rate >33%, no statistical adjustment of baseline differences and no assessment of clinically important outcomes. Reviews were excluded on lack of information, article selection criteria, search strategy followed, search keywords, searched databases or lack of study-by-study critique tables. In cases of multiple reports from the same study, the report covering the longest period was included. Two reviewers independently reviewed and assessed the quality of accepted articles. Results: Thirty-nine articles were included for review. Thirty were excluded and 9 accepted. Of the 9 accepted, 2 trials of reasonable and good evidence value did not demonstrate any anti-cariogenic effect of sugar-free chewing gum. However, 7 articles, with 1 of strong, and 6 of good evidence value, demonstrated anti-cariogenic effects of chewing Sorbitol, Xylitol or Sorbitol/Xylitol gum. This effect can be ascribed to saliva stimulation through the chewing process, particularly when gum is used immediately after meals; the lack of sucrose and the inability of bacteria to metabolize polyols into acids. Conclusion: The evidence suggests that sugar-free chewing gum has a caries-reducing effect. Further well-designed randomised trials are needed to confirm these findings. PMID:19089107
Faster sequence homology searches by clustering subsequences.
Suzuki, Shuji; Kakuta, Masanori; Ishida, Takashi; Akiyama, Yutaka
2015-04-15
Sequence homology searches are used in various fields. New sequencing technologies produce huge amounts of sequence data, which continuously increase the size of sequence databases. As a result, homology searches require large amounts of computational time, especially for metagenomic analysis. We developed a fast homology search method based on database subsequence clustering, and implemented it as GHOSTZ. This method clusters similar subsequences from a database to perform an efficient seed search and ungapped extension by reducing alignment candidates based on triangle inequality. The database subsequence clustering technique achieved an ∼2-fold increase in speed without a large decrease in search sensitivity. When we measured with metagenomic data, GHOSTZ is ∼2.2-2.8 times faster than RAPSearch and is ∼185-261 times faster than BLASTX. The source code is freely available for download at http://www.bi.cs.titech.ac.jp/ghostz/ akiyama@cs.titech.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Search Filter Precision Can Be Improved By NOTing Out Irrelevant Content
Wilczynski, Nancy L.; McKibbon, K. Ann; Haynes, R. Brian
2011-01-01
Background: Most methodologic search filters developed for use in large electronic databases such as MEDLINE have low precision. One method that has been proposed but not tested for improving precision is NOTing out irrelevant content. Objective: To determine if search filter precision can be improved by NOTing out the text words and index terms assigned to those articles that are retrieved but are off-target. Design: Analytic survey. Methods: NOTing out unique terms in off-target articles and testing search filter performance in the Clinical Hedges Database. Main Outcome Measures: Sensitivity, specificity, precision and number needed to read (NNR). Results: For all purpose categories (diagnosis, prognosis and etiology) except treatment and for all databases (MEDLINE, EMBASE, CINAHL and PsycINFO), constructing search filters that NOTed out irrelevant content resulted in substantive improvements in NNR (over four-fold for some purpose categories and databases). Conclusion: Search filter precision can be improved by NOTing out irrelevant content. PMID:22195215
ERIC Educational Resources Information Center
Cavaleri, Piero
2008-01-01
Purpose: The purpose of this paper is to describe the use of AJAX for searching the Biblioteche Oggi database of bibliographic records. Design/methodology/approach: The paper is a demonstration of how bibliographic database single page interfaces allow the implementation of more user-friendly features for social and collaborative tasks. Findings:…
Amick, G D
1999-01-01
A database containing names of mass spectral data files generated in a forensic toxicology laboratory and two Microsoft Visual Basic programs to maintain and search this database is described. The data files (approximately 0.5 KB/each) were collected from six mass spectrometers during routine casework. Data files were archived on 650 MB (74 min) recordable CD-ROMs. Each recordable CD-ROM was given a unique name, and its list of data file names was placed into the database. The present manuscript describes the use of search and maintenance programs for searching and routine upkeep of the database and creation of CD-ROMs for archiving of data files.
Description of 'REQUEST-KYUSHYU' for KYUKEICHO regional data base
NASA Astrophysics Data System (ADS)
Takimoto, Shin'ichi
Kyushu Economic Research Association (a foundational juridical person) initiated the regional database services, ' REQUEST-Kyushu ' recently. It is the full scale databases compiled based on the information and know-hows which the Association has accumulated over forty years. It covers the regional information database for journal and newspaper articles, and statistical information database for economic statistics. As to the former database it is searched on a personal computer and then a search result (original text) is sent through a facsimile. As to the latter, it is also searched on a personal computer where the data is processed, edited or downloaded. This paper describes characteristics, content and the system outline of 'REQUEST-Kyushu'.
Côté, Richard G; Jones, Philip; Martens, Lennart; Kerrien, Samuel; Reisinger, Florian; Lin, Quan; Leinonen, Rasko; Apweiler, Rolf; Hermjakob, Henning
2007-10-18
Each major protein database uses its own conventions when assigning protein identifiers. Resolving the various, potentially unstable, identifiers that refer to identical proteins is a major challenge. This is a common problem when attempting to unify datasets that have been annotated with proteins from multiple data sources or querying data providers with one flavour of protein identifiers when the source database uses another. Partial solutions for protein identifier mapping exist but they are limited to specific species or techniques and to a very small number of databases. As a result, we have not found a solution that is generic enough and broad enough in mapping scope to suit our needs. We have created the Protein Identifier Cross-Reference (PICR) service, a web application that provides interactive and programmatic (SOAP and REST) access to a mapping algorithm that uses the UniProt Archive (UniParc) as a data warehouse to offer protein cross-references based on 100% sequence identity to proteins from over 70 distinct source databases loaded into UniParc. Mappings can be limited by source database, taxonomic ID and activity status in the source database. Users can copy/paste or upload files containing protein identifiers or sequences in FASTA format to obtain mappings using the interactive interface. Search results can be viewed in simple or detailed HTML tables or downloaded as comma-separated values (CSV) or Microsoft Excel (XLS) files suitable for use in a local database or a spreadsheet. Alternatively, a SOAP interface is available to integrate PICR functionality in other applications, as is a lightweight REST interface. We offer a publicly available service that can interactively map protein identifiers and protein sequences to the majority of commonly used protein databases. Programmatic access is available through a standards-compliant SOAP interface or a lightweight REST interface. The PICR interface, documentation and code examples are available at http://www.ebi.ac.uk/Tools/picr.
Côté, Richard G; Jones, Philip; Martens, Lennart; Kerrien, Samuel; Reisinger, Florian; Lin, Quan; Leinonen, Rasko; Apweiler, Rolf; Hermjakob, Henning
2007-01-01
Background Each major protein database uses its own conventions when assigning protein identifiers. Resolving the various, potentially unstable, identifiers that refer to identical proteins is a major challenge. This is a common problem when attempting to unify datasets that have been annotated with proteins from multiple data sources or querying data providers with one flavour of protein identifiers when the source database uses another. Partial solutions for protein identifier mapping exist but they are limited to specific species or techniques and to a very small number of databases. As a result, we have not found a solution that is generic enough and broad enough in mapping scope to suit our needs. Results We have created the Protein Identifier Cross-Reference (PICR) service, a web application that provides interactive and programmatic (SOAP and REST) access to a mapping algorithm that uses the UniProt Archive (UniParc) as a data warehouse to offer protein cross-references based on 100% sequence identity to proteins from over 70 distinct source databases loaded into UniParc. Mappings can be limited by source database, taxonomic ID and activity status in the source database. Users can copy/paste or upload files containing protein identifiers or sequences in FASTA format to obtain mappings using the interactive interface. Search results can be viewed in simple or detailed HTML tables or downloaded as comma-separated values (CSV) or Microsoft Excel (XLS) files suitable for use in a local database or a spreadsheet. Alternatively, a SOAP interface is available to integrate PICR functionality in other applications, as is a lightweight REST interface. Conclusion We offer a publicly available service that can interactively map protein identifiers and protein sequences to the majority of commonly used protein databases. Programmatic access is available through a standards-compliant SOAP interface or a lightweight REST interface. The PICR interface, documentation and code examples are available at . PMID:17945017
How much is the cost of multiple sclerosis--systematic literature review.
Kolasa, Katarzyna
2013-01-01
In Poland, a data on MS costs is lacking. The systematic review of cost of illness studies was conducted to estimate the average annual cost of MS patient and its breakdown. The PubMed database was searched for relevant literature. Following search criteria were used: "multiple sclerosis", "costs", "cost of illness" and "disease burden". Articles written in English including total costs published 2002-2012 were included. In total 17 studies were classified. The costs were re-calculated into USD Purchasing Power Parity (PPP). The available approach from the literature was used for the cost breakdown presentation. The average patient was 47 years old with EDSS equals 4 and 13 years from the date of diagnosis. The average annual cost was 41 133 US$ PPP. The direct costs did not exceed 70% of total costs in any study. The pharmaceutical expenses were one of the most important contributors to the direct costs. Only 40% of patients were active on the labor market what translated into the loss of productivity and consequently an increase in total costs. The preformed systematic review revealed that multiple sclerosis imposes a huge economic burden on the healthcare system and society. It happens due to productivity loss and caregiver burden.
Individual differences in working memory capacity predict learned control over attentional capture.
Robison, Matthew K; Unsworth, Nash
2017-11-01
Although individual differences in working memory capacity (WMC) typically predict susceptibility to attentional capture in various paradigms (e.g., Stroop, antisaccade, flankers), it sometimes fails to correlate with the magnitude of attentional capture effects in visual search (e.g., Stokes, 2016), which is 1 of the most frequently studied tasks to study capture (Theeuwes, 2010). But some studies have shown that search modes can mitigate the effects of attentional capture (Leber & Egeth, 2006). Therefore, the present study examined whether or not the relationship between WMC and attentional capture changes as a function of the search modes available. In Experiment 1, WMC was unrelated to attentional capture, but only 1 search mode (singleton-detection) could be employed. In Experiment 2, greater WMC predicted smaller attentional capture effects, but only when multiple search modes (feature-search and singleton-detection) could be employed. Importantly this relationship was entirely independent of variation in attention control, which suggests that this effect is driven by WMC-related long-term memory differences (Cosman & Vecera, 2013a, 2013b). The present set of findings help to further our understanding of the nuanced ways in which memory and attention interact. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
RadSearch: a RIS/PACS integrated query tool
NASA Astrophysics Data System (ADS)
Tsao, Sinchai; Documet, Jorge; Moin, Paymann; Wang, Kevin; Liu, Brent J.
2008-03-01
Radiology Information Systems (RIS) contain a wealth of information that can be used for research, education, and practice management. However, the sheer amount of information available makes querying specific data difficult and time consuming. Previous work has shown that a clinical RIS database and its RIS text reports can be extracted, duplicated and indexed for searches while complying with HIPAA and IRB requirements. This project's intent is to provide a software tool, the RadSearch Toolkit, to allow intelligent indexing and parsing of RIS reports for easy yet powerful searches. In addition, the project aims to seamlessly query and retrieve associated images from the Picture Archiving and Communication System (PACS) in situations where an integrated RIS/PACS is in place - even subselecting individual series, such as in an MRI study. RadSearch's application of simple text parsing techniques to index text-based radiology reports will allow the search engine to quickly return relevant results. This powerful combination will be useful in both private practice and academic settings; administrators can easily obtain complex practice management information such as referral patterns; researchers can conduct retrospective studies with specific, multiple criteria; teaching institutions can quickly and effectively create thorough teaching files.
The CIS Database: Occupational Health and Safety Information Online.
ERIC Educational Resources Information Center
Siegel, Herbert; Scurr, Erica
1985-01-01
Describes document acquisition, selection, indexing, and abstracting and discusses online searching of the CIS database, an online system produced by the International Occupational Safety and Health Information Centre. This database comprehensively covers information in the field of occupational health and safety. Sample searches and search…
Federated Search Tools in Fusion Centers: Bridging Databases in the Information Sharing Environment
2012-09-01
considerable variation in how fusion centers plan for, gather requirements, select and acquire federated search tools to bridge disparate databases...centers, when considering integrating federated search tools; by evaluating the importance of the planning, requirements gathering, selection and...acquisition processes for integrating federated search tools; by acknowledging the challenges faced by some fusion centers during these integration processes
Famulari, Stevie; Witz, Kyla
2015-01-01
Designers, students, teachers, gardeners, farmers, landscape architects, architects, engineers, homeowners, and others have uses for the practice of phytoremediation. This research looks at the creation of a phytoremediation database which is designed for ease of use for a non-scientific user, as well as for students in an educational setting ( http://www.steviefamulari.net/phytoremediation ). During 2012, Environmental Artist & Professor of Landscape Architecture Stevie Famulari, with assistance from Kyla Witz, a landscape architecture student, created an online searchable database designed for high public accessibility. The database is a record of research of plant species that aid in the uptake of contaminants, including metals, organic materials, biodiesels & oils, and radionuclides. The database consists of multiple interconnected indexes categorized into common and scientific plant name, contaminant name, and contaminant type. It includes photographs, hardiness zones, specific plant qualities, full citations to the original research, and other relevant information intended to aid those designing with phytoremediation search for potential plants which may be used to address their site's need. The objective of the terminology section is to remove uncertainty for more inexperienced users, and to clarify terms for a more user-friendly experience. Implications of the work, including education and ease of browsing, as well as use of the database in teaching, are discussed.
How to locate and appraise qualitative research in complementary and alternative medicine
2013-01-01
Background The aim of this publication is to present a case study of how to locate and appraise qualitative studies for the conduct of a meta-ethnography in the field of complementary and alternative medicine (CAM). CAM is commonly associated with individualized medicine. However, one established scientific approach to the individual, qualitative research, thus far has been explicitly used very rarely. This article demonstrates a case example of how qualitative research in the field of CAM studies was identified and critically appraised. Methods Several search terms and techniques were tested for the identification and appraisal of qualitative CAM research in the conduct of a meta-ethnography. Sixty-seven electronic databases were searched for the identification of qualitative CAM trials, including CAM databases, nursing, nutrition, psychological, social, medical databases, the Cochrane Library and DIMDI. Results 9578 citations were screened, 223 articles met the pre-specified inclusion criteria, 63 full text publications were reviewed, 38 articles were appraised qualitatively and 30 articles were included. The search began with PubMed, yielding 87% of the included publications of all databases with few additional relevant findings in the specific databases. CINHAL and DIMDI also revealed a high number of precise hits. Although CAMbase and CAM-QUEST® focus on CAM research only, almost no hits of qualitative trials were found there. Searching with broad text terms was the most effective search strategy in all databases. Conclusions This publication presents a case study on how to locate and appraise qualitative studies in the field of CAM. The example shows that the literature search for qualitative studies in the field of CAM is most effective when the search is begun in PubMed followed by CINHAL or DIMDI using broad text terms. Exclusive CAM databases delivered no additional findings to locate qualitative CAM studies. PMID:23731997
How to locate and appraise qualitative research in complementary and alternative medicine.
Franzel, Brigitte; Schwiegershausen, Martina; Heusser, Peter; Berger, Bettina
2013-06-03
The aim of this publication is to present a case study of how to locate and appraise qualitative studies for the conduct of a meta-ethnography in the field of complementary and alternative medicine (CAM). CAM is commonly associated with individualized medicine. However, one established scientific approach to the individual, qualitative research, thus far has been explicitly used very rarely. This article demonstrates a case example of how qualitative research in the field of CAM studies was identified and critically appraised. Several search terms and techniques were tested for the identification and appraisal of qualitative CAM research in the conduct of a meta-ethnography. Sixty-seven electronic databases were searched for the identification of qualitative CAM trials, including CAM databases, nursing, nutrition, psychological, social, medical databases, the Cochrane Library and DIMDI. 9578 citations were screened, 223 articles met the pre-specified inclusion criteria, 63 full text publications were reviewed, 38 articles were appraised qualitatively and 30 articles were included. The search began with PubMed, yielding 87% of the included publications of all databases with few additional relevant findings in the specific databases. CINHAL and DIMDI also revealed a high number of precise hits. Although CAMbase and CAM-QUEST® focus on CAM research only, almost no hits of qualitative trials were found there. Searching with broad text terms was the most effective search strategy in all databases. This publication presents a case study on how to locate and appraise qualitative studies in the field of CAM. The example shows that the literature search for qualitative studies in the field of CAM is most effective when the search is begun in PubMed followed by CINHAL or DIMDI using broad text terms. Exclusive CAM databases delivered no additional findings to locate qualitative CAM studies.
Biomedical Requirements for High Productivity Computing Systems
2005-04-01
server at http://www.ncbi.nlm.nih.gov/BLAST/. There are many variants of BLAST, including: 1. BLASTN - Compares a DNA query to a DNA database. Searches ...database (3 reading frames from each strand of the DNA) searching . 13 4. TBLASTN - Compares a protein query to a DNA database, in the 6 possible...the molecular during this phase. After eliminating molecules that could not match the query , an atom-by-atom search for the molecules in conducted
SAMMD: Staphylococcus aureus microarray meta-database.
Nagarajan, Vijayaraj; Elasri, Mohamed O
2007-10-02
Staphylococcus aureus is an important human pathogen, causing a wide variety of diseases ranging from superficial skin infections to severe life threatening infections. S. aureus is one of the leading causes of nosocomial infections. Its ability to resist multiple antibiotics poses a growing public health problem. In order to understand the mechanism of pathogenesis of S. aureus, several global expression profiles have been developed. These transcriptional profiles included regulatory mutants of S. aureus and growth of wild type under different growth conditions. The abundance of these profiles has generated a large amount of data without a uniform annotation system to comprehensively examine them. We report the development of the Staphylococcus aureus Microarray meta-database (SAMMD) which includes data from all the published transcriptional profiles. SAMMD is a web-accessible database that helps users to perform a variety of analysis against and within the existing transcriptional profiles. SAMMD is a relational database that uses MySQL as the back end and PHP/JavaScript/DHTML as the front end. The database is normalized and consists of five tables, which holds information about gene annotations, regulated gene lists, experimental details, references, and other details. SAMMD data is collected from the peer-reviewed published articles. Data extraction and conversion was done using perl scripts while data entry was done through phpMyAdmin tool. The database is accessible via a web interface that contains several features such as a simple search by ORF ID, gene name, gene product name, advanced search using gene lists, comparing among datasets, browsing, downloading, statistics, and help. The database is licensed under General Public License (GPL). SAMMD is hosted and available at http://www.bioinformatics.org/sammd/. Currently there are over 9500 entries for regulated genes, from 67 microarray experiments. SAMMD will help staphylococcal scientists to analyze their expression data and understand it at global level. It will also allow scientists to compare and contrast their transcriptome to that of the other published transcriptomes.
SAMMD: Staphylococcus aureus Microarray Meta-Database
Nagarajan, Vijayaraj; Elasri, Mohamed O
2007-01-01
Background Staphylococcus aureus is an important human pathogen, causing a wide variety of diseases ranging from superficial skin infections to severe life threatening infections. S. aureus is one of the leading causes of nosocomial infections. Its ability to resist multiple antibiotics poses a growing public health problem. In order to understand the mechanism of pathogenesis of S. aureus, several global expression profiles have been developed. These transcriptional profiles included regulatory mutants of S. aureus and growth of wild type under different growth conditions. The abundance of these profiles has generated a large amount of data without a uniform annotation system to comprehensively examine them. We report the development of the Staphylococcus aureus Microarray meta-database (SAMMD) which includes data from all the published transcriptional profiles. SAMMD is a web-accessible database that helps users to perform a variety of analysis against and within the existing transcriptional profiles. Description SAMMD is a relational database that uses MySQL as the back end and PHP/JavaScript/DHTML as the front end. The database is normalized and consists of five tables, which holds information about gene annotations, regulated gene lists, experimental details, references, and other details. SAMMD data is collected from the peer-reviewed published articles. Data extraction and conversion was done using perl scripts while data entry was done through phpMyAdmin tool. The database is accessible via a web interface that contains several features such as a simple search by ORF ID, gene name, gene product name, advanced search using gene lists, comparing among datasets, browsing, downloading, statistics, and help. The database is licensed under General Public License (GPL). Conclusion SAMMD is hosted and available at . Currently there are over 9500 entries for regulated genes, from 67 microarray experiments. SAMMD will help staphylococcal scientists to analyze their expression data and understand it at global level. It will also allow scientists to compare and contrast their transcriptome to that of the other published transcriptomes. PMID:17910768
Multi-Database Searching in the Behavioral Sciences--Part I: Basic Techniques and Core Databases.
ERIC Educational Resources Information Center
Angier, Jennifer J.; Epstein, Barbara A.
1980-01-01
Outlines practical searching techniques in seven core behavioral science databases accessing psychological literature: Psychological Abstracts, Social Science Citation Index, Biosis, Medline, Excerpta Medica, Sociological Abstracts, ERIC. Use of individual files is discussed and their relative strengths/weaknesses are compared. Appended is a list…
Subject Specific Databases: A Powerful Research Tool
ERIC Educational Resources Information Center
Young, Terrence E., Jr.
2004-01-01
Subject specific databases, or vortals (vertical portals), are databases that provide highly detailed research information on a particular topic. They are the smallest, most focused search tools on the Internet and, in recent years, they've been on the rise. Currently, more of the so-called "mainstream" search engines, subject directories, and…
Subject Retrieval from Full-Text Databases in the Humanities
ERIC Educational Resources Information Center
East, John W.
2007-01-01
This paper examines the problems involved in subject retrieval from full-text databases of secondary materials in the humanities. Ten such databases were studied and their search functionality evaluated, focusing on factors such as Boolean operators, document surrogates, limiting by subject area, proximity operators, phrase searching, wildcards,…
A Framework for Cloudy Model Optimization and Database Storage
NASA Astrophysics Data System (ADS)
Calvén, Emilia; Helton, Andrew; Sankrit, Ravi
2018-01-01
We present a framework for producing Cloudy photoionization models of the nebular emission from novae ejecta and storing a subset of the results in SQL database format for later usage. The database can be searched for models best fitting observed spectral line ratios. Additionally, the framework includes an optimization feature that can be used in tandem with the database to search for and improve on models by creating new Cloudy models while, varying the parameters. The database search and optimization can be used to explore the structures of nebulae by deriving their properties from the best-fit models. The goal is to provide the community with a large database of Cloudy photoionization models, generated from parameters reflecting conditions within novae ejecta, that can be easily fitted to observed spectral lines; either by directly accessing the database using the framework code or by usage of a website specifically made for this purpose.
ERIC Educational Resources Information Center
Jensen, Chad D.; Cushing, Christopher C.; Aylward, Brandon S.; Craig, James T.; Sorell, Danielle M.; Steele, Ric G.
2011-01-01
Objective: This study was designed to quantitatively evaluate the effectiveness of motivational interviewing (MI) interventions for adolescent substance use behavior change. Method: Literature searches of electronic databases were undertaken in addition to manual reference searches of identified review articles. Databases searched include…
The LAILAPS search engine: a feature model for relevance ranking in life science databases.
Lange, Matthias; Spies, Karl; Colmsee, Christian; Flemming, Steffen; Klapperstück, Matthias; Scholz, Uwe
2010-03-25
Efficient and effective information retrieval in life sciences is one of the most pressing challenge in bioinformatics. The incredible growth of life science databases to a vast network of interconnected information systems is to the same extent a big challenge and a great chance for life science research. The knowledge found in the Web, in particular in life-science databases, are a valuable major resource. In order to bring it to the scientist desktop, it is essential to have well performing search engines. Thereby, not the response time nor the number of results is important. The most crucial factor for millions of query results is the relevance ranking. In this paper, we present a feature model for relevance ranking in life science databases and its implementation in the LAILAPS search engine. Motivated by the observation of user behavior during their inspection of search engine result, we condensed a set of 9 relevance discriminating features. These features are intuitively used by scientists, who briefly screen database entries for potential relevance. The features are both sufficient to estimate the potential relevance, and efficiently quantifiable. The derivation of a relevance prediction function that computes the relevance from this features constitutes a regression problem. To solve this problem, we used artificial neural networks that have been trained with a reference set of relevant database entries for 19 protein queries. Supporting a flexible text index and a simple data import format, this concepts are implemented in the LAILAPS search engine. It can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases. LAILAPS is publicly available for SWISSPROT data at http://lailaps.ipk-gatersleben.de.
77 FR 6535 - Notice of Intent To Seek Approval To Collect Information
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-08
... information from participants: Contact information, affiliation, and database searching experience... and fax numbers, and email address. Six questions are asked regarding: database searching experience...
Searching mixed DNA profiles directly against profile databases.
Bright, Jo-Anne; Taylor, Duncan; Curran, James; Buckleton, John
2014-03-01
DNA databases have revolutionised forensic science. They are a powerful investigative tool as they have the potential to identify persons of interest in criminal investigations. Routinely, a DNA profile generated from a crime sample could only be searched for in a database of individuals if the stain was from single contributor (single source) or if a contributor could unambiguously be determined from a mixed DNA profile. This meant that a significant number of samples were unsuitable for database searching. The advent of continuous methods for the interpretation of DNA profiles offers an advanced way to draw inferential power from the considerable investment made in DNA databases. Using these methods, each profile on the database may be considered a possible contributor to a mixture and a likelihood ratio (LR) can be formed. Those profiles which produce a sufficiently large LR can serve as an investigative lead. In this paper empirical studies are described to determine what constitutes a large LR. We investigate the effect on a database search of complex mixed DNA profiles with contributors in equal proportions with dropout as a consideration, and also the effect of an incorrect assignment of the number of contributors to a profile. In addition, we give, as a demonstration of the method, the results using two crime samples that were previously unsuitable for database comparison. We show that effective management of the selection of samples for searching and the interpretation of the output can be highly informative. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Lee, Taein; Cheng, Chun-Huai; Ficklin, Stephen; Yu, Jing; Humann, Jodi; Main, Dorrie
2017-01-01
Abstract Tripal is an open-source database platform primarily used for development of genomic, genetic and breeding databases. We report here on the release of the Chado Loader, Chado Data Display and Chado Search modules to extend the functionality of the core Tripal modules. These new extension modules provide additional tools for (1) data loading, (2) customized visualization and (3) advanced search functions for supported data types such as organism, marker, QTL/Mendelian Trait Loci, germplasm, map, project, phenotype, genotype and their respective metadata. The Chado Loader module provides data collection templates in Excel with defined metadata and data loaders with front end forms. The Chado Data Display module contains tools to visualize each data type and the metadata which can be used as is or customized as desired. The Chado Search module provides search and download functionality for the supported data types. Also included are the tools to visualize map and species summary. The use of materialized views in the Chado Search module enables better performance as well as flexibility of data modeling in Chado, allowing existing Tripal databases with different metadata types to utilize the module. These Tripal Extension modules are implemented in the Genome Database for Rosaceae (rosaceae.org), CottonGen (cottongen.org), Citrus Genome Database (citrusgenomedb.org), Genome Database for Vaccinium (vaccinium.org) and the Cool Season Food Legume Database (coolseasonfoodlegume.org). Database URL: https://www.citrusgenomedb.org/, https://www.coolseasonfoodlegume.org/, https://www.cottongen.org/, https://www.rosaceae.org/, https://www.vaccinium.org/
Wachtel, Ruth E; Dexter, Franklin
2013-12-01
The purpose of this article is to teach operating room managers, financial analysts, and those with a limited knowledge of search engines, including PubMed, how to locate articles they need in the areas of operating room and anesthesia group management. Many physicians are unaware of current literature in their field and evidence-based practices. The most common source of information is colleagues. Many people making management decisions do not read published scientific articles. Databases such as PubMed are available to search for such articles. Other databases, such as citation indices and Google Scholar, can be used to uncover additional articles. Nevertheless, most people who do not know how to use these databases are reluctant to utilize help resources when they do not know how to accomplish a task. Most people are especially reluctant to use on-line help files. Help files and search databases are often difficult to use because they have been designed for users already familiar with the field. The help files and databases have specialized vocabularies unique to the application. MeSH terms in PubMed are not useful alternatives for operating room management, an important limitation, because MeSH is the default when search terms are entered in PubMed. Librarians or those trained in informatics can be valuable assets for searching unusual databases, but they must possess the domain knowledge relative to the subject they are searching. The search methods we review are especially important when the subject area (e.g., anesthesia group management) is so specific that only 1 or 2 articles address the topic of interest. The materials are presented broadly enough that the reader can extrapolate the findings to other areas of clinical and management issues in anesthesiology.
HBVPathDB: a database of HBV infection-related molecular interaction network.
Zhang, Yi; Bo, Xiao-Chen; Yang, Jing; Wang, Sheng-Qi
2005-03-21
To describe molecules or genes interaction between hepatitis B viruses (HBV) and host, for understanding how virus' and host's genes and molecules are networked to form a biological system and for perceiving mechanism of HBV infection. The knowledge of HBV infection-related reactions was organized into various kinds of pathways with carefully drawn graphs in HBVPathDB. Pathway information is stored with relational database management system (DBMS), which is currently the most efficient way to manage large amounts of data and query is implemented with powerful Structured Query Language (SQL). The search engine is written using Personal Home Page (PHP) with SQL embedded and web retrieval interface is developed for searching with Hypertext Markup Language (HTML). We present the first version of HBVPathDB, which is a HBV infection-related molecular interaction network database composed of 306 pathways with 1 050 molecules involved. With carefully drawn graphs, pathway information stored in HBVPathDB can be browsed in an intuitive way. We develop an easy-to-use interface for flexible accesses to the details of database. Convenient software is implemented to query and browse the pathway information of HBVPathDB. Four search page layout options-category search, gene search, description search, unitized search-are supported by the search engine of the database. The database is freely available at http://www.bio-inf.net/HBVPathDB/HBV/. The conventional perspective HBVPathDB have already contained a considerable amount of pathway information with HBV infection related, which is suitable for in-depth analysis of molecular interaction network of virus and host. HBVPathDB integrates pathway data-sets with convenient software for query, browsing, visualization, that provides users more opportunity to identify regulatory key molecules as potential drug targets and to explore the possible mechanism of HBV infection based on gene expression datasets.
Yap, Kevin Yi-Lwern; Ho, Yasmin Xiu Xiu; Chui, Wai Keung; Chan, Alexandre
2010-11-01
Concomitant use of anticancer drugs (ACDs) and antidepressants (ADs) in the treatment of depression in patients with cancer may result in potentially harmful drug-drug interactions (DDIs). It is crucial that clinicians make timely, accurate, safe and effective decisions regarding drug therapies in patients. The ubiquitous nature of the internet or "cloud" has enabled easy dissemination of DDI information, but there is currently no database dedicated to allow searching of ACD interactions by chemotherapy regimens. We describe the implementation of an AD interaction module to a previously published oncology-specific DDI database for clinicians which focuses on ACDs, single-agent and multiple-agent chemotherapy regimens. Drug- and DDI-related information were collated from drug information handbooks, databases, package inserts, and published literature from PubMed, Scopus and Science Direct. Web documents were constructed using Adobe software and programming scripts, and mounted on a domain served from the internet cloud. OncoRx is an oncology-specific DDI database whose structure is designed around all the major classes of ACDs and their frequently prescribed chemotherapy regimens. There are 117 ACDs and 256 regimens in OncoRx, and it can detect over 1 500 interactions with 21 ADs. Clinicians are provided with the pharmacokinetic parameters of the drugs, information on the regimens and details of the detected DDIs during an interaction search. OncoRx is the first database of its kind which allows detection of ACD and chemotherapy regimen interactions with ADs. This tool will assist clinicians in improving clinical response and reducing adverse effects based on the therapeutic and toxicity profiles of the drugs.
JAMSTEC DARWIN Database Assimilates GANSEKI and COEDO
NASA Astrophysics Data System (ADS)
Tomiyama, T.; Toyoda, Y.; Horikawa, H.; Sasaki, T.; Fukuda, K.; Hase, H.; Saito, H.
2017-12-01
Introduction: Japan Agency for Marine-Earth Science and Technology (JAMSTEC) archives data and samples obtained by JAMSTEC research vessels and submersibles. As a common property of the human society, JAMSTEC archive is open for public users with scientific/educational purposes [1]. For publicizing its data and samples online, JAMSTEC is operating NUUNKUI data sites [2], a group of several databases for various data and sample types. For years, data and metadata of JAMSTEC rock samples, sediment core samples and cruise/dive observation were publicized through databases named GANSEKI, COEDO, and DARWIN, respectively. However, because they had different user interfaces and data structures, these services were somewhat confusing for unfamiliar users. Maintenance costs of multiple hardware and software were also problematic for performing sustainable services and continuous improvements. Database Integration: In 2017, GANSEKI, COEDO and DARWIN were integrated into DARWIN+ [3]. The update also included implementation of map-search function as a substitute of closed portal site. Major functions of previous systems were incorporated into the new system; users can perform the complex search, by thumbnail browsing, map area, keyword filtering, and metadata constraints. As for data handling, the new system is more flexible, allowing the entry of variety of additional data types. Data Management: After the DARWIN major update, JAMSTEC data & sample team has been dealing with minor issues of individual sample data/metadata which sometimes need manual modification to be transferred to the new system. Some new data sets, such as onboard sample photos and surface close-up photos of rock samples, are getting available online. Geochemical data of sediment core samples will supposedly be added in the near future. Reference: [1] http://www.jamstec.go.jp/e/database/data_policy.html [2] http://www.godac.jamstec.go.jp/jmedia/portal/e/ [3] http://www.godac.jamstec.go.jp/darwin/e/
Multi-source and ontology-based retrieval engine for maize mutant phenotypes
Green, Jason M.; Harnsomburana, Jaturon; Schaeffer, Mary L.; Lawrence, Carolyn J.; Shyu, Chi-Ren
2011-01-01
Model Organism Databases, including the various plant genome databases, collect and enable access to massive amounts of heterogeneous information, including sequence data, gene product information, images of mutant phenotypes, etc, as well as textual descriptions of many of these entities. While a variety of basic browsing and search capabilities are available to allow researchers to query and peruse the names and attributes of phenotypic data, next-generation search mechanisms that allow querying and ranking of text descriptions are much less common. In addition, the plant community needs an innovative way to leverage the existing links in these databases to search groups of text descriptions simultaneously. Furthermore, though much time and effort have been afforded to the development of plant-related ontologies, the knowledge embedded in these ontologies remains largely unused in available plant search mechanisms. Addressing these issues, we have developed a unique search engine for mutant phenotypes from MaizeGDB. This advanced search mechanism integrates various text description sources in MaizeGDB to aid a user in retrieving desired mutant phenotype information. Currently, descriptions of mutant phenotypes, loci and gene products are utilized collectively for each search, though expansion of the search mechanism to include other sources is straightforward. The retrieval engine, to our knowledge, is the first engine to exploit the content and structure of available domain ontologies, currently the Plant and Gene Ontologies, to expand and enrich retrieval results in major plant genomic databases. Database URL: http:www.PhenomicsWorld.org/QBTA.php PMID:21558151
Concentrations of indoor pollutants database: User's manual
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1992-05-01
This manual describes the computer-based database on indoor air pollutants. This comprehensive database alloys helps utility personnel perform rapid searches on literature related to indoor air pollutants. Besides general information, it provides guidance for finding specific information on concentrations of indoor air pollutants. The manual includes information on installing and using the database as well as a tutorial to assist the user in becoming familiar with the procedures involved in doing bibliographic and summary section searches. The manual demonstrates how to search for information by going through a series of questions that provide search parameters such as pollutants type, year,more » building type, keywords (from a specific list), country, geographic region, author's last name, and title. As more and more parameters are specified, the list of references found in the data search becomes smaller and more specific to the user's needs. Appendixes list types of information that can be input into the database when making a request. The CIP database allows individual utilities to obtain information on indoor air quality based on building types and other factors in their own service territory. This information is useful for utilities with concerns about indoor air quality and the control of indoor air pollutants. The CIP database itself is distributed by the Electric Power Software Center and runs on IBM PC-compatible computers.« less
Concentrations of indoor pollutants database: User`s manual
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1992-05-01
This manual describes the computer-based database on indoor air pollutants. This comprehensive database alloys helps utility personnel perform rapid searches on literature related to indoor air pollutants. Besides general information, it provides guidance for finding specific information on concentrations of indoor air pollutants. The manual includes information on installing and using the database as well as a tutorial to assist the user in becoming familiar with the procedures involved in doing bibliographic and summary section searches. The manual demonstrates how to search for information by going through a series of questions that provide search parameters such as pollutants type, year,more » building type, keywords (from a specific list), country, geographic region, author`s last name, and title. As more and more parameters are specified, the list of references found in the data search becomes smaller and more specific to the user`s needs. Appendixes list types of information that can be input into the database when making a request. The CIP database allows individual utilities to obtain information on indoor air quality based on building types and other factors in their own service territory. This information is useful for utilities with concerns about indoor air quality and the control of indoor air pollutants. The CIP database itself is distributed by the Electric Power Software Center and runs on IBM PC-compatible computers.« less
Finfgeld-Connett, Deborah; Johnson, E Diane
2013-01-01
To report literature search strategies for the purpose of conducting knowledge-building and theory-generating qualitative systematic reviews. Qualitative systematic reviews lie on a continuum from knowledge-building and theory-generating to aggregating and summarizing. Different types of literature searches are needed to optimally support these dissimilar reviews. Articles published between 1989-Autumn 2011. These documents were identified using a hermeneutic approach and multiple literature search strategies. Redundancy is not the sole measure of validity when conducting knowledge-building and theory-generating systematic reviews. When conducting these types of reviews, literature searches should be consistent with the goal of fully explicating concepts and the interrelationships among them. To accomplish this objective, a 'berry picking' approach is recommended along with strategies for overcoming barriers to finding qualitative research reports. To enhance integrity of knowledge-building and theory-generating systematic reviews, reviewers are urged to make literature search processes as transparent as possible, despite their complexity. This includes fully explaining and rationalizing what databases were used and how they were searched. It also means describing how literature tracking was conducted and grey literature was searched. In the end, the decision to cease searching also needs to be fully explained and rationalized. Predetermined linear search strategies are unlikely to generate search results that are adequate for purposes of conducting knowledge-building and theory-generating qualitative systematic reviews. Instead, it is recommended that iterative search strategies take shape as reviews evolve. © 2012 Blackwell Publishing Ltd.
DRUMS: a human disease related unique gene mutation search engine.
Li, Zuofeng; Liu, Xingnan; Wen, Jingran; Xu, Ye; Zhao, Xin; Li, Xuan; Liu, Lei; Zhang, Xiaoyan
2011-10-01
With the completion of the human genome project and the development of new methods for gene variant detection, the integration of mutation data and its phenotypic consequences has become more important than ever. Among all available resources, locus-specific databases (LSDBs) curate one or more specific genes' mutation data along with high-quality phenotypes. Although some genotype-phenotype data from LSDB have been integrated into central databases little effort has been made to integrate all these data by a search engine approach. In this work, we have developed disease related unique gene mutation search engine (DRUMS), a search engine for human disease related unique gene mutation as a convenient tool for biologists or physicians to retrieve gene variant and related phenotype information. Gene variant and phenotype information were stored in a gene-centred relational database. Moreover, the relationships between mutations and diseases were indexed by the uniform resource identifier from LSDB, or another central database. By querying DRUMS, users can access the most popular mutation databases under one interface. DRUMS could be treated as a domain specific search engine. By using web crawling, indexing, and searching technologies, it provides a competitively efficient interface for searching and retrieving mutation data and their relationships to diseases. The present system is freely accessible at http://www.scbit.org/glif/new/drums/index.html. © 2011 Wiley-Liss, Inc.
Accelerating Information Retrieval from Profile Hidden Markov Model Databases.
Tamimi, Ahmad; Ashhab, Yaqoub; Tamimi, Hashem
2016-01-01
Profile Hidden Markov Model (Profile-HMM) is an efficient statistical approach to represent protein families. Currently, several databases maintain valuable protein sequence information as profile-HMMs. There is an increasing interest to improve the efficiency of searching Profile-HMM databases to detect sequence-profile or profile-profile homology. However, most efforts to enhance searching efficiency have been focusing on improving the alignment algorithms. Although the performance of these algorithms is fairly acceptable, the growing size of these databases, as well as the increasing demand for using batch query searching approach, are strong motivations that call for further enhancement of information retrieval from profile-HMM databases. This work presents a heuristic method to accelerate the current profile-HMM homology searching approaches. The method works by cluster-based remodeling of the database to reduce the search space, rather than focusing on the alignment algorithms. Using different clustering techniques, 4284 TIGRFAMs profiles were clustered based on their similarities. A representative for each cluster was assigned. To enhance sensitivity, we proposed an extended step that allows overlapping among clusters. A validation benchmark of 6000 randomly selected protein sequences was used to query the clustered profiles. To evaluate the efficiency of our approach, speed and recall values were measured and compared with the sequential search approach. Using hierarchical, k-means, and connected component clustering techniques followed by the extended overlapping step, we obtained an average reduction in time of 41%, and an average recall of 96%. Our results demonstrate that representation of profile-HMMs using a clustering-based approach can significantly accelerate data retrieval from profile-HMM databases.
THGS: a web-based database of Transmembrane Helices in Genome Sequences
Fernando, S. A.; Selvarani, P.; Das, Soma; Kumar, Ch. Kiran; Mondal, Sukanta; Ramakumar, S.; Sekar, K.
2004-01-01
Transmembrane Helices in Genome Sequences (THGS) is an interactive web-based database, developed to search the transmembrane helices in the user-interested gene sequences available in the Genome Database (GDB). The proposed database has provision to search sequence motifs in transmembrane and globular proteins. In addition, the motif can be searched in the other sequence databases (Swiss-Prot and PIR) or in the macromolecular structure database, Protein Data Bank (PDB). Further, the 3D structure of the corresponding queried motif, if it is available in the solved protein structures deposited in the Protein Data Bank, can also be visualized using the widely used graphics package RASMOL. All the sequence databases used in the present work are updated frequently and hence the results produced are up to date. The database THGS is freely available via the world wide web and can be accessed at http://pranag.physics.iisc.ernet.in/thgs/ or http://144.16.71.10/thgs/. PMID:14681375
PIA: An Intuitive Protein Inference Engine with a Web-Based User Interface.
Uszkoreit, Julian; Maerkens, Alexandra; Perez-Riverol, Yasset; Meyer, Helmut E; Marcus, Katrin; Stephan, Christian; Kohlbacher, Oliver; Eisenacher, Martin
2015-07-02
Protein inference connects the peptide spectrum matches (PSMs) obtained from database search engines back to proteins, which are typically at the heart of most proteomics studies. Different search engines yield different PSMs and thus different protein lists. Analysis of results from one or multiple search engines is often hampered by different data exchange formats and lack of convenient and intuitive user interfaces. We present PIA, a flexible software suite for combining PSMs from different search engine runs and turning these into consistent results. PIA can be integrated into proteomics data analysis workflows in several ways. A user-friendly graphical user interface can be run either locally or (e.g., for larger core facilities) from a central server. For automated data processing, stand-alone tools are available. PIA implements several established protein inference algorithms and can combine results from different search engines seamlessly. On several benchmark data sets, we show that PIA can identify a larger number of proteins at the same protein FDR when compared to that using inference based on a single search engine. PIA supports the majority of established search engines and data in the mzIdentML standard format. It is implemented in Java and freely available at https://github.com/mpc-bioinformatics/pia.
Kwak, Namhee; Swan, Joshua T; Thompson-Moore, Nathaniel; Liebl, Michael G
2016-08-01
This study aims to develop a systematic search strategy and test its validity and reliability in terms of identifying projects published in peer-reviewed journals as reported by residency graduates through an online survey. This study was a prospective blind comparison to a reference standard. Pharmacy residency projects conducted at the study institution between 2001 and 2012 were included. A step-wise, systematic procedure containing up to 8 search strategies in PubMed and EMBASE for each project was created using the names of authors and abstract keywords. In order to further maximize sensitivity, complex phrases with multiple variations were truncated to the root word. Validity was assessed by obtaining information on publications from an online survey deployed to residency graduates. The search strategy identified 13 publications (93% sensitivity, 100% specificity, and 99% accuracy). Both methods identified a similar proportion achieving publication (19.7% search strategy vs 21.2% survey, P = 1.00). Reliability of the search strategy was affirmed by the perfect agreement between 2 investigators (k = 1.00). This systematic search strategy demonstrated a high sensitivity, specificity, and accuracy for identifying publications resulting from pharmacy residency projects using information available in residency conference abstracts. © The Author(s) 2015.
Data-driven indexing mechanism for the recognition of polyhedral objects
NASA Astrophysics Data System (ADS)
McLean, Stewart; Horan, Peter; Caelli, Terry M.
1992-02-01
This paper is concerned with the problem of searching large model databases. To date, most object recognition systems have concentrated on the problem of matching using simple searching algorithms. This is quite acceptable when the number of object models is small. However, in the future, general purpose computer vision systems will be required to recognize hundreds or perhaps thousands of objects and, in such circumstances, efficient searching algorithms will be needed. The problem of searching a large model database is one which must be addressed if future computer vision systems are to be at all effective. In this paper we present a method we call data-driven feature-indexed hypothesis generation as one solution to the problem of searching large model databases.
Uses of the Word “Macula” in Written English, 1400-Present
Schwartz, Stephen G.; Leffler, Christopher T.
2014-01-01
We compiled uses of the word “macula” in written English by searching multiple databases, including the Early English Books Online Text Creation Partnership, America’s Historical Newspapers, the Gale Cengage Collections, and others. “Macula” has been used: as a non-medical “spot” or “stain”, literal or figurative, including in astronomy and in Shakespeare; as a medical skin lesion, occasionally with a following descriptive adjective, such as a color (macula alba); as a corneal lesion, including the earliest identified use in English, circa 1400; and to describe the center of the retina. Francesco Buzzi described a yellow color in the posterior pole (“retina tinta di un color giallo”) in 1782, but did not use the word “macula”. “Macula lutea” was published by Samuel Thomas von Sömmering by 1799, and subsequently used in 1818 by James Wardrop, which appears to be the first known use in English. The Google n-gram database shows a marked increase in the frequencies of both “macula” and “macula lutea” following the introduction of the ophthalmoscope in 1850. “Macula” has been used in multiple contexts in written English. Modern databases provide powerful tools to explore historical uses of this word, which may be underappreciated by contemporary ophthalmologists. PMID:24913329
NASA Technical Reports Server (NTRS)
Dickson, J.; Drury, H.; Van Essen, D. C.
2001-01-01
Surface reconstructions of the cerebral cortex are increasingly widely used in the analysis and visualization of cortical structure, function and connectivity. From a neuroinformatics perspective, dealing with surface-related data poses a number of challenges. These include the multiplicity of configurations in which surfaces are routinely viewed (e.g. inflated maps, spheres and flat maps), plus the diversity of experimental data that can be represented on any given surface. To address these challenges, we have developed a surface management system (SuMS) that allows automated storage and retrieval of complex surface-related datasets. SuMS provides a systematic framework for the classification, storage and retrieval of many types of surface-related data and associated volume data. Within this classification framework, it serves as a version-control system capable of handling large numbers of surface and volume datasets. With built-in database management system support, SuMS provides rapid search and retrieval capabilities across all the datasets, while also incorporating multiple security levels to regulate access. SuMS is implemented in Java and can be accessed via a Web interface (WebSuMS) or using downloaded client software. Thus, SuMS is well positioned to act as a multiplatform, multi-user 'surface request broker' for the neuroscience community.
What hysteria? A systematic study of newspaper coverage of accused child molesters.
Cheit, Ross E
2003-06-01
There were three aims: First, to determine the extent to which those charged with child molestation receive newspaper coverage; second, to analyze the nature of that coverage; and third, to compare the universe of coverage to the nature of child molestation charges in the criminal justice system as a whole. Two databases were created. The first one identified all defendants charged with child molestation in Rhode Island in 1993. The database was updated after 5 years to include relevant information about case disposition. The second database was created by electronic searching the Providence Journal for every story that mentioned each defendant. Most defendants (56.1%) were not mentioned in the newspaper. Factors associated with a greater chance of coverage include: cases involving first-degree charges, cases with multiple counts, cases involving additional violence or multiple victims, and cases resulting in long prison sentences. The data indicate that the press exaggerates "stranger danger," while intra-familial cases are underreported. Newspaper accounts also minimize the extent to which guilty defendants avoid prison. Generalizing about the nature of child molestation cases in criminal court on the basis of newspaper coverage is inappropriate. The coverage is less extensive than often claimed, and it is skewed in ways that are typical of the mass media.
Westbom, Lena; Hagglund, Gunnar; Nordmark, Eva
2007-12-05
The aim of this study was to investigate the prevalence of cerebral palsy (CP) as well as to characterize the CP population, its participation in a secondary prevention programme (CPUP) and to validate the CPUP database. The study population was born 1990-1997 and resident in Skåne/Blekinge on Jan 1st 2002. Multiple sources were used. Irrespective of earlier diagnoses, neuropaediatrician and other professional medical records were evaluated for all children at the child habilitation units. The CPUP database and diagnosis registers at hospital departments were searched for children with CP or psychomotor retardation, whose records were then evaluated. To enhance early prevention, CP/probable CP was searched for also in children below four years of age born 1998-2001. The prevalence of CP was 2.4/1,000 (95% CI 2.1-2.6) in children 4-11 years of age born in Sweden, excluding post-neonatally acquired CP. Children born abroad had a higher prevalence of CP with more severe functional limitations. In the total population, the prevalence of CP was 2.7/1,000 (95% CI 2.4-3.0) and 48% were GMFCS-level I (the mildest limitation of gross motor function). One third of the children with CP, who were born or had moved into the area after a previous study in 1998, were not in the CPUP database. The subtype classification in the CPUP database was adjusted in the case of every fifth child aged 4-7 years not previously reviewed. The prevalence of CP and the subtype distribution did not differ from that reported in other studies, although the proportion of mild CP tended to be higher. The availability of a second opinion about the classification of CP/CP subtypes is necessary in order to keep a CP register valid, as well as an active search for undiagnosed CP among children with other impairments.
Pan, Min; Cong, Peikuan; Wang, Yue; Lin, Changsong; Yuan, Ying; Dong, Jian; Banerjee, Santasree; Zhang, Tao; Chen, Yanling; Zhang, Ting; Chen, Mingqing; Hu, Peter; Zheng, Shu; Zhang, Jin; Qi, Ming
2011-12-01
The Human Variome Project (HVP) is an international consortium of clinicians, geneticists, and researchers from over 30 countries, aiming to facilitate the establishment and maintenance of standards, systems, and infrastructure for the worldwide collection and sharing of all genetic variations effecting human disease. The HVP-China Node will build new and supplement existing databases of genetic diseases. As the first effort, we have created a novel variant database of BRCA1 and BRCA2, mismatch repair genes (MMR), and APC genes for breast cancer, Lynch syndrome, and familial adenomatous polyposis (FAP), respectively, in the Chinese population using the Leiden Open Variation Database (LOVD) format. We searched PubMed and some Chinese search engines to collect all the variants of these genes in the Chinese population that have already been detected and reported. There are some differences in the gene variants between the Chinese population and that of other ethnicities. The database is available online at http://www.genomed.org/LOVD/. Our database will appear to users who survey other LOVD databases (e.g., by Google search, or by NCBI GeneTests search). Remote submissions are accepted, and the information is updated monthly. © 2011 Wiley Periodicals, Inc.
Comparison of CINAHL, EMBASE, and MEDLINE databases for the nurse researcher.
Burnham, J; Shearer, B
1993-01-01
The purpose of this research was to determine which of three databases, CINAHL, EMBASE or MEDLINE, should be accessed when researching nursing topics. The three databases were searched for citations on topics selected by three nurse researchers and the results were compared. For the search of nursing care literature on a medical condition, it was helpful to search both CINAHL and MEDLINE. CINAHL provided the majority of relevant articles for the second search, on computers and privacy, but inclusion of MEDLINE and EMBASE enhanced retrieval somewhat. The search on substance abuse in pregnancy, not restricted to nursing literature, retrieved better results when searching both MEDLINE and EMBASE. Due to the nature and distribution of the nursing literature, it is especially important for the searcher to understand and respond to the focus of the researcher.
ERIC Educational Resources Information Center
Flatley, Robert K.; Lilla, Rick; Widner, Jack
2007-01-01
This study compared Social Work Abstracts and Social Services Abstracts databases in terms of indexing, journal coverage, and searches. The authors interviewed editors, analyzed journal coverage, and compared searches. It was determined that the databases complement one another more than compete. The authors conclude with some considerations.
The Philip Morris Information Network: A Library Database on an In-House Timesharing System.
ERIC Educational Resources Information Center
DeBardeleben, Marian Z.; And Others
1983-01-01
Outlines a database constructed at Philip Morris Research Center Library which encompasses holdings and circulation and acquisitions records for all items in the library. Host computer (DECSYSTEM-2060), software (BASIC), database design, search methodology, cataloging, and accessibility are noted; sample search, circ-in profile, end user profiles,…
How Many People Search the ERIC Database Each Day?
ERIC Educational Resources Information Center
Rudner, Lawrence
This study estimated the number of people searching the ERIC database each day. The Educational Resources Information Center (ERIC) is a national information system designed to provide ready access to an extensive body of education-related literature. Federal funds traditionally have paid for the development of the database, but not the…
FirstSearch and NetFirst--Web and Dial-up Access: Plus Ca Change, Plus C'est la Meme Chose?
ERIC Educational Resources Information Center
Koehler, Wallace; Mincey, Danielle
1996-01-01
Compares and evaluates the differences between OCLC's dial-up and World Wide Web FirstSearch access methods and their interfaces with the underlying databases. Also examines NetFirst, OCLC's new Internet catalog, the only Internet tracking database from a "traditional" database service. (Author/PEN)
Ocean Drilling Program: Web Site Access Statistics
and products Drilling services and tools Online Janus database Search the ODP/TAMU web site ODP's main See statistics for JOIDES members. See statistics for Janus database. 1997 October November December accessible only on www-odp.tamu.edu. ** End of ODP, start of IODP. Privacy Policy ODP | Search | Database
ERIC Educational Resources Information Center
Turner, Laura
2001-01-01
Focuses on the Deep Web, defined as Web content in searchable databases of the type that can be found only by direct query. Discusses the problems of indexing; inability to find information not indexed in the search engine's database; and metasearch engines. Describes 10 sites created to access online databases or directly search them. Lists ways…
A practical approach for inexpensive searches of radiology report databases.
Desjardins, Benoit; Hamilton, R Curtis
2007-06-01
We present a method to perform full text searches of radiology reports for the large number of departments that do not have this ability as part of their radiology or hospital information system. A tool written in Microsoft Access (front-end) has been designed to search a server (back-end) containing the indexed backup weekly copy of the full relational database extracted from a radiology information system (RIS). This front end-/back-end approach has been implemented in a large academic radiology department, and is used for teaching, research and administrative purposes. The weekly second backup of the 80 GB, 4 million record RIS database takes 2 hours. Further indexing of the exported radiology reports takes 6 hours. Individual searches of the indexed database typically take less than 1 minute on the indexed database and 30-60 minutes on the nonindexed database. Guidelines to properly address privacy and institutional review board issues are closely followed by all users. This method has potential to improve teaching, research, and administrative programs within radiology departments that cannot afford more expensive technology.
Labrecque, Michel; Ratté, Stéphane; Frémont, Pierre; Cauchon, Michel; Ouellet, Jérôme; Hogg, William; McGowan, Jessie; Gagnon, Marie-Pierre; Njoya, Merlin; Légaré, France
2013-10-01
To compare the ability of users of 2 medical search engines, InfoClinique and the Trip database, to provide correct answers to clinical questions and to explore the perceived effects of the tools on the clinical decision-making process. Randomized trial. Three family medicine units of the family medicine program of the Faculty of Medicine at Laval University in Quebec city, Que. Fifteen second-year family medicine residents. Residents generated 30 structured questions about therapy or preventive treatment (2 questions per resident) based on clinical encounters. Using an Internet platform designed for the trial, each resident answered 20 of these questions (their own 2, plus 18 of the questions formulated by other residents, selected randomly) before and after searching for information with 1 of the 2 search engines. For each question, 5 residents were randomly assigned to begin their search with InfoClinique and 5 with the Trip database. The ability of residents to provide correct answers to clinical questions using the search engines, as determined by third-party evaluation. After answering each question, participants completed a questionnaire to assess their perception of the engine's effect on the decision-making process in clinical practice. Of 300 possible pairs of answers (1 answer before and 1 after the initial search), 254 (85%) were produced by 14 residents. Of these, 132 (52%) and 122 (48%) pairs of answers concerned questions that had been assigned an initial search with InfoClinique and the Trip database, respectively. Both engines produced an important and similar absolute increase in the proportion of correct answers after searching (26% to 62% for InfoClinique, for an increase of 36%; 24% to 63% for the Trip database, for an increase of 39%; P = .68). For all 30 clinical questions, at least 1 resident produced the correct answer after searching with either search engine. The mean (SD) time of the initial search for each question was 23.5 (7.6) minutes with InfoClinique and 22.3 (7.8) minutes with the Trip database (P = .30). Participants' perceptions of each engine's effect on the decision-making process were very positive and similar for both search engines. Family medicine residents' ability to provide correct answers to clinical questions increased dramatically and similarly with the use of both InfoClinique and the Trip database. These tools have strong potential to increase the quality of medical care.
Alam-mehrjerdi, Zahra; Mokri, Azarakhsh; Dolan, Kate
2015-08-01
Methamphetamine use is a new health concern in Iran, the most populated Persian Gulf country. However, there is no well-documented literature. The current study objectives were to systematically review all published English and Persian studies of the prevalence of methamphetamine use, the general physical and psychiatric-related harms and the availability of methamphetamine treatment and harm reduction services for adult users in Iran. A comprehensive search of the international peer-reviewed and gray literature was undertaken. Multiple electronic and scientific English and Persian databases were systematically searched from January 2002 to September 2014. Additionally, English and Persian gray literature on methamphetamine use was sought using online gray literature databases, library databases and general online searches over the same period of time. Nineteen thousand and two hundred and eight studies, reports and conference papers were identified but only 42 studies were relevant to the study objectives. They were mainly published in 2010-2014. The search results confirmed the seizures of methamphetamine (six studies), the prevalence of methamphetamine use among the general population (three studies), drug users (four studies), women (nine studies) and opiate users in opiate treatment programs (five studies). In addition, methamphetamine use had resulted in blood-borne viral infections (one study), psychosis and intoxication (ten studies). Different reasons had facilitated methamphetamine use. However, the Matrix Model, community therapy and harm reduction services (four studies) had been provided for methamphetamine users in some cities. The current situation of methamphetamine use necessitates more research on the epidemiology and health-related implications. These studies should help in identifying priorities for designing and implementing prevention and educational programs. More active models of engagement with Persian methamphetamine users and the provision of services that meet their specific treatment needs are required. Copyright © 2015 Elsevier B.V. All rights reserved.
Tai, David; Fang, Jianwen
2012-08-27
The large sizes of today's chemical databases require efficient algorithms to perform similarity searches. It can be very time consuming to compare two large chemical databases. This paper seeks to build upon existing research efforts by describing a novel strategy for accelerating existing search algorithms for comparing large chemical collections. The quest for efficiency has focused on developing better indexing algorithms by creating heuristics for searching individual chemical against a chemical library by detecting and eliminating needless similarity calculations. For comparing two chemical collections, these algorithms simply execute searches for each chemical in the query set sequentially. The strategy presented in this paper achieves a speedup upon these algorithms by indexing the set of all query chemicals so redundant calculations that arise in the case of sequential searches are eliminated. We implement this novel algorithm by developing a similarity search program called Symmetric inDexing or SymDex. SymDex shows over a 232% maximum speedup compared to the state-of-the-art single query search algorithm over real data for various fingerprint lengths. Considerable speedup is even seen for batch searches where query set sizes are relatively small compared to typical database sizes. To the best of our knowledge, SymDex is the first search algorithm designed specifically for comparing chemical libraries. It can be adapted to most, if not all, existing indexing algorithms and shows potential for accelerating future similarity search algorithms for comparing chemical databases.
The MIGenAS integrated bioinformatics toolkit for web-based sequence analysis
Rampp, Markus; Soddemann, Thomas; Lederer, Hermann
2006-01-01
We describe a versatile and extensible integrated bioinformatics toolkit for the analysis of biological sequences over the Internet. The web portal offers convenient interactive access to a growing pool of chainable bioinformatics software tools and databases that are centrally installed and maintained by the RZG. Currently, supported tasks comprise sequence similarity searches in public or user-supplied databases, computation and validation of multiple sequence alignments, phylogenetic analysis and protein–structure prediction. Individual tools can be seamlessly chained into pipelines allowing the user to conveniently process complex workflows without the necessity to take care of any format conversions or tedious parsing of intermediate results. The toolkit is part of the Max-Planck Integrated Gene Analysis System (MIGenAS) of the Max Planck Society available at (click ‘Start Toolkit’). PMID:16844980
NASA Astrophysics Data System (ADS)
Bulan, Orhan; Bernal, Edgar A.; Loce, Robert P.; Wu, Wencheng
2013-03-01
Video cameras are widely deployed along city streets, interstate highways, traffic lights, stop signs and toll booths by entities that perform traffic monitoring and law enforcement. The videos captured by these cameras are typically compressed and stored in large databases. Performing a rapid search for a specific vehicle within a large database of compressed videos is often required and can be a time-critical life or death situation. In this paper, we propose video compression and decompression algorithms that enable fast and efficient vehicle or, more generally, event searches in large video databases. The proposed algorithm selects reference frames (i.e., I-frames) based on a vehicle having been detected at a specified position within the scene being monitored while compressing a video sequence. A search for a specific vehicle in the compressed video stream is performed across the reference frames only, which does not require decompression of the full video sequence as in traditional search algorithms. Our experimental results on videos captured in a local road show that the proposed algorithm significantly reduces the search space (thus reducing time and computational resources) in vehicle search tasks within compressed video streams, particularly those captured in light traffic volume conditions.
Mobilio, Dominick; Walker, Gary; Brooijmans, Natasja; Nilakantan, Ramaswamy; Denny, R Aldrin; Dejoannis, Jason; Feyfant, Eric; Kowticwar, Rupesh K; Mankala, Jyoti; Palli, Satish; Punyamantula, Sairam; Tatipally, Maneesh; John, Reji K; Humblet, Christine
2010-08-01
The Protein Data Bank is the most comprehensive source of experimental macromolecular structures. It can, however, be difficult at times to locate relevant structures with the Protein Data Bank search interface. This is particularly true when searching for complexes containing specific interactions between protein and ligand atoms. Moreover, searching within a family of proteins can be tedious. For example, one cannot search for some conserved residue as residue numbers vary across structures. We describe herein three databases, Protein Relational Database, Kinase Knowledge Base, and Matrix Metalloproteinase Knowledge Base, containing protein structures from the Protein Data Bank. In Protein Relational Database, atom-atom distances between protein and ligand have been precalculated allowing for millisecond retrieval based on atom identity and distance constraints. Ring centroids, centroid-centroid and centroid-atom distances and angles have also been included permitting queries for pi-stacking interactions and other structural motifs involving rings. Other geometric features can be searched through the inclusion of residue pair and triplet distances. In Kinase Knowledge Base and Matrix Metalloproteinase Knowledge Base, the catalytic domains have been aligned into common residue numbering schemes. Thus, by searching across Protein Relational Database and Kinase Knowledge Base, one can easily retrieve structures wherein, for example, a ligand of interest is making contact with the gatekeeper residue.
DOE Research and Development Accomplishments Website Policies/Important
Links RSS Archive Videos XML DOE R&D Accomplishments DOE R&D Accomplishments searchQuery à Find searchQuery x Find DOE R&D Acccomplishments Navigation dropdown arrow The Basics Stories Snapshots R&D Nuggets Database dropdown arrow Search Tag Cloud Browse Reports Database Help
Pre-Service Teachers' Use of Library Databases: Some Insights
ERIC Educational Resources Information Center
Lamb, Janeen; Howard, Sarah; Easey, Michael
2014-01-01
The aim of this study is to investigate if providing mathematics education pre-service teachers with animated library tutorials on library and database searches changes their searching practices. This study involved the completion of a survey by 138 students and seven individual interviews before and after library search demonstration videos were…
Common hyperspectral image database design
NASA Astrophysics Data System (ADS)
Tian, Lixun; Liao, Ningfang; Chai, Ali
2009-11-01
This paper is to introduce Common hyperspectral image database with a demand-oriented Database design method (CHIDB), which comprehensively set ground-based spectra, standardized hyperspectral cube, spectral analysis together to meet some applications. The paper presents an integrated approach to retrieving spectral and spatial patterns from remotely sensed imagery using state-of-the-art data mining and advanced database technologies, some data mining ideas and functions were associated into CHIDB to make it more suitable to serve in agriculture, geological and environmental areas. A broad range of data from multiple regions of the electromagnetic spectrum is supported, including ultraviolet, visible, near-infrared, thermal infrared, and fluorescence. CHIDB is based on dotnet framework and designed by MVC architecture including five main functional modules: Data importer/exporter, Image/spectrum Viewer, Data Processor, Parameter Extractor, and On-line Analyzer. The original data were all stored in SQL server2008 for efficient search, query and update, and some advance Spectral image data Processing technology are used such as Parallel processing in C#; Finally an application case is presented in agricultural disease detecting area.
MGDB: a comprehensive database of genes involved in melanoma.
Zhang, Di; Zhu, Rongrong; Zhang, Hanqian; Zheng, Chun-Hou; Xia, Junfeng
2015-01-01
The Melanoma Gene Database (MGDB) is a manually curated catalog of molecular genetic data relating to genes involved in melanoma. The main purpose of this database is to establish a network of melanoma related genes and to facilitate the mechanistic study of melanoma tumorigenesis. The entries describing the relationships between melanoma and genes in the current release were manually extracted from PubMed abstracts, which contains cumulative to date 527 human melanoma genes (422 protein-coding and 105 non-coding genes). Each melanoma gene was annotated in seven different aspects (General Information, Expression, Methylation, Mutation, Interaction, Pathway and Drug). In addition, manually curated literature references have also been provided to support the inclusion of the gene in MGDB and establish its association with melanoma. MGDB has a user-friendly web interface with multiple browse and search functions. We hoped MGDB will enrich our knowledge about melanoma genetics and serve as a useful complement to the existing public resources. Database URL: http://bioinfo.ahu.edu.cn:8080/Melanoma/index.jsp. © The Author(s) 2015. Published by Oxford University Press.
Ocean Drilling Program: Mirror Sites
Publication services and products Drilling services and tools Online Janus database Search the ODP/TAMU web information, see www.iodp-usio.org. ODP | Search | Database | Drilling | Publications | Science | Cruise Info
Ocean Drilling Program: TAMU Staff Directory
products Drilling services and tools Online Janus database Search the ODP/TAMU web site ODP's main web site Employment Opportunities ODP | Search | Database | Drilling | Publications | Science | Cruise Info | Public
An annotated database of Arabidopsis mutants of acyl lipid metabolism
McGlew, Kathleen; Shaw, Vincent; Zhang, Meng; ...
2014-12-10
Mutants have played a fundamental role in gene discovery and in understanding the function of genes involved in plant acyl lipid metabolism. The first mutant in Arabidopsis lipid metabolism ( fad4) was described in 1985. Since that time, characterization of mutants in more than 280 genes associated with acyl lipid metabolism has been reported. This review provides a brief background and history on identification of mutants in acyl lipid metabolism, an analysis of the distribution of mutants in different areas of acyl lipid metabolism and presents an annotated database (ARALIPmutantDB) of these mutants. The database provides information on the phenotypesmore » of mutants, pathways and enzymes/proteins associated with the mutants, and allows rapid access via hyperlinks to summaries of information about each mutant and to literature that provides information on the lipid composition of the mutants. Mutants for at least 30 % of the genes in the database have multiple names, which have been compiled here to reduce ambiguities in searches for information. Furthermore, the database should also provide a tool for exploring the relationships between mutants in acyl lipid-related genes and their lipid phenotypes and point to opportunities for further research.« less
Chang, Suhua; Zhang, Jiajie; Liao, Xiaoyun; Zhu, Xinxing; Wang, Dahai; Zhu, Jiang; Feng, Tao; Zhu, Baoli; Gao, George F; Wang, Jian; Yang, Huanming; Yu, Jun; Wang, Jing
2007-01-01
Frequent outbreaks of highly pathogenic avian influenza and the increasing data available for comparative analysis require a central database specialized in influenza viruses (IVs). We have established the Influenza Virus Database (IVDB) to integrate information and create an analysis platform for genetic, genomic, and phylogenetic studies of the virus. IVDB hosts complete genome sequences of influenza A virus generated by Beijing Institute of Genomics (BIG) and curates all other published IV sequences after expert annotation. Our Q-Filter system classifies and ranks all nucleotide sequences into seven categories according to sequence content and integrity. IVDB provides a series of tools and viewers for comparative analysis of the viral genomes, genes, genetic polymorphisms and phylogenetic relationships. A search system has been developed for users to retrieve a combination of different data types by setting search options. To facilitate analysis of global viral transmission and evolution, the IV Sequence Distribution Tool (IVDT) has been developed to display the worldwide geographic distribution of chosen viral genotypes and to couple genomic data with epidemiological data. The BLAST, multiple sequence alignment and phylogenetic analysis tools were integrated for online data analysis. Furthermore, IVDB offers instant access to pre-computed alignments and polymorphisms of IV genes and proteins, and presents the results as SNP distribution plots and minor allele distributions. IVDB is publicly available at http://influenza.genomics.org.cn.
Inoue, J
1991-12-01
When occupational health personnel, especially occupational physicians search bibliographies, they usually have to search bibliographies by themselves. Also, if a library is not available because of the location of their work place, they might have to rely on online databases. Although there are many commercial databases in the world, people who seldom use them, will have problems with on-line searching, such as user-computer interface, keywords, and so on. The present study surveyed the best bibliographic searching system in the field of occupational medicine by questionnaire through the use of DIALOG OnDisc MEDLINE as a commercial database. In order to ascertain the problems involved in determining the best bibliographic searching system, a prototype bibliographic searching system was constructed and then evaluated. Finally, solutions for the problems were discussed. These led to the following conclusions: to construct the best bibliographic searching system at the present time, 1) a concept of micro-to-mainframe links (MML) is needed for the computer hardware network; 2) multi-lingual font standards and an excellent common user-computer interface are needed for the computer software; 3) a short course and education of database management systems, and support of personal information processing for retrieved data are necessary for the practical use of the system.
What Searches Do Users Run on PEDro? An Analysis of 893,971 Search Commands Over a 6-Month Period.
Stevens, Matthew L; Moseley, Anne; Elkins, Mark R; Lin, Christine C-W; Maher, Chris G
2016-08-05
Clinicians must be able to search effectively for relevant research if they are to provide evidence-based healthcare. It is therefore relevant to consider how users search databases of evidence in healthcare, including what information users look for and what search strategies they employ. To date such analyses have been restricted to the PubMed database. Although the Physiotherapy Evidence Database (PEDro) is searched millions of times each year, no studies have investigated how users search PEDro. To assess the content and quality of searches conducted on PEDro. Searches conducted on the PEDro website over 6 months were downloaded and the 'get' commands and page-views extracted. The following data were tabulated: the 25 most common searches; the number of search terms used; the frequency of use of simple and advanced searches, including the use of each advanced search field; and the frequency of use of various search strategies. Between August 2014 and January 2015, 893,971 search commands were entered on PEDro. Fewer than 18 % of these searches used the advanced search features of PEDro. 'Musculoskeletal' was the most common subdiscipline searched, while 'low back pain' was the most common individual search. Around 20 % of all searches contained errors. PEDro is a commonly used evidence resource, but searching appears to be sub-optimal in many cases. The effectiveness of searches conducted by users needs to improve, which could be facilitated by methods such as targeted training and amending the search interface.
Microlithography and resist technology information at your fingertips via SciFinder
NASA Astrophysics Data System (ADS)
Konuk, Rengin; Macko, John R.; Staggenborg, Lisa
1997-07-01
Finding and retrieving the information you need about microlithography and resist technology in a timely fashion can make or break your competitive edge in today's business environment. Chemical Abstracts Service (CAS) provides the most complete and comprehensive database of the chemical literature in the CAplus, REGISTRY, and CASREACT files including 13 million document references, 15 million substance records and over 1.2 million reactions. This includes comprehensive coverage of positive and negative resist formulations and processing, photoacid generation, silylation, single and multilayer resist systems, photomasks, dry and wet etching, photolithography, electron-beam, ion-beam and x-ray lithography technologies and process control, optical tools, exposure systems, radiation sources and steppers. Journal articles, conference proceedings and patents related to microlithography and resist technology are analyzed and indexed by scientific information analysts with strong technical background in these areas. The full CAS database, which is updated weekly with new information, is now available at your desktop, via a convenient, user-friendly tool called 'SciFinder.' Author, subject and chemical substance searching is simplified by SciFinder's smart search features. Chemical substances can be searched by chemical structure, chemical name, CAS registry number or molecular formula. Drawing chemical structures in SciFinder is easy and does not require compliance with CA conventions. Built-in intelligence of SciFinder enables users to retrieve substances with multiple components, tautomeric forms and salts.
The Giardia genome project database.
McArthur, A G; Morrison, H G; Nixon, J E; Passamaneck, N Q; Kim, U; Hinkle, G; Crocker, M K; Holder, M E; Farr, R; Reich, C I; Olsen, G E; Aley, S B; Adam, R D; Gillin, F D; Sogin, M L
2000-08-15
The Giardia genome project database provides an online resource for Giardia lamblia (WB strain, clone C6) genome sequence information. The database includes edited single-pass reads, the results of BLASTX searches, and details of progress towards sequencing the entire 12 million-bp Giardia genome. Pre-sorted BLASTX results can be retrieved based on keyword searches and BLAST searches of the high throughput Giardia data can be initiated from the web site or through NCBI. Descriptions of the genomic DNA libraries, project protocols and summary statistics are also available. Although the Giardia genome project is ongoing, new sequences are made available on a bi-monthly basis to ensure that researchers have access to information that may assist them in the search for genes and their biological function. The current URL of the Giardia genome project database is www.mbl.edu/Giardia.
Multimedia explorer: image database, image proxy-server and search-engine.
Frankewitsch, T.; Prokosch, U.
1999-01-01
Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval. PMID:10566463
Multimedia explorer: image database, image proxy-server and search-engine.
Frankewitsch, T; Prokosch, U
1999-01-01
Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Merkley, Eric D.; Anderson, Brian J.; Park, Jea H.
2012-12-07
Multiheme c-type cytochromes (proteins with covalently attached heme c moieties) play important roles in extracellular metal respiration in dissimilatory metal-reducing bacteria. Liquid chromatography-tandem mass spectrometry-(LC-MS/MS) characterization of c-type cytochromes is hindered by the presence of multiple heme groups, since the heme c modified peptides are typically not observed, or if observed, not identified. Using a recently reported histidine affinity chromatography (HAC) procedure, we enriched heme c tryptic peptides from purified bovine heart cytochrome c, a bacterial decaheme cytochrome, and subjected these samples to LC-MS/MS analysis. Enriched bovine cytochrome c samples yielded three- to six-fold more confident peptide-spectrum matches to heme-cmore » containing peptides than unenriched digests. In unenriched digests of the decaheme cytochrome MtoA from Sideroxydans lithotrophicus ES-1, heme c peptides for four of the ten expected sites were observed by LC-MS/MS; following HAC fractionation, peptides covering nine out of ten sites were obtained. Heme c peptide spiked into E. coli lysates at mass ratios as low as 10-4 was detected with good signal-to-noise after HAC and LC-MS/MS analysis. In addition to HAC, we have developed a proteomics database search strategy that takes into account the unique physicochemical properties of heme c peptides. The results suggest that accounting for the double thioether link between heme c and peptide, and the use of the labile heme fragment as a reporter ion, can improve database searching results. The combination of affinity chromatography and heme-specific informatics yielded increases in the number of peptide-spectrum matches of 20-100-fold for bovine cytochrome c.« less
2012-01-01
Background To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. Results We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. Conclusions SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery. PMID:23281852
Chiu, Yi-Yuan; Lin, Chun-Yu; Lin, Chih-Ta; Hsu, Kai-Cheng; Chang, Li-Zen; Yang, Jinn-Moon
2012-01-01
To discover a compound inhibiting multiple proteins (i.e. polypharmacological targets) is a new paradigm for the complex diseases (e.g. cancers and diabetes). In general, the polypharmacological proteins often share similar local binding environments and motifs. As the exponential growth of the number of protein structures, to find the similar structural binding motifs (pharma-motifs) is an emergency task for drug discovery (e.g. side effects and new uses for old drugs) and protein functions. We have developed a Space-Related Pharmamotifs (called SRPmotif) method to recognize the binding motifs by searching against protein structure database. SRPmotif is able to recognize conserved binding environments containing spatially discontinuous pharma-motifs which are often short conserved peptides with specific physico-chemical properties for protein functions. Among 356 pharma-motifs, 56.5% interacting residues are highly conserved. Experimental results indicate that 81.1% and 92.7% polypharmacological targets of each protein-ligand complex are annotated with same biological process (BP) and molecular function (MF) terms, respectively, based on Gene Ontology (GO). Our experimental results show that the identified pharma-motifs often consist of key residues in functional (active) sites and play the key roles for protein functions. The SRPmotif is available at http://gemdock.life.nctu.edu.tw/SRP/. SRPmotif is able to identify similar pharma-interfaces and pharma-motifs sharing similar binding environments for polypharmacological targets by rapidly searching against the protein structure database. Pharma-motifs describe the conservations of binding environments for drug discovery and protein functions. Additionally, these pharma-motifs provide the clues for discovering new sequence-based motifs to predict protein functions from protein sequence databases. We believe that SRPmotif is useful for elucidating protein functions and drug discovery.
Zhang, Zhe-wen; Cheng, Juan; Liu, Zhuan; Ma, Ji-chun; Li, Jin-long; Wang, Jing; Yang, Ke-hu
2015-12-07
The aim of this study was to examine the epidemiological and reporting characteristics as well as the methodological quality of meta-analyses (MAs) of observational studies published in Chinese journals. 5 Chinese databases were searched for MAs of observational studies published from January 1978 to May 2014. Data were extracted into Excel spreadsheets, and Meta-analysis of Observational Studies in Epidemiology (MOOSE) and Assessment of Multiple Systematic Reviews (AMSTAR) checklists were used to assess reporting characteristics and methodological quality, respectively. A total of 607 MAs were included. Only 52.2% of the MAs assessed the quality of the included primary studies, and the retrieval information was not comprehensive in more than half (85.8%) of the MAs. In addition, 50 (8.2%) MAs did not search any Chinese databases, while 126 (20.8%) studies did not search any English databases. Approximately 41.2% of the MAs did not describe the statistical methods in sufficient details, and most (95.5%) MAs did not report on conflicts of interest. However, compared with the before publication of the MOOSE Checklist, the quality of reporting improved significantly for 20 subitems after publication of the MOOSE Checklist, and 7 items of the included MAs demonstrated significant improvement after publication of the AMSTAR Checklist (p<0.05). Although many MAs of observational studies have been published in Chinese journals, the reporting quality is questionable. Thus, there is an urgent need to increase the use of reporting guidelines and methodological tools in China; we recommend that Chinese journals adopt the MOOSE and AMSTAR criteria. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Gacesa, Ranko; Zucko, Jurica; Petursdottir, Solveig K; Gudmundsdottir, Elisabet Eik; Fridjonsson, Olafur H; Diminic, Janko; Long, Paul F; Cullum, John; Hranueli, Daslav; Hreggvidsson, Gudmundur O; Starcevic, Antonio
2017-06-01
The MEGGASENSE platform constructs relational databases of DNA or protein sequences. The default functional analysis uses 14 106 hidden Markov model (HMM) profiles based on sequences in the KEGG database. The Solr search engine allows sophisticated queries and a BLAST search function is also incorporated. These standard capabilities were used to generate the SCATT database from the predicted proteome of Streptomyces cattleya . The implementation of a specialised metagenome database (AMYLOMICS) for bioprospecting of carbohydrate-modifying enzymes is described. In addition to standard assembly of reads, a novel 'functional' assembly was developed, in which screening of reads with the HMM profiles occurs before the assembly. The AMYLOMICS database incorporates additional HMM profiles for carbohydrate-modifying enzymes and it is illustrated how the combination of HMM and BLAST analyses helps identify interesting genes. A variety of different proteome and metagenome databases have been generated by MEGGASENSE.
Characterizing the genetic structure of a forensic DNA database using a latent variable approach.
Kruijver, Maarten
2016-07-01
Several problems in forensic genetics require a representative model of a forensic DNA database. Obtaining an accurate representation of the offender database can be difficult, since databases typically contain groups of persons with unregistered ethnic origins in unknown proportions. We propose to estimate the allele frequencies of the subpopulations comprising the offender database and their proportions from the database itself using a latent variable approach. We present a model for which parameters can be estimated using the expectation maximization (EM) algorithm. This approach does not rely on relatively small and possibly unrepresentative population surveys, but is driven by the actual genetic composition of the database only. We fit the model to a snapshot of the Dutch offender database (2014), which contains close to 180,000 profiles, and find that three subpopulations suffice to describe a large fraction of the heterogeneity in the database. We demonstrate the utility and reliability of the approach with three applications. First, we use the model to predict the number of false leads obtained in database searches. We assess how well the model predicts the number of false leads obtained in mock searches in the Dutch offender database, both for the case of familial searching for first degree relatives of a donor and searching for contributors to three-person mixtures. Second, we study the degree of partial matching between all pairs of profiles in the Dutch database and compare this to what is predicted using the latent variable approach. Third, we use the model to provide evidence to support that the Dutch practice of estimating match probabilities using the Balding-Nichols formula with a native Dutch reference database and θ=0.03 is conservative. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Kobayashi, Norio; Ishii, Manabu; Takahashi, Satoshi; Mochizuki, Yoshiki; Matsushima, Akihiro; Toyoda, Tetsuro
2011-07-01
Global cloud frameworks for bioinformatics research databases become huge and heterogeneous; solutions face various diametric challenges comprising cross-integration, retrieval, security and openness. To address this, as of March 2011 organizations including RIKEN published 192 mammalian, plant and protein life sciences databases having 8.2 million data records, integrated as Linked Open or Private Data (LOD/LPD) using SciNetS.org, the Scientists' Networking System. The huge quantity of linked data this database integration framework covers is based on the Semantic Web, where researchers collaborate by managing metadata across public and private databases in a secured data space. This outstripped the data query capacity of existing interface tools like SPARQL. Actual research also requires specialized tools for data analysis using raw original data. To solve these challenges, in December 2009 we developed the lightweight Semantic-JSON interface to access each fragment of linked and raw life sciences data securely under the control of programming languages popularly used by bioinformaticians such as Perl and Ruby. Researchers successfully used the interface across 28 million semantic relationships for biological applications including genome design, sequence processing, inference over phenotype databases, full-text search indexing and human-readable contents like ontology and LOD tree viewers. Semantic-JSON services of SciNetS.org are provided at http://semanticjson.org.
MetaRNA-Seq: An Interactive Tool to Browse and Annotate Metadata from RNA-Seq Studies.
Kumar, Pankaj; Halama, Anna; Hayat, Shahina; Billing, Anja M; Gupta, Manish; Yousri, Noha A; Smith, Gregory M; Suhre, Karsten
2015-01-01
The number of RNA-Seq studies has grown in recent years. The design of RNA-Seq studies varies from very simple (e.g., two-condition case-control) to very complicated (e.g., time series involving multiple samples at each time point with separate drug treatments). Most of these publically available RNA-Seq studies are deposited in NCBI databases, but their metadata are scattered throughout four different databases: Sequence Read Archive (SRA), Biosample, Bioprojects, and Gene Expression Omnibus (GEO). Although the NCBI web interface is able to provide all of the metadata information, it often requires significant effort to retrieve study- or project-level information by traversing through multiple hyperlinks and going to another page. Moreover, project- and study-level metadata lack manual or automatic curation by categories, such as disease type, time series, case-control, or replicate type, which are vital to comprehending any RNA-Seq study. Here we describe "MetaRNA-Seq," a new tool for interactively browsing, searching, and annotating RNA-Seq metadata with the capability of semiautomatic curation at the study level.
Is Library Database Searching a Language Learning Activity?
ERIC Educational Resources Information Center
Bordonaro, Karen
2010-01-01
This study explores how non-native speakers of English think of words to enter into library databases when they begin the process of searching for information in English. At issue is whether or not language learning takes place when these students use library databases. Language learning in this study refers to the use of strategies employed by…
Parallel database search and prime factorization with magnonic holographic memory devices
DOE Office of Scientific and Technical Information (OSTI.GOV)
Khitun, Alexander
In this work, we describe the capabilities of Magnonic Holographic Memory (MHM) for parallel database search and prime factorization. MHM is a type of holographic device, which utilizes spin waves for data transfer and processing. Its operation is based on the correlation between the phases and the amplitudes of the input spin waves and the output inductive voltage. The input of MHM is provided by the phased array of spin wave generating elements allowing the producing of phase patterns of an arbitrary form. The latter makes it possible to code logic states into the phases of propagating waves and exploitmore » wave superposition for parallel data processing. We present the results of numerical modeling illustrating parallel database search and prime factorization. The results of numerical simulations on the database search are in agreement with the available experimental data. The use of classical wave interference may results in a significant speedup over the conventional digital logic circuits in special task data processing (e.g., √n in database search). Potentially, magnonic holographic devices can be implemented as complementary logic units to digital processors. Physical limitations and technological constrains of the spin wave approach are also discussed.« less
Parallel database search and prime factorization with magnonic holographic memory devices
NASA Astrophysics Data System (ADS)
Khitun, Alexander
2015-12-01
In this work, we describe the capabilities of Magnonic Holographic Memory (MHM) for parallel database search and prime factorization. MHM is a type of holographic device, which utilizes spin waves for data transfer and processing. Its operation is based on the correlation between the phases and the amplitudes of the input spin waves and the output inductive voltage. The input of MHM is provided by the phased array of spin wave generating elements allowing the producing of phase patterns of an arbitrary form. The latter makes it possible to code logic states into the phases of propagating waves and exploit wave superposition for parallel data processing. We present the results of numerical modeling illustrating parallel database search and prime factorization. The results of numerical simulations on the database search are in agreement with the available experimental data. The use of classical wave interference may results in a significant speedup over the conventional digital logic circuits in special task data processing (e.g., √n in database search). Potentially, magnonic holographic devices can be implemented as complementary logic units to digital processors. Physical limitations and technological constrains of the spin wave approach are also discussed.
Piehowski, Paul D; Petyuk, Vladislav A; Sandoval, John D; Burnum, Kristin E; Kiebel, Gary R; Monroe, Matthew E; Anderson, Gordon A; Camp, David G; Smith, Richard D
2013-03-01
For bottom-up proteomics, there are wide variety of database-searching algorithms in use for matching peptide sequences to tandem MS spectra. Likewise, there are numerous strategies being employed to produce a confident list of peptide identifications from the different search algorithm outputs. Here we introduce a grid-search approach for determining optimal database filtering criteria in shotgun proteomics data analyses that is easily adaptable to any search. Systematic Trial and Error Parameter Selection--referred to as STEPS--utilizes user-defined parameter ranges to test a wide array of parameter combinations to arrive at an optimal "parameter set" for data filtering, thus maximizing confident identifications. The benefits of this approach in terms of numbers of true-positive identifications are demonstrated using datasets derived from immunoaffinity-depleted blood serum and a bacterial cell lysate, two common proteomics sample types. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Analysis of Users' Searches of CD-ROM Databases in the National and University Library in Zagreb.
ERIC Educational Resources Information Center
Jokic, Maja
1997-01-01
Investigates the search behavior of CD-ROM database users in Zagreb (Croatia) libraries: one group needed a minimum of technical assistance, and the other was completely independent. Highlights include the use of questionnaires and transaction log analysis and the need for end-user education. The questionnaire and definitions of search process…
Non-Coding RNA Analysis Using the Rfam Database.
Kalvari, Ioanna; Nawrocki, Eric P; Argasinska, Joanna; Quinones-Olvera, Natalia; Finn, Robert D; Bateman, Alex; Petrov, Anton I
2018-06-01
Rfam is a database of non-coding RNA families in which each family is represented by a multiple sequence alignment, a consensus secondary structure, and a covariance model. Using a combination of manual and literature-based curation and a custom software pipeline, Rfam converts descriptions of RNA families found in the scientific literature into computational models that can be used to annotate RNAs belonging to those families in any DNA or RNA sequence. Valuable research outputs that are often locked up in figures and supplementary information files are encapsulated in Rfam entries and made accessible through the Rfam Web site. The data produced by Rfam have a broad application, from genome annotation to providing training sets for algorithm development. This article gives an overview of how to search and navigate the Rfam Web site, and how to annotate sequences with RNA families. The Rfam database is freely available at http://rfam.org. © 2018 by John Wiley & Sons, Inc. Copyright © 2018 John Wiley & Sons, Inc.
Tautomerism in chemical information management systems
NASA Astrophysics Data System (ADS)
Warr, Wendy A.
2010-06-01
Tautomerism has an impact on many of the processes in chemical information management systems including novelty checking during registration into chemical structure databases; storage of structures; exact and substructure searching in chemical structure databases; and depiction of structures retrieved by a search. The approaches taken by 27 different software vendors and database producers are compared. It is hoped that this comparison will act as a discussion document that could ultimately improve databases and software for researchers in the future.
NASA Technical Reports Server (NTRS)
McGreevy, Michael W.; Connors, Mary M. (Technical Monitor)
2001-01-01
To support Search Requests and Quick Responses at the Aviation Safety Reporting System (ASRS), four new QUORUM methods have been developed: keyword search, phrase search, phrase generation, and phrase discovery. These methods build upon the core QUORUM methods of text analysis, modeling, and relevance-ranking. QUORUM keyword search retrieves ASRS incident narratives that contain one or more user-specified keywords in typical or selected contexts, and ranks the narratives on their relevance to the keywords in context. QUORUM phrase search retrieves narratives that contain one or more user-specified phrases, and ranks the narratives on their relevance to the phrases. QUORUM phrase generation produces a list of phrases from the ASRS database that contain a user-specified word or phrase. QUORUM phrase discovery finds phrases that are related to topics of interest. Phrase generation and phrase discovery are particularly useful for finding query phrases for input to QUORUM phrase search. The presentation of the new QUORUM methods includes: a brief review of the underlying core QUORUM methods; an overview of the new methods; numerous, concrete examples of ASRS database searches using the new methods; discussion of related methods; and, in the appendices, detailed descriptions of the new methods.
Zhang, Jiyang; Ma, Jie; Dou, Lei; Wu, Songfeng; Qian, Xiaohong; Xie, Hongwei; Zhu, Yunping; He, Fuchu
2009-02-01
The hybrid linear trap quadrupole Fourier-transform (LTQ-FT) ion cyclotron resonance mass spectrometer, an instrument with high accuracy and resolution, is widely used in the identification and quantification of peptides and proteins. However, time-dependent errors in the system may lead to deterioration of the accuracy of these instruments, negatively influencing the determination of the mass error tolerance (MET) in database searches. Here, a comprehensive discussion of LTQ/FT precursor ion mass error is provided. On the basis of an investigation of the mass error distribution, we propose an improved recalibration formula and introduce a new tool, FTDR (Fourier-transform data recalibration), that employs a graphic user interface (GUI) for automatic calibration. It was found that the calibration could adjust the mass error distribution to more closely approximate a normal distribution and reduce the standard deviation (SD). Consequently, we present a new strategy, LDSF (Large MET database search and small MET filtration), for database search MET specification and validation of database search results. As the name implies, a large-MET database search is conducted and the search results are then filtered using the statistical MET estimated from high-confidence results. By applying this strategy to a standard protein data set and a complex data set, we demonstrate the LDSF can significantly improve the sensitivity of the result validation procedure.
Use of model organism and disease databases to support matchmaking for human disease gene discovery.
Mungall, Christopher J; Washington, Nicole L; Nguyen-Xuan, Jeremy; Condit, Christopher; Smedley, Damian; Köhler, Sebastian; Groza, Tudor; Shefchek, Kent; Hochheiser, Harry; Robinson, Peter N; Lewis, Suzanna E; Haendel, Melissa A
2015-10-01
The Matchmaker Exchange application programming interface (API) allows searching a patient's genotypic or phenotypic profiles across clinical sites, for the purposes of cohort discovery and variant disease causal validation. This API can be used not only to search for matching patients, but also to match against public disease and model organism data. This public disease data enable matching known diseases and variant-phenotype associations using phenotype semantic similarity algorithms developed by the Monarch Initiative. The model data can provide additional evidence to aid diagnosis, suggest relevant models for disease mechanism and treatment exploration, and identify collaborators across the translational divide. The Monarch Initiative provides an implementation of this API for searching multiple integrated sources of data that contextualize the knowledge about any given patient or patient family into the greater biomedical knowledge landscape. While this corpus of data can aid diagnosis, it is also the beginning of research to improve understanding of rare human diseases. © 2015 WILEY PERIODICALS, INC.
Beyond scene gist: Objects guide search more than scene background.
Koehler, Kathryn; Eckstein, Miguel P
2017-06-01
Although the facilitation of visual search by contextual information is well established, there is little understanding of the independent contributions of different types of contextual cues in scenes. Here we manipulated 3 types of contextual information: object co-occurrence, multiple object configurations, and background category. We isolated the benefits of each contextual cue to target detectability, its impact on decision bias, confidence, and the guidance of eye movements. We find that object-based information guides eye movements and facilitates perceptual judgments more than scene background. The degree of guidance and facilitation of each contextual cue can be related to its inherent informativeness about the target spatial location as measured by human explicit judgments about likely target locations. Our results improve the understanding of the contributions of distinct contextual scene components to search and suggest that the brain's utilization of cues to guide eye movements is linked to the cue's informativeness about the target's location. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
A Standardized Protocol for the Prospective Follow-Up of Cleft Lip and Palate Patients.
Salimi, Negar; Jolanta, Aleksejūnienė; Edwin, Yen; Angelina, Loo
2018-01-01
To develop a standardized all-encompassing protocol for the assessment of cleft lip and palate patients with clinical and research implications. Electronic database searches were conducted and 13 major cleft centers worldwide were contacted in order to prepare for the development of the protocol. In preparation, the available evidence was reviewed and potential fistula-related risk determinants from 4 different domains were identified. No standardized protocol for the assessment of cleft patients could be found in any of the electronic database searches that were conducted. Interviews with representatives from several major centers revealed that the majority of centers do not have a standardized comprehensive strategy for the reporting and follow-up of cleft lip and palate patients. The protocol was developed and consisted of the following domains of determinants: (1) the sociodemographic domain, (2) the cleft defect domain, (3) the surgery domain, and (4) the fistula domain. The proposed protocol has the potential to enhance the quality of patient care by ensuring that multiple patient-related aspects are consistently reported. It may also facilitate future multicenter research, which could contribute to the reduction of fistula occurrence in cleft lip and palate patients.
Onyura, Betty; Baker, Lindsay; Cameron, Blair; Friesen, Farah; Leslie, Karen
2016-01-01
An umbrella review compiles evidence from multiple reviews into a single accessible document. This umbrella review synthesizes evidence from systematic reviews on curricular and instructional design approaches in undergraduate medical education, focusing on learning outcomes. We conducted bibliographic database searches in Medline, EMBASE and ERIC from database inception to May 2013 inclusive, and digital keyword searches of leading medical education journals. We identified 18,470 abstracts; 467 underwent duplicate full-text scrutiny. Thirty-six articles met all eligibility criteria. Articles were abstracted independently by three authors, using a modified Kirkpatrick model for evaluating learning outcomes. Evidence for the effectiveness of diverse educational approaches is reported. This review maps out empirical knowledge on the efficacy of a broad range of educational approaches in medical education. Critical knowledge gaps, and lapses in methodological rigour, are discussed, providing valuable insight for future research. The findings call attention to the need for adopting evaluative strategies that explore how contextual variabilities and individual (teacher/learner) differences influence efficacy of educational interventions. Additionally, the results underscore that extant empirical evidence does not always provide unequivocal answers about what approaches are most effective. Educators should incorporate best available empirical knowledge with experiential and contextual knowledge.
Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology.
Karp, Peter D; Latendresse, Mario; Paley, Suzanne M; Krummenacker, Markus; Ong, Quang D; Billington, Richard; Kothari, Anamika; Weaver, Daniel; Lee, Thomas; Subhraveti, Pallavi; Spaulding, Aaron; Fulcher, Carol; Keseler, Ingrid M; Caspi, Ron
2016-09-01
Pathway Tools is a bioinformatics software environment with a broad set of capabilities. The software provides genome-informatics tools such as a genome browser, sequence alignments, a genome-variant analyzer and comparative-genomics operations. It offers metabolic-informatics tools, such as metabolic reconstruction, quantitative metabolic modeling, prediction of reaction atom mappings and metabolic route search. Pathway Tools also provides regulatory-informatics tools, such as the ability to represent and visualize a wide range of regulatory interactions. This article outlines the advances in Pathway Tools in the past 5 years. Major additions include components for metabolic modeling, metabolic route search, computation of atom mappings and estimation of compound Gibbs free energies of formation; addition of editors for signaling pathways, for genome sequences and for cellular architecture; storage of gene essentiality data and phenotype data; display of multiple alignments, and of signaling and electron-transport pathways; and development of Python and web-services application programming interfaces. Scientists around the world have created more than 9800 Pathway/Genome Databases by using Pathway Tools, many of which are curated databases for important model organisms. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Reefgenomics.Org - a repository for marine genomics data.
Liew, Yi Jin; Aranda, Manuel; Voolstra, Christian R
2016-01-01
Over the last decade, technological advancements have substantially decreased the cost and time of obtaining large amounts of sequencing data. Paired with the exponentially increased computing power, individual labs are now able to sequence genomes or transcriptomes to investigate biological questions of interest. This has led to a significant increase in available sequence data. Although the bulk of data published in articles are stored in public sequence databases, very often, only raw sequencing data are available; miscellaneous data such as assembled transcriptomes, genome annotations etc. are not easily obtainable through the same means. Here, we introduce our website (http://reefgenomics.org) that aims to centralize genomic and transcriptomic data from marine organisms. Besides providing convenient means to download sequences, we provide (where applicable) a genome browser to explore available genomic features, and a BLAST interface to search through the hosted sequences. Through the interface, multiple datasets can be queried simultaneously, allowing for the retrieval of matching sequences from organisms of interest. The minimalistic, no-frills interface reduces visual clutter, making it convenient for end-users to search and explore processed sequence data. DATABASE URL: http://reefgenomics.org. © The Author(s) 2016. Published by Oxford University Press.
Buehler, Stephanie S.; Madison, Bereneice; Snyder, Susan R.; Derzon, James H.; Saubolle, Michael A.; Weissfeld, Alice S.; Weinstein, Melvin P.; Liebow, Edward B.; Wolk, Donna M.
2015-01-01
SUMMARY Background. Bloodstream infection (BSI) is a major cause of morbidity and mortality throughout the world. Rapid identification of bloodstream pathogens is a laboratory practice that supports strategies for rapid transition to direct targeted therapy by providing for timely and effective patient care. In fact, the more rapidly that appropriate antimicrobials are prescribed, the lower the mortality for patients with sepsis. Rapid identification methods may have multiple positive impacts on patient outcomes, including reductions in mortality, morbidity, hospital lengths of stay, and antibiotic use. In addition, the strategy can reduce the cost of care for patients with BSIs. Objectives. The purpose of this review is to evaluate the evidence for the effectiveness of three rapid diagnostic practices in decreasing the time to targeted therapy for hospitalized patients with BSIs. The review was performed by applying the Centers for Disease Control and Prevention's (CDC's) Laboratory Medicine Best Practices Initiative (LMBP) systematic review methods for quality improvement (QI) practices and translating the results into evidence-based guidance (R. H. Christenson et al., Clin Chem 57:816–825, 2011, http://dx.doi.org/10.1373/clinchem.2010.157131). Search strategy. A comprehensive literature search was conducted to identify studies with measurable outcomes. A search of three electronic bibliographic databases (PubMed, Embase, and CINAHL), databases containing “gray” literature (unpublished academic, government, or industry evidence not governed by commercial publishing) (CIHI, NIHR, SIGN, and other databases), and the Cochrane database for English-language articles published between 1990 and 2011 was conducted in July 2011. Dates of search. The dates of our search were from 1990 to July 2011. Selection criteria. Animal studies and non-English publications were excluded. The search contained the following medical subject headings: bacteremia; bloodstream infection; time factors; health care costs; length of stay; morbidity; mortality; antimicrobial therapy; rapid molecular techniques, polymerase chain reaction (PCR); in situ hybridization, fluorescence; treatment outcome; drug therapy; patient care team; pharmacy service, hospital; hospital information systems; Gram stain; pharmacy service; and spectrometry, mass, matrix-assisted laser desorption-ionization. Phenotypic as well as the following key words were searched: targeted therapy; rapid identification; rapid; Gram positive; Gram negative; reduce(ed); cost(s); pneumoslide; PBP2; tube coagulase; matrix-assisted laser desorption/ionization time of flight; MALDI TOF; blood culture; EMR; electronic reporting; call to provider; collaboration; pharmacy; laboratory; bacteria; yeast; ICU; and others. In addition to the electronic search being performed, a request for unpublished quality improvement data was made to the clinical laboratory community. Main results. Rapid molecular testing with direct communication significantly improves timeliness compared to standard testing. Rapid phenotypic techniques with direct communication likely improve the timeliness of targeted therapy. Studies show a significant and homogeneous reduction in mortality associated with rapid molecular testing combined with direct communication. Authors' conclusions. No recommendation is made for or against the use of the three assessed practices of this review due to insufficient evidence. The overall strength of evidence is suggestive; the data suggest that each of these three practices has the potential to improve the time required to initiate targeted therapy and possibly improve other patient outcomes, such as mortality. The meta-analysis results suggest that the implementation of any of the three practices may be more effective at increasing timeliness to targeted therapy than routine microbiology techniques for identification of the microorganisms causing BSIs. Based on the included studies, results for all three practices appear applicable across multiple microorganisms, including methicillin-resistant Staphylococcus aureus (MRSA), methicillin-sensitive S. aureus (MSSA), Candida species, and Enterococcus species. PMID:26598385
Role of Proangiogenic Factors in Immunopathogenesis of Multiple Sclerosis.
Hamid, Kabir Magaji; Mirshafiey, Abbas
2016-02-01
Angiogenesis is a complex and balanced process in which new blood vessels form from preexisting ones by sprouting, splitting, growth and remodeling. This phenomenon plays a vital role in many physiological and pathological processes. However, the disturbance in physiological process can play a role in pathogenesis of some chronic inflammatory diseases, including multiple sclerosis (MS) in human and its animal model. Although the relation between abnormal blood vessels and MS lesions was established in previous studies, but the role of pathological angiogenesis remains unclear. In this study, the link between proangiogenic factors and multiple sclerosis pathogenesis was examined by conducting a systemic review. Thus we searched the English medical literature via PubMed, ISI web of knowledge, Medline and virtual health library (VHL) databases. In this review, we describe direct and indirect roles of some proangiogenic factors in MS pathogenesis and report the association of these factors with pathological and inflammatory angiogenesis.
G-Hash: Towards Fast Kernel-based Similarity Search in Large Graph Databases.
Wang, Xiaohong; Smalter, Aaron; Huan, Jun; Lushington, Gerald H
2009-01-01
Structured data including sets, sequences, trees and graphs, pose significant challenges to fundamental aspects of data management such as efficient storage, indexing, and similarity search. With the fast accumulation of graph databases, similarity search in graph databases has emerged as an important research topic. Graph similarity search has applications in a wide range of domains including cheminformatics, bioinformatics, sensor network management, social network management, and XML documents, among others.Most of the current graph indexing methods focus on subgraph query processing, i.e. determining the set of database graphs that contains the query graph and hence do not directly support similarity search. In data mining and machine learning, various graph kernel functions have been designed to capture the intrinsic similarity of graphs. Though successful in constructing accurate predictive and classification models for supervised learning, graph kernel functions have (i) high computational complexity and (ii) non-trivial difficulty to be indexed in a graph database.Our objective is to bridge graph kernel function and similarity search in graph databases by proposing (i) a novel kernel-based similarity measurement and (ii) an efficient indexing structure for graph data management. Our method of similarity measurement builds upon local features extracted from each node and their neighboring nodes in graphs. A hash table is utilized to support efficient storage and fast search of the extracted local features. Using the hash table, a graph kernel function is defined to capture the intrinsic similarity of graphs and for fast similarity query processing. We have implemented our method, which we have named G-hash, and have demonstrated its utility on large chemical graph databases. Our results show that the G-hash method achieves state-of-the-art performance for k-nearest neighbor (k-NN) classification. Most importantly, the new similarity measurement and the index structure is scalable to large database with smaller indexing size, faster indexing construction time, and faster query processing time as compared to state-of-the-art indexing methods such as C-tree, gIndex, and GraphGrep.
Cost and Search Result Comparisons of BRS After Dark and Knowledge Index.
ERIC Educational Resources Information Center
Cloud, Gayla Staples; Hambric, Jacqueline
This two-part study was designed (1) to determine differences in the costs of searching BRS After Dark (BRS AD) and Knowledge Index (KI) generally and across ten selected databases, and (2) to determine whether there is a difference in the citations retrieved when the same search is conducted on the same database in both systems. Study methodology…
PubMed searches: overview and strategies for clinicians.
Lindsey, Wesley T; Olin, Bernie R
2013-04-01
PubMed is a biomedical and life sciences database maintained by a division of the National Library of Medicine known as the National Center for Biotechnology Information (NCBI). It is a large resource with more than 5600 journals indexed and greater than 22 million total citations. Searches conducted in PubMed provide references that are more specific for the intended topic compared with other popular search engines. Effective PubMed searches allow the clinician to remain current on the latest clinical trials, systematic reviews, and practice guidelines. PubMed continues to evolve by allowing users to create a customized experience through the My NCBI portal, new arrangements and options in search filters, and supporting scholarly projects through exportation of citations to reference managing software. Prepackaged search options available in the Clinical Queries feature also allow users to efficiently search for clinical literature. PubMed also provides information regarding the source journals themselves through the Journals in NCBI Databases link. This article provides an overview of the PubMed database's structure and features as well as strategies for conducting an effective search.
An ontology-based search engine for protein-protein interactions
2010-01-01
Background Keyword matching or ID matching is the most common searching method in a large database of protein-protein interactions. They are purely syntactic methods, and retrieve the records in the database that contain a keyword or ID specified in a query. Such syntactic search methods often retrieve too few search results or no results despite many potential matches present in the database. Results We have developed a new method for representing protein-protein interactions and the Gene Ontology (GO) using modified Gödel numbers. This representation is hidden from users but enables a search engine using the representation to efficiently search protein-protein interactions in a biologically meaningful way. Given a query protein with optional search conditions expressed in one or more GO terms, the search engine finds all the interaction partners of the query protein by unique prime factorization of the modified Gödel numbers representing the query protein and the search conditions. Conclusion Representing the biological relations of proteins and their GO annotations by modified Gödel numbers makes a search engine efficiently find all protein-protein interactions by prime factorization of the numbers. Keyword matching or ID matching search methods often miss the interactions involving a protein that has no explicit annotations matching the search condition, but our search engine retrieves such interactions as well if they satisfy the search condition with a more specific term in the ontology. PMID:20122195
An ontology-based search engine for protein-protein interactions.
Park, Byungkyu; Han, Kyungsook
2010-01-18
Keyword matching or ID matching is the most common searching method in a large database of protein-protein interactions. They are purely syntactic methods, and retrieve the records in the database that contain a keyword or ID specified in a query. Such syntactic search methods often retrieve too few search results or no results despite many potential matches present in the database. We have developed a new method for representing protein-protein interactions and the Gene Ontology (GO) using modified Gödel numbers. This representation is hidden from users but enables a search engine using the representation to efficiently search protein-protein interactions in a biologically meaningful way. Given a query protein with optional search conditions expressed in one or more GO terms, the search engine finds all the interaction partners of the query protein by unique prime factorization of the modified Gödel numbers representing the query protein and the search conditions. Representing the biological relations of proteins and their GO annotations by modified Gödel numbers makes a search engine efficiently find all protein-protein interactions by prime factorization of the numbers. Keyword matching or ID matching search methods often miss the interactions involving a protein that has no explicit annotations matching the search condition, but our search engine retrieves such interactions as well if they satisfy the search condition with a more specific term in the ontology.
Using the Turning Research Into Practice (TRIP) database: how do clinicians really search?*
Meats, Emma; Brassey, Jon; Heneghan, Carl; Glasziou, Paul
2007-01-01
Objectives: Clinicians and patients are increasingly accessing information through Internet searches. This study aimed to examine clinicians' current search behavior when using the Turning Research Into Practice (TRIP) database to examine search engine use and the ways it might be improved. Methods: A Web log analysis was undertaken of the TRIP database—a meta-search engine covering 150 health resources including MEDLINE, The Cochrane Library, and a variety of guidelines. The connectors for terms used in searches were studied, and observations were made of 9 users' search behavior when working with the TRIP database. Results: Of 620,735 searches, most used a single term, and 12% (n = 75,947) used a Boolean operator: 11% (n = 69,006) used “AND” and 0.8% (n = 4,941) used “OR.” Of the elements of a well-structured clinical question (population, intervention, comparator, and outcome), the population was most commonly used, while fewer searches included the intervention. Comparator and outcome were rarely used. Participants in the observational study were interested in learning how to formulate better searches. Conclusions: Web log analysis showed most searches used a single term and no Boolean operators. Observational study revealed users were interested in conducting efficient searches but did not always know how. Therefore, either better training or better search interfaces are required to assist users and enable more effective searching. PMID:17443248
LiverTox: Clinical and Research Information on Drug-Induced Liver Injury
... News Information Resources Glossary Abbreviations SEARCH THE LIVERTOX DATABASE Search for a specific medication, herbal or supplement: ... About Us . Disclaimer. Information presented in the LiverTox database is derived from the scientific literature and public ...
Information Retrieval in Telemedicine: a Comparative Study on Bibliographic Databases
Ahmadi, Maryam; Sarabi, Roghayeh Ershad; Orak, Roohangiz Jamshidi; Bahaadinbeigy, Kambiz
2015-01-01
Background and Aims: The first step in each systematic review is selection of the most valid database that can provide the highest number of relevant references. This study was carried out to determine the most suitable database for information retrieval in telemedicine field. Methods: Cinhal, PubMed, Web of Science and Scopus databases were searched for telemedicine matched with Education, cost benefit and patient satisfaction. After analysis of the obtained results, the accuracy coefficient, sensitivity, uniqueness and overlap of databases were calculated. Results: The studied databases differed in the number of retrieved articles. PubMed was identified as the most suitable database for retrieving information on the selected topics with the accuracy and sensitivity ratios of 50.7% and 61.4% respectively. The uniqueness percent of retrieved articles ranged from 38% for Pubmed to 3.0% for Cinhal. The highest overlap rate (18.6%) was found between PubMed and Web of Science. Less than 1% of articles have been indexed in all searched databases. Conclusion: PubMed is suggested as the most suitable database for starting search in telemedicine and after PubMed, Scopus and Web of Science can retrieve about 90% of the relevant articles. PMID:26236086
Information Retrieval in Telemedicine: a Comparative Study on Bibliographic Databases.
Ahmadi, Maryam; Sarabi, Roghayeh Ershad; Orak, Roohangiz Jamshidi; Bahaadinbeigy, Kambiz
2015-06-01
The first step in each systematic review is selection of the most valid database that can provide the highest number of relevant references. This study was carried out to determine the most suitable database for information retrieval in telemedicine field. Cinhal, PubMed, Web of Science and Scopus databases were searched for telemedicine matched with Education, cost benefit and patient satisfaction. After analysis of the obtained results, the accuracy coefficient, sensitivity, uniqueness and overlap of databases were calculated. The studied databases differed in the number of retrieved articles. PubMed was identified as the most suitable database for retrieving information on the selected topics with the accuracy and sensitivity ratios of 50.7% and 61.4% respectively. The uniqueness percent of retrieved articles ranged from 38% for Pubmed to 3.0% for Cinhal. The highest overlap rate (18.6%) was found between PubMed and Web of Science. Less than 1% of articles have been indexed in all searched databases. PubMed is suggested as the most suitable database for starting search in telemedicine and after PubMed, Scopus and Web of Science can retrieve about 90% of the relevant articles.
Chambers, Duncan; Paton, Fiona; Wilson, Paul; Eastwood, Alison; Craig, Dawn; Fox, Dave; Jayne, David; McGinnes, Erika
2014-01-01
Objectives To identify and critically assess the extent to which systematic reviews of enhanced recovery programmes for patients undergoing colorectal surgery differ in their methodology and reported estimates of effect. Design Review of published systematic reviews. We searched the Cochrane Database of Systematic Reviews, the Database of Abstracts of Reviews of Effects (DARE) and Health Technology Assessment (HTA) Database from 1990 to March 2013. Systematic reviews of enhanced recovery programmes for patients undergoing colorectal surgery were eligible for inclusion. Primary and secondary outcome measures The primary outcome was length of hospital stay. We assessed changes in pooled estimates of treatment effect over time and how these might have been influenced by decisions taken by researchers as well as by the availability of new trials. The quality of systematic reviews was assessed using the Centre for Reviews and Dissemination (CRD) DARE critical appraisal process. Results 10 systematic reviews were included. Systematic reviews of randomised controlled trials have consistently shown a reduction in length of hospital stay with enhanced recovery compared with traditional care. The estimated effect tended to increase from 2006 to 2010 as more trials were published but has not altered significantly in the most recent review, despite the inclusion of several unique trials. The best estimate appears to be an average reduction of around 2.5 days in primary postoperative length of stay. Differences between reviews reflected differences in interpretation of inclusion criteria, searching and analytical methods or software. Conclusions Systematic reviews of enhanced recovery programmes show a high level of research waste, with multiple reviews covering identical or very similar groups of trials. Where multiple reviews exist on a topic, interpretation may require careful attention to apparently minor differences between reviews. Researchers can help readers by acknowledging existing reviews and through clear reporting of key decisions, especially on inclusion/exclusion and on statistical pooling. PMID:24879828
Multiple sclerosis: Pregnancy and women's health issues.
Mendibe Bilbao, M; Boyero Durán, S; Bárcena Llona, J; Rodriguez-Antigüedad, A
2016-08-18
The course of multiple sclerosis (MS) is influenced by sex, pregnancy and hormonal factors. To analyse the influence of the above factors in order to clarify the aetiopathogenic mechanisms involved in the disease. We conducted a comprehensive review of scientific publications in the PubMed database using a keyword search for 'multiple sclerosis', 'MS', 'EAE', 'pregnancy', 'hormonal factors', 'treatment', and related terms. We reviewed the advances presented at the meeting held by the European Committee for Treatment and Research in Multiple Sclerosis (ECTRIMS) in March 2013 in London, as well as recommendations by international experts. We provide recommendations for counselling and treating women with MS prior to and during pregnancy and after delivery. Current findings on the effects of treatment on the mother, fetus, and newborn are also presented. We issue recommendations for future research in order to address knowledge gaps and clarify any inconsistencies in currently available data. Copyright © 2016 Sociedad Española de Neurología. Publicado por Elsevier España, S.L.U. All rights reserved.
Meditation as an Adjunct to the Management of Multiple Sclerosis
Levin, Adam B.; Hadgkiss, Emily J.; Weiland, Tracey J.; Jelinek, George A.
2014-01-01
Background. Multiple sclerosis (MS) disease course is known to be adversely affected by several factors including stress. A proposed mechanism for decreasing stress and therefore decreasing MS morbidity and improving quality of life is meditation. This review aims to critically analyse the current literature regarding meditation and MS. Methods. Four major databases were used to search for English language papers published before March 2014 with the terms MS, multiple sclerosis, meditation, and mindfulness. Results. 12 pieces of primary literature fitting the selection criteria were selected: two were randomised controlled studies, four were cohort studies, and six were surveys. The current literature varies in quality; however common positive effects of meditation include improved quality of life (QOL) and improved coping skills. Conclusion. All studies suggest possible benefit to the use of meditation as an adjunct to the management of multiple sclerosis. Additional rigorous clinical trials are required to validate the existing findings and determine if meditation has an impact on disease course over time. PMID:25105026
Ghandikota, Sudhir; Hershey, Gurjit K Khurana; Mersha, Tesfaye B
2018-03-24
Advances in high-throughput sequencing technologies have made it possible to generate multiple omics data at an unprecedented rate and scale. The accumulation of these omics data far outpaces the rate at which biologists can mine and generate new hypothesis to test experimentally. There is an urgent need to develop a myriad of powerful tools to efficiently and effectively search and filter these resources to address specific post-GWAS functional genomics questions. However, to date, these resources are scattered across several databases and often lack a unified portal for data annotation and analytics. In addition, existing tools to analyze and visualize these databases are highly fragmented, resulting researchers to access multiple applications and manual interventions for each gene or variant in an ad hoc fashion until all the questions are answered. In this study, we present GENEASE, a web-based one-stop bioinformatics tool designed to not only query and explore multi-omics and phenotype databases (e.g., GTEx, ClinVar, dbGaP, GWAS Catalog, ENCODE, Roadmap Epigenomics, KEGG, Reactome, Gene and Phenotype Ontology) in a single web interface but also to perform seamless post genome-wide association downstream functional and overlap analysis for non-coding regulatory variants. GENEASE accesses over 50 different databases in public domain including model organism-specific databases to facilitate gene/variant and disease exploration, enrichment and overlap analysis in real time. It is a user-friendly tool with point-and-click interface containing links for support information including user manual and examples. GENEASE can be accessed freely at http://research.cchmc.org/mershalab/genease_new/login.html. Tesfaye.Mersha@cchmc.org, Sudhir.Ghandikota@cchmc.org. Supplementary data are available at Bioinformatics online.
Search Prairie Resources for Students Plant Database Plant Database Butterfly Info Butterfly Info Insects Insect Database Frogs Frog Info Bird Database Bird Database Online Prairie Data Online Prairie Data
Lyon, Jennifer A; Garcia-Milian, Rolando; Norton, Hannah F; Tennant, Michele R
2014-01-01
Expert-mediated literature searching, a keystone service in biomedical librarianship, would benefit significantly from regular methodical review. This article describes the novel use of Research Electronic Data Capture (REDCap) software to create a database of literature searches conducted at a large academic health sciences library. An archive of paper search requests was entered into REDCap, and librarians now prospectively enter records for current searches. Having search data readily available allows librarians to reuse search strategies and track their workload. In aggregate, this data can help guide practice and determine priorities by identifying users' needs, tracking librarian effort, and focusing librarians' continuing education.
Mining large heterogeneous data sets in drug discovery.
Wild, David J
2009-10-01
Increasingly, effective drug discovery involves the searching and data mining of large volumes of information from many sources covering the domains of chemistry, biology and pharmacology amongst others. This has led to a proliferation of databases and data sources relevant to drug discovery. This paper provides a review of the publicly-available large-scale databases relevant to drug discovery, describes the kinds of data mining approaches that can be applied to them and discusses recent work in integrative data mining that looks for associations that pan multiple sources, including the use of Semantic Web techniques. The future of mining large data sets for drug discovery requires intelligent, semantic aggregation of information from all of the data sources described in this review, along with the application of advanced methods such as intelligent agents and inference engines in client applications.
BIG: a large-scale data integration tool for renal physiology.
Zhao, Yue; Yang, Chin-Rang; Raghuram, Viswanathan; Parulekar, Jaya; Knepper, Mark A
2016-10-01
Due to recent advances in high-throughput techniques, we and others have generated multiple proteomic and transcriptomic databases to describe and quantify gene expression, protein abundance, or cellular signaling on the scale of the whole genome/proteome in kidney cells. The existence of so much data from diverse sources raises the following question: "How can researchers find information efficiently for a given gene product over all of these data sets without searching each data set individually?" This is the type of problem that has motivated the "Big-Data" revolution in Data Science, which has driven progress in fields such as marketing. Here we present an online Big-Data tool called BIG (Biological Information Gatherer) that allows users to submit a single online query to obtain all relevant information from all indexed databases. BIG is accessible at http://big.nhlbi.nih.gov/.
Efficient bibliographic searches on allergy using ISI databases.
Sáez Gómez, J M; Annan, J W; Negro Alvarez, J M; Guillen-Grima, F; Bozzola, C M; Ivancevich, J C; Aguinaga Ontoso, E
2008-01-01
The aim of this article is to provide an introduction to using databases from the Thomson ISI Web of Knowledge, with special reference to Citation Indexes as an analysis tool for publications, and also to explain the meaning of the well-known Impact Factor. We present the partially modified new Consultation Interface to enhance information search routines of these databases. It introduces distinctive methods in search bibliography, including the correct application of analysis tools, paying particular attention to Journal Citation Reports and Impact Factor. We finish this article with comment on the consequences of using the Impact Factor as a quality indicator for the assessment of journals and publications, and how to ensure measures for indexing in the Thomson ISI Databases.
Treatment Effects for Dysphagia in Adults with Multiple Sclerosis: A Systematic Review.
Alali, Dalal; Ballard, Kirrie; Bogaardt, Hans
2016-10-01
Dysphagia or swallowing difficulties have been reported to be a concern in adults with multiple sclerosis (MS). This problem can result in several complications including aspiration pneumonia, reduced quality of life and an increase in mortality rate. No previous systematic reviews on treatment effects for dysphagia in MS have been published. The main objective of this study is to summarise and qualitatively analyse published studies on treatment effects for dysphagia in MS. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines were applied to conduct a systematic search of seven databases, using relevant key words, and subsequent analysis of the identified studies. The studies were required to meet all three inclusion criteria of including a statement on intention to treat, or measure the effects of treatment for dysphagia in adults with MS and data on treatment outcomes for at least one adult diagnosed with MS. Retained studies were evaluated by two independent reviewers using a critical appraisal tool. This study has not been registered. A total of 563 studies were identified from the database searches. After screening and assessment of full articles for eligibility, five studies were included in the review. Three examined electrical stimulation and two examined the use of botulinum toxin. One study testing electrical stimulation was a randomised controlled trial, two were well-designed case series and two were case series lacking experimental control. All studies reported some positive effects on dysphagia; however, treatments that involved the use of electrical stimulation showed larger effect sizes. There is a paucity of evidence to guide treatment of dysphagia in MS, with only electrical stimulation and botulinum toxin treatment represented in the literature search conducted here. While both treatments show initial promise for reducing the swallowing impairment, they require further research using well-controlled experimental designs to determine their clinical applicability and long-term treatment effects for dysphagia across different types and severity of MS.
Bertran, E A; Berlie, H D; Taylor, A; Divine, G; Jaber, L A
2017-02-01
To examine differences in the performance of HbA 1c for diagnosing diabetes in Arabs compared with Europeans. The PubMed, Embase and Cochrane library databases were searched for records published between 1998 and 2015. Estimates of sensitivity, specificity and log diagnostic odds ratios for an HbA 1c cut-point of 48 mmol/mol (6.5%) were compared between Arabs and Europeans, using a bivariate linear mixed-model approach. For studies reporting multiple cut-points, population-specific summary receiver operating characteristic (SROC) curves were constructed. In addition, sensitivity, specificity and Youden Index were estimated for strata defined by HbA 1c cut-point and population type. Database searches yielded 1912 unique records; 618 full-text articles were reviewed. Fourteen studies met the inclusion criteria; hand-searching yielded three additional eligible studies. Three Arab (N = 2880) and 16 European populations (N = 49 127) were included in the analysis. Summary sensitivity and specificity for a HbA 1c cut-point of 48 mmol/mol (6.5%) in both populations were 42% (33-51%), and 97% (95-98%). There was no difference in area under SROC curves between Arab and European populations (0.844 vs. 0.847; P = 0.867), suggesting no difference in HbA 1c diagnostic accuracy between populations. Multiple cut-point summary estimates stratified by population suggest that Arabs have lower sensitivity and higher specificity at a HbA 1c cut-point of 44 mmol/mol (6.2%) compared with European populations. Estimates also suggest similar test performance at cut-points of 44 mmol/mol (6.2%) and 48 mmol/mol (6.5%) for Arabs. Given the low sensitivity of HbA 1c in the high-risk Arab American population, we recommend a combination of glucose-based and HbA 1c testing to ensure an accurate and timely diagnosis of diabetes. © 2016 Diabetes UK.
Understanding PubMed user search behavior through log analysis.
Islamaj Dogan, Rezarta; Murray, G Craig; Névéol, Aurélie; Lu, Zhiyong
2009-01-01
This article reports on a detailed investigation of PubMed users' needs and behavior as a step toward improving biomedical information retrieval. PubMed is providing free service to researchers with access to more than 19 million citations for biomedical articles from MEDLINE and life science journals. It is accessed by millions of users each day. Efficient search tools are crucial for biomedical researchers to keep abreast of the biomedical literature relating to their own research. This study provides insight into PubMed users' needs and their behavior. This investigation was conducted through the analysis of one month of log data, consisting of more than 23 million user sessions and more than 58 million user queries. Multiple aspects of users' interactions with PubMed are characterized in detail with evidence from these logs. Despite having many features in common with general Web searches, biomedical information searches have unique characteristics that are made evident in this study. PubMed users are more persistent in seeking information and they reformulate queries often. The three most frequent types of search are search by author name, search by gene/protein, and search by disease. Use of abbreviation in queries is very frequent. Factors such as result set size influence users' decisions. Analysis of characteristics such as these plays a critical role in identifying users' information needs and their search habits. In turn, such an analysis also provides useful insight for improving biomedical information retrieval.Database URL:http://www.ncbi.nlm.nih.gov/PubMed.
LigSearch: a knowledge-based web server to identify likely ligands for a protein target
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beer, Tjaart A. P. de; Laskowski, Roman A.; Duban, Mark-Eugene
LigSearch is a web server for identifying ligands likely to bind to a given protein. Identifying which ligands might bind to a protein before crystallization trials could provide a significant saving in time and resources. LigSearch, a web server aimed at predicting ligands that might bind to and stabilize a given protein, has been developed. Using a protein sequence and/or structure, the system searches against a variety of databases, combining available knowledge, and provides a clustered and ranked output of possible ligands. LigSearch can be accessed at http://www.ebi.ac.uk/thornton-srv/databases/LigSearch.
Bramer, Wichor M; Giustini, Dean; Kramer, Bianca Mr; Anderson, Pf
2013-12-23
The usefulness of Google Scholar (GS) as a bibliographic database for biomedical systematic review (SR) searching is a subject of current interest and debate in research circles. Recent research has suggested GS might even be used alone in SR searching. This assertion is challenged here by testing whether GS can locate all studies included in 21 previously published SRs. Second, it examines the recall of GS, taking into account the maximum number of items that can be viewed, and tests whether more complete searches created by an information specialist will improve recall compared to the searches used in the 21 published SRs. The authors identified 21 biomedical SRs that had used GS and PubMed as information sources and reported their use of identical, reproducible search strategies in both databases. These search strategies were rerun in GS and PubMed, and analyzed as to their coverage and recall. Efforts were made to improve searches that underperformed in each database. GS' overall coverage was higher than PubMed (98% versus 91%) and overall recall is higher in GS: 80% of the references included in the 21 SRs were returned by the original searches in GS versus 68% in PubMed. Only 72% of the included references could be used as they were listed among the first 1,000 hits (the maximum number shown). Practical precision (the number of included references retrieved in the first 1,000, divided by 1,000) was on average 1.9%, which is only slightly lower than in other published SRs. Improving searches with the lowest recall resulted in an increase in recall from 48% to 66% in GS and, in PubMed, from 60% to 85%. Although its coverage and precision are acceptable, GS, because of its incomplete recall, should not be used as a single source in SR searching. A specialized, curated medical database such as PubMed provides experienced searchers with tools and functionality that help improve recall, and numerous options in order to optimize precision. Searches for SRs should be performed by experienced searchers creating searches that maximize recall for as many databases as deemed necessary by the search expert.
2013-01-01
Background The usefulness of Google Scholar (GS) as a bibliographic database for biomedical systematic review (SR) searching is a subject of current interest and debate in research circles. Recent research has suggested GS might even be used alone in SR searching. This assertion is challenged here by testing whether GS can locate all studies included in 21 previously published SRs. Second, it examines the recall of GS, taking into account the maximum number of items that can be viewed, and tests whether more complete searches created by an information specialist will improve recall compared to the searches used in the 21 published SRs. Methods The authors identified 21 biomedical SRs that had used GS and PubMed as information sources and reported their use of identical, reproducible search strategies in both databases. These search strategies were rerun in GS and PubMed, and analyzed as to their coverage and recall. Efforts were made to improve searches that underperformed in each database. Results GS’ overall coverage was higher than PubMed (98% versus 91%) and overall recall is higher in GS: 80% of the references included in the 21 SRs were returned by the original searches in GS versus 68% in PubMed. Only 72% of the included references could be used as they were listed among the first 1,000 hits (the maximum number shown). Practical precision (the number of included references retrieved in the first 1,000, divided by 1,000) was on average 1.9%, which is only slightly lower than in other published SRs. Improving searches with the lowest recall resulted in an increase in recall from 48% to 66% in GS and, in PubMed, from 60% to 85%. Conclusions Although its coverage and precision are acceptable, GS, because of its incomplete recall, should not be used as a single source in SR searching. A specialized, curated medical database such as PubMed provides experienced searchers with tools and functionality that help improve recall, and numerous options in order to optimize precision. Searches for SRs should be performed by experienced searchers creating searches that maximize recall for as many databases as deemed necessary by the search expert. PMID:24360284
Alving, Berit Elisabeth; Christensen, Janne Buck; Thrysøe, Lars
2018-03-01
The purpose of this literature review is to provide an overview of the information retrieval behaviour of clinical nurses, in terms of the use of databases and other information resources and their frequency of use. Systematic searches carried out in five databases and handsearching were used to identify the studies from 2010 to 2016, with a populations, exposures and outcomes (PEO) search strategy, focusing on the question: In which databases or other information resources do hospital nurses search for evidence based information, and how often? Of 5272 titles retrieved based on the search strategy, only nine studies fulfilled the criteria for inclusion. The studies are from the United States, Canada, Taiwan and Nigeria. The results show that hospital nurses' primary choice of source for evidence based information is Google and peers, while bibliographic databases such as PubMed are secondary choices. Data on frequency are only included in four of the studies, and data are heterogenous. The reasons for choosing Google and peers are primarily lack of time; lack of information; lack of retrieval skills; or lack of training in database searching. Only a few studies are published on clinical nurses' retrieval behaviours, and more studies are needed from Europe and Australia. © 2018 Health Libraries Group.
DePriest, Adam D; Fiandalo, Michael V; Schlanger, Simon; Heemers, Frederike; Mohler, James L; Liu, Song; Heemers, Hannelore V
2016-01-01
Androgen receptor (AR) is a ligand-activated transcription factor that is the main target for treatment of non-organ-confined prostate cancer (CaP). Failure of life-prolonging AR-targeting androgen deprivation therapy is due to flexibility in steroidogenic pathways that control intracrine androgen levels and variability in the AR transcriptional output. Androgen biosynthesis enzymes, androgen transporters and AR-associated coregulators are attractive novel CaP treatment targets. These proteins, however, are characterized by multiple transcript variants and isoforms, are subject to genomic alterations, and are differentially expressed among CaPs. Determining their therapeutic potential requires evaluation of extensive, diverse datasets that are dispersed over multiple databases, websites and literature reports. Mining and integrating these datasets are cumbersome, time-consuming tasks and provide only snapshots of relevant information. To overcome this impediment to effective, efficient study of AR and potential drug targets, we developed the Regulators of Androgen Action Resource (RAAR), a non-redundant, curated and user-friendly searchable web interface. RAAR centralizes information on gene function, clinical relevance, and resources for 55 genes that encode proteins involved in biosynthesis, metabolism and transport of androgens and for 274 AR-associated coregulator genes. Data in RAAR are organized in two levels: (i) Information pertaining to production of androgens is contained in a 'pre-receptor level' database, and coregulator gene information is provided in a 'post-receptor level' database, and (ii) an 'other resources' database contains links to additional databases that are complementary to and useful to pursue further the information provided in RAAR. For each of its 329 entries, RAAR provides access to more than 20 well-curated publicly available databases, and thus, access to thousands of data points. Hyperlinks provide direct access to gene-specific entries in the respective database(s). RAAR is a novel, freely available resource that provides fast, reliable and easy access to integrated information that is needed to develop alternative CaP therapies. Database URL: http://www.lerner.ccf.org/cancerbio/heemers/RAAR/search/. © The Author(s) 2016. Published by Oxford University Press.
Knowlton, Michelle N; Li, Tongbin; Ren, Yongliang; Bill, Brent R; Ellis, Lynda Bm; Ekker, Stephen C
2008-01-07
The zebrafish is a powerful model vertebrate amenable to high throughput in vivo genetic analyses. Examples include reverse genetic screens using morpholino knockdown, expression-based screening using enhancer trapping and forward genetic screening using transposon insertional mutagenesis. We have created a database to facilitate web-based distribution of data from such genetic studies. The MOrpholino DataBase is a MySQL relational database with an online, PHP interface. Multiple quality control levels allow differential access to data in raw and finished formats. MODBv1 includes sequence information relating to almost 800 morpholinos and their targets and phenotypic data regarding the dose effect of each morpholino (mortality, toxicity and defects). To improve the searchability of this database, we have incorporated a fixed-vocabulary defect ontology that allows for the organization of morpholino affects based on anatomical structure affected and defect produced. This also allows comparison between species utilizing Phenotypic Attribute Trait Ontology (PATO) designated terminology. MODB is also cross-linked with ZFIN, allowing full searches between the two databases. MODB offers users the ability to retrieve morpholino data by sequence of morpholino or target, name of target, anatomical structure affected and defect produced. MODB data can be used for functional genomic analysis of morpholino design to maximize efficacy and minimize toxicity. MODB also serves as a template for future sequence-based functional genetic screen databases, and it is currently being used as a model for the creation of a mutagenic insertional transposon database.
PlantGI: a database for searching gene indices in agricultural plants developed at NIAB, Korea
Kim, Chang Kug; Choi, Ji Weon; Park, DongSuk; Kang, Man Jung; Seol, Young-Joo; Hyun, Do Yoon; Hahn, Jang Ho
2008-01-01
The Plant Gene Index (PlantGI) database is developed as a web-based search system with search capabilities for keywords to provide information on gene indices specifically for agricultural plants. The database contains specific Gene Index information for ten agricultural species, namely, rice, Chinese cabbage, wheat, maize, soybean, barley, mushroom, Arabidopsis, hot pepper and tomato. PlantGI differs from other Gene Index databases in being specific to agricultural plant species and thus complements services from similar other developments. The database includes options for interactive mining of EST CONTIGS and assembled EST data for user specific keyword queries. The current version of PlantGI contains a total of 34,000 EST CONTIGS data for rice (8488 records), wheat (8560 records), maize (4570 records), soybean (3726 records), barley (3417 records), Chinese cabbage (3602 records), tomato (1236 records), hot pepper (998 records), mushroom (130 records) and Arabidopsis (8 records). Availability The database is available for free at http://www.niab.go.kr/nabic/. PMID:18685722
Paz, Sonia Patricia Castedo; Branco, Luciana; Pereira, Marina Alves de Camargo; Spessotto, Caroline; Fragoso, Yara Dadalti
2018-01-01
John Cunningham virus (JCV) is a polyoma virus that infects humans, mainly in childhood or adolescence, and presents no symptomatic manifestations. JCV can cause progressive multifocal leukoencephalopathy (PML) in immunosuppressed individuals, including those undergoing treatment for multiple sclerosis (MS) and neuromyelitis optica (NMO). PML is a severe and potentially fatal disease of the brain. The prevalence of JCV antibodies in human serum has been reported to be between 50.0 and 90.0%. The aim of the present study was to review worldwide data on populations of patients with MS and NMO in order to establish the rates of JCV seropositivity in these individuals. The present review followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines and used the following search terms: "JCV" OR "JC virus" AND "multiple sclerosis" OR "MS" OR "NMO" OR "neuromyelitis optica" AND "prevalence." These terms were searched for both in smaller and in larger clusters of words. The databases searched included PubMed, MEDLINE, SciELO, LILACS, Google Scholar, and Embase. After the initial selection, 18 papers were included in the review. These articles reported the prevalence of JCV antibodies in the serum of patients with MS or NMO living in 26 countries. The systematic review identified data on 29,319 patients with MS/NMO and found that 57.1% of them (16,730 individuals) were seropositive for the anti-JCV antibody (range, 40.0 to 69.0%). The median worldwide prevalence of JCV among adults with MS or NMO was found to be 57.1%.
Specialist Bibliographic Databases
2016-01-01
Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls. PMID:27134485
Specialist Bibliographic Databases.
Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A; Trukhachev, Vladimir I; Kostyukova, Elena I; Gerasimov, Alexey N; Kitas, George D
2016-05-01
Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls.
SinEx DB: a database for single exon coding sequences in mammalian genomes.
Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S
2016-01-01
Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.Database URL: www.sinex.cl. © The Author(s) 2016. Published by Oxford University Press.
ERIC Educational Resources Information Center
Dalrymple, Prudence W.; Roderer, Nancy K.
1994-01-01
Highlights the changes that have occurred from 1987-93 in database access systems. Topics addressed include types of databases, including CD-ROMs; enduser interface; database selection; database access management, including library instruction and use of primary literature; economic issues; database users; the search process; and improving…