Science.gov

Sample records for keyword search engine

  1. Interest in Anesthesia as Reflected by Keyword Searches using Common Search Engines

    PubMed Central

    Liu, Renyu; García, Paul S.; Fleisher, Lee A.

    2012-01-01

    Background Since current general interest in anesthesia is unknown, we analyzed internet keyword searches to gauge general interest in anesthesia in comparison with surgery and pain. Methods The trend of keyword searches from 2004 to 2010 related to anesthesia and anaesthesia was investigated using Google Insights for Search. The trend of number of peer reviewed articles on anesthesia cited on PubMed and Medline from 2004 to 2010 was investigated. The average cost on advertising on anesthesia, surgery and pain was estimated using Google AdWords. Searching results in other common search engines were also analyzed. Correlation between year and relative number of searches was determined with p< 0.05 considered statistically significant. Results Searches for the keyword “anesthesia” or “anaesthesia” diminished since 2004 reflected by Google Insights for Search (p< 0.05). The search for “anesthesia side effects” is trending up over the same time period while the search for “anesthesia and safety” is trending down. The search phrase “before anesthesia” is searched more frequently than “preanesthesia” and the search for “before anesthesia” is trending up. Using “pain” as a keyword is steadily increasing over the years indicated. While different search engines may provide different total number of searching results (available posts), the ratios of searching results between some common keywords related to perioperative care are comparable, indicating similar trend. The peer reviewed manuscripts on “anesthesia” and the proportion of papers on “anesthesia and outcome” are trending up. Estimates for spending of advertising dollars are less for anesthesia-related terms when compared to that for pain or surgery due to relative smaller number of searching traffic. Conclusions General interest in anesthesia (anaesthesia) as measured by internet searches appears to be decreasing. Pain, preanesthesia evaluation, anesthesia and outcome and side

  2. Interest in Anesthesia as Reflected by Keyword Searches using Common Search Engines.

    PubMed

    Liu, Renyu; García, Paul S; Fleisher, Lee A

    2012-01-23

    Since current general interest in anesthesia is unknown, we analyzed internet keyword searches to gauge general interest in anesthesia in comparison with surgery and pain. The trend of keyword searches from 2004 to 2010 related to anesthesia and anaesthesia was investigated using Google Insights for Search. The trend of number of peer reviewed articles on anesthesia cited on PubMed and Medline from 2004 to 2010 was investigated. The average cost on advertising on anesthesia, surgery and pain was estimated using Google AdWords. Searching results in other common search engines were also analyzed. Correlation between year and relative number of searches was determined with p< 0.05 considered statistically significant. Searches for the keyword "anesthesia" or "anaesthesia" diminished since 2004 reflected by Google Insights for Search (p< 0.05). The search for "anesthesia side effects" is trending up over the same time period while the search for "anesthesia and safety" is trending down. The search phrase "before anesthesia" is searched more frequently than "preanesthesia" and the search for "before anesthesia" is trending up. Using "pain" as a keyword is steadily increasing over the years indicated. While different search engines may provide different total number of searching results (available posts), the ratios of searching results between some common keywords related to perioperative care are comparable, indicating similar trend. The peer reviewed manuscripts on "anesthesia" and the proportion of papers on "anesthesia and outcome" are trending up. Estimates for spending of advertising dollars are less for anesthesia-related terms when compared to that for pain or surgery due to relative smaller number of searching traffic. General interest in anesthesia (anaesthesia) as measured by internet searches appears to be decreasing. Pain, preanesthesia evaluation, anesthesia and outcome and side effects of anesthesia are the critical areas that anesthesiologists should

  3. Metadata Effectiveness in Internet Discovery: An Analysis of Digital Collection Metadata Elements and Internet Search Engine Keywords

    ERIC Educational Resources Information Center

    Yang, Le

    2016-01-01

    This study analyzed digital item metadata and keywords from Internet search engines to learn what metadata elements actually facilitate discovery of digital collections through Internet keyword searching and how significantly each metadata element affects the discovery of items in a digital repository. The study found that keywords from Internet…

  4. Standardization of Keyword Search Mode

    ERIC Educational Resources Information Center

    Su, Di

    2010-01-01

    In spite of its popularity, keyword search mode has not been standardized. Though information professionals are quick to adapt to various presentations of keyword search mode, novice end-users may find keyword search confusing. This article compares keyword search mode in some major reference databases and calls for standardization. (Contains 3…

  5. Score Normalization for Keyword Search

    DTIC Science & Technology

    2016-06-23

    Anahtar Sözcük Arama için Skor Düzgeleme Score Normalization for Keyword Search Leda Sarı, Murat Saraçlar Elektrik ve Elektronik Mühendisliği Bölümü...skor düzgeleme. Abstract—In this work, keyword search (KWS) is based on a symbolic index that uses posteriorgram representation of the speech data...For each query, sum-to-one normalization or keyword specific thresholding is applied to the search results. The effect of these methods on the proposed

  6. SPARK: Adapting Keyword Query to Semantic Search

    NASA Astrophysics Data System (ADS)

    Zhou, Qi; Wang, Chong; Xiong, Miao; Wang, Haofen; Yu, Yong

    Semantic search promises to provide more accurate result than present-day keyword search. However, progress with semantic search has been delayed due to the complexity of its query languages. In this paper, we explore a novel approach of adapting keywords to querying the semantic web: the approach automatically translates keyword queries into formal logic queries so that end users can use familiar keywords to perform semantic search. A prototype system named 'SPARK' has been implemented in light of this approach. Given a keyword query, SPARK outputs a ranked list of SPARQL queries as the translation result. The translation in SPARK consists of three major steps: term mapping, query graph construction and query ranking. Specifically, a probabilistic query ranking model is proposed to select the most likely SPARQL query. In the experiment, SPARK achieved an encouraging translation result.

  7. Keywords

    ERIC Educational Resources Information Center

    Doecke, Brenton; Howie, Mark; Sawyer, Wayne

    2007-01-01

    Borrowing the title of Raymond Williams' famous study, the following reflections--sometimes collective and sometimes individual--are based on a series of "Keywords", specifically: "fear", "community" and "creativity". By reflecting on the meanings these words have for us today, we attempt to capture their dialogical character, posing them as sites…

  8. Attribute-Based Proxy Re-Encryption with Keyword Search

    PubMed Central

    Shi, Yanfeng; Liu, Jiqiang; Han, Zhen; Zheng, Qingji; Zhang, Rui; Qiu, Shuo

    2014-01-01

    Keyword search on encrypted data allows one to issue the search token and conduct search operations on encrypted data while still preserving keyword privacy. In the present paper, we consider the keyword search problem further and introduce a novel notion called attribute-based proxy re-encryption with keyword search (), which introduces a promising feature: In addition to supporting keyword search on encrypted data, it enables data owners to delegate the keyword search capability to some other data users complying with the specific access control policy. To be specific, allows (i) the data owner to outsource his encrypted data to the cloud and then ask the cloud to conduct keyword search on outsourced encrypted data with the given search token, and (ii) the data owner to delegate other data users keyword search capability in the fine-grained access control manner through allowing the cloud to re-encrypted stored encrypted data with a re-encrypted data (embedding with some form of access control policy). We formalize the syntax and security definitions for , and propose two concrete constructions for : key-policy and ciphertext-policy . In the nutshell, our constructions can be treated as the integration of technologies in the fields of attribute-based cryptography and proxy re-encryption cryptography. PMID:25549257

  9. Attribute-based proxy re-encryption with keyword search.

    PubMed

    Shi, Yanfeng; Liu, Jiqiang; Han, Zhen; Zheng, Qingji; Zhang, Rui; Qiu, Shuo

    2014-01-01

    Keyword search on encrypted data allows one to issue the search token and conduct search operations on encrypted data while still preserving keyword privacy. In the present paper, we consider the keyword search problem further and introduce a novel notion called attribute-based proxy re-encryption with keyword search (ABRKS), which introduces a promising feature: In addition to supporting keyword search on encrypted data, it enables data owners to delegate the keyword search capability to some other data users complying with the specific access control policy. To be specific, ABRKS allows (i) the data owner to outsource his encrypted data to the cloud and then ask the cloud to conduct keyword search on outsourced encrypted data with the given search token, and (ii) the data owner to delegate other data users keyword search capability in the fine-grained access control manner through allowing the cloud to re-encrypted stored encrypted data with a re-encrypted data (embedding with some form of access control policy). We formalize the syntax and security definitions for ABRKS, and propose two concrete constructions for ABRKS: key-policy ABRKS and ciphertext-policy ABRKS. In the nutshell, our constructions can be treated as the integration of technologies in the fields of attribute-based cryptography and proxy re-encryption cryptography.

  10. Searching the ASRS Database Using QUORUM Keyword Search, Phrase Search, Phrase Generation, and Phrase Discovery

    NASA Technical Reports Server (NTRS)

    McGreevy, Michael W.; Connors, Mary M. (Technical Monitor)

    2001-01-01

    To support Search Requests and Quick Responses at the Aviation Safety Reporting System (ASRS), four new QUORUM methods have been developed: keyword search, phrase search, phrase generation, and phrase discovery. These methods build upon the core QUORUM methods of text analysis, modeling, and relevance-ranking. QUORUM keyword search retrieves ASRS incident narratives that contain one or more user-specified keywords in typical or selected contexts, and ranks the narratives on their relevance to the keywords in context. QUORUM phrase search retrieves narratives that contain one or more user-specified phrases, and ranks the narratives on their relevance to the phrases. QUORUM phrase generation produces a list of phrases from the ASRS database that contain a user-specified word or phrase. QUORUM phrase discovery finds phrases that are related to topics of interest. Phrase generation and phrase discovery are particularly useful for finding query phrases for input to QUORUM phrase search. The presentation of the new QUORUM methods includes: a brief review of the underlying core QUORUM methods; an overview of the new methods; numerous, concrete examples of ASRS database searches using the new methods; discussion of related methods; and, in the appendices, detailed descriptions of the new methods.

  11. EasyKSORD: A Platform of Keyword Search Over Relational Databases

    NASA Astrophysics Data System (ADS)

    Peng, Zhaohui; Li, Jing; Wang, Shan

    Keyword Search Over Relational Databases (KSORD) enables casual users to use keyword queries (a set of keywords) to search relational databases just like searching the Web, without any knowledge of the database schema or any need of writing SQL queries. Based on our previous work, we design and implement a novel KSORD platform named EasyKSORD for users and system administrators to use and manage different KSORD systems in a novel and simple manner. EasyKSORD supports advanced queries, efficient data-graph-based search engines, multiform result presentations, and system logging and analysis. Through EasyKSORD, users can search relational databases easily and read search results conveniently, and system administrators can easily monitor and analyze the operations of KSORD and manage KSORD systems much better.

  12. Supporting ontology-based keyword search over medical databases.

    PubMed

    Kementsietsidis, Anastasios; Lim, Lipyeow; Wang, Min

    2008-11-06

    The proliferation of medical terms poses a number of challenges in the sharing of medical information among different stakeholders. Ontologies are commonly used to establish relationships between different terms, yet their role in querying has not been investigated in detail. In this paper, we study the problem of supporting ontology-based keyword search queries on a database of electronic medical records. We present several approaches to support this type of queries, study the advantages and limitations of each approach, and summarize the lessons learned as best practices.

  13. Citation searches are more sensitive than keyword searches to identify studies using specific measurement instruments.

    PubMed

    Linder, Suzanne K; Kamath, Geetanjali R; Pratt, Gregory F; Saraykar, Smita S; Volk, Robert J

    2015-04-01

    To compare the effectiveness of two search methods in identifying studies that used the Control Preferences Scale (CPS), a health care decision-making instrument commonly used in clinical settings. We searched the literature using two methods: (1) keyword searching using variations of "Control Preferences Scale" and (2) cited reference searching using two seminal CPS publications. We searched three bibliographic databases [PubMed, Scopus, and Web of Science (WOS)] and one full-text database (Google Scholar). We report precision and sensitivity as measures of effectiveness. Keyword searches in bibliographic databases yielded high average precision (90%) but low average sensitivity (16%). PubMed was the most precise, followed closely by Scopus and WOS. The Google Scholar keyword search had low precision (54%) but provided the highest sensitivity (70%). Cited reference searches in all databases yielded moderate sensitivity (45-54%), but precision ranged from 35% to 75% with Scopus being the most precise. Cited reference searches were more sensitive than keyword searches, making it a more comprehensive strategy to identify all studies that use a particular instrument. Keyword searches provide a quick way of finding some but not all relevant articles. Goals, time, and resources should dictate the combination of which methods and databases are used. Copyright © 2015 Elsevier Inc. All rights reserved.

  14. Citation searches are more sensitive than keyword searches to identify studies using specific measurement instruments

    PubMed Central

    Linder, Suzanne K.; Kamath, Geetanjali R.; Pratt, Gregory F.; Saraykar, Smita S.; Volk, Robert J.

    2015-01-01

    Objective To compare the effectiveness of two search methods in identifying studies that used the Control Preferences Scale (CPS), a healthcare decision-making instrument commonly used in clinical settings. Study Design & Setting We searched the literature using two methods: 1) keyword searching using variations of “control preferences scale” and 2) cited reference searching using two seminal CPS publications. We searched three bibliographic databases [PubMed, Scopus, Web of Science (WOS)] and one full-text database (Google Scholar). We report precision and sensitivity as measures of effectiveness. Results Keyword searches in bibliographic databases yielded high average precision (90%), but low average sensitivity (16%). PubMed was the most precise, followed closely by Scopus and WOS. The Google Scholar keyword search had low precision (54%) but provided the highest sensitivity (70%). Cited reference searches in all databases yielded moderate sensitivity (45–54%), but precision ranged from 35–75% with Scopus being the most precise. Conclusion Cited reference searches were more sensitive than keyword searches, making it a more comprehensive strategy to identify all studies that use a particular instrument. Keyword searches provide a quick way of finding some but not all relevant articles. Goals, time and resources should dictate the combination of which methods and databases are used. PMID:25554521

  15. WORDGRAPH: Keyword-in-Context Visualization for NETSPEAK's Wildcard Search.

    PubMed

    Riehmann, Patrick; Gruendl, Henning; Potthast, Martin; Trenkmann, Martin; Stein, Benno; Froehlich, Benno

    2012-09-01

    The WORDGRAPH helps writers in visually choosing phrases while writing a text. It checks for the commonness of phrases and allows for the retrieval of alternatives by means of wildcard queries. To support such queries, we implement a scalable retrieval engine, which returns high-quality results within milliseconds using a probabilistic retrieval strategy. The results are displayed as WORDGRAPH visualization or as a textual list. The graphical interface provides an effective means for interactive exploration of search results using filter techniques, query expansion, and navigation. Our observations indicate that, of three investigated retrieval tasks, the textual interface is sufficient for the phrase verification task, wherein both interfaces support context-sensitive word choice, and the WORDGRAPH best supports the exploration of a phrase's context or the underlying corpus. Our user study confirms these observations and shows that WORDGRAPH is generally the preferred interface over the textual result list for queries containing multiple wildcards.

  16. Meta Search Engines.

    ERIC Educational Resources Information Center

    Garman, Nancy

    1999-01-01

    Describes common options and features to consider in evaluating which meta search engine will best meet a searcher's needs. Discusses number and names of engines searched; other sources and specialty engines; search queries; other search options; and results options. (AEF)

  17. Development of a Taxonomy of Keywords for Engineering Education Research

    ERIC Educational Resources Information Center

    Finelli, Cynthia J.; Borrego, Maura; Rasoulifar, Golnoosh

    2016-01-01

    The diversity of engineering education research provides an opportunity for cross-fertilisation of ideas and creativity, but it also can result in fragmentation of the field and duplication of effort. One solution is to establish a standardised taxonomy of engineering education terms to map the field and communicate and connect research…

  18. Annual variation in Internet keyword searches: Linking dieting interest to obesity and negative health outcomes.

    PubMed

    Markey, Patrick M; Markey, Charlotte N

    2013-07-01

    This study investigated the annual variation in Internet searches regarding dieting. Time-series analysis was first used to examine the annual trends of Google keyword searches during the past 7 years for topics related to dieting within the United States. The results indicated that keyword searches for dieting fit a consistent 12-month linear model, peaking in January (following New Year's Eve) and then linearly decreasing until surging again the following January. Additional state-level analyses revealed that the size of the December-January dieting-related keyword surge was predictive of both obesity and mortality rates due to diabetes, heart disease, and stroke.

  19. Is There a Standard Default Keyword Operator? A Bibliometric Analysis of Processing Options Chosen by Libraries To Execute Keyword Searches in Online Public Access Catalogs.

    ERIC Educational Resources Information Center

    Klein, Gary M.

    1994-01-01

    Online public access catalogs from 67 libraries using NOTIS software were searched using Internet connections to determine the positional operators selected as the default keyword operator on each catalog. Results indicate the lack of a processing standard for keyword searches. Five tables provide information. (Author/AEF)

  20. Effect of Reading Ability and Internet Experience on Keyword-Based Image Search

    ERIC Educational Resources Information Center

    Lei, Pei-Lan; Lin, Sunny S. J.; Sun, Chuen-Tsai

    2013-01-01

    Image searches are now crucial for obtaining information, constructing knowledge, and building successful educational outcomes. We investigated how reading ability and Internet experience influence keyword-based image search behaviors and performance. We categorized 58 junior-high-school students into four groups of high/low reading ability and…

  1. User Practices in Keyword and Boolean Searching on an Online Public Access Catalog.

    ERIC Educational Resources Information Center

    Ensor, Pat

    1992-01-01

    Discussion of keyword and Boolean searching techniques in online public access catalogs (OPACs) focuses on a study conducted at Indiana State University that examined users' attitudes toward searching on NOTIS (Northwestern Online Total Integrated System). Relevant literature is reviewed, and implications for library instruction are suggested. (17…

  2. Seasonal variation in internet keyword searches: a proxy assessment of sex mating behaviors.

    PubMed

    Markey, Patrick M; Markey, Charlotte N

    2013-05-01

    The current study investigated seasonal variation in internet searches regarding sex and mating behaviors. Harmonic analyses were used to examine the seasonal trends of Google keyword searches during the past 5 years for topics related to pornography, prostitution, and mate-seeking. Results indicated a consistent 6-month harmonic cycle with the peaks of keyword searches related to sex and mating behaviors occurring most frequently during winter and early summer. Such results compliment past research that has found similar seasonal trends of births, sexually transmitted infections, condom sales, and abortions.

  3. Universal Keyword Classifier on Public Key Based Encrypted Multikeyword Fuzzy Search in Public Cloud

    PubMed Central

    Munisamy, Shyamala Devi; Chokkalingam, Arun

    2015-01-01

    Cloud computing has pioneered the emerging world by manifesting itself as a service through internet and facilitates third party infrastructure and applications. While customers have no visibility on how their data is stored on service provider's premises, it offers greater benefits in lowering infrastructure costs and delivering more flexibility and simplicity in managing private data. The opportunity to use cloud services on pay-per-use basis provides comfort for private data owners in managing costs and data. With the pervasive usage of internet, the focus has now shifted towards effective data utilization on the cloud without compromising security concerns. In the pursuit of increasing data utilization on public cloud storage, the key is to make effective data access through several fuzzy searching techniques. In this paper, we have discussed the existing fuzzy searching techniques and focused on reducing the searching time on the cloud storage server for effective data utilization. Our proposed Asymmetric Classifier Multikeyword Fuzzy Search method provides classifier search server that creates universal keyword classifier for the multiple keyword request which greatly reduces the searching time by learning the search path pattern for all the keywords in the fuzzy keyword set. The objective of using BTree fuzzy searchable index is to resolve typos and representation inconsistencies and also to facilitate effective data utilization. PMID:26380364

  4. Universal Keyword Classifier on Public Key Based Encrypted Multikeyword Fuzzy Search in Public Cloud.

    PubMed

    Munisamy, Shyamala Devi; Chokkalingam, Arun

    2015-01-01

    Cloud computing has pioneered the emerging world by manifesting itself as a service through internet and facilitates third party infrastructure and applications. While customers have no visibility on how their data is stored on service provider's premises, it offers greater benefits in lowering infrastructure costs and delivering more flexibility and simplicity in managing private data. The opportunity to use cloud services on pay-per-use basis provides comfort for private data owners in managing costs and data. With the pervasive usage of internet, the focus has now shifted towards effective data utilization on the cloud without compromising security concerns. In the pursuit of increasing data utilization on public cloud storage, the key is to make effective data access through several fuzzy searching techniques. In this paper, we have discussed the existing fuzzy searching techniques and focused on reducing the searching time on the cloud storage server for effective data utilization. Our proposed Asymmetric Classifier Multikeyword Fuzzy Search method provides classifier search server that creates universal keyword classifier for the multiple keyword request which greatly reduces the searching time by learning the search path pattern for all the keywords in the fuzzy keyword set. The objective of using BTree fuzzy searchable index is to resolve typos and representation inconsistencies and also to facilitate effective data utilization.

  5. Efficient secure-channel free public key encryption with keyword search for EMRs in cloud storage.

    PubMed

    Guo, Lifeng; Yau, Wei-Chuen

    2015-02-01

    Searchable encryption is an important cryptographic primitive that enables privacy-preserving keyword search on encrypted electronic medical records (EMRs) in cloud storage. Efficiency of such searchable encryption in a medical cloud storage system is very crucial as it involves client platforms such as smartphones or tablets that only have constrained computing power and resources. In this paper, we propose an efficient secure-channel free public key encryption with keyword search (SCF-PEKS) scheme that is proven secure in the standard model. We show that our SCF-PEKS scheme is not only secure against chosen keyword and ciphertext attacks (IND-SCF-CKCA), but also secure against keyword guessing attacks (IND-KGA). Furthermore, our proposed scheme is more efficient than other recent SCF-PEKS schemes in the literature.

  6. Successful Keyword Searching: Initiating Research on Popular Topics Using Electronic Databases.

    ERIC Educational Resources Information Center

    MacDonald, Randall M.; MacDonald, Susan Priest

    Students are using electronic resources more than ever before to locate information for assignments. Without the proper search terms, results are incomplete, and students are frustrated. Using the keywords, key people, organizations, and Web sites provided in this book and compiled from the most commonly used databases, students will be able to…

  7. A Comparison of Keyword Subject Searching on Six British University OPACs Online Public Access Catalogs.

    ERIC Educational Resources Information Center

    Aanonson, John

    1987-01-01

    Compares features of online public access catalogs (OPACs) at six British universities: (1) Cambridge; (2) Hull; (3) Newcastle; (4) Surrey; (5) Sussex; and (6) York. Results of keyword subject searches on two topics performed on each of the OPACs are reported and compared. Six references are listed. (MES)

  8. SIRW: A web server for the Simple Indexing and Retrieval System that combines sequence motif searches with keyword searches.

    PubMed

    Ramu, Chenna

    2003-07-01

    SIRW (http://sirw.embl.de/) is a World Wide Web interface to the Simple Indexing and Retrieval System (SIR) that is capable of parsing and indexing various flat file databases. In addition it provides a framework for doing sequence analysis (e.g. motif pattern searches) for selected biological sequences through keyword search. SIRW is an ideal tool for the bioinformatics community for searching as well as analyzing biological sequences of interest.

  9. Assessing explicit error reporting in the narrative electronic medical record using keyword searching.

    PubMed

    Cao, Hui; Stetson, Peter; Hripcsak, George

    2003-01-01

    Many types of medical errors occur in and outside of hospitals, some of which have very serious consequences and increase cost. Identifying errors is a critical step for managing and preventing them. In this study, we assessed the explicit reporting of medical errors in the electronic record. We used five search terms "mistake," "error," "incorrect," "inadvertent," and "iatrogenic" to survey several sets of narrative reports including discharge summaries, sign-out notes, and outpatient notes from 1991 to 2000. We manually reviewed all the positive cases and identified them based on the reporting of physicians. We identified 222 explicitly reported medical errors. The positive predictive value varied with different keywords. In general, the positive predictive value for each keyword was low, ranging from 3.4 to 24.4%. Therapeutic-related errors were the most common reported errors and these reported therapeutic-related errors were mainly medication errors. Keyword searches combined with manual review indicated some medical errors that were reported in medical records. It had a low sensitivity and a moderate positive predictive value, which varied by search term. Physicians were most likely to record errors in the Hospital Course and History of Present Illness sections of discharge summaries. The reported errors in medical records covered a broad range and were related to several types of care providers as well as non-health care professionals.

  10. With News Search Engines

    ERIC Educational Resources Information Center

    Gunn, Holly

    2005-01-01

    Although there are many news search engines on the Web, finding the news items one wants can be challenging. Choosing appropriate search terms is one of the biggest challenges. Unless one has seen the article that one is seeking, it is often difficult to select words that were used in the headline or text of the article. The limited archives of…

  11. Custom Search Engines: Tools & Tips

    ERIC Educational Resources Information Center

    Notess, Greg R.

    2008-01-01

    Few have the resources to build a Google or Yahoo! from scratch. Yet anyone can build a search engine based on a subset of the large search engines' databases. Use Google Custom Search Engine or Yahoo! Search Builder or any of the other similar programs to create a vertical search engine targeting sites of interest to users. The basic steps to…

  12. Searches Conducted for Engineers.

    ERIC Educational Resources Information Center

    Lorenz, Patricia

    This paper reports an industrial information specialist's experience in performing online searches for engineers and surveys the databases used. Engineers seeking assistance fall into three categories: (1) those who recognize the value of online retrieval; (2) referrals by colleagues; and (3) those who do not seek help. As more successful searches…

  13. A Step Beyond Simple Keyword Searches: Services Enabled by a Full Content Digital Journal Archive

    NASA Technical Reports Server (NTRS)

    Boccippio, Dennis J.

    2003-01-01

    The problems of managing and searching large archives of scientific journal articles can potentially be addressed through data mining and statistical techniques matured primarily for quantitative scientific data analysis. A journal paper could be represented by a multivariate descriptor, e.g., the occurrence counts of a number key technical terms or phrases (keywords), perhaps derived from a controlled vocabulary ( e . g . , the American Meteorological Society's Glossary of Meteorology) or bootstrapped from the journal archive itself. With this technique, conventional statistical classification tools can be leveraged to address challenges faced by both scientists and professional societies in knowledge management. For example, cluster analyses can be used to find bundles of "most-related" papers, and address the issue of journal bifurcation (when is a new journal necessary, and what topics should it encompass). Similarly, neural networks can be trained to predict the optimal journal (within a society's collection) in which a newly submitted paper should be published. Comparable techniques could enable very powerful end-user tools for journal searches, all premised on the view of a paper as a data point in a multidimensional descriptor space, e.g.: "find papers most similar to the one I am reading", "build a personalized subscription service, based on the content of the papers I am interested in, rather than preselected keywords", "find suitable reviewers, based on the content of their own published works", etc. Such services may represent the next "quantum leap" beyond the rudimentary search interfaces currently provided to end-users, as well as a compelling value-added component needed to bridge the print-to-digital-medium gap, and help stabilize professional societies' revenue stream during the print-to-digital transition.

  14. A Study of Practical Proxy Reencryption with a Keyword Search Scheme considering Cloud Storage Structure

    PubMed Central

    Lee, Im-Yeong

    2014-01-01

    Data outsourcing services have emerged with the increasing use of digital information. They can be used to store data from various devices via networks that are easy to access. Unlike existing removable storage systems, storage outsourcing is available to many users because it has no storage limit and does not require a local storage medium. However, the reliability of storage outsourcing has become an important topic because many users employ it to store large volumes of data. To protect against unethical administrators and attackers, a variety of cryptography systems are used, such as searchable encryption and proxy reencryption. However, existing searchable encryption technology is inconvenient for use in storage outsourcing environments where users upload their data to be shared with others as necessary. In addition, some existing schemes are vulnerable to collusion attacks and have computing cost inefficiencies. In this paper, we analyze existing proxy re-encryption with keyword search. PMID:24693240

  15. A study of practical proxy reencryption with a keyword search scheme considering cloud storage structure.

    PubMed

    Lee, Sun-Ho; Lee, Im-Yeong

    2014-01-01

    Data outsourcing services have emerged with the increasing use of digital information. They can be used to store data from various devices via networks that are easy to access. Unlike existing removable storage systems, storage outsourcing is available to many users because it has no storage limit and does not require a local storage medium. However, the reliability of storage outsourcing has become an important topic because many users employ it to store large volumes of data. To protect against unethical administrators and attackers, a variety of cryptography systems are used, such as searchable encryption and proxy reencryption. However, existing searchable encryption technology is inconvenient for use in storage outsourcing environments where users upload their data to be shared with others as necessary. In addition, some existing schemes are vulnerable to collusion attacks and have computing cost inefficiencies. In this paper, we analyze existing proxy re-encryption with keyword search.

  16. End-to-End ASR-Free Keyword Search From Speech

    NASA Astrophysics Data System (ADS)

    Audhkhasi, Kartik; Rosenberg, Andrew; Sethy, Abhinav; Ramabhadran, Bhuvana; Kingsbury, Brian

    2017-12-01

    End-to-end (E2E) systems have achieved competitive results compared to conventional hybrid hidden Markov model (HMM)-deep neural network based automatic speech recognition (ASR) systems. Such E2E systems are attractive due to the lack of dependence on alignments between input acoustic and output grapheme or HMM state sequence during training. This paper explores the design of an ASR-free end-to-end system for text query-based keyword search (KWS) from speech trained with minimal supervision. Our E2E KWS system consists of three sub-systems. The first sub-system is a recurrent neural network (RNN)-based acoustic auto-encoder trained to reconstruct the audio through a finite-dimensional representation. The second sub-system is a character-level RNN language model using embeddings learned from a convolutional neural network. Since the acoustic and text query embeddings occupy different representation spaces, they are input to a third feed-forward neural network that predicts whether the query occurs in the acoustic utterance or not. This E2E ASR-free KWS system performs respectably despite lacking a conventional ASR system and trains much faster.

  17. Analysis of semantic search within the domains of uncertainty: using Keyword Effectiveness Indexing as an evaluation tool.

    PubMed

    Lorence, Daniel; Abraham, Joanna

    2006-01-01

    Medical and health-related searches pose a special case of risk when using the web as an information resource. Uninsured consumers, lacking access to a trained provider, will often rely on information from the internet for self-diagnosis and treatment. In areas where treatments are uncertain or controversial, most consumers lack the knowledge to make an informed decision. This exploratory technology assessment examines the use of Keyword Effectiveness Indexing (KEI) analysis as a potential tool for profiling information search and keyword retrieval patterns. Results demonstrate that the KEI methodology can be useful in identifying e-health search patterns, but is limited by semantic or text-based web environments.

  18. Chemical-text hybrid search engines.

    PubMed

    Zhou, Yingyao; Zhou, Bin; Jiang, Shumei; King, Frederick J

    2010-01-01

    As the amount of chemical literature increases, it is critical that researchers be enabled to accurately locate documents related to a particular aspect of a given compound. Existing solutions, based on text and chemical search engines alone, suffer from the inclusion of "false negative" and "false positive" results, and cannot accommodate diverse repertoire of formats currently available for chemical documents. To address these concerns, we developed an approach called Entity-Canonical Keyword Indexing (ECKI), which converts a chemical entity embedded in a data source into its canonical keyword representation prior to being indexed by text search engines. We implemented ECKI using Microsoft Office SharePoint Server Search, and the resultant hybrid search engine not only supported complex mixed chemical and keyword queries but also was applied to both intranet and Internet environments. We envision that the adoption of ECKI will empower researchers to pose more complex search questions that were not readily attainable previously and to obtain answers at much improved speed and accuracy.

  19. BIOMedical Search Engine Framework: Lightweight and customized implementation of domain-specific biomedical search engines.

    PubMed

    Jácome, Alberto G; Fdez-Riverola, Florentino; Lourenço, Anália

    2016-07-01

    Text mining and semantic analysis approaches can be applied to the construction of biomedical domain-specific search engines and provide an attractive alternative to create personalized and enhanced search experiences. Therefore, this work introduces the new open-source BIOMedical Search Engine Framework for the fast and lightweight development of domain-specific search engines. The rationale behind this framework is to incorporate core features typically available in search engine frameworks with flexible and extensible technologies to retrieve biomedical documents, annotate meaningful domain concepts, and develop highly customized Web search interfaces. The BIOMedical Search Engine Framework integrates taggers for major biomedical concepts, such as diseases, drugs, genes, proteins, compounds and organisms, and enables the use of domain-specific controlled vocabulary. Technologies from the Typesafe Reactive Platform, the AngularJS JavaScript framework and the Bootstrap HTML/CSS framework support the customization of the domain-oriented search application. Moreover, the RESTful API of the BIOMedical Search Engine Framework allows the integration of the search engine into existing systems or a complete web interface personalization. The construction of the Smart Drug Search is described as proof-of-concept of the BIOMedical Search Engine Framework. This public search engine catalogs scientific literature about antimicrobial resistance, microbial virulence and topics alike. The keyword-based queries of the users are transformed into concepts and search results are presented and ranked accordingly. The semantic graph view portraits all the concepts found in the results, and the researcher may look into the relevance of different concepts, the strength of direct relations, and non-trivial, indirect relations. The number of occurrences of the concept shows its importance to the query, and the frequency of concept co-occurrence is indicative of biological relations

  20. m2-ABKS: Attribute-Based Multi-Keyword Search over Encrypted Personal Health Records in Multi-Owner Setting.

    PubMed

    Miao, Yinbin; Ma, Jianfeng; Liu, Ximeng; Wei, Fushan; Liu, Zhiquan; Wang, Xu An

    2016-11-01

    Online personal health record (PHR) is more inclined to shift data storage and search operations to cloud server so as to enjoy the elastic resources and lessen computational burden in cloud storage. As multiple patients' data is always stored in the cloud server simultaneously, it is a challenge to guarantee the confidentiality of PHR data and allow data users to search encrypted data in an efficient and privacy-preserving way. To this end, we design a secure cryptographic primitive called as attribute-based multi-keyword search over encrypted personal health records in multi-owner setting to support both fine-grained access control and multi-keyword search via Ciphertext-Policy Attribute-Based Encryption. Formal security analysis proves our scheme is selectively secure against chosen-keyword attack. As a further contribution, we conduct empirical experiments over real-world dataset to show its feasibility and practicality in a broad range of actual scenarios without incurring additional computational burden.

  1. Information Retrieval for Education: Making Search Engines Language Aware

    ERIC Educational Resources Information Center

    Ott, Niels; Meurers, Detmar

    2010-01-01

    Search engines have been a major factor in making the web the successful and widely used information source it is today. Generally speaking, they make it possible to retrieve web pages on a topic specified by the keywords entered by the user. Yet web searching currently does not take into account which of the search results are comprehensible for…

  2. [Advanced online search techniques and dedicated search engines for physicians].

    PubMed

    Nahum, Yoav

    2008-02-01

    In recent years search engines have become an essential tool in the work of physicians. This article will review advanced search techniques from the world of information specialists, as well as some advanced search engine operators that may help physicians improve their online search capabilities, and maximize the yield of their searches. This article also reviews popular dedicated scientific and biomedical literature search engines.

  3. Clinician search behaviors may be influenced by search engine design.

    PubMed

    Lau, Annie Y S; Coiera, Enrico; Zrimec, Tatjana; Compton, Paul

    2010-06-30

    Searching the Web for documents using information retrieval systems plays an important part in clinicians' practice of evidence-based medicine. While much research focuses on the design of methods to retrieve documents, there has been little examination of the way different search engine capabilities influence clinician search behaviors. Previous studies have shown that use of task-based search engines allows for faster searches with no loss of decision accuracy compared with resource-based engines. We hypothesized that changes in search behaviors may explain these differences. In all, 75 clinicians (44 doctors and 31 clinical nurse consultants) were randomized to use either a resource-based or a task-based version of a clinical information retrieval system to answer questions about 8 clinical scenarios in a controlled setting in a university computer laboratory. Clinicians using the resource-based system could select 1 of 6 resources, such as PubMed; clinicians using the task-based system could select 1 of 6 clinical tasks, such as diagnosis. Clinicians in both systems could reformulate search queries. System logs unobtrusively capturing clinicians' interactions with the systems were coded and analyzed for clinicians' search actions and query reformulation strategies. The most frequent search action of clinicians using the resource-based system was to explore a new resource with the same query, that is, these clinicians exhibited a "breadth-first" search behaviour. Of 1398 search actions, clinicians using the resource-based system conducted 401 (28.7%, 95% confidence interval [CI] 26.37-31.11) in this way. In contrast, the majority of clinicians using the task-based system exhibited a "depth-first" search behavior in which they reformulated query keywords while keeping to the same task profiles. Of 585 search actions conducted by clinicians using the task-based system, 379 (64.8%, 95% CI 60.83-68.55) were conducted in this way. This study provides evidence that

  4. A Search Engine Features Comparison.

    ERIC Educational Resources Information Center

    Vorndran, Gerald

    Until recently, the World Wide Web (WWW) public access search engines have not included many of the advanced commands, options, and features commonly available with the for-profit online database user interfaces, such as DIALOG. This study evaluates the features and characteristics common to both types of search interfaces, examines the Web search…

  5. Next-Gen Search Engines

    ERIC Educational Resources Information Center

    Gupta, Amardeep

    2005-01-01

    Current search engines--even the constantly surprising Google--seem unable to leap the next big barrier in search: the trillions of bytes of dynamically generated data created by individual web sites around the world, or what some researchers call the "deep web." The challenge now is not information overload, but information overlook.…

  6. Where the bugs are: analyzing distributions of bacterial phyla by descriptor keyword search in the nucleotide database.

    PubMed

    Squartini, Andrea

    2011-07-26

    The associations between bacteria and environment underlie their preferential interactions with given physical or chemical conditions. Microbial ecology aims at extracting conserved patterns of occurrence of bacterial taxa in relation to defined habitats and contexts. In the present report the NCBI nucleotide sequence database is used as dataset to extract information relative to the distribution of each of the 24 phyla of the bacteria superkingdom and of the Archaea. Over two and a half million records are filtered in their cross-association with each of 48 sets of keywords, defined to cover natural or artificial habitats, interactions with plant, animal or human hosts, and physical-chemical conditions. The results are processed showing: (a) how the different descriptors enrich or deplete the proportions at which the phyla occur in the total database; (b) in which order of abundance do the different keywords score for each phylum (preferred habitats or conditions), and to which extent are phyla clustered to few descriptors (specific) or spread across many (cosmopolitan); (c) which keywords individuate the communities ranking highest for diversity and evenness. A number of cues emerge from the results, contributing to sharpen the picture on the functional systematic diversity of prokaryotes. Suggestions are given for a future automated service dedicated to refining and updating such kind of analyses via public bioinformatic engines.

  7. Specialized medical search-engines are no better than general search-engines in sourcing consumer information about androgen deficiency.

    PubMed

    Ilic, D; Bessell, T L; Silagy, C A; Green, S

    2003-03-01

    The Internet provides consumers with access to online health information; however, identifying relevant and valid information can be problematic. Our objectives were firstly to investigate the efficiency of search-engines, and then to assess the quality of online information pertaining to androgen deficiency in the ageing male (ADAM). Keyword searches were performed on nine search-engines (four general and five medical) to identify website information regarding ADAM. Search-engine efficiency was compared by percentage of relevant websites obtained via each search-engine. The quality of information published on each website was assessed using the DISCERN rating tool. Of 4927 websites searched, 47 (1.44%) and 10 (0.60%) relevant websites were identified by general and medical search-engines respectively. The overall quality of online information on ADAM was poor. The quality of websites retrieved using medical search-engines did not differ significantly from those retrieved by general search-engines. Despite the poor quality of online information relating to ADAM, it is evident that medical search-engines are no better than general search-engines in sourcing consumer information relevant to ADAM.

  8. Start Your Engines: Surfing with Search Engines for Kids.

    ERIC Educational Resources Information Center

    Byerly, Greg; Brodie, Carolyn S.

    1999-01-01

    Suggests that to be an effective educator and user of the Web it is essential to know the basics about search engines. Presents tips for using search engines. Describes several search engines for children and young adults, as well as some general filtered search engines for children. (AEF)

  9. Staleness Among Web Search Engines.

    ERIC Educational Resources Information Center

    Koehler, Wallace

    1998-01-01

    Describes a study of four major Web search engines that tested for staleness, a condition when a significant number of the hits it returns point to Web pages or server-level domains (SLD) that are no longer viable. Results of tests of URLs with AltaVista, HotBot, InfoSeek, and Open Text are discussed. (Author/LRW)

  10. [Development of domain specific search engines].

    PubMed

    Takai, T; Tokunaga, M; Maeda, K; Kaminuma, T

    2000-01-01

    As cyber space exploding in a pace that nobody has ever imagined, it becomes very important to search cyber space efficiently and effectively. One solution to this problem is search engines. Already a lot of commercial search engines have been put on the market. However these search engines respond with such cumbersome results that domain specific experts can not tolerate. Using a dedicate hardware and a commercial software called OpenText, we have tried to develop several domain specific search engines. These engines are for our institute's Web contents, drugs, chemical safety, endocrine disruptors, and emergent response for chemical hazard. These engines have been on our Web site for testing.

  11. Keyword: Placement

    ERIC Educational Resources Information Center

    Cassuto, Leonard

    2012-01-01

    The practical goal of graduate education is placement of graduates. But what does "placement" mean? Academics use the word without thinking much about it. "Placement" is a great keyword for the graduate-school enterprise. For one thing, its meaning certainly gives a purpose to graduate education. Furthermore, the word is a portal into the way of…

  12. Search Engines on the World Wide Web.

    ERIC Educational Resources Information Center

    Walster, Dian

    1997-01-01

    Discusses search engines and provides methods for determining what resources are searched, the quality of the information, and the algorithms used that will improve the use of search engines on the World Wide Web, online public access catalogs, and electronic encyclopedias. Lists strategies for conducting searches and for learning about the latest…

  13. Children's Search Engines from an Information Search Process Perspective.

    ERIC Educational Resources Information Center

    Broch, Elana

    2000-01-01

    Describes cognitive and affective characteristics of children and teenagers that may affect their Web searching behavior. Reviews literature on children's searching in online public access catalogs (OPACs) and using digital libraries. Profiles two Web search engines. Discusses some of the difficulties children have searching the Web, in the…

  14. Search Engines: Gateway to a New ``Panopticon''?

    NASA Astrophysics Data System (ADS)

    Kosta, Eleni; Kalloniatis, Christos; Mitrou, Lilian; Kavakli, Evangelia

    Nowadays, Internet users are depending on various search engines in order to be able to find requested information on the Web. Although most users feel that they are and remain anonymous when they place their search queries, reality proves otherwise. The increasing importance of search engines for the location of the desired information on the Internet usually leads to considerable inroads into the privacy of users. The scope of this paper is to study the main privacy issues with regard to search engines, such as the anonymisation of search logs and their retention period, and to examine the applicability of the European data protection legislation to non-EU search engine providers. Ixquick, a privacy-friendly meta search engine will be presented as an alternative to privacy intrusive existing practices of search engines.

  15. NASA Indexing Benchmarks: Evaluating Text Search Engines

    NASA Technical Reports Server (NTRS)

    Esler, Sandra L.; Nelson, Michael L.

    1997-01-01

    The current proliferation of on-line information resources underscores the requirement for the ability to index collections of information and search and retrieve them in a convenient manner. This study develops criteria for analytically comparing the index and search engines and presents results for a number of freely available search engines. A product of this research is a toolkit capable of automatically indexing, searching, and extracting performance statistics from each of the focused search engines. This toolkit is highly configurable and has the ability to run these benchmark tests against other engines as well. Results demonstrate that the tested search engines can be grouped into two levels. Level one engines are efficient on small to medium sized data collections, but show weaknesses when used for collections 100MB or larger. Level two search engines are recommended for data collections up to and beyond 100MB.

  16. Teen smoking cessation help via the Internet: a survey of search engines.

    PubMed

    Edwards, Christine C; Elliott, Sean P; Conway, Terry L; Woodruff, Susan I

    2003-07-01

    The objective of this study was to assess Web sites related to teen smoking cessation on the Internet. Seven Internet search engines were searched using the keywords teen quit smoking. The top 20 hits from each search engine were reviewed and categorized. The keywords teen quit smoking produced between 35 and 400,000 hits depending on the search engine. Of 140 potential hits, 62% were active, unique sites; 85% were listed by only one search engine; and 40% focused on cessation. Findings suggest that legitimate on-line smoking cessation help for teens is constrained by search engine choice and the amount of time teens spend looking through potential sites. Resource listings should be updated regularly. Smoking cessation Web sites need to be picked up on multiple search engine searches. Further evaluation of smoking cessation Web sites need to be conducted to identify the most effective help for teens.

  17. New generation of the multimedia search engines

    NASA Astrophysics Data System (ADS)

    Mijes Cruz, Mario Humberto; Soto Aldaco, Andrea; Maldonado Cano, Luis Alejandro; López Rodríguez, Mario; Rodríguez Vázqueza, Manuel Antonio; Amaya Reyes, Laura Mariel; Cano Martínez, Elizabeth; Pérez Rosas, Osvaldo Gerardo; Rodríguez Espejo, Luis; Flores Secundino, Jesús Abimelek; Rivera Martínez, José Luis; García Vázquez, Mireya Saraí; Zamudio Fuentes, Luis Miguel; Sánchez Valenzuela, Juan Carlos; Montoya Obeso, Abraham; Ramírez Acosta, Alejandro Álvaro

    2016-09-01

    Current search engines are based upon search methods that involve the combination of words (text-based search); which has been efficient until now. However, the Internet's growing demand indicates that there's more diversity on it with each passing day. Text-based searches are becoming limited, as most of the information on the Internet can be found in different types of content denominated multimedia content (images, audio files, video files). Indeed, what needs to be improved in current search engines is: search content, and precision; as well as an accurate display of expected search results by the user. Any search can be more precise if it uses more text parameters, but it doesn't help improve the content or speed of the search itself. One solution is to improve them through the characterization of the content for the search in multimedia files. In this article, an analysis of the new generation multimedia search engines is presented, focusing the needs according to new technologies. Multimedia content has become a central part of the flow of information in our daily life. This reflects the necessity of having multimedia search engines, as well as knowing the real tasks that it must comply. Through this analysis, it is shown that there are not many search engines that can perform content searches. The area of research of multimedia search engines of new generation is a multidisciplinary area that's in constant growth, generating tools that satisfy the different needs of new generation systems.

  18. Internet Search Engines - Fluctuations in Document Accessibility.

    ERIC Educational Resources Information Center

    Mettrop, Wouter; Nieuwenhuysen, Paul

    2001-01-01

    Reports an empirical investigation of the consistency of retrieval through Internet search engines. Evaluates 13 engines: AltaVista, EuroFerret, Excite, HotBot, InfoSeek, Lycos, MSN, NorthernLight, Snap, WebCrawler, and three national Dutch engines: Ilse, Search.nl and Vindex. The focus is on a characteristic related to size: the degree of…

  19. Search Engine Liability for Copyright Infringement

    NASA Astrophysics Data System (ADS)

    Fitzgerald, B.; O'Brien, D.; Fitzgerald, A.

    The chapter provides a broad overview to the topic of search engine liability for copyright infringement. In doing so, the chapter examines some of the key copyright law principles and their application to search engines. The chapter also provides a discussion of some of the most important cases to be decided within the courts of the United States, Australia, China and Europe regarding the liability of search engines for copyright infringement. Finally, the chapter will conclude with some thoughts for reform, including how copyright law can be amended in order to accommodate and realise the great informative power which search engines have to offer society.

  20. Database Search Engines: Paradigms, Challenges and Solutions.

    PubMed

    Verheggen, Kenneth; Martens, Lennart; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

    2016-01-01

    The first step in identifying proteins from mass spectrometry based shotgun proteomics data is to infer peptides from tandem mass spectra, a task generally achieved using database search engines. In this chapter, the basic principles of database search engines are introduced with a focus on open source software, and the use of database search engines is demonstrated using the freely available SearchGUI interface. This chapter also discusses how to tackle general issues related to sequence database searching and shows how to minimize their impact.

  1. Health literacy and usability of clinical trial search engines.

    PubMed

    Utami, Dina; Bickmore, Timothy W; Barry, Barbara; Paasche-Orlow, Michael K

    2014-01-01

    Several web-based search engines have been developed to assist individuals to find clinical trials for which they may be interested in volunteering. However, these search engines may be difficult for individuals with low health and computer literacy to navigate. The authors present findings from a usability evaluation of clinical trial search tools with 41 participants across the health and computer literacy spectrum. The study consisted of 3 parts: (a) a usability study of an existing web-based clinical trial search tool; (b) a usability study of a keyword-based clinical trial search tool; and (c) an exploratory study investigating users' information needs when deciding among 2 or more candidate clinical trials. From the first 2 studies, the authors found that users with low health literacy have difficulty forming queries using keywords and have significantly more difficulty using a standard web-based clinical trial search tool compared with users with adequate health literacy. From the third study, the authors identified the search factors most important to individuals searching for clinical trials and how these varied by health literacy level.

  2. A Method for Search Engine Selection using Thesaurus for Selective Meta-Search Engine

    NASA Astrophysics Data System (ADS)

    Goto, Shoji; Ozono, Tadachika; Shintani, Toramatsu

    In this paper, we propose a new method for selecting search engines on WWW for selective meta-search engine. In selective meta-search engine, a method is needed that would enable selecting appropriate search engines for users' queries. Most existing methods use statistical data such as document frequency. These methods may select inappropriate search engines if a query contains polysemous words. In this paper, we describe an search engine selection method based on thesaurus. In our method, a thesaurus is constructed from documents in a search engine and is used as a source description of the search engine. The form of a particular thesaurus depends on the documents used for its construction. Our method enables search engine selection by considering relationship between terms and overcomes the problems caused by polysemous words. Further, our method does not have a centralized broker maintaining data, such as document frequency for all search engines. As a result, it is easy to add a new search engine, and meta-search engines become more scalable with our method compared to other existing methods.

  3. Human Flesh Search Engine and Online Privacy.

    PubMed

    Zhang, Yang; Gao, Hong

    2016-04-01

    Human flesh search engine can be a double-edged sword, bringing convenience on the one hand and leading to infringement of personal privacy on the other hand. This paper discusses the ethical problems brought about by the human flesh search engine, as well as possible solutions.

  4. Smart internet search engine through 6W

    NASA Astrophysics Data System (ADS)

    Goehler, Stephen; Cader, Masud; Szu, Harold

    2006-04-01

    Current Internet search engine technology is limited in its ability to display necessary relevant information to the user. Yahoo, Google and Microsoft use lookup tables or indexes which limits the ability of users to find their desired information. While these companies have improved their results over the years by enhancing their existing technology and algorithms with specialized heuristics such as PageRank, there is a need for a next generation smart search engine that can effectively interpret the relevance of user searches and provide the actual information requested. This paper explores whether a smarter Internet search engine can effectively fulfill a user's needs through the use of 6W representations.

  5. Modelling and Simulation of Search Engine

    NASA Astrophysics Data System (ADS)

    Nasution, Mahyuddin K. M.

    2017-01-01

    The best tool currently used to access information is a search engine. Meanwhile, the information space has its own behaviour. Systematically, an information space needs to be familiarized with mathematics so easily we identify the characteristics associated with it. This paper reveal some characteristics of search engine based on a model of document collection, which are then estimated the impact on the feasibility of information. We reveal some of characteristics of search engine on the lemma and theorem about singleton and doubleton, then computes statistically characteristic as simulating the possibility of using search engine. In this case, Google and Yahoo. There are differences in the behaviour of both search engines, although in theory based on the concept of documents collection.

  6. EMERSE: The Electronic Medical Record Search Engine

    PubMed Central

    Hanauer, David A.

    2006-01-01

    EMERSE (The Electronic Medical Record Search Engine) is an intuitive, powerful search engine for free-text documents in the electronic medical record. It offers multiple options for creating complex search queries yet has an interface that is easy enough to be used by those with minimal computer experience. EMERSE is ideal for retrospective chart reviews and data abstraction and may have potential for clinical care as well.

  7. EMERSE: The Electronic Medical Record Search Engine

    PubMed Central

    Hanauer, David A.

    2006-01-01

    EMERSE (The Electronic Medical Record Search Engine) is an intuitive, powerful search engine for free-text documents in the electronic medical record. It offers multiple options for creating complex search queries yet has an interface that is easy enough to be used by those with minimal computer experience. EMERSE is ideal for retrospective chart reviews and data abstraction and may have potential for clinical care as well. PMID:17238560

  8. Search Engines for Tomorrow's Scholars

    ERIC Educational Resources Information Center

    Fagan, Jody Condit

    2011-01-01

    Today's scholars face an outstanding array of choices when choosing search tools: Google Scholar, discipline-specific abstracts and index databases, library discovery tools, and more recently, Microsoft's re-launch of their academic search tool, now dubbed Microsoft Academic Search. What are these tools' strengths for the emerging needs of…

  9. [Pruritus in Germany-a Google search engine analysis].

    PubMed

    Zink, A; Rüth, M; Schuster, B; Darsow, U; Biedermann, T; Ständer, S

    2018-06-06

    Because affected persons often do not visit a doctor, the prevalence of chronic and acute pruritus in the general population is difficult to determine. The aim of this study is to estimate the frequency and the most common locations of pruritus in German internet users, who-with 62.4 million persons-represent a large majority of the German population, by analysing the Google search volume. Relevant keywords for the subject "pruritus" were identified and analysed using the Google AdWords Keyword Planner. The assessment period was January 2015 to December 2016. In total the Google AdWords Keyword Planner identified 701 keywords for the topic "Juckreiz" (German lay word for pruritus), resulting in 7,531,890 pruritus-related Google searches during the assessment period. Most common search terms were the German lay term for atopic eczema ("Neurodermitis", 23.7%), the German lay term for psoriasis ("Schuppenflechte", 17.8%) and "psoriasis" (13%). The German lay term for pruritus ("Juckreiz") was only the sixth most searched term (3%). Most searches (72%) focused on influencing factors for pruritus, especially on skin diseases and skin conditions. The most commonly searched location was pruritus on the whole body, followed by anal pruritus. Analysis of the temporal course showed a higher monthly search volume during winter. With its unconventional methodology, a Google search engine analysis, this study allows a rough estimation of the medical need of pruritus in the German general population, which seems to be higher than expected. Especially pruritus in the anal area was identified as an unmet medical need.

  10. Web Search Studies: Multidisciplinary Perspectives on Web Search Engines

    NASA Astrophysics Data System (ADS)

    Zimmer, Michael

    Perhaps the most significant tool of our internet age is the web search engine, providing a powerful interface for accessing the vast amount of information available on the world wide web and beyond. While still in its infancy compared to the knowledge tools that precede it - such as the dictionary or encyclopedia - the impact of web search engines on society and culture has already received considerable attention from a variety of academic disciplines and perspectives. This article aims to organize a meta-discipline of “web search studies,” centered around a nucleus of major research on web search engines from five key perspectives: technical foundations and evaluations; transaction log analyses; user studies; political, ethical, and cultural critiques; and legal and policy analyses.

  11. An ontology-based search engine for protein-protein interactions.

    PubMed

    Park, Byungkyu; Han, Kyungsook

    2010-01-18

    Keyword matching or ID matching is the most common searching method in a large database of protein-protein interactions. They are purely syntactic methods, and retrieve the records in the database that contain a keyword or ID specified in a query. Such syntactic search methods often retrieve too few search results or no results despite many potential matches present in the database. We have developed a new method for representing protein-protein interactions and the Gene Ontology (GO) using modified Gödel numbers. This representation is hidden from users but enables a search engine using the representation to efficiently search protein-protein interactions in a biologically meaningful way. Given a query protein with optional search conditions expressed in one or more GO terms, the search engine finds all the interaction partners of the query protein by unique prime factorization of the modified Gödel numbers representing the query protein and the search conditions. Representing the biological relations of proteins and their GO annotations by modified Gödel numbers makes a search engine efficiently find all protein-protein interactions by prime factorization of the numbers. Keyword matching or ID matching search methods often miss the interactions involving a protein that has no explicit annotations matching the search condition, but our search engine retrieves such interactions as well if they satisfy the search condition with a more specific term in the ontology.

  12. An ontology-based search engine for protein-protein interactions

    PubMed Central

    2010-01-01

    Background Keyword matching or ID matching is the most common searching method in a large database of protein-protein interactions. They are purely syntactic methods, and retrieve the records in the database that contain a keyword or ID specified in a query. Such syntactic search methods often retrieve too few search results or no results despite many potential matches present in the database. Results We have developed a new method for representing protein-protein interactions and the Gene Ontology (GO) using modified Gödel numbers. This representation is hidden from users but enables a search engine using the representation to efficiently search protein-protein interactions in a biologically meaningful way. Given a query protein with optional search conditions expressed in one or more GO terms, the search engine finds all the interaction partners of the query protein by unique prime factorization of the modified Gödel numbers representing the query protein and the search conditions. Conclusion Representing the biological relations of proteins and their GO annotations by modified Gödel numbers makes a search engine efficiently find all protein-protein interactions by prime factorization of the numbers. Keyword matching or ID matching search methods often miss the interactions involving a protein that has no explicit annotations matching the search condition, but our search engine retrieves such interactions as well if they satisfy the search condition with a more specific term in the ontology. PMID:20122195

  13. Semantic interpretation of search engine resultant

    NASA Astrophysics Data System (ADS)

    Nasution, M. K. M.

    2018-01-01

    In semantic, logical language can be interpreted in various forms, but the certainty of meaning is included in the uncertainty, which directly always influences the role of technology. One results of this uncertainty applies to search engines as user interfaces with information spaces such as the Web. Therefore, the behaviour of search engine results should be interpreted with certainty through semantic formulation as interpretation. Behaviour formulation shows there are various interpretations that can be done semantically either temporary, inclusion, or repeat.

  14. Google and Women's Health-Related Issues: What Does the Search Engine Data Reveal?

    PubMed

    Baazeem, Mazin; Abenhaim, Haim

    2014-01-01

    Identifying the gaps in public knowledge of women's health related issues has always been difficult. With the increasing number of Internet users in the United States, we sought to use the Internet as a tool to help us identify such gaps and to estimate women's most prevalent health concerns by examining commonly searched health-related keywords in Google search engine. We collected a large pool of possible search keywords from two independent practicing obstetrician/gynecologists and classified them into five main categories (obstetrics, gynecology, infertility, urogynecology/menopause and oncology), and measured the monthly average search volume within the United States for each keyword with all its possible combinations using Google AdWords tool. We found that pregnancy related keywords were less frequently searched in general compared to other categories with an average of 145,400 hits per month for the top twenty keywords. Among the most common pregnancy-related keywords was "pregnancy and sex' while pregnancy-related diseases were uncommonly searched. HPV alone was searched 305,400 times per month. Of the cancers affecting women, breast cancer was the most commonly searched with an average of 247,190 times per month, followed by cervical cancer then ovarian cancer. The commonly searched keywords are often issues that are not discussed in our daily practice as well as in public health messages. The search volume is relatively related to disease prevalence with the exception of ovarian cancer which could signify a public fear.

  15. Combining Search Engines for Comparative Proteomics

    PubMed Central

    Tabb, David

    2012-01-01

    Many proteomics laboratories have found spectral counting to be an ideal way to recognize biomarkers that differentiate cohorts of samples. This approach assumes that proteins that differ in quantity between samples will generate different numbers of identifiable tandem mass spectra. Increasingly, researchers are employing multiple search engines to maximize the identifications generated from data collections. This talk evaluates four strategies to combine information from multiple search engines in comparative proteomics. The “Count Sum” model pools the spectra across search engines. The “Vote Counting” model combines the judgments from each search engine by protein. Two other models employ parametric and non-parametric analyses of protein-specific p-values from different search engines. We evaluated the four strategies in two different data sets. The ABRF iPRG 2009 study generated five LC-MS/MS analyses of “red” E. coli and five analyses of “yellow” E. coli. NCI CPTAC Study 6 generated five concentrations of Sigma UPS1 spiked into a yeast background. All data were identified with X!Tandem, Sequest, MyriMatch, and TagRecon. For both sample types, “Vote Counting” appeared to manage the diverse identification sets most effectively, yielding heightened discrimination as more search engines were added.

  16. Document Clustering Approach for Meta Search Engine

    NASA Astrophysics Data System (ADS)

    Kumar, Naresh, Dr.

    2017-08-01

    The size of WWW is growing exponentially with ever change in technology. This results in huge amount of information with long list of URLs. Manually it is not possible to visit each page individually. So, if the page ranking algorithms are used properly then user search space can be restricted up to some pages of searched results. But available literatures show that no single search system can provide qualitative results from all the domains. This paper provides solution to this problem by introducing a new meta search engine that determine the relevancy of query corresponding to web page and cluster the results accordingly. The proposed approach reduces the user efforts, improves the quality of results and performance of the meta search engine.

  17. Using Internet search engines to obtain medical information: a comparative study.

    PubMed

    Wang, Liupu; Wang, Juexin; Wang, Michael; Li, Yong; Liang, Yanchun; Xu, Dong

    2012-05-16

    The Internet has become one of the most important means to obtain health and medical information. It is often the first step in checking for basic information about a disease and its treatment. The search results are often useful to general users. Various search engines such as Google, Yahoo!, Bing, and Ask.com can play an important role in obtaining medical information for both medical professionals and lay people. However, the usability and effectiveness of various search engines for medical information have not been comprehensively compared and evaluated. To compare major Internet search engines in their usability of obtaining medical and health information. We applied usability testing as a software engineering technique and a standard industry practice to compare the four major search engines (Google, Yahoo!, Bing, and Ask.com) in obtaining health and medical information. For this purpose, we searched the keyword breast cancer in Google, Yahoo!, Bing, and Ask.com and saved the results of the top 200 links from each search engine. We combined nonredundant links from the four search engines and gave them to volunteer users in an alphabetical order. The volunteer users evaluated the websites and scored each website from 0 to 10 (lowest to highest) based on the usefulness of the content relevant to breast cancer. A medical expert identified six well-known websites related to breast cancer in advance as standards. We also used five keywords associated with breast cancer defined in the latest release of Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) and analyzed their occurrence in the websites. Each search engine provided rich information related to breast cancer in the search results. All six standard websites were among the top 30 in search results of all four search engines. Google had the best search validity (in terms of whether a website could be opened), followed by Bing, Ask.com, and Yahoo!. The search results highly overlapped between the

  18. Adding a visualization feature to web search engines: it's time.

    PubMed

    Wong, Pak Chung

    2008-01-01

    It's widely recognized that all Web search engines today are almost identical in presentation layout and behavior. In fact, the same presentation approach has been applied to depicting search engine results pages (SERPs) since the first Web search engine launched in 1993. In this Visualization Viewpoints article, I propose to add a visualization feature to Web search engines and suggest that the new addition can improve search engines' performance and capabilities, which in turn lead to better Web search technology.

  19. In Search of Search Engine Marketing Strategy Amongst SME's in Ireland

    NASA Astrophysics Data System (ADS)

    Barry, Chris; Charleton, Debbie

    Researchers have identified the Web as a searchers first port of call for locating information. Search Engine Marketing (SEM) strategies have been noted as a key consideration when developing, maintaining and managing Websites. A study presented here of SEM practices of Irish small to medium enterprises (SMEs) reveals they plan to spend more resources on SEM in the future. Most firms utilize an informal SEM strategy, where Website optimization is perceived most effective in attracting traffic. Respondents cite the use of ‘keywords in title and description tags’ as the most used SEM technique, followed by the use of ‘keywords throughout the whole Website’; while ‘Pay for Placement’ was most widely used Paid Search technique. In concurrence with the literature, measuring SEM performance remains a significant challenge with many firms unsure if they measure it effectively. An encouraging finding is that Irish SMEs adopt a positive ethical posture when undertaking SEM.

  20. Evidence-based Medicine Search: a customizable federated search engine.

    PubMed

    Bracke, Paul J; Howse, David K; Keim, Samuel M

    2008-04-01

    This paper reports on the development of a tool by the Arizona Health Sciences Library (AHSL) for searching clinical evidence that can be customized for different user groups. The AHSL provides services to the University of Arizona's (UA's) health sciences programs and to the University Medical Center. Librarians at AHSL collaborated with UA College of Medicine faculty to create an innovative search engine, Evidence-based Medicine (EBM) Search, that provides users with a simple search interface to EBM resources and presents results organized according to an evidence pyramid. EBM Search was developed with a web-based configuration component that allows the tool to be customized for different specialties. Informal and anecdotal feedback from physicians indicates that EBM Search is a useful tool with potential in teaching evidence-based decision making. While formal evaluation is still being planned, a tool such as EBM Search, which can be configured for specific user populations, may help lower barriers to information resources in an academic health sciences center.

  1. Evidence-based Medicine Search: a customizable federated search engine

    PubMed Central

    Bracke, Paul J.; Howse, David K.; Keim, Samuel M.

    2008-01-01

    Purpose: This paper reports on the development of a tool by the Arizona Health Sciences Library (AHSL) for searching clinical evidence that can be customized for different user groups. Brief Description: The AHSL provides services to the University of Arizona's (UA's) health sciences programs and to the University Medical Center. Librarians at AHSL collaborated with UA College of Medicine faculty to create an innovative search engine, Evidence-based Medicine (EBM) Search, that provides users with a simple search interface to EBM resources and presents results organized according to an evidence pyramid. EBM Search was developed with a web-based configuration component that allows the tool to be customized for different specialties. Outcomes/Conclusion: Informal and anecdotal feedback from physicians indicates that EBM Search is a useful tool with potential in teaching evidence-based decision making. While formal evaluation is still being planned, a tool such as EBM Search, which can be configured for specific user populations, may help lower barriers to information resources in an academic health sciences center. PMID:18379665

  2. Evaluating a Federated Medical Search Engine

    PubMed Central

    Belden, J.; Williams, J.; Richardson, B.; Schuster, K.

    2014-01-01

    Summary Background Federated medical search engines are health information systems that provide a single access point to different types of information. Their efficiency as clinical decision support tools has been demonstrated through numerous evaluations. Despite their rigor, very few of these studies report holistic evaluations of medical search engines and even fewer base their evaluations on existing evaluation frameworks. Objectives To evaluate a federated medical search engine, MedSocket, for its potential net benefits in an established clinical setting. Methods This study applied the Human, Organization, and Technology (HOT-fit) evaluation framework in order to evaluate MedSocket. The hierarchical structure of the HOT-factors allowed for identification of a combination of efficiency metrics. Human fit was evaluated through user satisfaction and patterns of system use; technology fit was evaluated through the measurements of time-on-task and the accuracy of the found answers; and organization fit was evaluated from the perspective of system fit to the existing organizational structure. Results Evaluations produced mixed results and suggested several opportunities for system improvement. On average, participants were satisfied with MedSocket searches and confident in the accuracy of retrieved answers. However, MedSocket did not meet participants’ expectations in terms of download speed, access to information, and relevance of the search results. These mixed results made it necessary to conclude that in the case of MedSocket, technology fit had a significant influence on the human and organization fit. Hence, improving technological capabilities of the system is critical before its net benefits can become noticeable. Conclusions The HOT-fit evaluation framework was instrumental in tailoring the methodology for conducting a comprehensive evaluation of the search engine. Such multidimensional evaluation of the search engine resulted in recommendations for

  3. Quality Dimensions of Internet Search Engines.

    ERIC Educational Resources Information Center

    Xie, M.; Wang, H.; Goh, T. N.

    1998-01-01

    Reviews commonly used search engines (AltaVista, Excite, infoseek, Lycos, HotBot, WebCrawler), focusing on existing comparative studies; considers quality dimensions from the customer's point of view based on a SERVQUAL framework; and groups these quality expectations in five dimensions: tangibles, reliability, responsiveness, assurance, and…

  4. Research Trends with Cross Tabulation Search Engine

    ERIC Educational Resources Information Center

    Yin, Chengjiu; Hirokawa, Sachio; Yau, Jane Yin-Kim; Hashimoto, Kiyota; Tabata, Yoshiyuki; Nakatoh, Tetsuya

    2013-01-01

    To help researchers in building a knowledge foundation of their research fields which could be a time-consuming process, the authors have developed a Cross Tabulation Search Engine (CTSE). Its purpose is to assist researchers in 1) conducting research surveys, 2) efficiently and effectively retrieving information (such as important researchers,…

  5. An Exploratory Survey of Student Perspectives Regarding Search Engines

    ERIC Educational Resources Information Center

    Alshare, Khaled; Miller, Don; Wenger, James

    2005-01-01

    This study explored college students' perceptions regarding their use of search engines. The main objective was to determine how frequently students used various search engines, whether advanced search features were used, and how many search engines were used. Various factors that might influence student responses were examined. Results showed…

  6. The Use of Web Search Engines in Information Science Research.

    ERIC Educational Resources Information Center

    Bar-Ilan, Judit

    2004-01-01

    Reviews the literature on the use of Web search engines in information science research, including: ways users interact with Web search engines; social aspects of searching; structure and dynamic nature of the Web; link analysis; other bibliometric applications; characterizing information on the Web; search engine evaluation and improvement; and…

  7. Sexual information seeking on web search engines.

    PubMed

    Spink, Amanda; Koricich, Andrew; Jansen, B J; Cole, Charles

    2004-02-01

    Sexual information seeking is an important element within human information behavior. Seeking sexually related information on the Internet takes many forms and channels, including chat rooms discussions, accessing Websites or searching Web search engines for sexual materials. The study of sexual Web queries provides insight into sexually-related information-seeking behavior, of value to Web users and providers alike. We qualitatively analyzed queries from logs of 1,025,910 Alta Vista and AlltheWeb.com Web user queries from 2001. We compared the differences in sexually-related Web searching between Alta Vista and AlltheWeb.com users. Differences were found in session duration, query outcomes, and search term choices. Implications of the findings for sexual information seeking are discussed.

  8. Chemical Information in Scirus and BASE (Bielefeld Academic Search Engine)

    ERIC Educational Resources Information Center

    Bendig, Regina B.

    2009-01-01

    The author sought to determine to what extent the two search engines, Scirus and BASE (Bielefeld Academic Search Engines), would be useful to first-year university students as the first point of searching for chemical information. Five topics were searched and the first ten records of each search result were evaluated with regard to the type of…

  9. BioSearch: a semantic search engine for Bio2RDF

    PubMed Central

    Qiu, Honglei; Huang, Jiacheng

    2017-01-01

    Abstract Biomedical data are growing at an incredible pace and require substantial expertise to organize data in a manner that makes them easily findable, accessible, interoperable and reusable. Massive effort has been devoted to using Semantic Web standards and technologies to create a network of Linked Data for the life sciences, among others. However, while these data are accessible through programmatic means, effective user interfaces for non-experts to SPARQL endpoints are few and far between. Contributing to user frustrations is that data are not necessarily described using common vocabularies, thereby making it difficult to aggregate results, especially when distributed across multiple SPARQL endpoints. We propose BioSearch — a semantic search engine that uses ontologies to enhance federated query construction and organize search results. BioSearch also features a simplified query interface that allows users to optionally filter their keywords according to classes, properties and datasets. User evaluation demonstrated that BioSearch is more effective and usable than two state of the art search and browsing solutions. Database URL: http://ws.nju.edu.cn/biosearch/ PMID:29220451

  10. Google and Women’s Health-Related Issues: What Does the Search Engine Data Reveal?

    PubMed Central

    Baazeem, Mazin

    2014-01-01

    Objectives Identifying the gaps in public knowledge of women’s health related issues has always been difficult. With the increasing number of Internet users in the United States, we sought to use the Internet as a tool to help us identify such gaps and to estimate women’s most prevalent health concerns by examining commonly searched health-related keywords in Google search engine. Methods We collected a large pool of possible search keywords from two independent practicing obstetrician/gynecologists and classified them into five main categories (obstetrics, gynecology, infertility, urogynecology/menopause and oncology), and measured the monthly average search volume within the United States for each keyword with all its possible combinations using Google AdWords tool. Results We found that pregnancy related keywords were less frequently searched in general compared to other categories with an average of 145,400 hits per month for the top twenty keywords. Among the most common pregnancy-related keywords was “pregnancy and sex’ while pregnancy-related diseases were uncommonly searched. HPV alone was searched 305,400 times per month. Of the cancers affecting women, breast cancer was the most commonly searched with an average of 247,190 times per month, followed by cervical cancer then ovarian cancer. Conclusion The commonly searched keywords are often issues that are not discussed in our daily practice as well as in public health messages. The search volume is relatively related to disease prevalence with the exception of ovarian cancer which could signify a public fear. PMID:25422723

  11. HST Keyword Dictionary

    NASA Astrophysics Data System (ADS)

    Swade, D. A.; Gardner, L.; Hopkins, E.; Kimball, T.; Lezon, K.; Rose, J.; Shiao, B.

    STScI has undertaken a project to place all HST keyword information in one source, the keyword database, and to provide a mechanism for making this keyword information accessible to all HST users, the keyword dictionary, which is a WWW interface to the keyword database.

  12. Locality in Search Engine Queries and Its Implications for Caching

    DTIC Science & Technology

    2001-05-01

    in the question of whether caching might be effective for search engines as well. They study two real search engine traces by examining query...locality and its implications for caching. The two search engines studied are Vivisimo and Excite. Their trace analysis results show that queries have

  13. Taming the Information Jungle with WWW Search Engines.

    ERIC Educational Resources Information Center

    Repman, Judi; And Others

    1997-01-01

    Because searching the Web with different engines often produces different results, the best strategy is to learn how each engine works. Discusses comparing search engines; qualities to consider (ease of use, relevance of hits, and speed); and six of the most popular search tools (Yahoo, Magellan. InfoSeek, Alta Vista, Lycos, and Excite). Lists…

  14. Research on Agriculture Domain Meta-Search Engine System

    NASA Astrophysics Data System (ADS)

    Xie, Nengfu; Wang, Wensheng

    The rapid growth of agriculture web information brings a fact that search engine can not return a satisfied result for users’ queries. In this paper, we propose an agriculture domain search engine system, called ADSE, that can obtains results by an advance interface to several searches and aggregates them. We also discuss two key technologies: agriculture information determination and engine.

  15. [Biomedical information on the internet using search engines. A one-year trial].

    PubMed

    Corrao, Salvatore; Leone, Francesco; Arnone, Sabrina

    2004-01-01

    The internet is a communication medium and content distributor that provide information in the general sense but it could be of great utility regarding as the search and retrieval of biomedical information. Search engines represent a great deal to rapidly find information on the net. However, we do not know whether general search engines and meta-search ones are reliable in order to find useful and validated biomedical information. The aim of our study was to verify the reproducibility of a search by key-words (pediatric or evidence) using 9 international search engines and 1 meta-search engine at the baseline and after a one year period. We analysed the first 20 citations as output of each searching. We evaluated the formal quality of Web-sites and their domain extensions. Moreover, we compared the output of each search at the start of this study and after a one year period and we considered as a criterion of reliability the number of Web-sites cited again. We found some interesting results that are reported throughout the text. Our findings point out an extreme dynamicity of the information on the Web and, for this reason, we advice a great caution when someone want to use search and meta-search engines as a tool for searching and retrieve reliable biomedical information. On the other hand, some search and meta-search engines could be very useful as a first step searching for defining better a search and, moreover, for finding institutional Web-sites too. This paper allows to know a more conscious approach to the internet biomedical information universe.

  16. Predicting user click behaviour in search engine advertisements

    NASA Astrophysics Data System (ADS)

    Daryaie Zanjani, Mohammad; Khadivi, Shahram

    2015-10-01

    According to the specific requirements and interests of users, search engines select and display advertisements that match user needs and have higher probability of attracting users' attention based on their previous search history. New objects such as user, advertisement or query cause a deterioration of precision in targeted advertising due to their lack of history. This article surveys this challenge. In the case of new objects, we first extract similar observed objects to the new object and then we use their history as the history of new object. Similarity between objects is measured based on correlation, which is a relation between user and advertisement when the advertisement is displayed to the user. This method is used for all objects, so it has helped us to accurately select relevant advertisements for users' queries. In our proposed model, we assume that similar users behave in a similar manner. We find that users with few queries are similar to new users. We will show that correlation between users and advertisements' keywords is high. Thus, users who pay attention to advertisements' keywords, click similar advertisements. In addition, users who pay attention to specific brand names might have similar behaviours too.

  17. A Literature Review of Indexing and Searching Techniques Implementation in Educational Search Engines

    ERIC Educational Resources Information Center

    El Guemmat, Kamal; Ouahabi, Sara

    2018-01-01

    The objective of this article is to analyze the searching and indexing techniques of educational search engines' implementation while treating future challenges. Educational search engines could greatly help in the effectiveness of e-learning if used correctly. However, these engines have several gaps which influence the performance of e-learning…

  18. Semantic Clustering of Search Engine Results

    PubMed Central

    Soliman, Sara Saad; El-Sayed, Maged F.; Hassan, Yasser F.

    2015-01-01

    This paper presents a novel approach for search engine results clustering that relies on the semantics of the retrieved documents rather than the terms in those documents. The proposed approach takes into consideration both lexical and semantics similarities among documents and applies activation spreading technique in order to generate semantically meaningful clusters. This approach allows documents that are semantically similar to be clustered together rather than clustering documents based on similar terms. A prototype is implemented and several experiments are conducted to test the prospered solution. The result of the experiment confirmed that the proposed solution achieves remarkable results in terms of precision. PMID:26933673

  19. Search engines, news wires and digital epidemiology: Presumptions and facts.

    PubMed

    Kaveh-Yazdy, Fatemeh; Zareh-Bidoki, Ali-Mohammad

    2018-07-01

    Digital epidemiology tries to identify diseases dynamics and spread behaviors using digital traces collected via search engines logs and social media posts. However, the impacts of news on information-seeking behaviors have been remained unknown. Data employed in this research provided from two sources, (1) Parsijoo search engine query logs of 48 months, and (2) a set of documents of 28 months of Parsijoo's news service. Two classes of topics, i.e. macro-topics and micro-topics were selected to be tracked in query logs and news. Keywords of the macro-topics were automatically generated using web provided resources and exceeded 10k. Keyword set of micro-topics were limited to a numerable list including terms related to diseases and health-related activities. The tests are established in the form of three studies. Study A includes temporal analyses of 7 macro-topics in query logs. Study B considers analyzing seasonality of searching patterns of 9 micro-topics, and Study C assesses the impact of news media coverage on users' health-related information-seeking behaviors. Study A showed that the hourly distribution of various macro-topics followed the changes in social activity level. Conversely, the interestingness of macro-topics did not follow the regulation of topic distributions. Among macro-topics, "Pharmacotherapy" has highest interestingness level and wider time-window of popularity. In Study B, seasonality of a limited number of diseases and health-related activities were analyzed. Trends of infectious diseases, such as flu, mumps and chicken pox were seasonal. Due to seasonality of most of diseases covered in national vaccination plans, the trend belonging to "Immunization and Vaccination" was seasonal, as well. Cancer awareness events caused peaks in search trends of "Cancer" and "Screening" micro-topics in specific days of each year that mimic repeated patterns which may mistakenly be identified as seasonality. In study C, we assessed the co-integration and

  20. The effective use of search engines on the Internet.

    PubMed

    Younger, P

    This article explains how nurses can get the most out of researching information on the internet using the search engine Google. It also explores some of the other types of search engines that are available. Internet users are shown how to find text, images and reports and search within sites. Copyright issues are also discussed.

  1. SpEnD: Linked Data SPARQL Endpoints Discovery Using Search Engines

    NASA Astrophysics Data System (ADS)

    Yumusak, Semih; Dogdu, Erdogan; Kodaz, Halife; Kamilaris, Andreas; Vandenbussche, Pierre-Yves

    In this study, a novel metacrawling method is proposed for discovering and monitoring linked data sources on the Web. We implemented the method in a prototype system, named SPARQL Endpoints Discovery (SpEnD). SpEnD starts with a "search keyword" discovery process for finding relevant keywords for the linked data domain and specifically SPARQL endpoints. Then, these search keywords are utilized to find linked data sources via popular search engines (Google, Bing, Yahoo, Yandex). By using this method, most of the currently listed SPARQL endpoints in existing endpoint repositories, as well as a significant number of new SPARQL endpoints, have been discovered. Finally, we have developed a new SPARQL endpoint crawler (SpEC) for crawling and link analysis.

  2. Comparing Web search engine performance in searching consumer health information: evaluation and recommendations.

    PubMed Central

    Wu, G; Li, J

    1999-01-01

    Identifying and accessing reliable, relevant consumer health information rapidly on the Internet may challenge the health sciences librarian and layperson alike. In this study, seven search engines are compared using representative consumer health topics for their content relevancy, system features, and attributes. The paper discusses evaluation criteria; systematically compares relevant results; analyzes performance in terms of the strengths and weaknesses of the search engines; and illustrates effective search engine selection, search formulation, and strategies. PMID:10550031

  3. Using Internet Search Engines to Obtain Medical Information: A Comparative Study

    PubMed Central

    Wang, Liupu; Wang, Juexin; Wang, Michael; Li, Yong; Liang, Yanchun

    2012-01-01

    Background The Internet has become one of the most important means to obtain health and medical information. It is often the first step in checking for basic information about a disease and its treatment. The search results are often useful to general users. Various search engines such as Google, Yahoo!, Bing, and Ask.com can play an important role in obtaining medical information for both medical professionals and lay people. However, the usability and effectiveness of various search engines for medical information have not been comprehensively compared and evaluated. Objective To compare major Internet search engines in their usability of obtaining medical and health information. Methods We applied usability testing as a software engineering technique and a standard industry practice to compare the four major search engines (Google, Yahoo!, Bing, and Ask.com) in obtaining health and medical information. For this purpose, we searched the keyword breast cancer in Google, Yahoo!, Bing, and Ask.com and saved the results of the top 200 links from each search engine. We combined nonredundant links from the four search engines and gave them to volunteer users in an alphabetical order. The volunteer users evaluated the websites and scored each website from 0 to 10 (lowest to highest) based on the usefulness of the content relevant to breast cancer. A medical expert identified six well-known websites related to breast cancer in advance as standards. We also used five keywords associated with breast cancer defined in the latest release of Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) and analyzed their occurrence in the websites. Results Each search engine provided rich information related to breast cancer in the search results. All six standard websites were among the top 30 in search results of all four search engines. Google had the best search validity (in terms of whether a website could be opened), followed by Bing, Ask.com, and Yahoo!. The search

  4. Toward two-dimensional search engines

    NASA Astrophysics Data System (ADS)

    Ermann, L.; Chepelianskii, A. D.; Shepelyansky, D. L.

    2012-07-01

    We study the statistical properties of various directed networks using ranking of their nodes based on the dominant vectors of the Google matrix known as PageRank and CheiRank. On average PageRank orders nodes proportionally to a number of ingoing links, while CheiRank orders nodes proportionally to a number of outgoing links. In this way, the ranking of nodes becomes two dimensional which paves the way for the development of two-dimensional search engines of a new type. Statistical properties of information flow on the PageRank-CheiRank plane are analyzed for networks of British, French and Italian universities, Wikipedia, Linux Kernel, gene regulation and other networks. A special emphasis is done for British universities networks using the large database publicly available in the UK. Methods of spam links control are also analyzed.

  5. Utilization of a radiology-centric search engine.

    PubMed

    Sharpe, Richard E; Sharpe, Megan; Siegel, Eliot; Siddiqui, Khan

    2010-04-01

    Internet-based search engines have become a significant component of medical practice. Physicians increasingly rely on information available from search engines as a means to improve patient care, provide better education, and enhance research. Specialized search engines have emerged to more efficiently meet the needs of physicians. Details about the ways in which radiologists utilize search engines have not been documented. The authors categorized every 25th search query in a radiology-centric vertical search engine by radiologic subspecialty, imaging modality, geographic location of access, time of day, use of abbreviations, misspellings, and search language. Musculoskeletal and neurologic imagings were the most frequently searched subspecialties. The least frequently searched were breast imaging, pediatric imaging, and nuclear medicine. Magnetic resonance imaging and computed tomography were the most frequently searched modalities. A majority of searches were initiated in North America, but all continents were represented. Searches occurred 24 h/day in converted local times, with a majority occurring during the normal business day. Misspellings and abbreviations were common. Almost all searches were performed in English. Search engine utilization trends are likely to mirror trends in diagnostic imaging in the region from which searches originate. Internet searching appears to function as a real-time clinical decision-making tool, a research tool, and an educational resource. A more thorough understanding of search utilization patterns can be obtained by analyzing phrases as actually entered as well as the geographic location and time of origination. This knowledge may contribute to the development of more efficient and personalized search engines.

  6. RNA search engines empower the bacterial intranet.

    PubMed

    Dendooven, Tom; Luisi, Ben F

    2017-08-15

    RNA acts not only as an information bearer in the biogenesis of proteins from genes, but also as a regulator that participates in the control of gene expression. In bacteria, small RNA molecules (sRNAs) play controlling roles in numerous processes and help to orchestrate complex regulatory networks. Such processes include cell growth and development, response to stress and metabolic change, transcription termination, cell-to-cell communication, and the launching of programmes for host invasion. All these processes require recognition of target messenger RNAs by the sRNAs. This review summarizes recent results that have provided insights into how bacterial sRNAs are recruited into effector ribonucleoprotein complexes that can seek out and act upon target transcripts. The results hint at how sRNAs and their protein partners act as pattern-matching search engines that efficaciously regulate gene expression, by performing with specificity and speed while avoiding off-target effects. The requirements for efficient searches of RNA patterns appear to be common to all domains of life. © 2017 The Author(s).

  7. RNA search engines empower the bacterial intranet

    PubMed Central

    Dendooven, Tom

    2017-01-01

    RNA acts not only as an information bearer in the biogenesis of proteins from genes, but also as a regulator that participates in the control of gene expression. In bacteria, small RNA molecules (sRNAs) play controlling roles in numerous processes and help to orchestrate complex regulatory networks. Such processes include cell growth and development, response to stress and metabolic change, transcription termination, cell-to-cell communication, and the launching of programmes for host invasion. All these processes require recognition of target messenger RNAs by the sRNAs. This review summarizes recent results that have provided insights into how bacterial sRNAs are recruited into effector ribonucleoprotein complexes that can seek out and act upon target transcripts. The results hint at how sRNAs and their protein partners act as pattern-matching search engines that efficaciously regulate gene expression, by performing with specificity and speed while avoiding off-target effects. The requirements for efficient searches of RNA patterns appear to be common to all domains of life. PMID:28710287

  8. Getting to the top of Google: search engine optimization.

    PubMed

    Maley, Catherine; Baum, Neil

    2010-01-01

    Search engine optimization is the process of making your Web site appear at or near the top of popular search engines such as Google, Yahoo, and MSN. This is not done by luck or knowing someone working for the search engines but by understanding the process of how search engines select Web sites for placement on top or on the first page. This article will review the process and provide methods and techniques to use to have your site rated at the top or very near the top.

  9. Towards Identifying and Reducing the Bias of Disease Information Extracted from Search Engine Data

    PubMed Central

    Huang, Da-Cang; Wang, Jin-Feng; Huang, Ji-Xia; Sui, Daniel Z.; Zhang, Hong-Yan; Hu, Mao-Gui; Xu, Cheng-Dong

    2016-01-01

    The estimation of disease prevalence in online search engine data (e.g., Google Flu Trends (GFT)) has received a considerable amount of scholarly and public attention in recent years. While the utility of search engine data for disease surveillance has been demonstrated, the scientific community still seeks ways to identify and reduce biases that are embedded in search engine data. The primary goal of this study is to explore new ways of improving the accuracy of disease prevalence estimations by combining traditional disease data with search engine data. A novel method, Biased Sentinel Hospital-based Area Disease Estimation (B-SHADE), is introduced to reduce search engine data bias from a geographical perspective. To monitor search trends on Hand, Foot and Mouth Disease (HFMD) in Guangdong Province, China, we tested our approach by selecting 11 keywords from the Baidu index platform, a Chinese big data analyst similar to GFT. The correlation between the number of real cases and the composite index was 0.8. After decomposing the composite index at the city level, we found that only 10 cities presented a correlation of close to 0.8 or higher. These cities were found to be more stable with respect to search volume, and they were selected as sample cities in order to estimate the search volume of the entire province. After the estimation, the correlation improved from 0.8 to 0.864. After fitting the revised search volume with historical cases, the mean absolute error was 11.19% lower than it was when the original search volume and historical cases were combined. To our knowledge, this is the first study to reduce search engine data bias levels through the use of rigorous spatial sampling strategies. PMID:27271698

  10. Towards Identifying and Reducing the Bias of Disease Information Extracted from Search Engine Data.

    PubMed

    Huang, Da-Cang; Wang, Jin-Feng; Huang, Ji-Xia; Sui, Daniel Z; Zhang, Hong-Yan; Hu, Mao-Gui; Xu, Cheng-Dong

    2016-06-01

    The estimation of disease prevalence in online search engine data (e.g., Google Flu Trends (GFT)) has received a considerable amount of scholarly and public attention in recent years. While the utility of search engine data for disease surveillance has been demonstrated, the scientific community still seeks ways to identify and reduce biases that are embedded in search engine data. The primary goal of this study is to explore new ways of improving the accuracy of disease prevalence estimations by combining traditional disease data with search engine data. A novel method, Biased Sentinel Hospital-based Area Disease Estimation (B-SHADE), is introduced to reduce search engine data bias from a geographical perspective. To monitor search trends on Hand, Foot and Mouth Disease (HFMD) in Guangdong Province, China, we tested our approach by selecting 11 keywords from the Baidu index platform, a Chinese big data analyst similar to GFT. The correlation between the number of real cases and the composite index was 0.8. After decomposing the composite index at the city level, we found that only 10 cities presented a correlation of close to 0.8 or higher. These cities were found to be more stable with respect to search volume, and they were selected as sample cities in order to estimate the search volume of the entire province. After the estimation, the correlation improved from 0.8 to 0.864. After fitting the revised search volume with historical cases, the mean absolute error was 11.19% lower than it was when the original search volume and historical cases were combined. To our knowledge, this is the first study to reduce search engine data bias levels through the use of rigorous spatial sampling strategies.

  11. How To Do Field Searching in Web Search Engines: A Field Trip.

    ERIC Educational Resources Information Center

    Hock, Ran

    1998-01-01

    Describes the field search capabilities of selected Web search engines (AltaVista, HotBot, Infoseek, Lycos, Yahoo!) and includes a chart outlining what fields (date, title, URL, images, audio, video, links, page depth) are searchable, where to go on the page to search them, the syntax required (if any), and how field search queries are entered.…

  12. Combining results of multiple search engines in proteomics.

    PubMed

    Shteynberg, David; Nesvizhskii, Alexey I; Moritz, Robert L; Deutsch, Eric W

    2013-09-01

    A crucial component of the analysis of shotgun proteomics datasets is the search engine, an algorithm that attempts to identify the peptide sequence from the parent molecular ion that produced each fragment ion spectrum in the dataset. There are many different search engines, both commercial and open source, each employing a somewhat different technique for spectrum identification. The set of high-scoring peptide-spectrum matches for a defined set of input spectra differs markedly among the various search engine results; individual engines each provide unique correct identifications among a core set of correlative identifications. This has led to the approach of combining the results from multiple search engines to achieve improved analysis of each dataset. Here we review the techniques and available software for combining the results of multiple search engines and briefly compare the relative performance of these techniques.

  13. Combining Results of Multiple Search Engines in Proteomics*

    PubMed Central

    Shteynberg, David; Nesvizhskii, Alexey I.; Moritz, Robert L.; Deutsch, Eric W.

    2013-01-01

    A crucial component of the analysis of shotgun proteomics datasets is the search engine, an algorithm that attempts to identify the peptide sequence from the parent molecular ion that produced each fragment ion spectrum in the dataset. There are many different search engines, both commercial and open source, each employing a somewhat different technique for spectrum identification. The set of high-scoring peptide-spectrum matches for a defined set of input spectra differs markedly among the various search engine results; individual engines each provide unique correct identifications among a core set of correlative identifications. This has led to the approach of combining the results from multiple search engines to achieve improved analysis of each dataset. Here we review the techniques and available software for combining the results of multiple search engines and briefly compare the relative performance of these techniques. PMID:23720762

  14. Variability of patient spine education by Internet search engine.

    PubMed

    Ghobrial, George M; Mehdi, Angud; Maltenfort, Mitchell; Sharan, Ashwini D; Harrop, James S

    2014-03-01

    Patients are increasingly reliant upon the Internet as a primary source of medical information. The educational experience varies by search engine, search term, and changes daily. There are no tools for critical evaluation of spinal surgery websites. To highlight the variability between common search engines for the same search terms. To detect bias, by prevalence of specific kinds of websites for certain spinal disorders. Demonstrate a simple scoring system of spinal disorder website for patient use, to maximize the quality of information exposed to the patient. Ten common search terms were used to query three of the most common search engines. The top fifty results of each query were tabulated. A negative binomial regression was performed to highlight the variation across each search engine. Google was more likely than Bing and Yahoo search engines to return hospital ads (P=0.002) and more likely to return scholarly sites of peer-reviewed lite (P=0.003). Educational web sites, surgical group sites, and online web communities had a significantly higher likelihood of returning on any search, regardless of search engine, or search string (P=0.007). Likewise, professional websites, including hospital run, industry sponsored, legal, and peer-reviewed web pages were less likely to be found on a search overall, regardless of engine and search string (P=0.078). The Internet is a rapidly growing body of medical information which can serve as a useful tool for patient education. High quality information is readily available, provided that the patient uses a consistent, focused metric for evaluating online spine surgery information, as there is a clear variability in the way search engines present information to the patient. Published by Elsevier B.V.

  15. Web Feet Guide to Search Engines: Finding It on the Net.

    ERIC Educational Resources Information Center

    Web Feet, 2001

    2001-01-01

    This guide to search engines for the World Wide Web discusses selecting the right search engine; interpreting search results; major search engines; online tutorials and guides; search engines for kids; specialized search tools for various subjects; and other specialized engines and gateways. (LRW)

  16. The Keck keyword layer

    NASA Technical Reports Server (NTRS)

    Conrad, A. R.; Lupton, W. F.

    1992-01-01

    Each Keck instrument presents a consistent software view to the user interface programmer. The view consists of a small library of functions, which are identical for all instruments, and a large set of keywords, that vary from instrument to instrument. All knowledge of the underlying task structure is hidden from the application programmer by the keyword layer. Image capture software uses the same function library to collect data for the image header. Because the image capture software and the instrument control software are built on top of the same keyword layer, a given observation can be 'replayed' by extracting keyword-value pairs from the image header and passing them back to the control system. The keyword layer features non-blocking as well as blocking I/O. A non-blocking keyword write operation (such as setting a filter position) specifies a callback to be invoked when the operation is complete. A non-blocking keyword read operation specifies a callback to be invoked whenever the keyword changes state. The keyword-callback style meshes well with the widget-callback style commonly used in X window programs. The first keyword library was built for the two Keck optical instruments. More recently, keyword libraries have been developed for the infrared instruments and for telescope control. Although the underlying mechanisms used for inter-process communication by each of these systems vary widely (Lick MUSIC, Sun RPC, and direct socket I/O, respectively), a basic user interface has been written that can be used with any of these systems. Since the keyword libraries are bound to user interface programs dynamically at run time, only a single set of user interface executables is needed. For example, the same program, 'xshow', can be used to display continuously the telescope's position, the time left in an instrument's exposure, or both values simultaneously. Less generic tools that operate on specific keywords, for example an X display that controls optical

  17. Quantitative evaluation of recall and precision of CAT Crawler, a search engine specialized on retrieval of Critically Appraised Topics

    PubMed Central

    Dong, Peng; Wong, Ling Ling; Ng, Sarah; Loh, Marie; Mondry, Adrian

    2004-01-01

    Background Critically Appraised Topics (CATs) are a useful tool that helps physicians to make clinical decisions as the healthcare moves towards the practice of Evidence-Based Medicine (EBM). The fast growing World Wide Web has provided a place for physicians to share their appraised topics online, but an increasing amount of time is needed to find a particular topic within such a rich repository. Methods A web-based application, namely the CAT Crawler, was developed by Singapore's Bioinformatics Institute to allow physicians to adequately access available appraised topics on the Internet. A meta-search engine, as the core component of the application, finds relevant topics following keyword input. The primary objective of the work presented here is to evaluate the quantity and quality of search results obtained from the meta-search engine of the CAT Crawler by comparing them with those obtained from two individual CAT search engines. From the CAT libraries at these two sites, all possible keywords were extracted using a keyword extractor. Of those common to both libraries, ten were randomly chosen for evaluation. All ten were submitted to the two search engines individually, and through the meta-search engine of the CAT Crawler. Search results were evaluated for relevance both by medical amateurs and professionals, and the respective recall and precision were calculated. Results While achieving an identical recall, the meta-search engine showed a precision of 77.26% (±14.45) compared to the individual search engines' 52.65% (±12.0) (p < 0.001). Conclusion The results demonstrate the validity of the CAT Crawler meta-search engine approach. The improved precision due to inherent filters underlines the practical usefulness of this tool for clinicians. PMID:15588311

  18. Quantitative evaluation of recall and precision of CAT Crawler, a search engine specialized on retrieval of Critically Appraised Topics.

    PubMed

    Dong, Peng; Wong, Ling Ling; Ng, Sarah; Loh, Marie; Mondry, Adrian

    2004-12-10

    Critically Appraised Topics (CATs) are a useful tool that helps physicians to make clinical decisions as the healthcare moves towards the practice of Evidence-Based Medicine (EBM). The fast growing World Wide Web has provided a place for physicians to share their appraised topics online, but an increasing amount of time is needed to find a particular topic within such a rich repository. A web-based application, namely the CAT Crawler, was developed by Singapore's Bioinformatics Institute to allow physicians to adequately access available appraised topics on the Internet. A meta-search engine, as the core component of the application, finds relevant topics following keyword input. The primary objective of the work presented here is to evaluate the quantity and quality of search results obtained from the meta-search engine of the CAT Crawler by comparing them with those obtained from two individual CAT search engines. From the CAT libraries at these two sites, all possible keywords were extracted using a keyword extractor. Of those common to both libraries, ten were randomly chosen for evaluation. All ten were submitted to the two search engines individually, and through the meta-search engine of the CAT Crawler. Search results were evaluated for relevance both by medical amateurs and professionals, and the respective recall and precision were calculated. While achieving an identical recall, the meta-search engine showed a precision of 77.26% (+/-14.45) compared to the individual search engines' 52.65% (+/-12.0) (p < 0.001). The results demonstrate the validity of the CAT Crawler meta-search engine approach. The improved precision due to inherent filters underlines the practical usefulness of this tool for clinicians.

  19. LAILAPS: the plant science search engine.

    PubMed

    Esch, Maria; Chen, Jinbo; Colmsee, Christian; Klapperstück, Matthias; Grafahrend-Belau, Eva; Scholz, Uwe; Lange, Matthias

    2015-01-01

    With the number of sequenced plant genomes growing, the number of predicted genes and functional annotations is also increasing. The association between genes and phenotypic traits is currently of great interest. Unfortunately, the information available today is widely scattered over a number of different databases. Information retrieval (IR) has become an all-encompassing bioinformatics methodology for extracting knowledge from complex, heterogeneous and distributed databases, and therefore can be a useful tool for obtaining a comprehensive view of plant genomics, from genes to traits. Here we describe LAILAPS (http://lailaps.ipk-gatersleben.de), an IR system designed to link plant genomic data in the context of phenotypic attributes for a detailed forward genetic research. LAILAPS comprises around 65 million indexed documents, encompassing >13 major life science databases with around 80 million links to plant genomic resources. The LAILAPS search engine allows fuzzy querying for candidate genes linked to specific traits over a loosely integrated system of indexed and interlinked genome databases. Query assistance and an evidence-based annotation system enable time-efficient and comprehensive information retrieval. An artificial neural network incorporating user feedback and behavior tracking allows relevance sorting of results. We fully describe LAILAPS's functionality and capabilities by comparing this system's performance with other widely used systems and by reporting both a validation in maize and a knowledge discovery use-case focusing on candidate genes in barley. © The Author 2014. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.

  20. Brief Report: Consistency of Search Engine Rankings for Autism Websites

    ERIC Educational Resources Information Center

    Reichow, Brian; Naples, Adam; Steinhoff, Timothy; Halpern, Jason; Volkmar, Fred R.

    2012-01-01

    The World Wide Web is one of the most common methods used by parents to find information on autism spectrum disorders and most consumers find information through search engines such as Google or Bing. However, little is known about how the search engines operate or the consistency of the results that are returned over time. This study presents the…

  1. IntegromeDB: an integrated system and biological search engine.

    PubMed

    Baitaluk, Michael; Kozhenkov, Sergey; Dubinina, Yulia; Ponomarenko, Julia

    2012-01-19

    With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback.

  2. IntegromeDB: an integrated system and biological search engine

    PubMed Central

    2012-01-01

    Background With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Description Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. Conclusions The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback. PMID:22260095

  3. E-Referencer: Transforming Boolean OPACs to Web Search Engines.

    ERIC Educational Resources Information Center

    Khoo, Christopher S. G.; Poo, Danny C. C.; Toh, Teck-Kang; Hong, Glenn

    E-Referencer is an expert intermediary system for searching library online public access catalogs (OPACs) on the World Wide Web. It is implemented as a proxy server that mediates the interaction between the user and Boolean OPACs. It transforms a Boolean OPAC into a retrieval system with many of the search capabilities of Web search engines.…

  4. Use of an Academic Library Web Site Search Engine.

    ERIC Educational Resources Information Center

    Fagan, Jody Condit

    2002-01-01

    Describes an analysis of the search engine logs of Southern Illinois University, Carbondale's library to determine how patrons used the site search. Discusses results that showed patrons did not understand the function of the search and explains improvements that were made in the Web site and in online reference services. (Author/LRW)

  5. ReSEARCH: A Requirements Search Engine: Progress Report 2

    DTIC Science & Technology

    2008-09-01

    and provides a convenient user interface for the search process. Ideally, the web application would be based on Tomcat, a free Java Servlet and JSP...Implementation issues Lucene Java is an Open Source project, available under the Apache License, which provides an accessible API for the development of...from the Apache Lucene website (Lucene- java Wiki , 2008). A search application developed with Lucene consists of the same two major com- ponents

  6. A fuzzy-match search engine for physician directories.

    PubMed

    Rastegar-Mojarad, Majid; Kadolph, Christopher; Ye, Zhan; Wall, Daniel; Murali, Narayana; Lin, Simon

    2014-11-04

    A search engine to find physicians' information is a basic but crucial function of a health care provider's website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspellings and spelling variations of names is needed, as the United States (US) has culturally, racially, and ethnically diverse names. The Marshfield Clinic website provides a search engine for users to search for physicians' names. The current search engine provides an auto-completion function, but it requires an exact match. We observed that 26% of all searches yielded no results. The goal was to design a fuzzy-match algorithm to aid users in finding physicians easier and faster. Instead of an exact match search, we used a fuzzy algorithm to find similar matches for searched terms. In the algorithm, we solved three types of search engine failures: "Typographic", "Phonetic spelling variation", and "Nickname". To solve these mismatches, we used a customized Levenshtein distance calculation that incorporated Soundex coding and a lookup table of nicknames derived from US census data. Using the "Challenge Data Set of Marshfield Physician Names," we evaluated the accuracy of fuzzy-match engine-top ten (90%) and compared it with exact match (0%), Soundex (24%), Levenshtein distance (59%), and fuzzy-match engine-top one (71%). We designed, created a reference implementation, and evaluated a fuzzy-match search engine for physician directories. The open-source code is available at the codeplex website and a reference implementation is available for demonstration at the datamarsh website.

  7. Ocean Drilling Program: Science Operator Search Engine

    Science.gov Websites

    and products Drilling services and tools Online Janus database Search the ODP/TAMU web site ODP's main -USIO site, plus IODP, ODP, and DSDP Publications, together or separately. ODP | Search | Database

  8. Search Engines for Tomorrow's Scholars, Part Two

    ERIC Educational Resources Information Center

    Fagan, Jody Condit

    2012-01-01

    This two-part article considers how well some of today's search tools support scholars' work. The first part of the article reviewed Google Scholar and Microsoft Academic Search using a modified version of Carole L. Palmer, Lauren C. Teffeau, and Carrier M. Pirmann's framework (2009). Microsoft Academic Search is a strong contender when…

  9. Design search and optimization in aerospace engineering.

    PubMed

    Keane, A J; Scanlan, J P

    2007-10-15

    In this paper, we take a design-led perspective on the use of computational tools in the aerospace sector. We briefly review the current state-of-the-art in design search and optimization (DSO) as applied to problems from aerospace engineering, focusing on those problems that make heavy use of computational fluid dynamics (CFD). This ranges over issues of representation, optimization problem formulation and computational modelling. We then follow this with a multi-objective, multi-disciplinary example of DSO applied to civil aircraft wing design, an area where this kind of approach is becoming essential for companies to maintain their competitive edge. Our example considers the structure and weight of a transonic civil transport wing, its aerodynamic performance at cruise speed and its manufacturing costs. The goals are low drag and cost while holding weight and structural performance at acceptable levels. The constraints and performance metrics are modelled by a linked series of analysis codes, the most expensive of which is a CFD analysis of the aerodynamics using an Euler code with coupled boundary layer model. Structural strength and weight are assessed using semi-empirical schemes based on typical airframe company practice. Costing is carried out using a newly developed generative approach based on a hierarchical decomposition of the key structural elements of a typical machined and bolted wing-box assembly. To carry out the DSO process in the face of multiple competing goals, a recently developed multi-objective probability of improvement formulation is invoked along with stochastic process response surface models (Krigs). This approach both mitigates the significant run times involved in CFD computation and also provides an elegant way of balancing competing goals while still allowing the deployment of the whole range of single objective optimizers commonly available to design teams.

  10. New Architectures for Presenting Search Results Based on Web Search Engines Users Experience

    ERIC Educational Resources Information Center

    Martinez, F. J.; Pastor, J. A.; Rodriguez, J. V.; Lopez, Rosana; Rodriguez, J. V., Jr.

    2011-01-01

    Introduction: The Internet is a dynamic environment which is continuously being updated. Search engines have been, currently are and in all probability will continue to be the most popular systems in this information cosmos. Method: In this work, special attention has been paid to the series of changes made to search engines up to this point,…

  11. Dermatological image search engines on the Internet: do they work?

    PubMed

    Cutrone, M; Grimalt, R

    2007-02-01

    Atlases on CD-ROM first substituted the use of paediatric dermatology atlases printed on paper. This permitted a faster search and a practical comparison of differential diagnoses. The third step in the evolution of clinical atlases was the onset of the online atlas. Many doctors now use the Internet image search engines to obtain clinical images directly. The aim of this study was to test the reliability of the image search engines compared to the online atlases. We tested seven Internet image search engines with three paediatric dermatology diseases. In general, the service offered by the search engines is good, and continues to be free of charge. The coincidence between what we searched for and what we found was generally excellent, and contained no advertisements. Most Internet search engines provided similar results but some were more user friendly than others. It is not necessary to repeat the same research with Picsearch, Lycos and MSN, as the response would be the same; there is a possibility that they might share software. Image search engines are a useful, free and precise method to obtain paediatric dermatology images for teaching purposes. There is still the matter of copyright to be resolved. What are the legal uses of these 'free' images? How do we define 'teaching purposes'? New watermark methods and encrypted electronic signatures might solve these problems and answer these questions.

  12. Social media networking: YouTube and search engine optimization.

    PubMed

    Jackson, Rem; Schneider, Andrew; Baum, Neil

    2011-01-01

    This is the third part of a three-part article on social media networking. This installment will focus on YouTube and search engine optimization. This article will explore the application of YouTube to the medical practice and how YouTube can help a practice retain its existing patients and attract new patients to the practice. The article will also describe the importance of search engine optimization and how to make your content appear on the first page of the search engines such as Google, Yahoo, and YouTube.

  13. PlateRunner: A Search Engine to Identify EMR Boilerplates.

    PubMed

    Divita, Guy; Workman, T Elizabeth; Carter, Marjorie E; Redd, Andrew; Samore, Matthew H; Gundlapalli, Adi V

    2016-01-01

    Medical text contains boilerplated content, an artifact of pull-down forms from EMRs. Boilerplated content is the source of challenges for concept extraction on clinical text. This paper introduces PlateRunner, a search engine on boilerplates from the US Department of Veterans Affairs (VA) EMR. Boilerplates containing concepts should be identified and reviewed to recognize challenging formats, identify high yield document titles, and fine tune section zoning. This search engine has the capability to filter negated and asserted concepts, save and search query results. This tool can save queries, search results, and documents found for later analysis.

  14. TokSearch: A search engine for fusion experimental data

    SciTech Connect

    Sammuli, Brian S.; Barr, Jayson L.; Eidietis, Nicholas W.

    At a typical fusion research site, experimental data is stored using archive technologies that deal with each discharge as an independent set of data. These technologies (e.g. MDSplus or HDF5) are typically supplemented with a database that aggregates metadata for multiple shots to allow for efficient querying of certain predefined quantities. Often, however, a researcher will need to extract information from the archives, possibly for many shots, that is not available in the metadata store or otherwise indexed for quick retrieval. To address this need, a new search tool called TokSearch has been added to the General Atomics TokSys controlmore » design and analysis suite [1]. This tool provides the ability to rapidly perform arbitrary, parallelized queries of archived tokamak shot data (both raw and analyzed) over large numbers of shots. The TokSearch query API borrows concepts from SQL, and users can choose to implement queries in either MatlabTM or Python.« less

  15. TokSearch: A search engine for fusion experimental data

    DOE PAGES

    Sammuli, Brian S.; Barr, Jayson L.; Eidietis, Nicholas W.; ...

    2018-04-01

    At a typical fusion research site, experimental data is stored using archive technologies that deal with each discharge as an independent set of data. These technologies (e.g. MDSplus or HDF5) are typically supplemented with a database that aggregates metadata for multiple shots to allow for efficient querying of certain predefined quantities. Often, however, a researcher will need to extract information from the archives, possibly for many shots, that is not available in the metadata store or otherwise indexed for quick retrieval. To address this need, a new search tool called TokSearch has been added to the General Atomics TokSys controlmore » design and analysis suite [1]. This tool provides the ability to rapidly perform arbitrary, parallelized queries of archived tokamak shot data (both raw and analyzed) over large numbers of shots. The TokSearch query API borrows concepts from SQL, and users can choose to implement queries in either MatlabTM or Python.« less

  16. Short-term Internet search using makes people rely on search engines when facing unknown issues.

    PubMed

    Wang, Yifan; Wu, Lingdan; Luo, Liang; Zhang, Yifen; Dong, Guangheng

    2017-01-01

    The Internet search engines, which have powerful search/sort functions and ease of use features, have become an indispensable tool for many individuals. The current study is to test whether the short-term Internet search training can make people more dependent on it. Thirty-one subjects out of forty subjects completed the search training study which included a pre-test, a six-day's training of Internet search, and a post-test. During the pre- and post- tests, subjects were asked to search online the answers to 40 unusual questions, remember the answers and recall them in the scanner. Un-learned questions were randomly presented at the recalling stage in order to elicited search impulse. Comparing to the pre-test, subjects in the post-test reported higher impulse to use search engines to answer un-learned questions. Consistently, subjects showed higher brain activations in dorsolateral prefrontal cortex and anterior cingulate cortex in the post-test than in the pre-test. In addition, there were significant positive correlations self-reported search impulse and brain responses in the frontal areas. The results suggest that a simple six-day's Internet search training can make people dependent on the search tools when facing unknown issues. People are easily dependent on the Internet search engines.

  17. Short-term Internet search using makes people rely on search engines when facing unknown issues

    PubMed Central

    Wang, Yifan; Wu, Lingdan; Luo, Liang; Zhang, Yifen

    2017-01-01

    The Internet search engines, which have powerful search/sort functions and ease of use features, have become an indispensable tool for many individuals. The current study is to test whether the short-term Internet search training can make people more dependent on it. Thirty-one subjects out of forty subjects completed the search training study which included a pre-test, a six-day’s training of Internet search, and a post-test. During the pre- and post- tests, subjects were asked to search online the answers to 40 unusual questions, remember the answers and recall them in the scanner. Un-learned questions were randomly presented at the recalling stage in order to elicited search impulse. Comparing to the pre-test, subjects in the post-test reported higher impulse to use search engines to answer un-learned questions. Consistently, subjects showed higher brain activations in dorsolateral prefrontal cortex and anterior cingulate cortex in the post-test than in the pre-test. In addition, there were significant positive correlations self-reported search impulse and brain responses in the frontal areas. The results suggest that a simple six-day’s Internet search training can make people dependent on the search tools when facing unknown issues. People are easily dependent on the Internet search engines. PMID:28441408

  18. GEMINI: a computationally-efficient search engine for large gene expression datasets.

    PubMed

    DeFreitas, Timothy; Saddiki, Hachem; Flaherty, Patrick

    2016-02-24

    Low-cost DNA sequencing allows organizations to accumulate massive amounts of genomic data and use that data to answer a diverse range of research questions. Presently, users must search for relevant genomic data using a keyword, accession number of meta-data tag. However, in this search paradigm the form of the query - a text-based string - is mismatched with the form of the target - a genomic profile. To improve access to massive genomic data resources, we have developed a fast search engine, GEMINI, that uses a genomic profile as a query to search for similar genomic profiles. GEMINI implements a nearest-neighbor search algorithm using a vantage-point tree to store a database of n profiles and in certain circumstances achieves an [Formula: see text] expected query time in the limit. We tested GEMINI on breast and ovarian cancer gene expression data from The Cancer Genome Atlas project and show that it achieves a query time that scales as the logarithm of the number of records in practice on genomic data. In a database with 10(5) samples, GEMINI identifies the nearest neighbor in 0.05 sec compared to a brute force search time of 0.6 sec. GEMINI is a fast search engine that uses a query genomic profile to search for similar profiles in a very large genomic database. It enables users to identify similar profiles independent of sample label, data origin or other meta-data information.

  19. Real-time earthquake monitoring using a search engine method.

    PubMed

    Zhang, Jie; Zhang, Haijiang; Chen, Enhong; Zheng, Yi; Kuang, Wenhuan; Zhang, Xiong

    2014-12-04

    When an earthquake occurs, seismologists want to use recorded seismograms to infer its location, magnitude and source-focal mechanism as quickly as possible. If such information could be determined immediately, timely evacuations and emergency actions could be undertaken to mitigate earthquake damage. Current advanced methods can report the initial location and magnitude of an earthquake within a few seconds, but estimating the source-focal mechanism may require minutes to hours. Here we present an earthquake search engine, similar to a web search engine, that we developed by applying a computer fast search method to a large seismogram database to find waveforms that best fit the input data. Our method is several thousand times faster than an exact search. For an Mw 5.9 earthquake on 8 March 2012 in Xinjiang, China, the search engine can infer the earthquake's parameters in <1 s after receiving the long-period surface wave data.

  20. Real-time earthquake monitoring using a search engine method

    PubMed Central

    Zhang, Jie; Zhang, Haijiang; Chen, Enhong; Zheng, Yi; Kuang, Wenhuan; Zhang, Xiong

    2014-01-01

    When an earthquake occurs, seismologists want to use recorded seismograms to infer its location, magnitude and source-focal mechanism as quickly as possible. If such information could be determined immediately, timely evacuations and emergency actions could be undertaken to mitigate earthquake damage. Current advanced methods can report the initial location and magnitude of an earthquake within a few seconds, but estimating the source-focal mechanism may require minutes to hours. Here we present an earthquake search engine, similar to a web search engine, that we developed by applying a computer fast search method to a large seismogram database to find waveforms that best fit the input data. Our method is several thousand times faster than an exact search. For an Mw 5.9 earthquake on 8 March 2012 in Xinjiang, China, the search engine can infer the earthquake’s parameters in <1 s after receiving the long-period surface wave data. PMID:25472861

  1. Searching for a New Way to Reach Patrons: A Search Engine Optimization Pilot Project at Binghamton University Libraries

    ERIC Educational Resources Information Center

    Rushton, Erin E.; Kelehan, Martha Daisy; Strong, Marcy A.

    2008-01-01

    Search engine use is one of the most popular online activities. According to a recent OCLC report, nearly all students start their electronic research using a search engine instead of the library Web site. Instead of viewing search engines as competition, however, librarians at Binghamton University Libraries decided to employ search engine…

  2. Practical and Efficient Searching in Proteomics: A Cross Engine Comparison

    PubMed Central

    Paulo, Joao A.

    2014-01-01

    Background Analysis of large datasets produced by mass spectrometry-based proteomics relies on database search algorithms to sequence peptides and identify proteins. Several such scoring methods are available, each based on different statistical foundations and thereby not producing identical results. Here, the aim is to compare peptide and protein identifications using multiple search engines and examine the additional proteins gained by increasing the number of technical replicate analyses. Methods A HeLa whole cell lysate was analyzed on an Orbitrap mass spectrometer for 10 technical replicates. The data were combined and searched using Mascot, SEQUEST, and Andromeda. Comparisons were made of peptide and protein identifications among the search engines. In addition, searches using each engine were performed with incrementing number of technical replicates. Results The number and identity of peptides and proteins differed across search engines. For all three search engines, the differences in proteins identifications were greater than the differences in peptide identifications indicating that the major source of the disparity may be at the protein inference grouping level. The data also revealed that analysis of 2 technical replicates can increase protein identifications by up to 10-15%, while a third replicate results in an additional 4-5%. Conclusions The data emphasize two practical methods of increasing the robustness of mass spectrometry data analysis. The data show that 1) using multiple search engines can expand the number of identified proteins (union) and validate protein identifications (intersection), and 2) analysis of 2 or 3 technical replicates can substantially expand protein identifications. Moreover, information can be extracted from a dataset by performing database searching with different engines and performing technical repeats, which requires no additional sample preparation and effectively utilizes research time and effort. PMID:25346847

  3. Practical and Efficient Searching in Proteomics: A Cross Engine Comparison.

    PubMed

    Paulo, Joao A

    2013-10-01

    Analysis of large datasets produced by mass spectrometry-based proteomics relies on database search algorithms to sequence peptides and identify proteins. Several such scoring methods are available, each based on different statistical foundations and thereby not producing identical results. Here, the aim is to compare peptide and protein identifications using multiple search engines and examine the additional proteins gained by increasing the number of technical replicate analyses. A HeLa whole cell lysate was analyzed on an Orbitrap mass spectrometer for 10 technical replicates. The data were combined and searched using Mascot, SEQUEST, and Andromeda. Comparisons were made of peptide and protein identifications among the search engines. In addition, searches using each engine were performed with incrementing number of technical replicates. The number and identity of peptides and proteins differed across search engines. For all three search engines, the differences in proteins identifications were greater than the differences in peptide identifications indicating that the major source of the disparity may be at the protein inference grouping level. The data also revealed that analysis of 2 technical replicates can increase protein identifications by up to 10-15%, while a third replicate results in an additional 4-5%. The data emphasize two practical methods of increasing the robustness of mass spectrometry data analysis. The data show that 1) using multiple search engines can expand the number of identified proteins (union) and validate protein identifications (intersection), and 2) analysis of 2 or 3 technical replicates can substantially expand protein identifications. Moreover, information can be extracted from a dataset by performing database searching with different engines and performing technical repeats, which requires no additional sample preparation and effectively utilizes research time and effort.

  4. Using Internet search engines to estimate word frequency.

    PubMed

    Blair, Irene V; Urland, Geoffrey R; Ma, Jennifer E

    2002-05-01

    The present research investigated Internet search engines as a rapid, cost-effective alternative for estimating word frequencies. Frequency estimates for 382 words were obtained and compared across four methods: (1) Internet search engines, (2) the Kucera and Francis (1967) analysis of a traditional linguistic corpus, (3) the CELEX English linguistic database (Baayen, Piepenbrock, & Gulikers, 1995), and (4) participant ratings of familiarity. The results showed that Internet search engines produced frequency estimates that were highly consistent with those reported by Kucera and Francis and those calculated from CELEX, highly consistent across search engines, and very reliable over a 6-month period of time. Additional results suggested that Internet search engines are an excellent option when traditional word frequency analyses do not contain the necessary data (e.g., estimates for forenames and slang). In contrast, participants' familiarity judgments did not correspond well with the more objective estimates of word frequency. Researchers are advised to use search engines with large databases (e.g., AltaVista) to ensure the greatest representativeness of the frequency estimates.

  5. Paying Your Way to the Top: Search Engine Advertising.

    ERIC Educational Resources Information Center

    Scott, David M.

    2003-01-01

    Explains how organizations can buy listings on major Web search engines, making it the fastest growing form of advertising. Highlights include two network models, Google and Overture; bidding on phrases to buy as links to use with ads; ad ranking; benefits for small businesses; and paid listings versus regular search results. (LRW)

  6. How Safe Are Kid-Safe Search Engines?

    ERIC Educational Resources Information Center

    Masterson-Krum, Hope

    2001-01-01

    Examines search tools available to elementary and secondary school students, both human-compiled and crawler-based, to help direct them to age-appropriate Web sites; analyzes the procedures of search engines labeled family-friendly or kid safe that use filters; and tests the effectiveness of these services to students in school libraries. (LRW)

  7. Can electronic search engines optimize screening of search results in systematic reviews: an empirical study.

    PubMed

    Sampson, Margaret; Barrowman, Nicholas J; Moher, David; Clifford, Tammy J; Platt, Robert W; Morrison, Andra; Klassen, Terry P; Zhang, Li

    2006-02-24

    Most electronic search efforts directed at identifying primary studies for inclusion in systematic reviews rely on the optimal Boolean search features of search interfaces such as DIALOG and Ovid. Our objective is to test the ability of an Ultraseek search engine to rank MEDLINE records of the included studies of Cochrane reviews within the top half of all the records retrieved by the Boolean MEDLINE search used by the reviewers. Collections were created using the MEDLINE bibliographic records of included and excluded studies listed in the review and all records retrieved by the MEDLINE search. Records were converted to individual HTML files. Collections of records were indexed and searched through a statistical search engine, Ultraseek, using review-specific search terms. Our data sources, systematic reviews published in the Cochrane library, were included if they reported using at least one phase of the Cochrane Highly Sensitive Search Strategy (HSSS), provided citations for both included and excluded studies and conducted a meta-analysis using a binary outcome measure. Reviews were selected if they yielded between 1000-6000 records when the MEDLINE search strategy was replicated. Nine Cochrane reviews were included. Included studies within the Cochrane reviews were found within the first 500 retrieved studies more often than would be expected by chance. Across all reviews, recall of included studies into the top 500 was 0.70. There was no statistically significant difference in ranking when comparing included studies with just the subset of excluded studies listed as excluded in the published review. The relevance ranking provided by the search engine was better than expected by chance and shows promise for the preliminary evaluation of large results from Boolean searches. A statistical search engine does not appear to be able to make fine discriminations concerning the relevance of bibliographic records that have been pre-screened by systematic reviewers.

  8. Grooker, KartOO, Addict-o-Matic and More: Really Different Search Engines

    ERIC Educational Resources Information Center

    Descy, Don E.

    2009-01-01

    There are hundreds of unique search engines in the United States and thousands of unique search engines around the world. If people get into search engines designed just to search particular web sites, the number is in the hundreds of thousands. This article looks at: (1) clustering search engines, such as KartOO (www.kartoo.com) and Grokker…

  9. FindZebra: a search engine for rare diseases.

    PubMed

    Dragusin, Radu; Petcu, Paula; Lioma, Christina; Larsen, Birger; Jørgensen, Henrik L; Cox, Ingemar J; Hansen, Lars Kai; Ingwersen, Peter; Winther, Ole

    2013-06-01

    The web has become a primary information resource about illnesses and treatments for both medical and non-medical users. Standard web search is by far the most common interface to this information. It is therefore of interest to find out how well web search engines work for diagnostic queries and what factors contribute to successes and failures. Among diseases, rare (or orphan) diseases represent an especially challenging and thus interesting class to diagnose as each is rare, diverse in symptoms and usually has scattered resources associated with it. We design an evaluation approach for web search engines for rare disease diagnosis which includes 56 real life diagnostic cases, performance measures, information resources and guidelines for customising Google Search to this task. In addition, we introduce FindZebra, a specialized (vertical) rare disease search engine. FindZebra is powered by open source search technology and uses curated freely available online medical information. FindZebra outperforms Google Search in both default set-up and customised to the resources used by FindZebra. We extend FindZebra with specialized functionalities exploiting medical ontological information and UMLS medical concepts to demonstrate different ways of displaying the retrieved results to medical experts. Our results indicate that a specialized search engine can improve the diagnostic quality without compromising the ease of use of the currently widely popular standard web search. The proposed evaluation approach can be valuable for future development and benchmarking. The FindZebra search engine is available at http://www.findzebra.com/. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  10. Video segmentation using keywords

    NASA Astrophysics Data System (ADS)

    Ton-That, Vinh; Vong, Chi-Tai; Nguyen-Dao, Xuan-Truong; Tran, Minh-Triet

    2018-04-01

    At DAVIS-2016 Challenge, many state-of-art video segmentation methods achieve potential results, but they still much depend on annotated frames to distinguish between background and foreground. It takes a lot of time and efforts to create these frames exactly. In this paper, we introduce a method to segment objects from video based on keywords given by user. First, we use a real-time object detection system - YOLOv2 to identify regions containing objects that have labels match with the given keywords in the first frame. Then, for each region identified from the previous step, we use Pyramid Scene Parsing Network to assign each pixel as foreground or background. These frames can be used as input frames for Object Flow algorithm to perform segmentation on entire video. We conduct experiments on a subset of DAVIS-2016 dataset in half the size of its original size, which shows that our method can handle many popular classes in PASCAL VOC 2012 dataset with acceptable accuracy, about 75.03%. We suggest widely testing by combining other methods to improve this result in the future.

  11. An advanced search engine for patent analytics in medicinal chemistry.

    PubMed

    Pasche, Emilie; Gobeill, Julien; Teodoro, Douglas; Gaudinat, Arnaud; Vishnykova, Dina; Lovis, Christian; Ruch, Patrick

    2012-01-01

    Patent collections contain an important amount of medical-related knowledge, but existing tools were reported to lack of useful functionalities. We present here the development of TWINC, an advanced search engine dedicated to patent retrieval in the domain of health and life sciences. Our tool embeds two search modes: an ad hoc search to retrieve relevant patents given a short query and a related patent search to retrieve similar patents given a patent. Both search modes rely on tuning experiments performed during several patent retrieval competitions. Moreover, TWINC is enhanced with interactive modules, such as chemical query expansion, which is of prior importance to cope with various ways of naming biomedical entities. While the related patent search showed promising performances, the ad-hoc search resulted in fairly contrasted results. Nonetheless, TWINC performed well during the Chemathlon task of the PatOlympics competition and experts appreciated its usability.

  12. GoWeb: a semantic search engine for the life science web.

    PubMed

    Dietze, Heiko; Schroeder, Michael

    2009-10-01

    Current search engines are keyword-based. Semantic technologies promise a next generation of semantic search engines, which will be able to answer questions. Current approaches either apply natural language processing to unstructured text or they assume the existence of structured statements over which they can reason. Here, we introduce a third approach, GoWeb, which combines classical keyword-based Web search with text-mining and ontologies to navigate large results sets and facilitate question answering. We evaluate GoWeb on three benchmarks of questions on genes and functions, on symptoms and diseases, and on proteins and diseases. The first benchmark is based on the BioCreAtivE 1 Task 2 and links 457 gene names with 1352 functions. GoWeb finds 58% of the functional GeneOntology annotations. The second benchmark is based on 26 case reports and links symptoms with diseases. GoWeb achieves 77% success rate improving an existing approach by nearly 20%. The third benchmark is based on 28 questions in the TREC genomics challenge and links proteins to diseases. GoWeb achieves a success rate of 79%. GoWeb's combination of classical Web search with text-mining and ontologies is a first step towards answering questions in the biomedical domain. GoWeb is online at: http://www.gopubmed.org/goweb.

  13. Variable neighborhood search for reverse engineering of gene regulatory networks.

    PubMed

    Nicholson, Charles; Goodwin, Leslie; Clark, Corey

    2017-01-01

    A new search heuristic, Divided Neighborhood Exploration Search, designed to be used with inference algorithms such as Bayesian networks to improve on the reverse engineering of gene regulatory networks is presented. The approach systematically moves through the search space to find topologies representative of gene regulatory networks that are more likely to explain microarray data. In empirical testing it is demonstrated that the novel method is superior to the widely employed greedy search techniques in both the quality of the inferred networks and computational time. Copyright © 2016 Elsevier Inc. All rights reserved.

  14. Engineering Your Job Search: A Job-Finding Resource for Engineering Professionals.

    ERIC Educational Resources Information Center

    1995

    This guide, which is intended for engineering professionals, explains how to use up-to-date job search techniques to design and conduct an effective job hunt. The first 11 chapters discuss the following steps in searching for a job: handling a job loss; managing time and financial resources while conducting a full-time job search; using objective…

  15. Building a better search engine for earth science data

    NASA Astrophysics Data System (ADS)

    Armstrong, E. M.; Yang, C. P.; Moroni, D. F.; McGibbney, L. J.; Jiang, Y.; Huang, T.; Greguska, F. R., III; Li, Y.; Finch, C. J.

    2017-12-01

    Free text data searching of earth science datasets has been implemented with varying degrees of success and completeness across the spectrum of the 12 NASA earth sciences data centers. At the JPL Physical Oceanography Distributed Active Archive Center (PO.DAAC) the search engine has been developed around the Solr/Lucene platform. Others have chosen other popular enterprise search platforms like Elasticsearch. Regardless, the default implementations of these search engines leveraging factors such as dataset popularity, term frequency and inverse document term frequency do not fully meet the needs of precise relevancy and ranking of earth science search results. For the PO.DAAC, this shortcoming has been identified for several years by its external User Working Group that has assigned several recommendations to improve the relevancy and discoverability of datasets related to remotely sensed sea surface temperature, ocean wind, waves, salinity, height and gravity that comprise a total count of over 500 public availability datasets. Recently, the PO.DAAC has teamed with an effort led by George Mason University to improve the improve the search and relevancy ranking of oceanographic data via a simple search interface and powerful backend services called MUDROD (Mining and Utilizing Dataset Relevancy from Oceanographic Datasets to Improve Data Discovery) funded by the NASA AIST program. MUDROD has mined and utilized the combination of PO.DAAC earth science dataset metadata, usage metrics, and user feedback and search history to objectively extract relevance for improved data discovery and access. In addition to improved dataset relevance and ranking, the MUDROD search engine also returns recommendations to related datasets and related user queries. This presentation will report on use cases that drove the architecture and development, and the success metrics and improvements on search precision and recall that MUDROD has demonstrated over the existing PO.DAAC search

  16. Comparison of Four Search Engines and their efficacy With Emphasis on Literature Research in Addiction (Prevention and Treatment).

    PubMed

    Samadzadeh, Gholam Reza; Rigi, Tahereh; Ganjali, Ali Reza

    2013-01-01

    Surveying valuable and most recent information from internet, has become vital for researchers and scholars, because every day, thousands and perhaps millions of scientific works are brought out as digital resources which represented by internet and researchers can't ignore this great resource to find related documents for their literature search, which may not be found in any library. With regard to variety of documents presented on the internet, search engines are one of the most effective search tools for finding information. The aim of this study is to evaluate the three criteria, recall, preciseness and importance of the four search engines which are PubMed, Science Direct, Google Scholar and federated search of Iranian National Medical Digital Library in addiction (prevention and treatment) to select the most effective search engine for offering the best literature research. This research was a cross-sectional study by which four popular search engines in medical sciences were evaluated. To select keywords, medical subject heading (Mesh) was used. We entered given keywords in the search engines and after searching, 10 first entries were evaluated. Direct observation was used as a mean for data collection and they were analyzed by descriptive statistics (number, percent number and mean) and inferential statistics, One way analysis of variance (ANOVA) and post hoc Tukey in Spss. 15 statistical software. P Value < 0.05 was considered statistically significant. Results have shown that the search engines had different operations with regard to the evaluated criteria. Since P Value was 0.004 < 0.05 for preciseness and was 0.002 < 0.05 for importance, it shows significant difference among search engines. PubMed, Science Direct and Google Scholar were the best in recall, preciseness and importance respectively. As literature research is one of the most important stages of research, it's better for researchers, especially Substance-Related Disorders scholars to use

  17. Comparison of Four Search Engines and their efficacy With Emphasis on Literature Research in Addiction (Prevention and Treatment)

    PubMed Central

    Samadzadeh, Gholam Reza; Rigi, Tahereh; Ganjali, Ali Reza

    2013-01-01

    Background Surveying valuable and most recent information from internet, has become vital for researchers and scholars, because every day, thousands and perhaps millions of scientific works are brought out as digital resources which represented by internet and researchers can’t ignore this great resource to find related documents for their literature search, which may not be found in any library. With regard to variety of documents presented on the internet, search engines are one of the most effective search tools for finding information. Objectives The aim of this study is to evaluate the three criteria, recall, preciseness and importance of the four search engines which are PubMed, Science Direct, Google Scholar and federated search of Iranian National Medical Digital Library in addiction (prevention and treatment) to select the most effective search engine for offering the best literature research. Materials and Methods This research was a cross-sectional study by which four popular search engines in medical sciences were evaluated. To select keywords, medical subject heading (Mesh) was used. We entered given keywords in the search engines and after searching, 10 first entries were evaluated. Direct observation was used as a mean for data collection and they were analyzed by descriptive statistics (number, percent number and mean) and inferential statistics, One way analysis of variance (ANOVA) and post hoc Tukey in Spss. 15 statistical software. P Value < 0.05 was considered statistically significant. Results Results have shown that the search engines had different operations with regard to the evaluated criteria. Since P Value was 0.004 < 0.05 for preciseness and was 0.002 < 0.05 for importance, it shows significant difference among search engines. PubMed, Science Direct and Google Scholar were the best in recall, preciseness and importance respectively. Conclusions As literature research is one of the most important stages of research, it's better for

  18. DRUMS: a human disease related unique gene mutation search engine.

    PubMed

    Li, Zuofeng; Liu, Xingnan; Wen, Jingran; Xu, Ye; Zhao, Xin; Li, Xuan; Liu, Lei; Zhang, Xiaoyan

    2011-10-01

    With the completion of the human genome project and the development of new methods for gene variant detection, the integration of mutation data and its phenotypic consequences has become more important than ever. Among all available resources, locus-specific databases (LSDBs) curate one or more specific genes' mutation data along with high-quality phenotypes. Although some genotype-phenotype data from LSDB have been integrated into central databases little effort has been made to integrate all these data by a search engine approach. In this work, we have developed disease related unique gene mutation search engine (DRUMS), a search engine for human disease related unique gene mutation as a convenient tool for biologists or physicians to retrieve gene variant and related phenotype information. Gene variant and phenotype information were stored in a gene-centred relational database. Moreover, the relationships between mutations and diseases were indexed by the uniform resource identifier from LSDB, or another central database. By querying DRUMS, users can access the most popular mutation databases under one interface. DRUMS could be treated as a domain specific search engine. By using web crawling, indexing, and searching technologies, it provides a competitively efficient interface for searching and retrieving mutation data and their relationships to diseases. The present system is freely accessible at http://www.scbit.org/glif/new/drums/index.html. © 2011 Wiley-Liss, Inc.

  19. Sundanese ancient manuscripts search engine using probability approach

    NASA Astrophysics Data System (ADS)

    Suryani, Mira; Hadi, Setiawan; Paulus, Erick; Nurma Yulita, Intan; Supriatna, Asep K.

    2017-10-01

    Today, Information and Communication Technology (ICT) has become a regular thing for every aspect of live include cultural and heritage aspect. Sundanese ancient manuscripts as Sundanese heritage are in damage condition and also the information that containing on it. So in order to preserve the information in Sundanese ancient manuscripts and make them easier to search, a search engine has been developed. The search engine must has good computing ability. In order to get the best computation in developed search engine, three types of probabilistic approaches: Bayesian Networks Model, Divergence from Randomness with PL2 distribution, and DFR-PL2F as derivative form DFR-PL2 have been compared in this study. The three probabilistic approaches supported by index of documents and three different weighting methods: term occurrence, term frequency, and TF-IDF. The experiment involved 12 Sundanese ancient manuscripts. From 12 manuscripts there are 474 distinct terms. The developed search engine tested by 50 random queries for three types of query. The experiment results showed that for the single query and multiple query, the best searching performance given by the combination of PL2F approach and TF-IDF weighting method. The performance has been evaluated using average time responds with value about 0.08 second and Mean Average Precision (MAP) about 0.33.

  20. Tags Extarction from Spatial Documents in Search Engines

    NASA Astrophysics Data System (ADS)

    Borhaninejad, S.; Hakimpour, F.; Hamzei, E.

    2015-12-01

    Nowadays the selective access to information on the Web is provided by search engines, but in the cases which the data includes spatial information the search task becomes more complex and search engines require special capabilities. The purpose of this study is to extract the information which lies in spatial documents. To that end, we implement and evaluate information extraction from GML documents and a retrieval method in an integrated approach. Our proposed system consists of three components: crawler, database and user interface. In crawler component, GML documents are discovered and their text is parsed for information extraction; storage. The database component is responsible for indexing of information which is collected by crawlers. Finally the user interface component provides the interaction between system and user. We have implemented this system as a pilot system on an Application Server as a simulation of Web. Our system as a spatial search engine provided searching capability throughout the GML documents and thus an important step to improve the efficiency of search engines has been taken.

  1. D-score: a search engine independent MD-score.

    PubMed

    Vaudel, Marc; Breiter, Daniela; Beck, Florian; Rahnenführer, Jörg; Martens, Lennart; Zahedi, René P

    2013-03-01

    While peptides carrying PTMs are routinely identified in gel-free MS, the localization of the PTMs onto the peptide sequences remains challenging. Search engine scores of secondary peptide matches have been used in different approaches in order to infer the quality of site inference, by penalizing the localization whenever the search engine similarly scored two candidate peptides with different site assignments. In the present work, we show how the estimation of posterior error probabilities for peptide candidates allows the estimation of a PTM score called the D-score, for multiple search engine studies. We demonstrate the applicability of this score to three popular search engines: Mascot, OMSSA, and X!Tandem, and evaluate its performance using an already published high resolution data set of synthetic phosphopeptides. For those peptides with phosphorylation site inference uncertainty, the number of spectrum matches with correctly localized phosphorylation increased by up to 25.7% when compared to using Mascot alone, although the actual increase depended on the fragmentation method used. Since this method relies only on search engine scores, it can be readily applied to the scoring of the localization of virtually any modification at no additional experimental or in silico cost. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. The Theory of Planned Behaviour Applied to Search Engines as a Learning Tool

    ERIC Educational Resources Information Center

    Liaw, Shu-Sheng

    2004-01-01

    Search engines have been developed for helping learners to seek online information. Based on theory of planned behaviour approach, this research intends to investigate the behaviour of using search engines as a learning tool. After factor analysis, the results suggest that perceived satisfaction of search engine, search engines as an information…

  3. LoyalTracker: Visualizing Loyalty Dynamics in Search Engines.

    PubMed

    Shi, Conglei; Wu, Yingcai; Liu, Shixia; Zhou, Hong; Qu, Huamin

    2014-12-01

    The huge amount of user log data collected by search engine providers creates new opportunities to understand user loyalty and defection behavior at an unprecedented scale. However, this also poses a great challenge to analyze the behavior and glean insights into the complex, large data. In this paper, we introduce LoyalTracker, a visual analytics system to track user loyalty and switching behavior towards multiple search engines from the vast amount of user log data. We propose a new interactive visualization technique (flow view) based on a flow metaphor, which conveys a proper visual summary of the dynamics of user loyalty of thousands of users over time. Two other visualization techniques, a density map and a word cloud, are integrated to enable analysts to gain further insights into the patterns identified by the flow view. Case studies and the interview with domain experts are conducted to demonstrate the usefulness of our technique in understanding user loyalty and switching behavior in search engines.

  4. Beyond Keyword Search: Discovering Relevant Scientific Literature

    DTIC Science & Technology

    2011-06-01

    Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases. In KDD, 2007. 18 [44] Thomson Reuters Web of Knowledge. http...stimulation 2001 98 14386-14391 10123 In vivo evidence for a cochlear amplifier in the hair-cell bundle of lizards 2001 98 2826-2831 22 Table 1 (cont...2000 97 4856-4861 12134 Arabidopsis RelA/SpoT homologs implicate (p)ppGpp in plant signal- ing 2000 97 3747-3752 12176 Cochlear mechanisms from a

  5. Advances in the Kepler Transit Search Engine

    NASA Astrophysics Data System (ADS)

    Jenkins, Jon M.

    2016-10-01

    Twenty years ago, no planets were known outside our own solar system. Since then, the discoveries of ~1500 exoplanets have radically altered our views of planets and planetary systems. This revolution is due in no small part to the Kepler Mission, which has discovered >1000 of these planets and >4000 planet candidates. While Kepler has shown that small rocky planets and planetary systems are quite common, the quest to find Earth's closest cousins and characterize their atmospheres presses forward with missions such as NASA Explorer Program's Transiting Exoplanet Survey Satellite (TESS) slated for launch in 2017 and ESA's PLATO mission scheduled for launch in 2024. These future missions pose daunting data processing challenges in terms of the number of stars, the amount of data, and the difficulties in detecting weak signatures of transiting small planets against a roaring background. These complications include instrument noise and systematic effects as well as the intrinsic stellar variability of the subjects under scrutiny. In this paper we review recent developments in the Kepler transit search pipeline improving both the yield and reliability of detected transit signatures. Many of the phenomena in light curves that represent noise can also trigger transit detection algorithms. The Kepler Mission has expended great effort in suppressing false positives from its planetary candidate catalogs. Over 18,000 transit-like signatures can be identified for a search across 4 years of data. Most of these signatures are artifacts, not planets. Vetting all such signatures historically takes several months' effort by many individuals. We describe the application of machine learning approaches for the automated vetting and production of planet candidate catalogs. These algorithms can improve the efficiency of the human vetting effort as well as quantifying the likelihood that each candidate is truly a planet. This information is crucial for obtaining valid planet occurrence

  6. New Capabilities in the Astrophysics Multispectral Archive Search Engine

    NASA Astrophysics Data System (ADS)

    Cheung, C. Y.; Kelley, S.; Roussopoulos, N.

    The Astrophysics Multispectral Archive Search Engine (AMASE) uses object-oriented database techniques to provide a uniform multi-mission and multi-spectral interface to search for data in the distributed archives. We describe our experience of porting AMASE from Illustra object-relational DBMS to the Informix Universal Data Server. New capabilities and utilities have been developed, including a spatial datablade that supports Nearest Neighbor queries.

  7. An end user evaluation of query formulation and results review tools in three medical meta-search engines.

    PubMed

    Leroy, Gondy; Xu, Jennifer; Chung, Wingyan; Eggers, Shauna; Chen, Hsinchun

    2007-01-01

    Retrieving sufficient relevant information online is difficult for many people because they use too few keywords to search and search engines do not provide many support tools. To further complicate the search, users often ignore support tools when available. Our goal is to evaluate in a realistic setting when users use support tools and how they perceive these tools. We compared three medical search engines with support tools that require more or less effort from users to form a query and evaluate results. We carried out an end user study with 23 users who were asked to find information, i.e., subtopics and supporting abstracts, for a given theme. We used a balanced within-subjects design and report on the effectiveness, efficiency and usability of the support tools from the end user perspective. We found significant differences in efficiency but did not find significant differences in effectiveness between the three search engines. Dynamic user support tools requiring less effort led to higher efficiency. Fewer searches were needed and more documents were found per search when both query reformulation and result review tools dynamically adjust to the user query. The query reformulation tool that provided a long list of keywords, dynamically adjusted to the user query, was used most often and led to more subtopics. As hypothesized, the dynamic result review tools were used more often and led to more subtopics than static ones. These results were corroborated by the usability questionnaires, which showed that support tools that dynamically optimize output were preferred.

  8. BioCarian: search engine for exploratory searches in heterogeneous biological databases.

    PubMed

    Zaki, Nazar; Tennakoon, Chandana

    2017-10-02

    There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community. We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search

  9. Development of a Google-based search engine for data mining radiology reports.

    PubMed

    Erinjeri, Joseph P; Picus, Daniel; Prior, Fred W; Rubin, David A; Koppel, Paul

    2009-08-01

    The aim of this study is to develop a secure, Google-based data-mining tool for radiology reports using free and open source technologies and to explore its use within an academic radiology department. A Health Insurance Portability and Accountability Act (HIPAA)-compliant data repository, search engine and user interface were created to facilitate treatment, operations, and reviews preparatory to research. The Institutional Review Board waived review of the project, and informed consent was not required. Comprising 7.9 GB of disk space, 2.9 million text reports were downloaded from our radiology information system to a fileserver. Extensible markup language (XML) representations of the reports were indexed using Google Desktop Enterprise search engine software. A hypertext markup language (HTML) form allowed users to submit queries to Google Desktop, and Google's XML response was interpreted by a practical extraction and report language (PERL) script, presenting ranked results in a web browser window. The query, reason for search, results, and documents visited were logged to maintain HIPAA compliance. Indexing averaged approximately 25,000 reports per hour. Keyword search of a common term like "pneumothorax" yielded the first ten most relevant results of 705,550 total results in 1.36 s. Keyword search of a rare term like "hemangioendothelioma" yielded the first ten most relevant results of 167 total results in 0.23 s; retrieval of all 167 results took 0.26 s. Data mining tools for radiology reports will improve the productivity of academic radiologists in clinical, educational, research, and administrative tasks. By leveraging existing knowledge of Google's interface, radiologists can quickly perform useful searches.

  10. GGRNA: an ultrafast, transcript-oriented search engine for genes and transcripts

    PubMed Central

    Naito, Yuki; Bono, Hidemasa

    2012-01-01

    GGRNA (http://GGRNA.dbcls.jp/) is a Google-like, ultrafast search engine for genes and transcripts. The web server accepts arbitrary words and phrases, such as gene names, IDs, gene descriptions, annotations of gene and even nucleotide/amino acid sequences through one simple search box, and quickly returns relevant RefSeq transcripts. A typical search takes just a few seconds, which dramatically enhances the usability of routine searching. In particular, GGRNA can search sequences as short as 10 nt or 4 amino acids, which cannot be handled easily by popular sequence analysis tools. Nucleotide sequences can be searched allowing up to three mismatches, or the query sequences may contain degenerate nucleotide codes (e.g. N, R, Y, S). Furthermore, Gene Ontology annotations, Enzyme Commission numbers and probe sequences of catalog microarrays are also incorporated into GGRNA, which may help users to conduct searches by various types of keywords. GGRNA web server will provide a simple and powerful interface for finding genes and transcripts for a wide range of users. All services at GGRNA are provided free of charge to all users. PMID:22641850

  11. GGRNA: an ultrafast, transcript-oriented search engine for genes and transcripts.

    PubMed

    Naito, Yuki; Bono, Hidemasa

    2012-07-01

    GGRNA (http://GGRNA.dbcls.jp/) is a Google-like, ultrafast search engine for genes and transcripts. The web server accepts arbitrary words and phrases, such as gene names, IDs, gene descriptions, annotations of gene and even nucleotide/amino acid sequences through one simple search box, and quickly returns relevant RefSeq transcripts. A typical search takes just a few seconds, which dramatically enhances the usability of routine searching. In particular, GGRNA can search sequences as short as 10 nt or 4 amino acids, which cannot be handled easily by popular sequence analysis tools. Nucleotide sequences can be searched allowing up to three mismatches, or the query sequences may contain degenerate nucleotide codes (e.g. N, R, Y, S). Furthermore, Gene Ontology annotations, Enzyme Commission numbers and probe sequences of catalog microarrays are also incorporated into GGRNA, which may help users to conduct searches by various types of keywords. GGRNA web server will provide a simple and powerful interface for finding genes and transcripts for a wide range of users. All services at GGRNA are provided free of charge to all users.

  12. HOW DO RADIOLOGISTS USE THE HUMAN SEARCH ENGINE?

    PubMed

    Wolfe, Jeremy M; Evans, Karla K; Drew, Trafton; Aizenman, Avigael; Josephs, Emilie

    2016-06-01

    Radiologists perform many 'visual search tasks' in which they look for one or more instances of one or more types of target item in a medical image (e.g. cancer screening). To understand and improve how radiologists do such tasks, it must be understood how the human 'search engine' works. This article briefly reviews some of the relevant work into this aspect of medical image perception. Questions include how attention and the eyes are guided in radiologic search? How is global (image-wide) information used in search? How might properties of human vision and human cognition lead to errors in radiologic search? © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  13. Quality of Health Information on the Internet for Urolithiasis on the Google Search Engine.

    PubMed

    Chang, Dwayne T S; Abouassaly, Robert; Lawrentschuk, Nathan

    2016-01-01

    Purpose . To compare the quality of health information on the Internet for keywords related to urolithiasis, to assess for difference in information quality across four main Western languages, and to compare the source of sponsorship in these websites. Methods . Health On the Net (HON) Foundation principles were utilised to determine quality information. Fifteen keywords related to urolithiasis were searched on the Google search engine. The first 150 websites were assessed against the HON principles and the source of sponsorship determined. Results . A total of 8986 websites were analysed. A proportion of HON-accredited websites for individual search terms range between 2.5% and 12.0%. The first 50 websites were more likely to be HON-positive compared to websites 51-100 and 101-150. French websites searched were more likely to be HON-positive whereas German websites were less likely to be HON-positive than English websites. There was no statistically significant difference between the rate of HON-positive English and Spanish websites. The three main website sponsors were from government/educational sources (40.2%), followed by commercial (29.9%) and physician/surgeon sources (18.6%). Conclusions . Health information on most urolithiasis websites was not validated. Nearly one-third of websites in this study have commercial sponsorship. Doctors should recognise the need for more reliable health websites for their patients.

  14. Index Relativity and Patron Search Strategy.

    ERIC Educational Resources Information Center

    Allison, DeeAnn; Childers Scott

    2002-01-01

    Describes a study at the University of Nebraska-Lincoln that compared searches in two different keyword indexes with similar content where search results were dependent on search strategy quality, search engine execution, and content. Results showed search engine execution had an impact on the number of matches and that users ignored search help…

  15. SearchGUI: A Highly Adaptable Common Interface for Proteomics Search and de Novo Engines.

    PubMed

    Barsnes, Harald; Vaudel, Marc

    2018-05-25

    Mass-spectrometry-based proteomics has become the standard approach for identifying and quantifying proteins. A vital step consists of analyzing experimentally generated mass spectra to identify the underlying peptide sequences for later mapping to the originating proteins. We here present the latest developments in SearchGUI, a common open-source interface for the most frequently used freely available proteomics search and de novo engines that has evolved into a central component in numerous bioinformatics workflows.

  16. The Role of Exploratory Talk in Classroom Search Engine Tasks

    ERIC Educational Resources Information Center

    Knight, Simon; Mercer, Neil

    2015-01-01

    While search engines are commonly used by children to find information, and in classroom-based activities, children are not adept in their information seeking or evaluation of information sources. Prior work has explored such activities in isolated, individual contexts, failing to account for the collaborative, discourse-mediated nature of search…

  17. A Competitive and Experiential Assignment in Search Engine Optimization Strategy

    ERIC Educational Resources Information Center

    Clarke, Theresa B.; Clarke, Irvine, III

    2014-01-01

    Despite an increase in ad spending and demand for employees with expertise in search engine optimization (SEO), methods for teaching this important marketing strategy have received little coverage in the literature. Using Bloom's cognitive goals hierarchy as a framework, this experiential assignment provides a process for educators who may be new…

  18. Teaching Search Engine Marketing through the Google Ad Grants Program

    ERIC Educational Resources Information Center

    Clarke, Theresa B.; Murphy, Jamie; Wetsch, Lyle R.; Boeck, Harold

    2018-01-01

    Instructors may find it difficult to stay abreast of the rapidly changing nature of search engine marketing (SEM) and to incorporate hands-on, practical classroom experiences. One solution is Google Ad Grants, a nonprofit edition of Google AdWords that provides up to $10,000 monthly in free advertising. A quasi-experiment revealed no differences…

  19. PR Students' Perceptions and Readiness for Using Search Engine Optimization

    ERIC Educational Resources Information Center

    Moody, Mia; Bates, Elizabeth

    2013-01-01

    Enough evidence is available to support the idea that public relations professionals must possess search engine optimization (SEO) skills to assist clients in a full-service capacity; however, little research exists on how much college students know about the tactic and best practices for incorporating SEO into course curriculum. Furthermore, much…

  20. Through the Google Goggles: Sociopolitical Bias in Search Engine Design

    NASA Astrophysics Data System (ADS)

    Diaz, A.

    Search engines like Google are essential to navigating the Web's endless supply of news, political information, and citizen discourse. The mechanisms and conditions under which search results are selected should therefore be of considerable interest to media scholars, political theorists, and citizens alike. In this chapter, I adopt a "deliberative" ideal for search engines and examine whether Google exhibits the "same old" media biases of mainstreaming, hypercommercialism, and industry consolidation. In the end, serious objections to Google are raised: Google may favor popularity over richness; it provides advertising that competes directly with "editorial" content; it so overwhelmingly dominates the industry that users seldom get a second opinion, and this is unlikely to change. Ultimately, however, the results of this analysis may speak less about Google than about contradictions in the deliberative ideal and the so-called "inherently democratic" nature of the Web.

  1. Health search engine with e-document analysis for reliable search results.

    PubMed

    Gaudinat, Arnaud; Ruch, Patrick; Joubert, Michel; Uziel, Philippe; Strauss, Anne; Thonnet, Michèle; Baud, Robert; Spahni, Stéphane; Weber, Patrick; Bonal, Juan; Boyer, Celia; Fieschi, Marius; Geissbuhler, Antoine

    2006-01-01

    After a review of the existing practical solution available to the citizen to retrieve eHealth document, the paper describes an original specialized search engine WRAPIN. WRAPIN uses advanced cross lingual information retrieval technologies to check information quality by synthesizing medical concepts, conclusions and references contained in the health literature, to identify accurate, relevant sources. Thanks to MeSH terminology [1] (Medical Subject Headings from the U.S. National Library of Medicine) and advanced approaches such as conclusion extraction from structured document, reformulation of the query, WRAPIN offers to the user a privileged access to navigate through multilingual documents without language or medical prerequisites. The results of an evaluation conducted on the WRAPIN prototype show that results of the WRAPIN search engine are perceived as informative 65% (59% for a general-purpose search engine), reliable and trustworthy 72% (41% for the other engine) by users. But it leaves room for improvement such as the increase of database coverage, the explanation of the original functionalities and an audience adaptability. Thanks to evaluation outcomes, WRAPIN is now in exploitation on the HON web site (http://www.healthonnet.org), free of charge. Intended to the citizen it is a good alternative to general-purpose search engines when the user looks up trustworthy health and medical information or wants to check automatically a doubtful content of a Web page.

  2. Using Search Engine Data as a Tool to Predict Syphilis.

    PubMed

    Young, Sean D; Torrone, Elizabeth A; Urata, John; Aral, Sevgi O

    2018-07-01

    Researchers have suggested that social media and online search data might be used to monitor and predict syphilis and other sexually transmitted diseases. Because people at risk for syphilis might seek sexual health and risk-related information on the internet, we investigated associations between internet state-level search query data (e.g., Google Trends) and reported weekly syphilis cases. We obtained weekly counts of reported primary and secondary syphilis for 50 states from 2012 to 2014 from the US Centers for Disease Control and Prevention. We collected weekly internet search query data regarding 25 risk-related keywords from 2012 to 2014 for 50 states using Google Trends. We joined 155 weeks of Google Trends data with 1-week lag to weekly syphilis data for a total of 7750 data points. Using the least absolute shrinkage and selection operator, we trained three linear mixed models on the first 10 weeks of each year. We validated models for 2012 and 2014 for the following 52 weeks and the 2014 model for the following 42 weeks. The models, consisting of different sets of keyword predictors for each year, accurately predicted 144 weeks of primary and secondary syphilis counts for each state, with an overall average R of 0.9 and overall average root mean squared error of 4.9. We used Google Trends search data from the prior week to predict cases of syphilis in the following weeks for each state. Further research could explore how search data could be integrated into public health monitoring systems.

  3. A Search Engine That's Aware of Your Needs

    NASA Technical Reports Server (NTRS)

    2005-01-01

    Internet research can be compared to trying to drink from a firehose. Such a wealth of information is available that even the simplest inquiry can sometimes generate tens of thousands of leads, more information than most people can handle, and more burdensome than most can endure. Like everyone else, NASA scientists rely on the Internet as a primary search tool. Unlike the average user, though, NASA scientists perform some pretty sophisticated, involved research. To help manage the Internet and to allow researchers at NASA to gain better, more efficient access to the wealth of information, the Agency needed a search tool that was more refined and intelligent than the typical search engine. Partnership NASA funded Stottler Henke, Inc., of San Mateo, California, a cutting-edge software company, with a Small Business Innovation Research (SBIR) contract to develop the Aware software for searching through the vast stores of knowledge quickly and efficiently. The partnership was through NASA s Ames Research Center.

  4. Design implications for task-specific search utilities for retrieval and re-engineering of code

    NASA Astrophysics Data System (ADS)

    Iqbal, Rahat; Grzywaczewski, Adam; Halloran, John; Doctor, Faiyaz; Iqbal, Kashif

    2017-05-01

    The importance of information retrieval systems is unquestionable in the modern society and both individuals as well as enterprises recognise the benefits of being able to find information effectively. Current code-focused information retrieval systems such as Google Code Search, Codeplex or Koders produce results based on specific keywords. However, these systems do not take into account developers' context such as development language, technology framework, goal of the project, project complexity and developer's domain expertise. They also impose additional cognitive burden on users in switching between different interfaces and clicking through to find the relevant code. Hence, they are not used by software developers. In this paper, we discuss how software engineers interact with information and general-purpose information retrieval systems (e.g. Google, Yahoo!) and investigate to what extent domain-specific search and recommendation utilities can be developed in order to support their work-related activities. In order to investigate this, we conducted a user study and found that software engineers followed many identifiable and repeatable work tasks and behaviours. These behaviours can be used to develop implicit relevance feedback-based systems based on the observed retention actions. Moreover, we discuss the implications for the development of task-specific search and collaborative recommendation utilities embedded with the Google standard search engine and Microsoft IntelliSense for retrieval and re-engineering of code. Based on implicit relevance feedback, we have implemented a prototype of the proposed collaborative recommendation system, which was evaluated in a controlled environment simulating the real-world situation of professional software engineers. The evaluation has achieved promising initial results on the precision and recall performance of the system.

  5. Predicting Drug Recalls From Internet Search Engine Queries.

    PubMed

    Yom-Tov, Elad

    2017-01-01

    Batches of pharmaceuticals are sometimes recalled from the market when a safety issue or a defect is detected in specific production runs of a drug. Such problems are usually detected when patients or healthcare providers report abnormalities to medical authorities. Here, we test the hypothesis that defective production lots can be detected earlier by monitoring queries to Internet search engines. We extracted queries from the USA to the Bing search engine, which mentioned one of the 5195 pharmaceutical drugs during 2015 and all recall notifications issued by the Food and Drug Administration (FDA) during that year. By using attributes that quantify the change in query volume at the state level, we attempted to predict if a recall of a specific drug will be ordered by FDA in a time horizon ranging from 1 to 40 days in future. Our results show that future drug recalls can indeed be identified with an AUC of 0.791 and a lift at 5% of approximately 6 when predicting a recall occurring one day ahead. This performance degrades as prediction is made for longer periods ahead. The most indicative attributes for prediction are sudden spikes in query volume about a specific medicine in each state. Recalls of prescription drugs and those estimated to be of medium-risk are more likely to be identified using search query data. These findings suggest that aggregated Internet search engine data can be used to facilitate in early warning of faulty batches of medicines.

  6. Finding Business Information on the "Invisible Web": Search Utilities vs. Conventional Search Engines.

    ERIC Educational Resources Information Center

    Darrah, Brenda

    Researchers for small businesses, which may have no access to expensive databases or market research reports, must often rely on information found on the Internet, which can be difficult to find. Although current conventional Internet search engines are now able to index over on billion documents, there are many more documents existing in…

  7. Respiratory syncytial virus tracking using internet search engine data.

    PubMed

    Oren, Eyal; Frere, Justin; Yom-Tov, Eran; Yom-Tov, Elad

    2018-04-03

    Respiratory Syncytial Virus (RSV) is the leading cause of hospitalization in children less than 1 year of age in the United States. Internet search engine queries may provide high resolution temporal and spatial data to estimate and predict disease activity. After filtering an initial list of 613 symptoms using high-resolution Bing search logs, we used Google Trends data between 2004 and 2016 for a smaller list of 50 terms to build predictive models of RSV incidence for five states where long-term surveillance data was available. We then used domain adaptation to model RSV incidence for the 45 remaining US states. Surveillance data sources (hospitalization and laboratory reports) were highly correlated, as were laboratory reports with search engine data. The four terms which were most often statistically significantly correlated as time series with the surveillance data in the five state models were RSV, flu, pneumonia, and bronchiolitis. Using our models, we tracked the spread of RSV by observing the time of peak use of the search term in different states. In general, the RSV peak moved from south-east (Florida) to the north-west US. Our study represents the first time that RSV has been tracked using Internet data results and highlights successful use of search filters and domain adaptation techniques, using data at multiple resolutions. Our approach may assist in identifying spread of both local and more widespread RSV transmission and may be applicable to other seasonal conditions where comprehensive epidemiological data is difficult to collect or obtain.

  8. Next-Generation Search Engines for Information Retrieval

    SciTech Connect

    Devarakonda, Ranjeet; Hook, Leslie A; Palanisamy, Giri

    In the recent years, there have been significant advancements in the areas of scientific data management and retrieval techniques, particularly in terms of standards and protocols for archiving data and metadata. Scientific data is rich, and spread across different places. In order to integrate these pieces together, a data archive and associated metadata should be generated. Data should be stored in a format that can be retrievable and more importantly it should be in a format that will continue to be accessible as technology changes, such as XML. While general-purpose search engines (such as Google or Bing) are useful formore » finding many things on the Internet, they are often of limited usefulness for locating Earth Science data relevant (for example) to a specific spatiotemporal extent. By contrast, tools that search repositories of structured metadata can locate relevant datasets with fairly high precision, but the search is limited to that particular repository. Federated searches (such as Z39.50) have been used, but can be slow and the comprehensiveness can be limited by downtime in any search partner. An alternative approach to improve comprehensiveness is for a repository to harvest metadata from other repositories, possibly with limits based on subject matter or access permissions. Searches through harvested metadata can be extremely responsive, and the search tool can be customized with semantic augmentation appropriate to the community of practice being served. One such system, Mercury, a metadata harvesting, data discovery, and access system, built for researchers to search to, share and obtain spatiotemporal data used across a range of climate and ecological sciences. Mercury is open-source toolset, backend built on Java and search capability is supported by the some popular open source search libraries such as SOLR and LUCENE. Mercury harvests the structured metadata and key data from several data providing servers around the world and builds a

  9. ESIP Documentation Cluster Session: GCMD Keyword Update

    NASA Technical Reports Server (NTRS)

    Stevens, Tyler

    2018-01-01

    The Global Change Master Directory (GCMD) Keywords are a hierarchical set of controlled Earth Science vocabularies that help ensure Earth science data and services are described in a consistent and comprehensive manner and allow for the precise searching of collection-level metadata and subsequent retrieval of data and services. Initiated over twenty years ago, the GCMD Keywords are periodically analyzed for relevancy and will continue to be refined and expanded in response to user needs. This talk explores the current status of the GCMD keywords, the value and usage that the keywords bring to different tools/agencies as it relates to data discovery, and how the keywords relate to SWEET (Semantic Web for Earth and Environmental Terminology) Ontologies.

  10. The EBI search engine: EBI search as a service-making biological data accessible for all.

    PubMed

    Park, Young M; Squizzato, Silvano; Buso, Nicola; Gur, Tamer; Lopez, Rodrigo

    2017-07-03

    We present an update of the EBI Search engine, an easy-to-use fast text search and indexing system with powerful data navigation and retrieval capabilities. The interconnectivity that exists between data resources at EMBL-EBI provides easy, quick and precise navigation and a better understanding of the relationship between different data types that include nucleotide and protein sequences, genes, gene products, proteins, protein domains, protein families, enzymes and macromolecular structures, as well as the life science literature. EBI Search provides a powerful RESTful API that enables its integration into third-party portals, thus providing 'Search as a Service' capabilities, which are the main topic of this article. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. The EBI search engine: EBI search as a service—making biological data accessible for all

    PubMed Central

    Park, Young M.; Squizzato, Silvano; Buso, Nicola; Gur, Tamer

    2017-01-01

    Abstract We present an update of the EBI Search engine, an easy-to-use fast text search and indexing system with powerful data navigation and retrieval capabilities. The interconnectivity that exists between data resources at EMBL–EBI provides easy, quick and precise navigation and a better understanding of the relationship between different data types that include nucleotide and protein sequences, genes, gene products, proteins, protein domains, protein families, enzymes and macromolecular structures, as well as the life science literature. EBI Search provides a powerful RESTful API that enables its integration into third-party portals, thus providing ‘Search as a Service’ capabilities, which are the main topic of this article. PMID:28472374

  12. Query Log Analysis of an Electronic Health Record Search Engine

    PubMed Central

    Yang, Lei; Mei, Qiaozhu; Zheng, Kai; Hanauer, David A.

    2011-01-01

    We analyzed a longitudinal collection of query logs of a full-text search engine designed to facilitate information retrieval in electronic health records (EHR). The collection, 202,905 queries and 35,928 user sessions recorded over a course of 4 years, represents the information-seeking behavior of 533 medical professionals, including frontline practitioners, coding personnel, patient safety officers, and biomedical researchers for patient data stored in EHR systems. In this paper, we present descriptive statistics of the queries, a categorization of information needs manifested through the queries, as well as temporal patterns of the users’ information-seeking behavior. The results suggest that information needs in medical domain are substantially more sophisticated than those that general-purpose web search engines need to accommodate. Therefore, we envision there exists a significant challenge, along with significant opportunities, to provide intelligent query recommendations to facilitate information retrieval in EHR. PMID:22195150

  13. Cancer Internet search activity on a major search engine, United States 2001-2003.

    PubMed

    Cooper, Crystale Purvis; Mallon, Kenneth P; Leadbetter, Steven; Pollack, Lori A; Peipins, Lucy A

    2005-07-01

    To locate online health information, Internet users typically use a search engine, such as Yahoo! or Google. We studied Yahoo! search activity related to the 23 most common cancers in the United States. The objective was to test three potential correlates of Yahoo! cancer search activity--estimated cancer incidence, estimated cancer mortality, and the volume of cancer news coverage--and to study the periodicity of and peaks in Yahoo! cancer search activity. Yahoo! cancer search activity was obtained from a proprietary database called the Yahoo! Buzz Index. The American Cancer Society's estimates of cancer incidence and mortality were used. News reports associated with specific cancer types were identified using the LexisNexis "US News" database, which includes more than 400 national and regional newspapers and a variety of newswire services. The Yahoo! search activity associated with specific cancers correlated with their estimated incidence (Spearman rank correlation, rho = 0.50, P = .015), estimated mortality (rho = 0.66, P = .001), and volume of related news coverage (rho = 0.88, P < .001). Yahoo! cancer search activity tended to be higher on weekdays and during national cancer awareness months but lower during summer months; cancer news coverage also tended to follow these trends. Sharp increases in Yahoo! search activity scores from one day to the next appeared to be associated with increases in relevant news coverage. Media coverage appears to play a powerful role in prompting online searches for cancer information. Internet search activity offers an innovative tool for passive surveillance of health information-seeking behavior.

  14. Cancer Internet Search Activity on a Major Search Engine, United States 2001-2003

    PubMed Central

    Cooper, Crystale Purvis; Mallon, Kenneth P; Leadbetter, Steven; Peipins, Lucy A

    2005-01-01

    Background To locate online health information, Internet users typically use a search engine, such as Yahoo! or Google. We studied Yahoo! search activity related to the 23 most common cancers in the United States. Objective The objective was to test three potential correlates of Yahoo! cancer search activity—estimated cancer incidence, estimated cancer mortality, and the volume of cancer news coverage—and to study the periodicity of and peaks in Yahoo! cancer search activity. Methods Yahoo! cancer search activity was obtained from a proprietary database called the Yahoo! Buzz Index. The American Cancer Society's estimates of cancer incidence and mortality were used. News reports associated with specific cancer types were identified using the LexisNexis “US News” database, which includes more than 400 national and regional newspapers and a variety of newswire services. Results The Yahoo! search activity associated with specific cancers correlated with their estimated incidence (Spearman rank correlation, ρ = 0.50, P = .015), estimated mortality (ρ = 0.66, P = .001), and volume of related news coverage (ρ = 0.88, P < .001). Yahoo! cancer search activity tended to be higher on weekdays and during national cancer awareness months but lower during summer months; cancer news coverage also tended to follow these trends. Sharp increases in Yahoo! search activity scores from one day to the next appeared to be associated with increases in relevant news coverage. Conclusions Media coverage appears to play a powerful role in prompting online searches for cancer information. Internet search activity offers an innovative tool for passive surveillance of health information–seeking behavior. PMID:15998627

  15. Searching Choices: Quantifying Decision-Making Processes Using Search Engine Data.

    PubMed

    Moat, Helen Susannah; Olivola, Christopher Y; Chater, Nick; Preis, Tobias

    2016-07-01

    When making a decision, humans consider two types of information: information they have acquired through their prior experience of the world, and further information they gather to support the decision in question. Here, we present evidence that data from search engines such as Google can help us model both sources of information. We show that statistics from search engines on the frequency of content on the Internet can help us estimate the statistical structure of prior experience; and, specifically, we outline how such statistics can inform psychological theories concerning the valuation of human lives, or choices involving delayed outcomes. Turning to information gathering, we show that search query data might help measure human information gathering, and it may predict subsequent decisions. Such data enable us to compare information gathered across nations, where analyses suggest, for example, a greater focus on the future in countries with a higher per capita GDP. We conclude that search engine data constitute a valuable new resource for cognitive scientists, offering a fascinating new tool for understanding the human decision-making process. Copyright © 2016 The Authors. Topics in Cognitive Science published by Wiley Periodicals, Inc. on behalf of Cognitive Science Society.

  16. Modification site localization scoring integrated into a search engine.

    PubMed

    Baker, Peter R; Trinidad, Jonathan C; Chalkley, Robert J

    2011-07-01

    Large proteomic data sets identifying hundreds or thousands of modified peptides are becoming increasingly common in the literature. Several methods for assessing the reliability of peptide identifications both at the individual peptide or data set level have become established. However, tools for measuring the confidence of modification site assignments are sparse and are not often employed. A few tools for estimating phosphorylation site assignment reliabilities have been developed, but these are not integral to a search engine, so require a particular search engine output for a second step of processing. They may also require use of a particular fragmentation method and are mostly only applicable for phosphorylation analysis, rather than post-translational modifications analysis in general. In this study, we present the performance of site assignment scoring that is directly integrated into the search engine Protein Prospector, which allows site assignment reliability to be automatically reported for all modifications present in an identified peptide. It clearly indicates when a site assignment is ambiguous (and if so, between which residues), and reports an assignment score that can be translated into a reliability measure for individual site assignments.

  17. LIVIVO - the Vertical Search Engine for Life Sciences.

    PubMed

    Müller, Bernd; Poley, Christoph; Pössel, Jana; Hagelstein, Alexandra; Gübitz, Thomas

    2017-01-01

    The explosive growth of literature and data in the life sciences challenges researchers to keep track of current advancements in their disciplines. Novel approaches in the life science like the One Health paradigm require integrated methodologies in order to link and connect heterogeneous information from databases and literature resources. Current publications in the life sciences are increasingly characterized by the employment of trans-disciplinary methodologies comprising molecular and cell biology, genetics, genomic, epigenomic, transcriptional and proteomic high throughput technologies with data from humans, plants, and animals. The literature search engine LIVIVO empowers retrieval functionality by incorporating various literature resources from medicine, health, environment, agriculture and nutrition. LIVIVO is developed in-house by ZB MED - Information Centre for Life Sciences. It provides a user-friendly and usability-tested search interface with a corpus of 55 Million citations derived from 50 databases. Standardized application programming interfaces are available for data export and high throughput retrieval. The search functions allow for semantic retrieval with filtering options based on life science entities. The service oriented architecture of LIVIVO uses four different implementation layers to deliver search services. A Knowledge Environment is developed by ZB MED to deal with the heterogeneity of data as an integrative approach to model, store, and link semantic concepts within literature resources and databases. Future work will focus on the exploitation of life science ontologies and on the employment of NLP technologies in order to improve query expansion, filters in faceted search, and concept based relevancy rankings in LIVIVO.

  18. The Effectiveness of Web Search Engines to Index New Sites from Different Countries

    ERIC Educational Resources Information Center

    Pirkola, Ari

    2009-01-01

    Introduction: Investigates how effectively Web search engines index new sites from different countries. The primary interest is whether new sites are indexed equally or whether search engines are biased towards certain countries. If major search engines show biased coverage it can be considered a significant economic and political problem because…

  19. Searching for safety: addressing search engine, website, and provider accountability for illicit online drug sales.

    PubMed

    Liang, Bryan A; Mackey, Tim

    2009-01-01

    Online sales of pharmaceuticals are a rapidly growing phenomenon. Yet despite the dangers of purchasing drugs over the Internet, sales continue to escalate. These dangers include patient harm from fake or tainted drugs, lack of clinical oversight, and financial loss. Patients, and in particular vulnerable groups such as seniors and minorities, purchase drugs online either naïvely or because they lack the ability to access medications from other sources due to price considerations. Unfortunately, high risk online drug sources dominate the Internet, and virtually no accountability exists to ensure safety of purchased products. Importantly, search engines such as Google, Yahoo, and MSN, although purportedly requiring "verification" of Internet drug sellers using PharmacyChecker.com requirements, actually allow and profit from illicit drug sales from unverified websites. These search engines are not held accountable for facilitating clearly illegal activities. Both website drug seller anonymity and unethical physicians approving or writing prescriptions without seeing the patient contribute to rampant illegal online drug sales. Efforts in this country and around the world to stem the tide of these sales have had extremely limited effectiveness. Unfortunately, current congressional proposals are fractionated and do not address the key issues of demand by vulnerable patient populations, search engine accountability, and the ease with which financial transactions can be consummated to promote illegal online sales. To deal with the social scourge of illicit online drug sales, this article proposes a comprehensive statutory solution that creates a no-cost/low-cost national Drug Access Program to break the chain of demand from vulnerable patient populations and illicit online sellers, makes all Internet drug sales illegal unless the Internet pharmacy is licensed through a national Internet pharmacy licensing program, prohibits financial transactions for illegal online drug

  20. Search engine imaginary: Visions and values in the co-production of search technology and Europe.

    PubMed

    Mager, Astrid

    2017-04-01

    This article discusses the co-production of search technology and a European identity in the context of the EU data protection reform. The negotiations of the EU data protection legislation ran from 2012 until 2015 and resulted in a unified data protection legislation directly binding for all European member states. I employ a discourse analysis to examine EU policy documents and Austrian media materials related to the reform process. Using the concept 'sociotechnical imaginary', I show how a European imaginary of search engines is forming in the EU policy domain, how a European identity is constructed in the envisioned politics of control, and how national specificities contribute to the making and unmaking of a European identity. I discuss the roles that national technopolitical identities play in shaping both search technology and Europe, taking as an example Austria, a small country with a long history in data protection and a tradition of restrained technology politics.

  1. Cross-System Evaluation of Clinical Trial Search Engines

    PubMed Central

    Jiang, Silis Y.; Weng, Chunhua

    2014-01-01

    Clinical trials are fundamental to the advancement of medicine but constantly face recruitment difficulties. Various clinical trial search engines have been designed to help health consumers identify trials for which they may be eligible. Unfortunately, knowledge of the usefulness and usability of their designs remains scarce. In this study, we used mixed methods, including time-motion analysis, think-aloud protocol, and survey, to evaluate five popular clinical trial search engines with 11 users. Differences in user preferences and time spent on each system were observed and correlated with user characteristics. In general, searching for applicable trials using these systems is a cognitively demanding task. Our results show that user perceptions of these systems are multifactorial. The survey indicated eTACTS being the generally preferred system, but this finding did not persist among all mixed methods. This study confirms the value of mixed-methods for a comprehensive system evaluation. Future system designers must be aware that different users groups expect different functionalities. PMID:25954590

  2. Cross-system evaluation of clinical trial search engines.

    PubMed

    Jiang, Silis Y; Weng, Chunhua

    2014-01-01

    Clinical trials are fundamental to the advancement of medicine but constantly face recruitment difficulties. Various clinical trial search engines have been designed to help health consumers identify trials for which they may be eligible. Unfortunately, knowledge of the usefulness and usability of their designs remains scarce. In this study, we used mixed methods, including time-motion analysis, think-aloud protocol, and survey, to evaluate five popular clinical trial search engines with 11 users. Differences in user preferences and time spent on each system were observed and correlated with user characteristics. In general, searching for applicable trials using these systems is a cognitively demanding task. Our results show that user perceptions of these systems are multifactorial. The survey indicated eTACTS being the generally preferred system, but this finding did not persist among all mixed methods. This study confirms the value of mixed-methods for a comprehensive system evaluation. Future system designers must be aware that different users groups expect different functionalities.

  3. Web page sorting algorithm based on query keyword distance relation

    NASA Astrophysics Data System (ADS)

    Yang, Han; Cui, Hong Gang; Tang, Hao

    2017-08-01

    In order to optimize the problem of page sorting, according to the search keywords in the web page in the relationship between the characteristics of the proposed query keywords clustering ideas. And it is converted into the degree of aggregation of the search keywords in the web page. Based on the PageRank algorithm, the clustering degree factor of the query keyword is added to make it possible to participate in the quantitative calculation. This paper proposes an improved algorithm for PageRank based on the distance relation between search keywords. The experimental results show the feasibility and effectiveness of the method.

  4. iPixel: a visual content-based and semantic search engine for retrieving digitized mammograms by using collective intelligence.

    PubMed

    Alor-Hernández, Giner; Pérez-Gallardo, Yuliana; Posada-Gómez, Rubén; Cortes-Robles, Guillermo; Rodríguez-González, Alejandro; Aguilar-Laserre, Alberto A

    2012-09-01

    Nowadays, traditional search engines such as Google, Yahoo and Bing facilitate the retrieval of information in the format of images, but the results are not always useful for the users. This is mainly due to two problems: (1) the semantic keywords are not taken into consideration and (2) it is not always possible to establish a query using the image features. This issue has been covered in different domains in order to develop content-based image retrieval (CBIR) systems. The expert community has focussed their attention on the healthcare domain, where a lot of visual information for medical analysis is available. This paper provides a solution called iPixel Visual Search Engine, which involves semantics and content issues in order to search for digitized mammograms. iPixel offers the possibility of retrieving mammogram features using collective intelligence and implementing a CBIR algorithm. Our proposal compares not only features with similar semantic meaning, but also visual features. In this sense, the comparisons are made in different ways: by the number of regions per image, by maximum and minimum size of regions per image and by average intensity level of each region. iPixel Visual Search Engine supports the medical community in differential diagnoses related to the diseases of the breast. The iPixel Visual Search Engine has been validated by experts in the healthcare domain, such as radiologists, in addition to experts in digital image analysis.

  5. Noesis: Ontology based Scoped Search Engine and Resource Aggregator for Atmospheric Science

    NASA Astrophysics Data System (ADS)

    Ramachandran, R.; Movva, S.; Li, X.; Cherukuri, P.; Graves, S.

    2006-12-01

    The goal for search engines is to return results that are both accurate and complete. The search engines should find only what you really want and find everything you really want. Search engines (even meta search engines) lack semantics. The basis for search is simply based on string matching between the user's query term and the resource database and the semantics associated with the search string is not captured. For example, if an atmospheric scientist is searching for "pressure" related web resources, most search engines return inaccurate results such as web resources related to blood pressure. In this presentation Noesis, which is a meta-search engine and a resource aggregator that uses domain ontologies to provide scoped search capabilities will be described. Noesis uses domain ontologies to help the user scope the search query to ensure that the search results are both accurate and complete. The domain ontologies guide the user to refine their search query and thereby reduce the user's burden of experimenting with different search strings. Semantics are captured by refining the query terms to cover synonyms, specializations, generalizations and related concepts. Noesis also serves as a resource aggregator. It categorizes the search results from different online resources such as education materials, publications, datasets, web search engines that might be of interest to the user.

  6. The EBI Search engine: providing search and retrieval functionality for biological data from EMBL-EBI.

    PubMed

    Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Gur, Tamer; Cowley, Andrew; Li, Weizhong; Uludag, Mahmut; Pundir, Sangya; Cham, Jennifer A; McWilliam, Hamish; Lopez, Rodrigo

    2015-07-01

    The European Bioinformatics Institute (EMBL-EBI-https://www.ebi.ac.uk) provides free and unrestricted access to data across all major areas of biology and biomedicine. Searching and extracting knowledge across these domains requires a fast and scalable solution that addresses the requirements of domain experts as well as casual users. We present the EBI Search engine, referred to here as 'EBI Search', an easy-to-use fast text search and indexing system with powerful data navigation and retrieval capabilities. API integration provides access to analytical tools, allowing users to further investigate the results of their search. The interconnectivity that exists between data resources at EMBL-EBI provides easy, quick and precise navigation and a better understanding of the relationship between different data types including sequences, genes, gene products, proteins, protein domains, protein families, enzymes and macromolecular structures, together with relevant life science literature. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Development of a One-Stop Data Search and Discovery Engine using Ontologies for Semantic Mappings (HydroSeek)

    NASA Astrophysics Data System (ADS)

    Piasecki, M.; Beran, B.

    2007-12-01

    Search engines have changed the way we see the Internet. The ability to find the information by just typing in keywords was a big contribution to the overall web experience. While the conventional search engine methodology worked well for textual documents, locating scientific data remains a problem since they are stored in databases not readily accessible by search engine bots. Considering different temporal, spatial and thematic coverage of different databases, especially for interdisciplinary research it is typically necessary to work with multiple data sources. These sources can be federal agencies which generally offer national coverage or regional sources which cover a smaller area with higher detail. However for a given geographic area of interest there often exists more than one database with relevant data. Thus being able to query multiple databases simultaneously is a desirable feature that would be tremendously useful for scientists. Development of such a search engine requires dealing with various heterogeneity issues. In scientific databases, systems often impose controlled vocabularies which ensure that they are generally homogeneous within themselves but are semantically heterogeneous when moving between different databases. This defines the boundaries of possible semantic related problems making it easier to solve than with the conventional search engines that deal with free text. We have developed a search engine that enables querying multiple data sources simultaneously and returns data in a standardized output despite the aforementioned heterogeneity issues between the underlying systems. This application relies mainly on metadata catalogs or indexing databases, ontologies and webservices with virtual globe and AJAX technologies for the graphical user interface. Users can trigger a search of dozens of different parameters over hundreds of thousands of stations from multiple agencies by providing a keyword, a spatial extent, i.e. a bounding box

  8. A Taxonomic Search Engine: Federating taxonomic databases using web services

    PubMed Central

    Page, Roderic DM

    2005-01-01

    Background The taxonomic name of an organism is a key link between different databases that store information on that organism. However, in the absence of a single, comprehensive database of organism names, individual databases lack an easy means of checking the correctness of a name. Furthermore, the same organism may have more than one name, and the same name may apply to more than one organism. Results The Taxonomic Search Engine (TSE) is a web application written in PHP that queries multiple taxonomic databases (ITIS, Index Fungorum, IPNI, NCBI, and uBIO) and summarises the results in a consistent format. It supports "drill-down" queries to retrieve a specific record. The TSE can optionally suggest alternative spellings the user can try. It also acts as a Life Science Identifier (LSID) authority for the source taxonomic databases, providing globally unique identifiers (and associated metadata) for each name. Conclusion The Taxonomic Search Engine is available at and provides a simple demonstration of the potential of the federated approach to providing access to taxonomic names. PMID:15757517

  9. General vs health specialized search engine: a blind comparative evaluation of top search results.

    PubMed

    Pletneva, Natalia; Ruiz de Castaneda, Rafael; Baroz, Frederic; Boyer, Celia

    2014-01-01

    This paper presents the results of a blind comparison of top ten search results retrieved by Google.ch (French) and Khresmoi for everyone, a health specialized search engine. Participants--students of the Faculty of Medicine of the University of Geneva had to complete three tasks and select their preferred results. The majority of the participants have largely preferred Google results while Khresmoi results showed potential to compete in specific topics. The coverage of the results seems to be one of the reasons. The second being that participants do not know how to select quality and transparent health web pages. More awareness, tools and education about the matter is required for the students of Medicine to be able to efficiently distinguish trustworthy online health information.

  10. Improving sensitivity in proteome studies by analysis of false discovery rates for multiple search engines.

    PubMed

    Jones, Andrew R; Siepen, Jennifer A; Hubbard, Simon J; Paton, Norman W

    2009-03-01

    LC-MS experiments can generate large quantities of data, for which a variety of database search engines are available to make peptide and protein identifications. Decoy databases are becoming widely used to place statistical confidence in result sets, allowing the false discovery rate (FDR) to be estimated. Different search engines produce different identification sets so employing more than one search engine could result in an increased number of peptides (and proteins) being identified, if an appropriate mechanism for combining data can be defined. We have developed a search engine independent score, based on FDR, which allows peptide identifications from different search engines to be combined, called the FDR Score. The results demonstrate that the observed FDR is significantly different when analysing the set of identifications made by all three search engines, by each pair of search engines or by a single search engine. Our algorithm assigns identifications to groups according to the set of search engines that have made the identification, and re-assigns the score (combined FDR Score). The combined FDR Score can differentiate between correct and incorrect peptide identifications with high accuracy, allowing on average 35% more peptide identifications to be made at a fixed FDR than using a single search engine.

  11. EIIS: An Educational Information Intelligent Search Engine Supported by Semantic Services

    ERIC Educational Resources Information Center

    Huang, Chang-Qin; Duan, Ru-Lin; Tang, Yong; Zhu, Zhi-Ting; Yan, Yong-Jian; Guo, Yu-Qing

    2011-01-01

    The semantic web brings a new opportunity for efficient information organization and search. To meet the special requirements of the educational field, this paper proposes an intelligent search engine enabled by educational semantic support service, where three kinds of searches are integrated into Educational Information Intelligent Search (EIIS)…

  12. Preliminary Comparison of Three Search Engines for Point of Care Access to MEDLINE® Citations

    PubMed Central

    Hauser, Susan E.; Demner-Fushman, Dina; Ford, Glenn M.; Jacobs, Joshua L.; Thoma, George

    2006-01-01

    Medical resident physicians used MD on Tap in real time to search for MEDLINE citations relevant to clinical questions using three search engines: Essie, Entrez and Google™, in order of performance. PMID:17238564

  13. Cumulative query method for influenza surveillance using search engine data.

    PubMed

    Seo, Dong-Woo; Jo, Min-Woo; Sohn, Chang Hwan; Shin, Soo-Yong; Lee, JaeHo; Yu, Maengsoo; Kim, Won Young; Lim, Kyoung Soo; Lee, Sang-Il

    2014-12-16

    Internet search queries have become an important data source in syndromic surveillance system. However, there is currently no syndromic surveillance system using Internet search query data in South Korea. The objective of this study was to examine correlations between our cumulative query method and national influenza surveillance data. Our study was based on the local search engine, Daum (approximately 25% market share), and influenza-like illness (ILI) data from the Korea Centers for Disease Control and Prevention. A quota sampling survey was conducted with 200 participants to obtain popular queries. We divided the study period into two sets: Set 1 (the 2009/10 epidemiological year for development set 1 and 2010/11 for validation set 1) and Set 2 (2010/11 for development Set 2 and 2011/12 for validation Set 2). Pearson's correlation coefficients were calculated between the Daum data and the ILI data for the development set. We selected the combined queries for which the correlation coefficients were .7 or higher and listed them in descending order. Then, we created a cumulative query method n representing the number of cumulative combined queries in descending order of the correlation coefficient. In validation set 1, 13 cumulative query methods were applied, and 8 had higher correlation coefficients (min=.916, max=.943) than that of the highest single combined query. Further, 11 of 13 cumulative query methods had an r value of ≥.7, but 4 of 13 combined queries had an r value of ≥.7. In validation set 2, 8 of 15 cumulative query methods showed higher correlation coefficients (min=.975, max=.987) than that of the highest single combined query. All 15 cumulative query methods had an r value of ≥.7, but 6 of 15 combined queries had an r value of ≥.7. Cumulative query method showed relatively higher correlation with national influenza surveillance data than combined queries in the development and validation set.

  14. An open-source, mobile-friendly search engine for public medical knowledge.

    PubMed

    Samwald, Matthias; Hanbury, Allan

    2014-01-01

    The World Wide Web has become an important source of information for medical practitioners. To complement the capabilities of currently available web search engines we developed FindMeEvidence, an open-source, mobile-friendly medical search engine. In a preliminary evaluation, the quality of results from FindMeEvidence proved to be competitive with those from TRIP Database, an established, closed-source search engine for evidence-based medicine.

  15. Keywords in the Australian Context

    ERIC Educational Resources Information Center

    Doecke, Brenton; Howie, Mark; Sawyer, Wayne

    2006-01-01

    Borrowing the title of Raymond Williams' famous study, the following reflections--sometimes collective and sometimes individual--are based on a series of "Keywords", specifically: "fear" "community" and "creativity". By reflecting on the meanings these words have for us today, we attempt to capture their…

  16. Dynamics of a macroscopic model characterizing mutualism of search engines and web sites

    NASA Astrophysics Data System (ADS)

    Wang, Yuanshi; Wu, Hong

    2006-05-01

    We present a model to describe the mutualism relationship between search engines and web sites. In the model, search engines and web sites benefit from each other while the search engines are derived products of the web sites and cannot survive independently. Our goal is to show strategies for the search engines to survive in the internet market. From mathematical analysis of the model, we show that mutualism does not always result in survival. We show various conditions under which the search engines would tend to extinction, persist or grow explosively. Then by the conditions, we deduce a series of strategies for the search engines to survive in the internet market. We present conditions under which the initial number of consumers of the search engines has little contribution to their persistence, which is in agreement with the results in previous works. Furthermore, we show novel conditions under which the initial value plays an important role in the persistence of the search engines and deduce new strategies. We also give suggestions for the web sites to cooperate with the search engines in order to form a win-win situation.

  17. Improving sensitivity in proteome studies by analysis of false discovery rates for multiple search engines

    PubMed Central

    Jones, Andrew R.; Siepen, Jennifer A.; Hubbard, Simon J.; Paton, Norman W.

    2010-01-01

    Tandem mass spectrometry, run in combination with liquid chromatography (LC-MS/MS), can generate large numbers of peptide and protein identifications, for which a variety of database search engines are available. Distinguishing correct identifications from false positives is far from trivial because all data sets are noisy, and tend to be too large for manual inspection, therefore probabilistic methods must be employed to balance the trade-off between sensitivity and specificity. Decoy databases are becoming widely used to place statistical confidence in results sets, allowing the false discovery rate (FDR) to be estimated. It has previously been demonstrated that different MS search engines produce different peptide identification sets, and as such, employing more than one search engine could result in an increased number of peptides being identified. However, such efforts are hindered by the lack of a single scoring framework employed by all search engines. We have developed a search engine independent scoring framework based on FDR which allows peptide identifications from different search engines to be combined, called the FDRScore. We observe that peptide identifications made by three search engines are infrequently false positives, and identifications made by only a single search engine, even with a strong score from the source search engine, are significantly more likely to be false positives. We have developed a second score based on the FDR within peptide identifications grouped according to the set of search engines that have made the identification, called the combined FDRScore. We demonstrate by searching large publicly available data sets that the combined FDRScore can differentiate between between correct and incorrect peptide identifications with high accuracy, allowing on average 35% more peptide identifications to be made at a fixed FDR than using a single search engine. PMID:19253293

  18. Surveillance Tools Emerging From Search Engines and Social Media Data for Determining Eye Disease Patterns.

    PubMed

    Deiner, Michael S; Lietman, Thomas M; McLeod, Stephen D; Chodosh, James; Porco, Travis C

    2016-09-01

    Internet-based search engine and social media data may provide a novel complementary source for better understanding the epidemiologic factors of infectious eye diseases, which could better inform eye health care and disease prevention. To assess whether data from internet-based social media and search engines are associated with objective clinic-based diagnoses of conjunctivitis. Data from encounters of 4143 patients diagnosed with conjunctivitis from June 3, 2012, to April 26, 2014, at the University of California San Francisco (UCSF) Medical Center, were analyzed using Spearman rank correlation of each weekly observation to compare demographics and seasonality of nonallergic conjunctivitis with allergic conjunctivitis. Data for patient encounters with diagnoses for glaucoma and influenza were also obtained for the same period and compared with conjunctivitis. Temporal patterns of Twitter and Google web search data, geolocated to the United States and associated with these clinical diagnoses, were compared with the clinical encounters. The a priori hypothesis was that weekly internet-based searches and social media posts about conjunctivitis may reflect the true weekly clinical occurrence of conjunctivitis. Weekly total clinical diagnoses at UCSF of nonallergic conjunctivitis, allergic conjunctivitis, glaucoma, and influenza were compared using Spearman rank correlation with equivalent weekly data on Tweets related to disease or disease-related keyword searches obtained from Google Trends. Seasonality of clinical diagnoses of nonallergic conjunctivitis among the 4143 patients (2364 females [57.1%] and 1776 males [42.9%]) with 5816 conjunctivitis encounters at UCSF correlated strongly with results of Google searches in the United States for the term pink eye (ρ, 0.68 [95% CI, 0.52 to 0.78]; P < .001) and correlated moderately with Twitter results about pink eye (ρ, 0.38 [95% CI, 0.16 to 0.56]; P < .001) and with clinical diagnosis of influenza (ρ, 0

  19. Surveillance Tools Emerging From Search Engines and Social Media Data for Determining Eye Disease Patterns

    PubMed Central

    Deiner, Michael S.; Lietman, Thomas M.; McLeod, Stephen D.; Chodosh, James; Porco, Travis C.

    2016-01-01

    IMPORTANCE Internet-based search engine and social media data may provide a novel complementary source for better understanding the epidemiologic factors of infectious eye diseases, which could better inform eye health care and disease prevention. OBJECTIVE To assess whether data from internet-based social media and search engines are associated with objective clinic-based diagnoses of conjunctivitis. DESIGN, SETTING, AND PARTICIPANTS Data from encounters of 4143 patients diagnosed with conjunctivitis from June 3, 2012, to April 26, 2014, at the University of California San Francisco (UCSF) Medical Center, were analyzed using Spearman rank correlation of each weekly observation to compare demographics and seasonality of nonallergic conjunctivitis with allergic conjunctivitis. Data for patient encounters with diagnoses for glaucoma and influenza were also obtained for the same period and compared with conjunctivitis. Temporal patterns of Twitter and Google web search data, geolocated to the United States and associated with these clinical diagnoses, were compared with the clinical encounters. The a priori hypothesis was that weekly internet-based searches and social media posts about conjunctivitis may reflect the true weekly clinical occurrence of conjunctivitis. MAIN OUTCOMES AND MEASURES Weekly total clinical diagnoses at UCSF of nonallergic conjunctivitis, allergic conjunctivitis, glaucoma, and influenza were compared using Spearman rank correlation with equivalent weekly data on Tweets related to disease or disease-related keyword searches obtained from Google Trends. RESULTS Seasonality of clinical diagnoses of nonallergic conjunctivitis among the 4143 patients (2364 females [57.1%] and 1776 males [42.9%]) with 5816 conjunctivitis encounters at UCSF correlated strongly with results of Google searches in the United States for the term pink eye (ρ, 0.68 [95%CI, 0.52 to 0.78]; P < .001) and correlated moderately with Twitter results about pink eye (ρ, 0.38 [95

  20. The Gaze of the Perfect Search Engine: Google as an Infrastructure of Dataveillance

    NASA Astrophysics Data System (ADS)

    Zimmer, M.

    Web search engines have emerged as a ubiquitous and vital tool for the successful navigation of the growing online informational sphere. The goal of the world's largest search engine, Google, is to "organize the world's information and make it universally accessible and useful" and to create the "perfect search engine" that provides only intuitive, personalized, and relevant results. While intended to enhance intellectual mobility in the online sphere, this chapter reveals that the quest for the perfect search engine requires the widespread monitoring and aggregation of a users' online personal and intellectual activities, threatening the values the perfect search engines were designed to sustain. It argues that these search-based infrastructures of dataveillance contribute to a rapidly emerging "soft cage" of everyday digital surveillance, where they, like other dataveillance technologies before them, contribute to the curtailing of individual freedom, affect users' sense of self, and present issues of deep discrimination and social justice.

  1. Keyword extraction by nonextensivity measure.

    PubMed

    Mehri, Ali; Darooneh, Amir H

    2011-05-01

    The presence of a long-range correlation in the spatial distribution of a relevant word type, in spite of random occurrences of an irrelevant word type, is an important feature of human-written texts. We classify the correlation between the occurrences of words by nonextensive statistical mechanics for the word-ranking process. In particular, we look at the nonextensivity parameter as an alternative metric to measure the spatial correlation in the text, from which the words may be ranked in terms of this measure. Finally, we compare different methods for keyword extraction. © 2011 American Physical Society

  2. Semantic similarity measure in biomedical domain leverage web search engine.

    PubMed

    Chen, Chi-Huang; Hsieh, Sheau-Ling; Weng, Yung-Ching; Chang, Wen-Yung; Lai, Feipei

    2010-01-01

    Semantic similarity measure plays an essential role in Information Retrieval and Natural Language Processing. In this paper we propose a page-count-based semantic similarity measure and apply it in biomedical domains. Previous researches in semantic web related applications have deployed various semantic similarity measures. Despite the usefulness of the measurements in those applications, measuring semantic similarity between two terms remains a challenge task. The proposed method exploits page counts returned by the Web Search Engine. We define various similarity scores for two given terms P and Q, using the page counts for querying P, Q and P AND Q. Moreover, we propose a novel approach to compute semantic similarity using lexico-syntactic patterns with page counts. These different similarity scores are integrated adapting support vector machines, to leverage the robustness of semantic similarity measures. Experimental results on two datasets achieve correlation coefficients of 0.798 on the dataset provided by A. Hliaoutakis, 0.705 on the dataset provide by T. Pedersen with physician scores and 0.496 on the dataset provided by T. Pedersen et al. with expert scores.

  3. Taking It to the Top: A Lesson in Search Engine Optimization

    ERIC Educational Resources Information Center

    Frydenberg, Mark; Miko, John S.

    2011-01-01

    Search engine optimization (SEO), the promoting of a Web site so it achieves optimal position with a search engine's rankings, is an important strategy for organizations and individuals in order to promote their brands online. Techniques for achieving SEO are relevant to students of marketing, computing, media arts, and other disciplines, and many…

  4. Index Compression and Efficient Query Processing in Large Web Search Engines

    ERIC Educational Resources Information Center

    Ding, Shuai

    2013-01-01

    The inverted index is the main data structure used by all the major search engines. Search engines build an inverted index on their collection to speed up query processing. As the size of the web grows, the length of the inverted list structures, which can easily grow to hundreds of MBs or even GBs for common terms (roughly linear in the size of…

  5. Finding Information on the World Wide Web: The Retrieval Effectiveness of Search Engines.

    ERIC Educational Resources Information Center

    Pathak, Praveen; Gordon, Michael

    1999-01-01

    Describes a study that examined the effectiveness of eight search engines for the World Wide Web. Calculated traditional information-retrieval measures of recall and precision at varying numbers of retrieved documents to use as the bases for statistical comparisons of retrieval effectiveness. Also examined the overlap between search engines.…

  6. Evaluation of Proteomic Search Engines for the Analysis of Histone Modifications

    PubMed Central

    2015-01-01

    Identification of histone post-translational modifications (PTMs) is challenging for proteomics search engines. Including many histone PTMs in one search increases the number of candidate peptides dramatically, leading to low search speed and fewer identified spectra. To evaluate database search engines on identifying histone PTMs, we present a method in which one kind of modification is searched each time, for example, unmodified, individually modified, and multimodified, each search result is filtered with false discovery rate less than 1%, and the identifications of multiple search engines are combined to obtain confident results. We apply this method for eight search engines on histone data sets. We find that two search engines, pFind and Mascot, identify most of the confident results at a reasonable speed, so we recommend using them to identify histone modifications. During the evaluation, we also find some important aspects for the analysis of histone modifications. Our evaluation of different search engines on identifying histone modifications will hopefully help those who are hoping to enter the histone proteomics field. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD001118. PMID:25167464

  7. Evaluation of proteomic search engines for the analysis of histone modifications.

    PubMed

    Yuan, Zuo-Fei; Lin, Shu; Molden, Rosalynn C; Garcia, Benjamin A

    2014-10-03

    Identification of histone post-translational modifications (PTMs) is challenging for proteomics search engines. Including many histone PTMs in one search increases the number of candidate peptides dramatically, leading to low search speed and fewer identified spectra. To evaluate database search engines on identifying histone PTMs, we present a method in which one kind of modification is searched each time, for example, unmodified, individually modified, and multimodified, each search result is filtered with false discovery rate less than 1%, and the identifications of multiple search engines are combined to obtain confident results. We apply this method for eight search engines on histone data sets. We find that two search engines, pFind and Mascot, identify most of the confident results at a reasonable speed, so we recommend using them to identify histone modifications. During the evaluation, we also find some important aspects for the analysis of histone modifications. Our evaluation of different search engines on identifying histone modifications will hopefully help those who are hoping to enter the histone proteomics field. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the data set identifier PXD001118.

  8. [Title, abstract and keywords: essential issues in medical bibliographic research].

    PubMed

    Bonciu, Carmen

    2005-01-01

    Medical information, conveyed either by books, journal articles, conference and congress papers or posters, represents the product, the result of the medical research. Note that the informational cycle can be shown schematically as Bibliographic information --> Medical research --> Research results --> Bibliographic information. The result of the scientific research (articles, posters, etc.) re-enters the informational cycle, as bibliographic information for a new medical research. The bibliographic research is still a time, and effort consuming activity, despite the explosive growth of information technology. It requires specific medical, information technology and bibliographic knowledge. The present work aims to emphasize the importance of title, keywords and abstract terms selection, to article writing and publication in medical journals, and the proper choice of meta-information in web pages. The bibliographic research was made using two databases with English language information about articles from international medical journals: MEDLINE (PUBMED) and PROQUEST MEDICAL LIBRARY. The results were compared with GOOGLE and YAHOO search. These searching engines are common now in all types of Internet users (including researchers, librarians, etc.). It is essential for the researchers to know the article registration mechanism in a database and the modalities of bibliographic investigation of online databases, so that the title, keyword and abstract terms are selected properly. The use of words not related to the subject, in title, keywords or abstract, results in ambiguities. The writing and the translation of scientific words must also be accurate, mainly when article authors are non-native English speakers: e.g., chimiotherapy (sic)--20 articles in Medline, 270 articles in Google; morphopathology (sic)-- 78 articles in Medline, and 294 in Google; morphopatology (sic)--2 articles in Medline, and 12 articles in Google.

  9. Developing a search engine for pharmacotherapeutic information that is not published in biomedical journals.

    PubMed

    Do Pazo-Oubiña, F; Calvo Pita, C; Puigventós Latorre, F; Periañez-Párraga, L; Ventayol Bosch, P

    2011-01-01

    To identify publishers of pharmacotherapeutic information not found in biomedical journals that focuses on evaluating and providing advice on medicines and to develop a search engine to access this information. Compiling web sites that publish information on the rational use of medicines and have no commercial interests. Free-access web sites in Spanish, Galician, Catalan or English. Designing a search engine using the Google "custom search" application. Overall 159 internet addresses were compiled and were classified into 9 labels. We were able to recover the information from the selected sources using a search engine, which is called "AlquimiA" and available from http://www.elcomprimido.com/FARHSD/AlquimiA.htm. The main sources of pharmacotherapeutic information not published in biomedical journals were identified. The search engine is a useful tool for searching and accessing "grey literature" on the internet. Copyright © 2010 SEFH. Published by Elsevier Espana. All rights reserved.

  10. An assessment of the visibility of MeSH-indexed medical web catalogs through search engines.

    PubMed

    Zweigenbaum, P; Darmoni, S J; Grabar, N; Douyère, M; Benichou, J

    2002-01-01

    Manually indexed Internet health catalogs such as CliniWeb or CISMeF provide resources for retrieving high-quality health information. Users of these quality-controlled subject gateways are most often referred to them by general search engines such as Google, AltaVista, etc. This raises several questions, among which the following: what is the relative visibility of medical Internet catalogs through search engines? This study addresses this issue by measuring and comparing the visibility of six major, MeSH-indexed health catalogs through four different search engines (AltaVista, Google, Lycos, Northern Light) in two languages (English and French). Over half a million queries were sent to the search engines; for most of these search engines, according to our measures at the time the queries were sent, the most visible catalog for English MeSH terms was CliniWeb and the most visible one for French MeSH terms was CISMeF.

  11. Evaluating Open-Source Full-Text Search Engines for Matching ICD-10 Codes.

    PubMed

    Jurcău, Daniel-Alexandru; Stoicu-Tivadar, Vasile

    2016-01-01

    This research presents the results of evaluating multiple free, open-source engines on matching ICD-10 diagnostic codes via full-text searches. The study investigates what it takes to get an accurate match when searching for a specific diagnostic code. For each code the evaluation starts by extracting the words that make up its text and continues with building full-text search queries from the combinations of these words. The queries are then run against all the ICD-10 codes until a match indicates the code in question as a match with the highest relative score. This method identifies the minimum number of words that must be provided in order for the search engines choose the desired entry. The engines analyzed include a popular Java-based full-text search engine, a lightweight engine written in JavaScript which can even execute on the user's browser, and two popular open-source relational database management systems.

  12. Curating the Web: Building a Google Custom Search Engine for the Arts

    ERIC Educational Resources Information Center

    Hennesy, Cody; Bowman, John

    2008-01-01

    Google's first foray onto the web made search simple and results relevant. With its Co-op platform, Google has taken another step toward dramatically increasing the relevancy of search results, further adapting the World Wide Web to local needs. Google Custom Search Engine, a tool on the Co-op platform, puts one in control of his or her own search…

  13. Balancing Efficiency and Effectiveness for Fusion-Based Search Engines in the "Big Data" Environment

    ERIC Educational Resources Information Center

    Li, Jieyu; Huang, Chunlan; Wang, Xiuhong; Wu, Shengli

    2016-01-01

    Introduction: In the big data age, we have to deal with a tremendous amount of information, which can be collected from various types of sources. For information search systems such as Web search engines or online digital libraries, the collection of documents becomes larger and larger. For some queries, an information search system needs to…

  14. GeoSearcher: Location-Based Ranking of Search Engine Results.

    ERIC Educational Resources Information Center

    Watters, Carolyn; Amoudi, Ghada

    2003-01-01

    Discussion of Web queries with geospatial dimensions focuses on an algorithm that assigns location coordinates dynamically to Web sites based on the URL. Describes a prototype search system that uses the algorithm to re-rank search engine results for queries with a geospatial dimension, thus providing an alternative ranking order for search engine…

  15. BioTCM-SE: a semantic search engine for the information retrieval of modern biology and traditional Chinese medicine.

    PubMed

    Chen, Xi; Chen, Huajun; Bi, Xuan; Gu, Peiqin; Chen, Jiaoyan; Wu, Zhaohui

    2014-01-01

    Understanding the functional mechanisms of the complex biological system as a whole is drawing more and more attention in global health care management. Traditional Chinese Medicine (TCM), essentially different from Western Medicine (WM), is gaining increasing attention due to its emphasis on individual wellness and natural herbal medicine, which satisfies the goal of integrative medicine. However, with the explosive growth of biomedical data on the Web, biomedical researchers are now confronted with the problem of large-scale data analysis and data query. Besides that, biomedical data also has a wide coverage which usually comes from multiple heterogeneous data sources and has different taxonomies, making it hard to integrate and query the big biomedical data. Embedded with domain knowledge from different disciplines all regarding human biological systems, the heterogeneous data repositories are implicitly connected by human expert knowledge. Traditional search engines cannot provide accurate and comprehensive search results for the semantically associated knowledge since they only support keywords-based searches. In this paper, we present BioTCM-SE, a semantic search engine for the information retrieval of modern biology and TCM, which provides biologists with a comprehensive and accurate associated knowledge query platform to greatly facilitate the implicit knowledge discovery between WM and TCM.

  16. BioTCM-SE: A Semantic Search Engine for the Information Retrieval of Modern Biology and Traditional Chinese Medicine

    PubMed Central

    Chen, Xi; Chen, Huajun; Bi, Xuan; Gu, Peiqin; Chen, Jiaoyan; Wu, Zhaohui

    2014-01-01

    Understanding the functional mechanisms of the complex biological system as a whole is drawing more and more attention in global health care management. Traditional Chinese Medicine (TCM), essentially different from Western Medicine (WM), is gaining increasing attention due to its emphasis on individual wellness and natural herbal medicine, which satisfies the goal of integrative medicine. However, with the explosive growth of biomedical data on the Web, biomedical researchers are now confronted with the problem of large-scale data analysis and data query. Besides that, biomedical data also has a wide coverage which usually comes from multiple heterogeneous data sources and has different taxonomies, making it hard to integrate and query the big biomedical data. Embedded with domain knowledge from different disciplines all regarding human biological systems, the heterogeneous data repositories are implicitly connected by human expert knowledge. Traditional search engines cannot provide accurate and comprehensive search results for the semantically associated knowledge since they only support keywords-based searches. In this paper, we present BioTCM-SE, a semantic search engine for the information retrieval of modern biology and TCM, which provides biologists with a comprehensive and accurate associated knowledge query platform to greatly facilitate the implicit knowledge discovery between WM and TCM. PMID:24772189

  17. A Full-Text-Based Search Engine for Finding Highly Matched Documents Across Multiple Categories

    NASA Technical Reports Server (NTRS)

    Nguyen, Hung D.; Steele, Gynelle C.

    2016-01-01

    This report demonstrates the full-text-based search engine that works on any Web-based mobile application. The engine has the capability to search databases across multiple categories based on a user's queries and identify the most relevant or similar. The search results presented here were found using an Android (Google Co.) mobile device; however, it is also compatible with other mobile phones.

  18. Social Work Literature Searching: Current Issues with Databases and Online Search Engines

    ERIC Educational Resources Information Center

    McGinn, Tony; Taylor, Brian; McColgan, Mary; McQuilkan, Janice

    2016-01-01

    Objectives: To compare the performance of a range of search facilities; and to illustrate the execution of a comprehensive literature search for qualitative evidence in social work. Context: Developments in literature search methods and comparisons of search facilities help facilitate access to the best available evidence for social workers.…

  19. Domain Knowledge, Search Behaviour, and Search Effectiveness of Engineering and Science Students: An Exploratory Study

    ERIC Educational Resources Information Center

    Zhang, Xiangmin; Anghelescu, Hermina G. B.; Yuan, Xiaojun

    2005-01-01

    Introduction: This study sought to answer three questions: 1) Would the level of domain knowledge significantly affect the user's search behaviour? 2) Would the level of domain knowledge significantly affect search effectiveness, and 3) What would be the relationship between search behaviour and search effectiveness? Method: Participants were…

  20. World Wide Web Search Engines: AltaVista and Yahoo.

    ERIC Educational Resources Information Center

    Machovec, George S., Ed.

    1996-01-01

    Examines the history, structure, and search capabilities of Internet search tools AltaVista and Yahoo. AltaVista provides relevance-ranked feedback on full-text searches. Yahoo indexes Web "citations" only but does organize information hierarchically into predefined categories. Yahoo has recently become a publicly held company and…

  1. Earth Science Keyword Stewardship: Access and Management through NASA's Global Change Master Directory (GCMD) Keyword Management System (KMS)

    NASA Astrophysics Data System (ADS)

    Stevens, T.; Olsen, L. M.; Ritz, S.; Morahan, M.; Aleman, A.; Cepero, L.; Gokey, C.; Holland, M.; Cordova, R.; Areu, S.; Cherry, T.; Tran-Ho, H.

    2012-12-01

    Discovering Earth science data can be complex if the catalog holding the data lacks structure. Controlled keyword vocabularies within metadata catalogues can improve data discovery. NASA's Global Change Master Directory's (GCMD) Keyword Management System (KMS) is a recently released a RESTful web service for managing and providing access to controlled keywords (science keywords, service keywords, platforms, instruments, providers, locations, projects, data resolution, etc.). The KMS introduces a completely new paradigm for the use and management of the keywords and allows access to these keywords as SKOS Concepts (RDF), OWL, standard XML, and CSV. A universally unique identifier (UUID) is automatically assigned to each keyword, which uniquely identifies each concept and its associated information. A component of the KMS is the keyword manager, an internal tool that allows GCMD science coordinators to manage concepts. This includes adding, modifying, and deleting broader, narrower, or related concepts and associated definitions. The controlled keyword vocabulary represents over 20 years of effort and collaboration with the Earth science community. The maintenance, stability, and ongoing vigilance in maintaining mutually exclusive and parallel keyword lists is important for a "normalized" search and discovery, and provides a unique advantage for the science community. Modifications and additions are made based on community suggestions and internal review. To help maintain keyword integrity, science keyword rules and procedures for modification of keywords were developed. This poster will highlight the use of the KMS as a beneficial service for the stewardship and access of the GCMD keywords. Users will learn how to access the KMS and utilize the keywords. Best practices for managing an extensive keyword hierarchy will also be discussed. Participants will learn the process for making keyword suggestions, which subsequently help in building a controlled keyword

  2. A unified architecture for biomedical search engines based on semantic web technologies.

    PubMed

    Jalali, Vahid; Matash Borujerdi, Mohammad Reza

    2011-04-01

    There is a huge growth in the volume of published biomedical research in recent years. Many medical search engines are designed and developed to address the over growing information needs of biomedical experts and curators. Significant progress has been made in utilizing the knowledge embedded in medical ontologies and controlled vocabularies to assist these engines. However, the lack of common architecture for utilized ontologies and overall retrieval process, hampers evaluating different search engines and interoperability between them under unified conditions. In this paper, a unified architecture for medical search engines is introduced. Proposed model contains standard schemas declared in semantic web languages for ontologies and documents used by search engines. Unified models for annotation and retrieval processes are other parts of introduced architecture. A sample search engine is also designed and implemented based on the proposed architecture in this paper. The search engine is evaluated using two test collections and results are reported in terms of precision vs. recall and mean average precision for different approaches used by this search engine.

  3. An Improved Forensic Science Information Search.

    PubMed

    Teitelbaum, J

    2015-01-01

    Although thousands of search engines and databases are available online, finding answers to specific forensic science questions can be a challenge even to experienced Internet users. Because there is no central repository for forensic science information, and because of the sheer number of disciplines under the forensic science umbrella, forensic scientists are often unable to locate material that is relevant to their needs. The author contends that using six publicly accessible search engines and databases can produce high-quality search results. The six resources are Google, PubMed, Google Scholar, Google Books, WorldCat, and the National Criminal Justice Reference Service. Carefully selected keywords and keyword combinations, designating a keyword phrase so that the search engine will search on the phrase and not individual keywords, and prompting search engines to retrieve PDF files are among the techniques discussed. Copyright © 2015 Central Police University.

  4. MSblender: A probabilistic approach for integrating peptide identifications from multiple database search engines.

    PubMed

    Kwon, Taejoon; Choi, Hyungwon; Vogel, Christine; Nesvizhskii, Alexey I; Marcotte, Edward M

    2011-07-01

    Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance scores such as p-values or posterior probabilities of peptide-spectrum matches (PSMs) from multiple search engines after high scoring peptides have been assigned to spectra, but these methods lack reliable control of identification error rates as data are integrated from different search engines. We developed a statistically coherent method for integrative analysis, termed MSblender. MSblender converts raw search scores from search engines into a probability score for every possible PSM and properly accounts for the correlation between search scores. The method reliably estimates false discovery rates and identifies more PSMs than any single search engine at the same false discovery rate. Increased identifications increment spectral counts for most proteins and allow quantification of proteins that would not have been quantified by individual search engines. We also demonstrate that enhanced quantification contributes to improve sensitivity in differential expression analyses.

  5. MSblender: a probabilistic approach for integrating peptide identifications from multiple database search engines

    PubMed Central

    Kwon, Taejoon; Choi, Hyungwon; Vogel, Christine; Nesvizhskii, Alexey I.; Marcotte, Edward M.

    2011-01-01

    Shotgun proteomics using mass spectrometry is a powerful method for protein identification but suffers limited sensitivity in complex samples. Integrating peptide identifications from multiple database search engines is a promising strategy to increase the number of peptide identifications and reduce the volume of unassigned tandem mass spectra. Existing methods pool statistical significance scores such as p-values or posterior probabilities of peptide-spectrum matches (PSMs) from multiple search engines after high scoring peptides have been assigned to spectra, but these methods lack reliable control of identification error rates as data are integrated from different search engines. We developed a statistically coherent method for integrative analysis, termed MSblender. MSblender converts raw search scores from search engines into a probability score for all possible PSMs and properly accounts for the correlation between search scores. The method reliably estimates false discovery rates and identifies more PSMs than any single search engine at the same false discovery rate. Increased identifications increment spectral counts for all detected proteins and allow quantification of proteins that would not have been quantified by individual search engines. We also demonstrate that enhanced quantification contributes to improve sensitivity in differential expression analyses. PMID:21488652

  6. An approach in building a chemical compound search engine in oracle database.

    PubMed

    Wang, H; Volarath, P; Harrison, R

    2005-01-01

    A searching or identifying of chemical compounds is an important process in drug design and in chemistry research. An efficient search engine involves a close coupling of the search algorithm and database implementation. The database must process chemical structures, which demands the approaches to represent, store, and retrieve structures in a database system. In this paper, a general database framework for working as a chemical compound search engine in Oracle database is described. The framework is devoted to eliminate data type constrains for potential search algorithms, which is a crucial step toward building a domain specific query language on top of SQL. A search engine implementation based on the database framework is also demonstrated. The convenience of the implementation emphasizes the efficiency and simplicity of the framework.

  7. Comparing image search behaviour in the ARRS GoldMiner search engine and a clinical PACS/RIS.

    PubMed

    De-Arteaga, Maria; Eggel, Ivan; Do, Bao; Rubin, Daniel; Kahn, Charles E; Müller, Henning

    2015-08-01

    Information search has changed the way we manage knowledge and the ubiquity of information access has made search a frequent activity, whether via Internet search engines or increasingly via mobile devices. Medical information search is in this respect no different and much research has been devoted to analyzing the way in which physicians aim to access information. Medical image search is a much smaller domain but has gained much attention as it has different characteristics than search for text documents. While web search log files have been analysed many times to better understand user behaviour, the log files of hospital internal systems for search in a PACS/RIS (Picture Archival and Communication System, Radiology Information System) have rarely been analysed. Such a comparison between a hospital PACS/RIS search and a web system for searching images of the biomedical literature is the goal of this paper. Objectives are to identify similarities and differences in search behaviour of the two systems, which could then be used to optimize existing systems and build new search engines. Log files of the ARRS GoldMiner medical image search engine (freely accessible on the Internet) containing 222,005 queries, and log files of Stanford's internal PACS/RIS search called radTF containing 18,068 queries were analysed. Each query was preprocessed and all query terms were mapped to the RadLex (Radiology Lexicon) terminology, a comprehensive lexicon of radiology terms created and maintained by the Radiological Society of North America, so the semantic content in the queries and the links between terms could be analysed, and synonyms for the same concept could be detected. RadLex was mainly created for the use in radiology reports, to aid structured reporting and the preparation of educational material (Lanlotz, 2006) [1]. In standard medical vocabularies such as MeSH (Medical Subject Headings) and UMLS (Unified Medical Language System) specific terms of radiology are often

  8. Using Search Engines to Investigate Shared Migraine Experiences.

    PubMed

    Burns, Sara M; Turner, Dana P; Sexton, Katherine E; Deng, Hao; Houle, Timothy T

    2017-09-01

    To investigate migraine patterns in the United States using Google search data and utilize this information to better understand societal-level trends. Additionally, we aimed to evaluate time-series relationships between migraines and social factors. Extensive research has been done on clinical factors associated with migraines, yet population-level social factors have not been widely explored. Migraine internet search data may provide insight into migraine trends beyond information that can be gleaned from other sources. In this longitudinal analysis of open access data, we performed a time-series analysis in which about 12 years of Google Trends data (January 1, 2004 to August 15, 2016) were assessed. Data points were captured at a daily level and Google's 0-100 adjusted scale was used as the primary outcome to enable the comparison of relative popularity in the migraine search term. We hypothesized that the volume of relative migraine Google searches would be affected by societal aspects such as day of the week, holidays, and novel social events. Several recurrent social factors that drive migraine searches were identified. Of these, day of the week had the most significant impact on the volume of Google migraine searches. On average, Mondays accumulated 13.31 higher relative search volume than Fridays (95% CI: 11.12-15.51, P ≤ .001). Surprisingly, holidays were associated with lower relative migraine search volumes. Christmas Day had 13.84 lower relative search volumes (95% CI: 6.26-21.43, P ≤ .001) and Thanks giving had 20.18 lower relative search volumes (95% CI: 12.55-27.82, P ≤ .001) than days that were not holidays. Certain novel social events and extreme weather also appear to be associated with relative migraine Google search volume. Social factors play a crucial role in explaining population level migraine patterns, and thus, warrant further exploration. © 2017 American Headache Society.

  9. When Every Search Engine Knows Your Name. Online Treasures

    ERIC Educational Resources Information Center

    Balas, Janet L.

    2005-01-01

    This article explores personalized search technologies showing that various vendors are trying a variety of approaches. Brief descriptions are given of some of beta projects in effort to assist librarians seeking to offer services that meet their patrons' individual needs by exploring how personal search technologies are being used on the Web in…

  10. The Extreme Searcher's Guide to Web Search Engines: A Handbook for the Serious Searcher. 2nd Edition.

    ERIC Educational Resources Information Center

    Hock, Randolph

    This book aims to facilitate more effective and efficient use of World Wide Web search engines by helping the reader: know the basic structure of the major search engines; become acquainted with those attributes (features, benefits, options, content, etc.) that search engines have in common and where they differ; know the main strengths and…

  11. Research on the optimization strategy of web search engine based on data mining

    NASA Astrophysics Data System (ADS)

    Chen, Ronghua

    2018-04-01

    With the wide application of search engines, web site information has become an important way for people to obtain information. People have found that they are growing in an increasingly explosive manner. Web site information is verydifficult to find the information they need, and now the search engine can not meet the need, so there is an urgent need for the network to provide website personalized information service, data mining technology for this new challenge is to find a breakthrough. In order to improve people's accuracy of finding information from websites, a website search engine optimization strategy based on data mining is proposed, and verified by website search engine optimization experiment. The results show that the proposed strategy improves the accuracy of the people to find information, and reduces the time for people to find information. It has an important practical value.

  12. The History of the Internet Search Engine: Navigational Media and the Traffic Commodity

    NASA Astrophysics Data System (ADS)

    van Couvering, E.

    This chapter traces the economic development of the search engine industry over time, beginning with the earliest Web search engines and ending with the domination of the market by Google, Yahoo! and MSN. Specifically, it focuses on the ways in which search engines are similar to and different from traditional media institutions, and how the relations between traditional and Internet media have changed over time. In addition to its historical overview, a core contribution of this chapter is the analysis of the industry using a media value chain based on audiences rather than on content, and the development of traffic as the core unit of exchange. It shows that traditional media companies failed when they attempted to create vertically integrated portals in the late 1990s, based on the idea of controlling Internet content, while search engines succeeded in creating huge "virtually integrated" networks based on control of Internet traffic rather than Internet content.

  13. Optimizing Online Suicide Prevention: A Search Engine-Based Tailored Approach.

    PubMed

    Arendt, Florian; Scherr, Sebastian

    2017-11-01

    Search engines are increasingly used to seek suicide-related information online, which can serve both harmful and helpful purposes. Google acknowledges this fact and presents a suicide-prevention result for particular search terms. Unfortunately, the result is only presented to a limited number of visitors. Hence, Google is missing the opportunity to provide help to vulnerable people. We propose a two-step approach to a tailored optimization: First, research will identify the risk factors. Second, search engines will reweight algorithms according to the risk factors. In this study, we show that the query share of the search term "poisoning" on Google shows substantial peaks corresponding to peaks in actual suicidal behavior. Accordingly, thresholds for showing the suicide-prevention result should be set to the lowest levels during the spring, on Sundays and Mondays, on New Year's Day, and on Saturdays following Thanksgiving. Search engines can help to save lives globally by utilizing a more tailored approach to suicide prevention.

  14. Development and evaluation of a biomedical search engine using a predicate-based vector space model.

    PubMed

    Kwak, Myungjae; Leroy, Gondy; Martinez, Jesse D; Harwell, Jeffrey

    2013-10-01

    Although biomedical information available in articles and patents is increasing exponentially, we continue to rely on the same information retrieval methods and use very few keywords to search millions of documents. We are developing a fundamentally different approach for finding much more precise and complete information with a single query using predicates instead of keywords for both query and document representation. Predicates are triples that are more complex datastructures than keywords and contain more structured information. To make optimal use of them, we developed a new predicate-based vector space model and query-document similarity function with adjusted tf-idf and boost function. Using a test bed of 107,367 PubMed abstracts, we evaluated the first essential function: retrieving information. Cancer researchers provided 20 realistic queries, for which the top 15 abstracts were retrieved using a predicate-based (new) and keyword-based (baseline) approach. Each abstract was evaluated, double-blind, by cancer researchers on a 0-5 point scale to calculate precision (0 versus higher) and relevance (0-5 score). Precision was significantly higher (p<.001) for the predicate-based (80%) than for the keyword-based (71%) approach. Relevance was almost doubled with the predicate-based approach-2.1 versus 1.6 without rank order adjustment (p<.001) and 1.34 versus 0.98 with rank order adjustment (p<.001) for predicate--versus keyword-based approach respectively. Predicates can support more precise searching than keywords, laying the foundation for rich and sophisticated information search. Copyright © 2013 Elsevier Inc. All rights reserved.

  15. A natural language based search engine for ICD10 diagnosis encoding.

    PubMed

    Baud, Robert

    2004-01-01

    We have developed a multiple step process for implementing an ICD10 search engine. The complexity of the task has been shown and we recommend collecting adequate expertise before starting any implementation. Underestimation of the expert time and inadequate data resources are probable reasons for failure. We also claim that when all conditions are met in term of resource and availability of the expertise, the benefits of a responsive ICD10 search engine will be present and the investment will be successful.

  16. Determination of geographic variance in stroke prevalence using Internet search engine analytics.

    PubMed

    Walcott, Brian P; Nahed, Brian V; Kahle, Kristopher T; Redjal, Navid; Coumans, Jean-Valery

    2011-06-01

    Previous methods to determine stroke prevalence, such as nationwide surveys, are labor-intensive endeavors. Recent advances in search engine query analytics have led to a new metric for disease surveillance to evaluate symptomatic phenomenon, such as influenza. The authors hypothesized that the use of search engine query data can determine the prevalence of stroke. The Google Insights for Search database was accessed to analyze anonymized search engine query data. The authors' search strategy utilized common search queries used when attempting either to identify the signs and symptoms of a stroke or to perform stroke education. The search logic was as follows: (stroke signs + stroke symptoms + mini stroke--heat) from January 1, 2005, to December 31, 2010. The relative number of searches performed (the interest level) for this search logic was established for all 50 states and the District of Columbia. A Pearson product-moment correlation coefficient was calculated from the statespecific stroke prevalence data previously reported. Web search engine interest level was available for all 50 states and the District of Columbia over the time period for January 1, 2005-December 31, 2010. The interest level was highest in Alabama and Tennessee (100 and 96, respectively) and lowest in California and Virginia (58 and 53, respectively). The Pearson correlation coefficient (r) was calculated to be 0.47 (p = 0.0005, 2-tailed). Search engine query data analysis allows for the determination of relative stroke prevalence. Further investigation will reveal the reliability of this metric to determine temporal pattern analysis and prevalence in this and other symptomatic diseases.

  17. Enhanced Identification of Eligibility for Depression Research Using an Electronic Medical Record Search Engine

    PubMed Central

    Seyfried, Lisa; Hanauer, David; Nease, Donald; Albeiruti, Rashad; Kavanagh, Janet; Kales, Helen C.

    2009-01-01

    Purpose Electronic medical records (EMR) have become part of daily practice for many physicians. Attempts have been made to apply electronic search engine technology to speed EMR review. This was a prospective, observational study to compare the speed and accuracy of electronic search engine vs. manual review of the EMR. Methods Three raters reviewed 49 cases in the EMR to screen for eligibility in a depression study using the electronic search engine (EMERSE). One week later raters received a scrambled set of the same patients including 9 distractor cases, and used manual EMR review to determine eligibility. For both methods, accuracy was assessed for the original 49 cases by comparison with a gold standard rater. Results Use of EMERSE resulted in considerable time savings; chart reviews using EMERSE were significantly faster than traditional manual review (p=0.03). The percent agreement of raters with the gold standard (e.g. concurrent validity) using either EMERSE or manual review was not significantly different. Conclusions Using a search engine optimized for finding clinical information in the free-text sections of the EMR can provide significant time savings while preserving reliability. The major power of this search engine is not from a more advanced and sophisticated search algorithm, but rather from a user interface designed explicitly to help users search the entire medical record in a way that protects health information. PMID:19560962

  18. Enhanced identification of eligibility for depression research using an electronic medical record search engine.

    PubMed

    Seyfried, Lisa; Hanauer, David A; Nease, Donald; Albeiruti, Rashad; Kavanagh, Janet; Kales, Helen C

    2009-12-01

    Electronic medical records (EMRs) have become part of daily practice for many physicians. Attempts have been made to apply electronic search engine technology to speed EMR review. This was a prospective, observational study to compare the speed and clinical accuracy of a medical record search engine vs. manual review of the EMR. Three raters reviewed 49 cases in the EMR to screen for eligibility in a depression study using the electronic medical record search engine (EMERSE). One week later raters received a scrambled set of the same patients including 9 distractor cases, and used manual EMR review to determine eligibility. For both methods, accuracy was assessed for the original 49 cases by comparison with a gold standard rater. Use of EMERSE resulted in considerable time savings; chart reviews using EMERSE were significantly faster than traditional manual review (p=0.03). The percent agreement of raters with the gold standard (e.g. concurrent validity) using either EMERSE or manual review was not significantly different. Using a search engine optimized for finding clinical information in the free-text sections of the EMR can provide significant time savings while preserving clinical accuracy. The major power of this search engine is not from a more advanced and sophisticated search algorithm, but rather from a user interface designed explicitly to help users search the entire medical record in a way that protects health information.

  19. Development and tuning of an original search engine for patent libraries in medicinal chemistry

    PubMed Central

    2014-01-01

    Background The large increase in the size of patent collections has led to the need of efficient search strategies. But the development of advanced text-mining applications dedicated to patents of the biomedical field remains rare, in particular to address the needs of the pharmaceutical & biotech industry, which intensively uses patent libraries for competitive intelligence and drug development. Methods We describe here the development of an advanced retrieval engine to search information in patent collections in the field of medicinal chemistry. We investigate and combine different strategies and evaluate their respective impact on the performance of the search engine applied to various search tasks, which covers the putatively most frequent search behaviours of intellectual property officers in medical chemistry: 1) a prior art search task; 2) a technical survey task; and 3) a variant of the technical survey task, sometimes called known-item search task, where a single patent is targeted. Results The optimal tuning of our engine resulted in a top-precision of 6.76% for the prior art search task, 23.28% for the technical survey task and 46.02% for the variant of the technical survey task. We observed that co-citation boosting was an appropriate strategy to improve prior art search tasks, while IPC classification of queries was improving retrieval effectiveness for technical survey tasks. Surprisingly, the use of the full body of the patent was always detrimental for search effectiveness. It was also observed that normalizing biomedical entities using curated dictionaries had simply no impact on the search tasks we evaluate. The search engine was finally implemented as a web-application within Novartis Pharma. The application is briefly described in the report. Conclusions We have presented the development of a search engine dedicated to patent search, based on state of the art methods applied to patent corpora. We have shown that a proper tuning of the system to

  20. Development and tuning of an original search engine for patent libraries in medicinal chemistry.

    PubMed

    Pasche, Emilie; Gobeill, Julien; Kreim, Olivier; Oezdemir-Zaech, Fatma; Vachon, Therese; Lovis, Christian; Ruch, Patrick

    2014-01-01

    The large increase in the size of patent collections has led to the need of efficient search strategies. But the development of advanced text-mining applications dedicated to patents of the biomedical field remains rare, in particular to address the needs of the pharmaceutical & biotech industry, which intensively uses patent libraries for competitive intelligence and drug development. We describe here the development of an advanced retrieval engine to search information in patent collections in the field of medicinal chemistry. We investigate and combine different strategies and evaluate their respective impact on the performance of the search engine applied to various search tasks, which covers the putatively most frequent search behaviours of intellectual property officers in medical chemistry: 1) a prior art search task; 2) a technical survey task; and 3) a variant of the technical survey task, sometimes called known-item search task, where a single patent is targeted. The optimal tuning of our engine resulted in a top-precision of 6.76% for the prior art search task, 23.28% for the technical survey task and 46.02% for the variant of the technical survey task. We observed that co-citation boosting was an appropriate strategy to improve prior art search tasks, while IPC classification of queries was improving retrieval effectiveness for technical survey tasks. Surprisingly, the use of the full body of the patent was always detrimental for search effectiveness. It was also observed that normalizing biomedical entities using curated dictionaries had simply no impact on the search tasks we evaluate. The search engine was finally implemented as a web-application within Novartis Pharma. The application is briefly described in the report. We have presented the development of a search engine dedicated to patent search, based on state of the art methods applied to patent corpora. We have shown that a proper tuning of the system to adapt to the various search tasks

  1. Use of Web Search Engines and Personalisation in Information Searching for Educational Purposes

    ERIC Educational Resources Information Center

    Salehi, Sara; Du, Jia Tina; Ashman, Helen

    2018-01-01

    Introduction: Students increasingly depend on Web search for educational purposes. This causes concerns among education providers as some evidence indicates that in higher education, the disadvantages of Web search and personalised information are not justified by the benefits. Method: One hundred and twenty university students were surveyed about…

  2. Inefficiency and Bias of Search Engines in Retrieving References Containing Scientific Names of Fossil Amphibians

    ERIC Educational Resources Information Center

    Brown, Lauren E.; Dubois, Alain; Shepard, Donald B.

    2008-01-01

    Retrieval efficiencies of paper-based references in journals and other serials containing 10 scientific names of fossil amphibians were determined for seven major search engines. Retrievals were compared to the number of references obtained covering the period 1895-2006 by a Comprehensive Search. The latter was primarily a traditional…

  3. Impact of Internet Search Engines on OPAC Users: A Study of Punjabi University, Patiala (India)

    ERIC Educational Resources Information Center

    Kumar, Shiv

    2012-01-01

    Purpose: The aim of this paper is to study the impact of internet search engine usage with special reference to OPAC searches in the Punjabi University Library, Patiala, Punjab (India). Design/methodology/approach: The primary data were collected from 352 users comprising faculty, research scholars and postgraduate students of the university. A…

  4. Search Engines: A Primer on Finding Information on the World Wide Web.

    ERIC Educational Resources Information Center

    Maddux, Cleborne

    1996-01-01

    Presents an annotated list of several World Wide Web search engines, including Yahoo, Infoseek, Alta Vista, Magellan, Lycos, Webcrawler, Excite, Deja News, and the LISZT Directory of discussion groups. Uniform Resource Locators (URLs) are included. Discussion assesses performance and describes rules and syntax for refining or limiting a search.…

  5. 'Sciencenet'--towards a global search and share engine for all scientific knowledge.

    PubMed

    Lütjohann, Dominic S; Shah, Asmi H; Christen, Michael P; Richter, Florian; Knese, Karsten; Liebel, Urban

    2011-06-15

    Modern biological experiments create vast amounts of data which are geographically distributed. These datasets consist of petabytes of raw data and billions of documents. Yet to the best of our knowledge, a search engine technology that searches and cross-links all different data types in life sciences does not exist. We have developed a prototype distributed scientific search engine technology, 'Sciencenet', which facilitates rapid searching over this large data space. By 'bringing the search engine to the data', we do not require server farms. This platform also allows users to contribute to the search index and publish their large-scale data to support e-Science. Furthermore, a community-driven method guarantees that only scientific content is crawled and presented. Our peer-to-peer approach is sufficiently scalable for the science web without performance or capacity tradeoff. The free to use search portal web page and the downloadable client are accessible at: http://sciencenet.kit.edu. The web portal for index administration is implemented in ASP.NET, the 'AskMe' experiment publisher is written in Python 2.7, and the backend 'YaCy' search engine is based on Java 1.6.

  6. ‘Sciencenet’—towards a global search and share engine for all scientific knowledge

    PubMed Central

    Lütjohann, Dominic S.; Shah, Asmi H.; Christen, Michael P.; Richter, Florian; Knese, Karsten; Liebel, Urban

    2011-01-01

    Summary: Modern biological experiments create vast amounts of data which are geographically distributed. These datasets consist of petabytes of raw data and billions of documents. Yet to the best of our knowledge, a search engine technology that searches and cross-links all different data types in life sciences does not exist. We have developed a prototype distributed scientific search engine technology, ‘Sciencenet’, which facilitates rapid searching over this large data space. By ‘bringing the search engine to the data’, we do not require server farms. This platform also allows users to contribute to the search index and publish their large-scale data to support e-Science. Furthermore, a community-driven method guarantees that only scientific content is crawled and presented. Our peer-to-peer approach is sufficiently scalable for the science web without performance or capacity tradeoff. Availability and Implementation: The free to use search portal web page and the downloadable client are accessible at: http://sciencenet.kit.edu. The web portal for index administration is implemented in ASP.NET, the ‘AskMe’ experiment publisher is written in Python 2.7, and the backend ‘YaCy’ search engine is based on Java 1.6. Contact: urban.liebel@kit.edu Supplementary Material: Detailed instructions and descriptions can be found on the project homepage: http://sciencenet.kit.edu. PMID:21493657

  7. Development and Evaluation of Thesauri-Based Bibliographic Biomedical Search Engine

    ERIC Educational Resources Information Center

    Alghoson, Abdullah

    2017-01-01

    Due to the large volume and exponential growth of biomedical documents (e.g., books, journal articles), it has become increasingly challenging for biomedical search engines to retrieve relevant documents based on users' search queries. Part of the challenge is the matching mechanism of free-text indexing that performs matching based on…

  8. Supporting inter-topic entity search for biomedical Linked Data based on heterogeneous relationships.

    PubMed

    Zong, Nansu; Lee, Sungin; Ahn, Jinhyun; Kim, Hong-Gee

    2017-08-01

    The keyword-based entity search restricts search space based on the preference of search. When given keywords and preferences are not related to the same biomedical topic, existing biomedical Linked Data search engines fail to deliver satisfactory results. This research aims to tackle this issue by supporting an inter-topic search-improving search with inputs, keywords and preferences, under different topics. This study developed an effective algorithm in which the relations between biomedical entities were used in tandem with a keyword-based entity search, Siren. The algorithm, PERank, which is an adaptation of Personalized PageRank (PPR), uses a pair of input: (1) search preferences, and (2) entities from a keyword-based entity search with a keyword query, to formalize the search results on-the-fly based on the index of the precomputed Individual Personalized PageRank Vectors (IPPVs). Our experiments were performed over ten linked life datasets for two query sets, one with keyword-preference topic correspondence (intra-topic search), and the other without (inter-topic search). The experiments showed that the proposed method achieved better search results, for example a 14% increase in precision for the inter-topic search than the baseline keyword-based search engine. The proposed method improved the keyword-based biomedical entity search by supporting the inter-topic search without affecting the intra-topic search based on the relations between different entities. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Reconsidering the Rhizome: A Textual Analysis of Web Search Engines as Gatekeepers of the Internet

    NASA Astrophysics Data System (ADS)

    Hess, A.

    Critical theorists have often drawn from Deleuze and Guattari's notion of the rhizome when discussing the potential of the Internet. While the Internet may structurally appear as a rhizome, its day-to-day usage by millions via search engines precludes experiencing the random interconnectedness and potential democratizing function. Through a textual analysis of four search engines, I argue that Web searching has grown hierarchies, or "trees," that organize data in tracts of knowledge and place users in marketing niches rather than assist in the development of new knowledge.

  10. ClinicalKey: a point-of-care search engine.

    PubMed

    Vardell, Emily

    2013-01-01

    ClinicalKey is a new point-of-care resource for health care professionals. Through controlled vocabulary, ClinicalKey offers a cross section of resources on diseases and procedures, from journals to e-books and practice guidelines to patient education. A sample search was conducted to demonstrate the features of the database, and a comparison with similar tools is presented.

  11. Essie: A Concept-based Search Engine for Structured Biomedical Text

    PubMed Central

    Ide, Nicholas C.; Loane, Russell F.; Demner-Fushman, Dina

    2007-01-01

    This article describes the algorithms implemented in the Essie search engine that is currently serving several Web sites at the National Library of Medicine. Essie is a phrase-based search engine with term and concept query expansion and probabilistic relevancy ranking. Essie’s design is motivated by an observation that query terms are often conceptually related to terms in a document, without actually occurring in the document text. Essie’s performance was evaluated using data and standard evaluation methods from the 2003 and 2006 Text REtrieval Conference (TREC) Genomics track. Essie was the best-performing search engine in the 2003 TREC Genomics track and achieved results comparable to those of the highest-ranking systems on the 2006 TREC Genomics track task. Essie shows that a judicious combination of exploiting document structure, phrase searching, and concept based query expansion is a useful approach for information retrieval in the biomedical domain. PMID:17329729

  12. FOAMSearch.net: A custom search engine for emergency medicine and critical care.

    PubMed

    Raine, Todd; Thoma, Brent; Chan, Teresa M; Lin, Michelle

    2015-08-01

    The number of online resources read by and pertinent to clinicians has increased dramatically. However, most healthcare professionals still use mainstream search engines as their primary port of entry to the resources on the Internet. These search engines use algorithms that do not make it easy to find clinician-oriented resources. FOAMSearch, a custom search engine (CSE), was developed to find relevant, high-quality online resources for emergency medicine and critical care (EMCC) clinicians. Using Google™ algorithms, it searches a vetted list of >300 blogs, podcasts, wikis, knowledge translation tools, clinical decision support tools and medical journals. Utilisation has increased progressively to >3000 users/month since its launch in 2011. Further study of the role of CSEs to find medical resources is needed, and it might be possible to develop similar CSEs for other areas of medicine. © 2015 Australasian College for Emergency Medicine and Australasian Society for Emergency Medicine.

  13. Agreement between Medline searches using the Medline-CD-Rom and Internet Pubmed, BioMedNet, Medscape and Gateway search-engines.

    PubMed

    Caro-Rojas, Rosa Angela; Eslava-Schmalbach, Javier H

    2005-01-01

    To compare the information obtained from the Medline database using Internet commercial search engines with that obtained from a compact disc (Medline-CD). An agreement study was carried out based on 101 clinical scenarios provided by specialists in internal medicine, pharmacy, gynaecology-obstetrics, surgery and paediatrics. 175 search strategies were employed using the connector AND plus text within quotation marks. The search was limited to 1991-1999. Internet search-engines were selected by common criteria. Identical search strategies were independently applied to and masked from Internet search engines, as well as the Medline-CD. 3,488 articles were obtained using 129 search strategies. Agreement with the Medline-CD was 54% for PubMed, 57% for Gateway, 54% for Medscape and 65% for BioMedNet. The highest agreement rate for a given speciality (paediatrics) was 78.1% for BioMedNet, having greater -/- than +/+ agreement. Even though free access to Medline has encouraged the boom and growth of evidence-based medicine, these results must be considered within the context of which search engine was selected for doing the searches. The Internet search engines studied showed a poor agreement with the Medline-CD, the rate of agreement differing according to speciality, thus significantly affecting searches and their reproducibility. Software designed for conducting Medline database searches, including the Medline-CD, must be standardised and validated.

  14. Complex dynamics of our economic life on different scales: insights from search engine query data.

    PubMed

    Preis, Tobias; Reith, Daniel; Stanley, H Eugene

    2010-12-28

    Search engine query data deliver insight into the behaviour of individuals who are the smallest possible scale of our economic life. Individuals are submitting several hundred million search engine queries around the world each day. We study weekly search volume data for various search terms from 2004 to 2010 that are offered by the search engine Google for scientific use, providing information about our economic life on an aggregated collective level. We ask the question whether there is a link between search volume data and financial market fluctuations on a weekly time scale. Both collective 'swarm intelligence' of Internet users and the group of financial market participants can be regarded as a complex system of many interacting subunits that react quickly to external changes. We find clear evidence that weekly transaction volumes of S&P 500 companies are correlated with weekly search volume of corresponding company names. Furthermore, we apply a recently introduced method for quantifying complex correlations in time series with which we find a clear tendency that search volume time series and transaction volume time series show recurring patterns.

  15. An autonomous organic reaction search engine for chemical reactivity

    NASA Astrophysics Data System (ADS)

    Dragone, Vincenza; Sans, Victor; Henson, Alon B.; Granda, Jaroslaw M.; Cronin, Leroy

    2017-06-01

    The exploration of chemical space for new reactivity, reactions and molecules is limited by the need for separate work-up-separation steps searching for molecules rather than reactivity. Herein we present a system that can autonomously evaluate chemical reactivity within a network of 64 possible reaction combinations and aims for new reactivity, rather than a predefined set of targets. The robotic system combines chemical handling, in-line spectroscopy and real-time feedback and analysis with an algorithm that is able to distinguish and select the most reactive pathways, generating a reaction selection index (RSI) without need for separate work-up or purification steps. This allows the automatic navigation of a chemical network, leading to previously unreported molecules while needing only to do a fraction of the total possible reactions without any prior knowledge of the chemistry. We show the RSI correlates with reactivity and is able to search chemical space using the most reactive pathways.

  16. An autonomous organic reaction search engine for chemical reactivity

    PubMed Central

    Dragone, Vincenza; Sans, Victor; Henson, Alon B.; Granda, Jaroslaw M.; Cronin, Leroy

    2017-01-01

    The exploration of chemical space for new reactivity, reactions and molecules is limited by the need for separate work-up-separation steps searching for molecules rather than reactivity. Herein we present a system that can autonomously evaluate chemical reactivity within a network of 64 possible reaction combinations and aims for new reactivity, rather than a predefined set of targets. The robotic system combines chemical handling, in-line spectroscopy and real-time feedback and analysis with an algorithm that is able to distinguish and select the most reactive pathways, generating a reaction selection index (RSI) without need for separate work-up or purification steps. This allows the automatic navigation of a chemical network, leading to previously unreported molecules while needing only to do a fraction of the total possible reactions without any prior knowledge of the chemistry. We show the RSI correlates with reactivity and is able to search chemical space using the most reactive pathways. PMID:28598440

  17. An autonomous organic reaction search engine for chemical reactivity.

    PubMed

    Dragone, Vincenza; Sans, Victor; Henson, Alon B; Granda, Jaroslaw M; Cronin, Leroy

    2017-06-09

    The exploration of chemical space for new reactivity, reactions and molecules is limited by the need for separate work-up-separation steps searching for molecules rather than reactivity. Herein we present a system that can autonomously evaluate chemical reactivity within a network of 64 possible reaction combinations and aims for new reactivity, rather than a predefined set of targets. The robotic system combines chemical handling, in-line spectroscopy and real-time feedback and analysis with an algorithm that is able to distinguish and select the most reactive pathways, generating a reaction selection index (RSI) without need for separate work-up or purification steps. This allows the automatic navigation of a chemical network, leading to previously unreported molecules while needing only to do a fraction of the total possible reactions without any prior knowledge of the chemistry. We show the RSI correlates with reactivity and is able to search chemical space using the most reactive pathways.

  18. CellAtlasSearch: a scalable search engine for single cells.

    PubMed

    Srivastava, Divyanshu; Iyer, Arvind; Kumar, Vibhor; Sengupta, Debarka

    2018-05-21

    Owing to the advent of high throughput single cell transcriptomics, past few years have seen exponential growth in production of gene expression data. Recently efforts have been made by various research groups to homogenize and store single cell expression from a large number of studies. The true value of this ever increasing data deluge can be unlocked by making it searchable. To this end, we propose CellAtlasSearch, a novel search architecture for high dimensional expression data, which is massively parallel as well as light-weight, thus infinitely scalable. In CellAtlasSearch, we use a Graphical Processing Unit (GPU) friendly version of Locality Sensitive Hashing (LSH) for unmatched speedup in data processing and query. Currently, CellAtlasSearch features over 300 000 reference expression profiles including both bulk and single-cell data. It enables the user query individual single cell transcriptomes and finds matching samples from the database along with necessary meta information. CellAtlasSearch aims to assist researchers and clinicians in characterizing unannotated single cells. It also facilitates noise free, low dimensional representation of single-cell expression profiles by projecting them on a wide variety of reference samples. The web-server is accessible at: http://www.cellatlassearch.com.

  19. `Googling' Terrorists: Are Northern Irish Terrorists Visible on Internet Search Engines?

    NASA Astrophysics Data System (ADS)

    Reilly, P.

    In this chapter, the analysis suggests that Northern Irish terrorists are not visible on Web search engines when net users employ conventional Internet search techniques. Editors of mass media organisations traditionally have had the ability to decide whether a terrorist atrocity is `newsworthy,' controlling the `oxygen' supply that sustains all forms of terrorism. This process, also known as `gatekeeping,' is often influenced by the norms of social responsibility, or alternatively, with regard to the interests of the advertisers and corporate sponsors that sustain mass media organisations. The analysis presented in this chapter suggests that Internet search engines can also be characterised as `gatekeepers,' albeit without the ability to shape the content of Websites before it reaches net users. Instead, Internet search engines give priority retrieval to certain Websites within their directory, pointing net users towards these Websites rather than others on the Internet. Net users are more likely to click on links to the more `visible' Websites on Internet search engine directories, these sites invariably being the highest `ranked' in response to a particular search query. A number of factors including the design of the Website and the number of links to external sites determine the `visibility' of a Website on Internet search engines. The study suggests that Northern Irish terrorists and their sympathisers are unlikely to achieve a greater degree of `visibility' online than they enjoy in the conventional mass media through the perpetration of atrocities. Although these groups may have a greater degree of freedom on the Internet to publicise their ideologies, they are still likely to be speaking to the converted or members of the press. Although it is easier to locate Northern Irish terrorist organisations on Internet search engines by linking in via ideology, ideological description searches, such as `Irish Republican' and `Ulster Loyalist,' are more likely to

  20. Development of Health Information Search Engine Based on Metadata and Ontology

    PubMed Central

    Song, Tae-Min; Jin, Dal-Lae

    2014-01-01

    Objectives The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Methods Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. Results A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Conclusions Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers. PMID:24872907

  1. Development of health information search engine based on metadata and ontology.

    PubMed

    Song, Tae-Min; Park, Hyeoun-Ae; Jin, Dal-Lae

    2014-04-01

    The aim of the study was to develop a metadata and ontology-based health information search engine ensuring semantic interoperability to collect and provide health information using different application programs. Health information metadata ontology was developed using a distributed semantic Web content publishing model based on vocabularies used to index the contents generated by the information producers as well as those used to search the contents by the users. Vocabulary for health information ontology was mapped to the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT), and a list of about 1,500 terms was proposed. The metadata schema used in this study was developed by adding an element describing the target audience to the Dublin Core Metadata Element Set. A metadata schema and an ontology ensuring interoperability of health information available on the internet were developed. The metadata and ontology-based health information search engine developed in this study produced a better search result compared to existing search engines. Health information search engine based on metadata and ontology will provide reliable health information to both information producer and information consumers.

  2. An assessment of the visibility of MeSH-indexed medical web catalogs through search engines.

    PubMed Central

    Zweigenbaum, P.; Darmoni, S. J.; Grabar, N.; Douyère, M.; Benichou, J.

    2002-01-01

    Manually indexed Internet health catalogs such as CliniWeb or CISMeF provide resources for retrieving high-quality health information. Users of these quality-controlled subject gateways are most often referred to them by general search engines such as Google, AltaVista, etc. This raises several questions, among which the following: what is the relative visibility of medical Internet catalogs through search engines? This study addresses this issue by measuring and comparing the visibility of six major, MeSH-indexed health catalogs through four different search engines (AltaVista, Google, Lycos, Northern Light) in two languages (English and French). Over half a million queries were sent to the search engines; for most of these search engines, according to our measures at the time the queries were sent, the most visible catalog for English MeSH terms was CliniWeb and the most visible one for French MeSH terms was CISMeF. PMID:12463965

  3. A Real-Time All-Atom Structural Search Engine for Proteins

    PubMed Central

    Gonzalez, Gabriel; Hannigan, Brett; DeGrado, William F.

    2014-01-01

    Protein designers use a wide variety of software tools for de novo design, yet their repertoire still lacks a fast and interactive all-atom search engine. To solve this, we have built the Suns program: a real-time, atomic search engine integrated into the PyMOL molecular visualization system. Users build atomic-level structural search queries within PyMOL and receive a stream of search results aligned to their query within a few seconds. This instant feedback cycle enables a new “designability”-inspired approach to protein design where the designer searches for and interactively incorporates native-like fragments from proven protein structures. We demonstrate the use of Suns to interactively build protein motifs, tertiary interactions, and to identify scaffolds compatible with hot-spot residues. The official web site and installer are located at http://www.degradolab.org/suns/ and the source code is hosted at https://github.com/godotgildor/Suns (PyMOL plugin, BSD license), https://github.com/Gabriel439/suns-cmd (command line client, BSD license), and https://github.com/Gabriel439/suns-search (search engine server, GPLv2 license). PMID:25079944

  4. The search engine manipulation effect (SEME) and its possible impact on the outcomes of elections

    PubMed Central

    Epstein, Robert; Robertson, Ronald E.

    2015-01-01

    Internet search rankings have a significant impact on consumer choices, mainly because users trust and choose higher-ranked results more than lower-ranked results. Given the apparent power of search rankings, we asked whether they could be manipulated to alter the preferences of undecided voters in democratic elections. Here we report the results of five relevant double-blind, randomized controlled experiments, using a total of 4,556 undecided voters representing diverse demographic characteristics of the voting populations of the United States and India. The fifth experiment is especially notable in that it was conducted with eligible voters throughout India in the midst of India’s 2014 Lok Sabha elections just before the final votes were cast. The results of these experiments demonstrate that (i) biased search rankings can shift the voting preferences of undecided voters by 20% or more, (ii) the shift can be much higher in some demographic groups, and (iii) search ranking bias can be masked so that people show no awareness of the manipulation. We call this type of influence, which might be applicable to a variety of attitudes and beliefs, the search engine manipulation effect. Given that many elections are won by small margins, our results suggest that a search engine company has the power to influence the results of a substantial number of elections with impunity. The impact of such manipulations would be especially large in countries dominated by a single search engine company. PMID:26243876

  5. The search engine manipulation effect (SEME) and its possible impact on the outcomes of elections.

    PubMed

    Epstein, Robert; Robertson, Ronald E

    2015-08-18

    Internet search rankings have a significant impact on consumer choices, mainly because users trust and choose higher-ranked results more than lower-ranked results. Given the apparent power of search rankings, we asked whether they could be manipulated to alter the preferences of undecided voters in democratic elections. Here we report the results of five relevant double-blind, randomized controlled experiments, using a total of 4,556 undecided voters representing diverse demographic characteristics of the voting populations of the United States and India. The fifth experiment is especially notable in that it was conducted with eligible voters throughout India in the midst of India's 2014 Lok Sabha elections just before the final votes were cast. The results of these experiments demonstrate that (i) biased search rankings can shift the voting preferences of undecided voters by 20% or more, (ii) the shift can be much higher in some demographic groups, and (iii) search ranking bias can be masked so that people show no awareness of the manipulation. We call this type of influence, which might be applicable to a variety of attitudes and beliefs, the search engine manipulation effect. Given that many elections are won by small margins, our results suggest that a search engine company has the power to influence the results of a substantial number of elections with impunity. The impact of such manipulations would be especially large in countries dominated by a single search engine company.

  6. IdentiPy: An Extensible Search Engine for Protein Identification in Shotgun Proteomics.

    PubMed

    Levitsky, Lev I; Ivanov, Mark V; Lobas, Anna A; Bubis, Julia A; Tarasova, Irina A; Solovyeva, Elizaveta M; Pridatchenko, Marina L; Gorshkov, Mikhail V

    2018-06-18

    We present an open-source, extensible search engine for shotgun proteomics. Implemented in Python programming language, IdentiPy shows competitive processing speed and sensitivity compared with the state-of-the-art search engines. It is equipped with a user-friendly web interface, IdentiPy Server, enabling the use of a single server installation accessed from multiple workstations. Using a simplified version of X!Tandem scoring algorithm and its novel "autotune" feature, IdentiPy outperforms the popular alternatives on high-resolution data sets. Autotune adjusts the search parameters for the particular data set, resulting in improved search efficiency and simplifying the user experience. IdentiPy with the autotune feature shows higher sensitivity compared with the evaluated search engines. IdentiPy Server has built-in postprocessing and protein inference procedures and provides graphic visualization of the statistical properties of the data set and the search results. It is open-source and can be freely extended to use third-party scoring functions or processing algorithms and allows customization of the search workflow for specialized applications.

  7. A real-time all-atom structural search engine for proteins.

    PubMed

    Gonzalez, Gabriel; Hannigan, Brett; DeGrado, William F

    2014-07-01

    Protein designers use a wide variety of software tools for de novo design, yet their repertoire still lacks a fast and interactive all-atom search engine. To solve this, we have built the Suns program: a real-time, atomic search engine integrated into the PyMOL molecular visualization system. Users build atomic-level structural search queries within PyMOL and receive a stream of search results aligned to their query within a few seconds. This instant feedback cycle enables a new "designability"-inspired approach to protein design where the designer searches for and interactively incorporates native-like fragments from proven protein structures. We demonstrate the use of Suns to interactively build protein motifs, tertiary interactions, and to identify scaffolds compatible with hot-spot residues. The official web site and installer are located at http://www.degradolab.org/suns/ and the source code is hosted at https://github.com/godotgildor/Suns (PyMOL plugin, BSD license), https://github.com/Gabriel439/suns-cmd (command line client, BSD license), and https://github.com/Gabriel439/suns-search (search engine server, GPLv2 license).

  8. Features: Real-Time Adaptive Feature and Document Learning for Web Search.

    ERIC Educational Resources Information Center

    Chen, Zhixiang; Meng, Xiannong; Fowler, Richard H.; Zhu, Binhai

    2001-01-01

    Describes Features, an intelligent Web search engine that is able to perform real-time adaptive feature (i.e., keyword) and document learning. Explains how Features learns from users' document relevance feedback and automatically extracts and suggests indexing keywords relevant to a search query, and learns from users' keyword relevance feedback…

  9. Broadening the search for minority science and engineering doctoral starts

    NASA Astrophysics Data System (ADS)

    Brazziel, William E.; Brazziel, Marian E.

    1995-06-01

    This analysis looked at doctorate completion in science and engineering (S&E) by underrepresented minorities: blacks, Hispanics and Indian Americans. These are the groups we must increasingly depend upon to make up for shortfalls in science and engineering doctorate production among American citizens. These shortfalls derive from truncated birth rates among white people, for the most part. The analysis answered several questions officials will need to know the answers to if we are to plan effectively to develop the talents of these individuals. Specifically, the National Science Foundation asked us to look at the feasibility of involving nontraditional minority science and engineering graduates (baccalaureates at 25+) as doctoral starts, along with minority S&E graduates who had taken jobs with corporations to pay off student loans and military personnel involved in S&E study and S&E work (see NSF report of research under grant SED-9107756). We found that nontraditional minority S&E doctorate recipients matched their traditional counterparts in elapsed time to degree and similar indicators. They had less in the way of support for doctoral study, however. We found that minority S&E graduates who took jobs in corporations were keenly interested in returning to campus to complete degrees. We also found that many bright minority youngsters are studying S&E subjects in the Community College of the Air Force and in U.S. Army SOC colleges. Some have enrolled in baccalaureate programs on university campuses and plan to continue on to the PhD. We concluded that money is important in tapping these talent pools to make up for the demographically driven shortfalls discussed above.

  10. Querying archetype-based EHRs by search ontology-based XPath engineering.

    PubMed

    Kropf, Stefan; Uciteli, Alexandr; Schierle, Katrin; Krücken, Peter; Denecke, Kerstin; Herre, Heinrich

    2018-05-11

    Legacy data and new structured data can be stored in a standardized format as XML-based EHRs on XML databases. Querying documents on these databases is crucial for answering research questions. Instead of using free text searches, that lead to false positive results, the precision can be increased by constraining the search to certain parts of documents. A search ontology-based specification of queries on XML documents defines search concepts and relates them to parts in the XML document structure. Such query specification method is practically introduced and evaluated by applying concrete research questions formulated in natural language on a data collection for information retrieval purposes. The search is performed by search ontology-based XPath engineering that reuses ontologies and XML-related W3C standards. The key result is that the specification of research questions can be supported by the usage of search ontology-based XPath engineering. A deeper recognition of entities and a semantic understanding of the content is necessary for a further improvement of precision and recall. Key limitation is that the application of the introduced process requires skills in ontology and software development. In future, the time consuming ontology development could be overcome by implementing a new clinical role: the clinical ontologist. The introduced Search Ontology XML extension connects Search Terms to certain parts in XML documents and enables an ontology-based definition of queries. Search ontology-based XPath engineering can support research question answering by the specification of complex XPath expressions without deep syntax knowledge about XPaths.

  11. Seasonal trends in sleep-disordered breathing: evidence from Internet search engine query data.

    PubMed

    Ingram, David G; Matthews, Camilla K; Plante, David T

    2015-03-01

    The primary aim of the current study was to test the hypothesis that there is a seasonal component to snoring and obstructive sleep apnea (OSA) through the use of Google search engine query data. Internet search engine query data were retrieved from Google Trends from January 2006 to December 2012. Monthly normalized search volume was obtained over that 7-year period in the USA and Australia for the following search terms: "snoring" and "sleep apnea". Seasonal effects were investigated by fitting cosinor regression models. In addition, the search terms "snoring children" and "sleep apnea children" were evaluated to examine seasonal effects in pediatric populations. Statistically significant seasonal effects were found using cosinor analysis in both USA and Australia for "snoring" (p < 0.00001 for both countries). Similarly, seasonal patterns were observed for "sleep apnea" in the USA (p = 0.001); however, cosinor analysis was not significant for this search term in Australia (p = 0.13). Seasonal patterns for "snoring children" and "sleep apnea children" were observed in the USA (p = 0.002 and p < 0.00001, respectively), with insufficient search volume to examine these search terms in Australia. All searches peaked in the winter or early spring in both countries, with the magnitude of seasonal effect ranging from 5 to 50 %. Our findings indicate that there are significant seasonal trends for both snoring and sleep apnea internet search engine queries, with a peak in the winter and early spring. Further research is indicated to determine the mechanisms underlying these findings, whether they have clinical impact, and if they are associated with other comorbid medical conditions that have similar patterns of seasonal exacerbation.

  12. GEOmetadb: powerful alternative search engine for the Gene Expression Omnibus

    PubMed Central

    Zhu, Yuelin; Davis, Sean; Stephens, Robert; Meltzer, Paul S.; Chen, Yidong

    2008-01-01

    The NCBI Gene Expression Omnibus (GEO) represents the largest public repository of microarray data. However, finding data in GEO can be challenging. We have developed GEOmetadb in an attempt to make querying the GEO metadata both easier and more powerful. All GEO metadata records as well as the relationships between them are parsed and stored in a local MySQL database. A powerful, flexible web search interface with several convenient utilities provides query capabilities not available via NCBI tools. In addition, a Bioconductor package, GEOmetadb that utilizes a SQLite export of the entire GEOmetadb database is also available, rendering the entire GEO database accessible with full power of SQL-based queries from within R. Availability: The web interface and SQLite databases available at http://gbnci.abcc.ncifcrf.gov/geo/. The Bioconductor package is available via the Bioconductor project. The corresponding MATLAB implementation is also available at the same website. Contact: yidong@mail.nih.gov PMID:18842599

  13. Start Your Search Engines. Part One: Taming Google--and Other Tips to Master Web Searches

    ERIC Educational Resources Information Center

    Adam, Anna; Mowers, Helen

    2008-01-01

    There are a lot of useful tools on the Web, all those social applications, and the like. Still most people go online for one thing--to perform a basic search. For most fact-finding missions, the Web is there. But--as media specialists well know--the sheer wealth of online information can hamper efforts to focus on a few reliable references.…

  14. Global polar geospatial information service retrieval based on search engine and ontology reasoning

    USGS Publications Warehouse

    Chen, Nengcheng; E, Dongcheng; Di, Liping; Gong, Jianya; Chen, Zeqiang

    2007-01-01

    In order to improve the access precision of polar geospatial information service on web, a new methodology for retrieving global spatial information services based on geospatial service search and ontology reasoning is proposed, the geospatial service search is implemented to find the coarse service from web, the ontology reasoning is designed to find the refined service from the coarse service. The proposed framework includes standardized distributed geospatial web services, a geospatial service search engine, an extended UDDI registry, and a multi-protocol geospatial information service client. Some key technologies addressed include service discovery based on search engine and service ontology modeling and reasoning in the Antarctic geospatial context. Finally, an Antarctica multi protocol OWS portal prototype based on the proposed methodology is introduced.

  15. Resonance Search for a Heavy Photon in the 2015 Engineering Run Data of the Heavy Photon Search Experiment

    NASA Astrophysics Data System (ADS)

    Moreno, Omar; Heavy Photon Search Collaboration

    2017-01-01

    The Heavy Photon Search (HPS) experiment at Jefferson Lab is searching for a new U(1) vector boson (``heavy photon'',``dark photon'' or A') in the mass range of 20-500 MeV/c2. An A' in this mass range is theoretically favorable and may also mediate dark matter interactions. The A' couples to the ordinary photon through kinetic mixing, which induces their coupling to electric charge. Since heavy photons couple to electrons, they can be produced through a process analogous to bremsstrahlung, subsequently decaying to an e+e- , which can be observed as a narrow resonance above the dominant QED trident background. For suitably small couplings, heavy photons travel detectable distances before decaying, providing a second signature. Using the CEBAF electron beam at Jefferson Lab incident on a thin tungsten target, along with a compact, large acceptance forward spectrometer consisting of a silicon vertex tracker and lead tungstate electromagnetic calorimeter, HPS is accessing unexplored regions in the mass-coupling phase space. The HPS engineering run took place in spring of 2015 using a 1.056 GeV, 50 nA beam and collected 1165 nb-1 (7.29 mC) of data. This talk will present the results of a resonance search for a heavy photon using the engineering run data.

  16. Evaluating a federated medical search engine: tailoring the methodology and reporting the evaluation outcomes.

    PubMed

    Saparova, D; Belden, J; Williams, J; Richardson, B; Schuster, K

    2014-01-01

    Federated medical search engines are health information systems that provide a single access point to different types of information. Their efficiency as clinical decision support tools has been demonstrated through numerous evaluations. Despite their rigor, very few of these studies report holistic evaluations of medical search engines and even fewer base their evaluations on existing evaluation frameworks. To evaluate a federated medical search engine, MedSocket, for its potential net benefits in an established clinical setting. This study applied the Human, Organization, and Technology (HOT-fit) evaluation framework in order to evaluate MedSocket. The hierarchical structure of the HOT-factors allowed for identification of a combination of efficiency metrics. Human fit was evaluated through user satisfaction and patterns of system use; technology fit was evaluated through the measurements of time-on-task and the accuracy of the found answers; and organization fit was evaluated from the perspective of system fit to the existing organizational structure. Evaluations produced mixed results and suggested several opportunities for system improvement. On average, participants were satisfied with MedSocket searches and confident in the accuracy of retrieved answers. However, MedSocket did not meet participants' expectations in terms of download speed, access to information, and relevance of the search results. These mixed results made it necessary to conclude that in the case of MedSocket, technology fit had a significant influence on the human and organization fit. Hence, improving technological capabilities of the system is critical before its net benefits can become noticeable. The HOT-fit evaluation framework was instrumental in tailoring the methodology for conducting a comprehensive evaluation of the search engine. Such multidimensional evaluation of the search engine resulted in recommendations for system improvement.

  17. What Major Search Engines Like Google, Yahoo and Bing Need to Know about Teachers in the UK?

    ERIC Educational Resources Information Center

    Seyedarabi, Faezeh

    2014-01-01

    This article briefly outlines the current major search engines' approach to teachers' web searching. The aim of this article is to make Web searching easier for teachers when searching for relevant online teaching materials, in general, and UK teacher practitioners at primary, secondary and post-compulsory levels, in particular. Therefore, major…

  18. Global Change Master Directory (GCMD) Keywords and Their Applications in Earth Science Data Discovery

    NASA Astrophysics Data System (ADS)

    Aleman, A.

    2017-12-01

    This presentation will provide an overview and discussion of the Global Change Master Directory (GCMD) Keywords and their applications in Earth science data discovery. The GCMD Keywords are a hierarchical set of controlled keywords covering the Earth science disciplines, including: science keywords, service keywords, data centers, projects, location, data resolution, instruments and platforms. Controlled vocabularies (keywords) help users accurately, consistently and comprehensively categorize their data and also allow for the precise search and subsequent retrieval of data. The GCMD Keywords are a community resource and are developed collaboratively with input from various stakeholders, including GCMD staff, keyword users and metadata providers. The GCMD Keyword Landing Page and GCMD Keyword Community Forum provide access to keyword resources and an area for discussion of topics related to the GCMD Keywords. See https://earthdata.nasa.gov/about/gcmd/global-change-master-directory-gcmd-keywords

  19. Using internet search engines and library catalogs to locate toxicology information.

    PubMed

    Wukovitz, L D

    2001-01-12

    The increasing importance of the Internet demands that toxicologists become aquainted with its resources. To find information, researchers must be able to effectively use Internet search engines, directories, subject-oriented websites, and library catalogs. The article will explain these resources, explore their benefits and weaknesses, and identify skills that help the researcher to improve search results and critically evaluate sources for their relevancy, validity, accuracy, and timeliness.

  20. Estimating search engine index size variability: a 9-year longitudinal study.

    PubMed

    van den Bosch, Antal; Bogers, Toine; de Kunder, Maurice

    One of the determining factors of the quality of Web search engines is the size of their index. In addition to its influence on search result quality, the size of the indexed Web can also tell us something about which parts of the WWW are directly accessible to the everyday user. We propose a novel method of estimating the size of a Web search engine's index by extrapolating from document frequencies of words observed in a large static corpus of Web pages. In addition, we provide a unique longitudinal perspective on the size of Google and Bing's indices over a nine-year period, from March 2006 until January 2015. We find that index size estimates of these two search engines tend to vary dramatically over time, with Google generally possessing a larger index than Bing. This result raises doubts about the reliability of previous one-off estimates of the size of the indexed Web. We find that much, if not all of this variability can be explained by changes in the indexing and ranking infrastructure of Google and Bing. This casts further doubt on whether Web search engines can be used reliably for cross-sectional webometric studies.

  1. Web Spam, Social Propaganda and the Evolution of Search Engine Rankings

    NASA Astrophysics Data System (ADS)

    Metaxas, Panagiotis Takis

    Search Engines have greatly influenced the way we experience the web. Since the early days of the web, users have been relying on them to get informed and make decisions. When the web was relatively small, web directories were built and maintained using human experts to screen and categorize pages according to their characteristics. By the mid 1990's, however, it was apparent that the human expert model of categorizing web pages does not scale. The first search engines appeared and they have been evolving ever since, taking over the role that web directories used to play.

  2. Collection of Medical Original Data with Search Engine for Decision Support.

    PubMed

    Orthuber, Wolfgang

    2016-01-01

    Medicine is becoming more and more complex and humans can capture total medical knowledge only partially. For specific access a high resolution search engine is demonstrated, which allows besides conventional text search also search of precise quantitative data of medical findings, therapies and results. Users can define metric spaces ("Domain Spaces", DSs) with all searchable quantitative data ("Domain Vectors", DSs). An implementation of the search engine is online in http://numericsearch.com. In future medicine the doctor could make first a rough diagnosis and check which fine diagnostics (quantitative data) colleagues had collected in such a case. Then the doctor decides about fine diagnostics and results are sent (half automatically) to the search engine which filters a group of patients which best fits to these data. In this specific group variable therapies can be checked with associated therapeutic results, like in an individual scientific study for the current patient. The statistical (anonymous) results could be used for specific decision support. Reversely the therapeutic decision (in the best case with later results) could be used to enhance the collection of precise pseudonymous medical original data which is used for better and better statistical (anonymous) search results.

  3. Beyond Keyword Search: Representations and Models for Personalization

    DTIC Science & Technology

    2013-01-29

    model of information flow in the blogosphere. Blogscope is intended to be an analysis and visualization tool for the blogosphere. Unlike us, they are...Compressive nonlinearity in the hair bundle’s active response to mechanical stimulation 2001 98 14386-14391 10123 In vivo evidence for a cochlear ...p)ppGpp in plant signaling 2000 97 3747-3752 12176 Cochlear mechanisms from a phylogenetic viewpoint 2000 97 11736-11743 12270 Putting ion channels

  4. Beyond Keyword Search: Representations and Models for Personalization

    ERIC Educational Resources Information Center

    El-Arini, Khalid

    2013-01-01

    We live in an era of information overload. From online news to online shopping to scholarly research, we are inundated with a torrent of information on a daily basis. With our limited time, money and attention, we often struggle to extract actionable knowledge from this deluge of data. A common approach for addressing this challenge is…

  5. A Multimodal Search Engine for Medical Imaging Studies.

    PubMed

    Pinho, Eduardo; Godinho, Tiago; Valente, Frederico; Costa, Carlos

    2017-02-01

    The use of digital medical imaging systems in healthcare institutions has increased significantly, and the large amounts of data in these systems have led to the conception of powerful support tools: recent studies on content-based image retrieval (CBIR) and multimodal information retrieval in the field hold great potential in decision support, as well as for addressing multiple challenges in healthcare systems, such as computer-aided diagnosis (CAD). However, the subject is still under heavy research, and very few solutions have become part of Picture Archiving and Communication Systems (PACS) in hospitals and clinics. This paper proposes an extensible platform for multimodal medical image retrieval, integrated in an open-source PACS software with profile-based CBIR capabilities. In this article, we detail a technical approach to the problem by describing its main architecture and each sub-component, as well as the available web interfaces and the multimodal query techniques applied. Finally, we assess our implementation of the engine with computational performance benchmarks.

  6. Automatic Keyword Extraction from Individual Documents

    SciTech Connect

    Rose, Stuart J.; Engel, David W.; Cramer, Nicholas O.

    2010-05-03

    This paper introduces a novel and domain-independent method for automatically extracting keywords, as sequences of one or more words, from individual documents. We describe the method’s configuration parameters and algorithm, and present an evaluation on a benchmark corpus of technical abstracts. We also present a method for generating lists of stop words for specific corpora and domains, and evaluate its ability to improve keyword extraction on the benchmark corpus. Finally, we apply our method of automatic keyword extraction to a corpus of news articles and define metrics for characterizing the exclusivity, essentiality, and generality of extracted keywords within a corpus.

  7. Which Search Engine Is the Most Used One among University Students?

    ERIC Educational Resources Information Center

    Cavus, Nadire; Alpan, Kezban

    2010-01-01

    The importance of information is increasing in the information age that we are living in with internet becoming the major information resource for people with rapidly increasing number of documents. This situation makes finding information on the internet without web search engines impossible. The aim of the study is revealing most widely used…

  8. A cognitive evaluation of four online search engines for answering definitional questions posed by physicians.

    PubMed

    Yu, Hong; Kaufman, David

    2007-01-01

    The Internet is having a profound impact on physicians' medical decision making. One recent survey of 277 physicians showed that 72% of physicians regularly used the Internet to research medical information and 51% admitted that information from web sites influenced their clinical decisions. This paper describes the first cognitive evaluation of four state-of-the-art Internet search engines: Google (i.e., Google and Scholar.Google), MedQA, Onelook, and PubMed for answering definitional questions (i.e., questions with the format of "What is X?") posed by physicians. Onelook is a portal for online definitions, and MedQA is a question answering system that automatically generates short texts to answer specific biomedical questions. Our evaluation criteria include quality of answer, ease of use, time spent, and number of actions taken. Our results show that MedQA outperforms Onelook and PubMed in most of the criteria, and that MedQA surpasses Google in time spent and number of actions, two important efficiency criteria. Our results show that Google is the best system for quality of answer and ease of use. We conclude that Google is an effective search engine for medical definitions, and that MedQA exceeds the other search engines in that it provides users direct answers to their questions; while the users of the other search engines have to visit several sites before finding all of the pertinent information.

  9. A study of medical and health queries to web search engines.

    PubMed

    Spink, Amanda; Yang, Yin; Jansen, Jim; Nykanen, Pirrko; Lorence, Daniel P; Ozmutlu, Seda; Ozmutlu, H Cenk

    2004-03-01

    This paper reports findings from an analysis of medical or health queries to different web search engines. We report results: (i). comparing samples of 10000 web queries taken randomly from 1.2 million query logs from the AlltheWeb.com and Excite.com commercial web search engines in 2001 for medical or health queries, (ii). comparing the 2001 findings from Excite and AlltheWeb.com users with results from a previous analysis of medical and health related queries from the Excite Web search engine for 1997 and 1999, and (iii). medical or health advice-seeking queries beginning with the word 'should'. Findings suggest: (i). a small percentage of web queries are medical or health related, (ii). the top five categories of medical or health queries were: general health, weight issues, reproductive health and puberty, pregnancy/obstetrics, and human relationships, and (iii). over time, the medical and health queries may have declined as a proportion of all web queries, as the use of specialized medical/health websites and e-commerce-related queries has increased. Findings provide insights into medical and health-related web querying and suggests some implications for the use of the general web search engines when seeking medical/health information.

  10. A novel algorithm for validating peptide identification from a shotgun proteomics search engine.

    PubMed

    Jian, Ling; Niu, Xinnan; Xia, Zhonghang; Samir, Parimal; Sumanasekera, Chiranthani; Mu, Zheng; Jennings, Jennifer L; Hoek, Kristen L; Allos, Tara; Howard, Leigh M; Edwards, Kathryn M; Weil, P Anthony; Link, Andrew J

    2013-03-01

    Liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS) has revolutionized the proteomics analysis of complexes, cells, and tissues. In a typical proteomic analysis, the tandem mass spectra from a LC-MS/MS experiment are assigned to a peptide by a search engine that compares the experimental MS/MS peptide data to theoretical peptide sequences in a protein database. The peptide spectra matches are then used to infer a list of identified proteins in the original sample. However, the search engines often fail to distinguish between correct and incorrect peptides assignments. In this study, we designed and implemented a novel algorithm called De-Noise to reduce the number of incorrect peptide matches and maximize the number of correct peptides at a fixed false discovery rate using a minimal number of scoring outputs from the SEQUEST search engine. The novel algorithm uses a three-step process: data cleaning, data refining through a SVM-based decision function, and a final data refining step based on proteolytic peptide patterns. Using proteomics data generated on different types of mass spectrometers, we optimized the De-Noise algorithm on the basis of the resolution and mass accuracy of the mass spectrometer employed in the LC-MS/MS experiment. Our results demonstrate De-Noise improves peptide identification compared to other methods used to process the peptide sequence matches assigned by SEQUEST. Because De-Noise uses a limited number of scoring attributes, it can be easily implemented with other search engines.

  11. Uncovering the Hidden Web, Part I: Finding What the Search Engines Don't. ERIC Digest.

    ERIC Educational Resources Information Center

    Mardis, Marcia

    Currently, the World Wide Web contains an estimated 7.4 million sites (OCLC, 2001). Yet even the most experienced searcher, using the most robust search engines, can access only about 16% of these pages (Dahn, 2001). The other 84% of the publicly available information on the Web is referred to as the "hidden,""invisible," or…

  12. A New Archive and Internet Search Engine May Change the Nature of On-Line Research.

    ERIC Educational Resources Information Center

    Selingo, Jeffrey

    1998-01-01

    In the process of trying to preserve Internet history by archiving it, a company has developed a powerful Internet search engine that provides information on Web site usage patterns, which can act as a relatively objective source of information about information sources and can link sources that a researcher might otherwise miss. However, issues…

  13. Web Search Engines: Key To Locating Information for All Users or Only the Cognoscenti?

    ERIC Educational Resources Information Center

    Tomaiuolo, Nicholas G.; Packer, Joan G.

    This paper describes a study that attempted to ascertain the degree of success that undergraduates and graduate students, with varying levels of experience using the World Wide Web and Web search engines, and without librarian instruction or intervention, had in locating relevant material on specific topics furnished by the investigators. Because…

  14. Andromeda: a peptide search engine integrated into the MaxQuant environment.

    PubMed

    Cox, Jürgen; Neuhauser, Nadin; Michalski, Annette; Scheltema, Richard A; Olsen, Jesper V; Mann, Matthias

    2011-04-01

    A key step in mass spectrometry (MS)-based proteomics is the identification of peptides in sequence databases by their fragmentation spectra. Here we describe Andromeda, a novel peptide search engine using a probabilistic scoring model. On proteome data, Andromeda performs as well as Mascot, a widely used commercial search engine, as judged by sensitivity and specificity analysis based on target decoy searches. Furthermore, it can handle data with arbitrarily high fragment mass accuracy, is able to assign and score complex patterns of post-translational modifications, such as highly phosphorylated peptides, and accommodates extremely large databases. The algorithms of Andromeda are provided. Andromeda can function independently or as an integrated search engine of the widely used MaxQuant computational proteomics platform and both are freely available at www.maxquant.org. The combination enables analysis of large data sets in a simple analysis workflow on a desktop computer. For searching individual spectra Andromeda is also accessible via a web server. We demonstrate the flexibility of the system by implementing the capability to identify cofragmented peptides, significantly improving the total number of identified peptides.

  15. Adding a Visualization Feature to Web Search Engines: It’s Time

    SciTech Connect

    Wong, Pak C.

    Since the first world wide web (WWW) search engine quietly entered our lives in 1994, the “information need” behind web searching has rapidly grown into a multi-billion dollar business that dominates the internet landscape, drives e-commerce traffic, propels global economy, and affects the lives of the whole human race. Today’s search engines are faster, smarter, and more powerful than those released just a few years ago. With the vast investment pouring into research and development by leading web technology providers and the intense emotion behind corporate slogans such as “win the web” or “take back the web,” I can’t helpmore » but ask why are we still using the very same “text-only” interface that was used 13 years ago to browse our search engine results pages (SERPs)? Why has the SERP interface technology lagged so far behind in the web evolution when the corresponding search technology has advanced so rapidly? In this article I explore some current SERP interface issues, suggest a simple but practical visual-based interface design approach, and argue why a visual approach can be a strong candidate for tomorrow’s SERP interface.« less

  16. Global trends in the awareness of sepsis: insights from search engine data between 2012 and 2017.

    PubMed

    Jabaley, Craig S; Blum, James M; Groff, Robert F; O'Reilly-Shah, Vikas N

    2018-01-17

    Sepsis is an established global health priority with high mortality that can be curtailed through early recognition and intervention; as such, efforts to raise awareness are potentially impactful and increasingly common. We sought to characterize trends in the awareness of sepsis by examining temporal, geographic, and other changes in search engine utilization for sepsis information-seeking online. Using time series analyses and mixed descriptive methods, we retrospectively analyzed publicly available global usage data reported by Google Trends (Google, Palo Alto, CA, USA) concerning web searches for the topic of sepsis between 24 June 2012 and 24 June 2017. Google Trends reports aggregated and de-identified usage data for its search products, including interest over time, interest by region, and details concerning the popularity of related queries where applicable. Outlying epochs of search activity were identified using autoregressive integrated moving average modeling with transfer functions. We then identified awareness campaigns and news media coverage that correlated with epochs of significantly heightened search activity. A second-order autoregressive model with transfer functions was specified following preliminary outlier analysis. Nineteen significant outlying epochs above the modeled baseline were identified in the final analysis that correlated with 14 awareness and news media events. Our model demonstrated that the baseline level of search activity increased in a nonlinear fashion. A recurrent cyclic increase in search volume beginning in 2012 was observed that correlates with World Sepsis Day. Numerous other awareness and media events were correlated with outlying epochs. The average worldwide search volume for sepsis was less than that of influenza, myocardial infarction, and stroke. Analyzing aggregate search engine utilization data has promise as a mechanism to measure the impact of awareness efforts. Heightened information-seeking about sepsis

  17. Eczema, Atopic Dermatitis, or Atopic Eczema: Analysis of Global Search Engine Trends.

    PubMed

    Xu, Shuai; Thyssen, Jacob P; Paller, Amy S; Silverberg, Jonathan I

    The lack of standardized nomenclature for atopic dermatitis (AD) creates challenges for scientific communication, patient education, and advocacy. We sought to determine the relative popularity of the terms eczema, AD, and atopic eczema (AE) using global search engine volumes. A retrospective analysis of average monthly search volumes from 2014 to 2016 of Google, Bing/Yahoo, and Baidu was performed for eczema, AD, and AE in English and 37 other languages. Google Trends was used to determine the relative search popularity of each term from 2006 to 2016 in English and the top foreign languages, German, Turkish, Russian, and Japanese. Overall, eczema accounted for 1.5 million monthly searches (84%) compared with 247 000 searches for AD (14%) and 44 000 searches for AE (2%). For English language, eczema accounted for 93% of searches compared with 6% for AD and 1% for AE. Search popularity for eczema increased from 2006 to 2016 but remained stable for AD and AE. Given the ambiguity of the term eczema, we recommend the universal use of the next most popular term, AD.

  18. GeneView: a comprehensive semantic search engine for PubMed.

    PubMed

    Thomas, Philippe; Starlinger, Johannes; Vowinkel, Alexander; Arzt, Sebastian; Leser, Ulf

    2012-07-01

    Research results are primarily published in scientific literature and curation efforts cannot keep up with the rapid growth of published literature. The plethora of knowledge remains hidden in large text repositories like MEDLINE. Consequently, life scientists have to spend a great amount of time searching for specific information. The enormous ambiguity among most names of biomedical objects such as genes, chemicals and diseases often produces too large and unspecific search results. We present GeneView, a semantic search engine for biomedical knowledge. GeneView is built upon a comprehensively annotated version of PubMed abstracts and openly available PubMed Central full texts. This semi-structured representation of biomedical texts enables a number of features extending classical search engines. For instance, users may search for entities using unique database identifiers or they may rank documents by the number of specific mentions they contain. Annotation is performed by a multitude of state-of-the-art text-mining tools for recognizing mentions from 10 entity classes and for identifying protein-protein interactions. GeneView currently contains annotations for >194 million entities from 10 classes for ∼21 million citations with 271,000 full text bodies. GeneView can be searched at http://bc3.informatik.hu-berlin.de/.

  19. GeNemo: a search engine for web-based functional genomic data.

    PubMed

    Zhang, Yongqing; Cao, Xiaoyi; Zhong, Sheng

    2016-07-08

    A set of new data types emerged from functional genomic assays, including ChIP-seq, DNase-seq, FAIRE-seq and others. The results are typically stored as genome-wide intensities (WIG/bigWig files) or functional genomic regions (peak/BED files). These data types present new challenges to big data science. Here, we present GeNemo, a web-based search engine for functional genomic data. GeNemo searches user-input data against online functional genomic datasets, including the entire collection of ENCODE and mouse ENCODE datasets. Unlike text-based search engines, GeNemo's searches are based on pattern matching of functional genomic regions. This distinguishes GeNemo from text or DNA sequence searches. The user can input any complete or partial functional genomic dataset, for example, a binding intensity file (bigWig) or a peak file. GeNemo reports any genomic regions, ranging from hundred bases to hundred thousand bases, from any of the online ENCODE datasets that share similar functional (binding, modification, accessibility) patterns. This is enabled by a Markov Chain Monte Carlo-based maximization process, executed on up to 24 parallel computing threads. By clicking on a search result, the user can visually compare her/his data with the found datasets and navigate the identified genomic regions. GeNemo is available at www.genemo.org. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Whiplash Syndrome Reloaded: Digital Echoes of Whiplash Syndrome in the European Internet Search Engine Context.

    PubMed

    Noll-Hussong, Michael

    2017-03-27

    In many Western countries, after a motor vehicle collision, those involved seek health care for the assessment of injuries and for insurance documentation purposes. In contrast, in many less wealthy countries, there may be limited access to care and no insurance or compensation system. The purpose of this infodemiology study was to investigate the global pattern of evolving Internet usage in countries with and without insurance and the corresponding compensation systems for whiplash injury. We used the Internet search engine analytics via Google Trends to study the health information-seeking behavior concerning whiplash injury at national population levels in Europe. We found that the search for "whiplash" is strikingly and consistently often associated with the search for "compensation" in countries or cultures with a tort system. Frequent or traumatic painful injuries; diseases or disorders such as arthritis, headache, radius, and hip fracture; depressive disorders; and fibromyalgia were not associated similarly with searches on "compensation." In this study, we present evidence from the evolving viewpoint of naturalistic Internet search engine analytics that the expectations for receiving compensation may influence Internet search behavior in relation to whiplash injury. ©Michael Noll-Hussong. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 27.03.2017.

  1. Whiplash Syndrome Reloaded: Digital Echoes of Whiplash Syndrome in the European Internet Search Engine Context

    PubMed Central

    2017-01-01

    Background In many Western countries, after a motor vehicle collision, those involved seek health care for the assessment of injuries and for insurance documentation purposes. In contrast, in many less wealthy countries, there may be limited access to care and no insurance or compensation system. Objective The purpose of this infodemiology study was to investigate the global pattern of evolving Internet usage in countries with and without insurance and the corresponding compensation systems for whiplash injury. Methods We used the Internet search engine analytics via Google Trends to study the health information-seeking behavior concerning whiplash injury at national population levels in Europe. Results We found that the search for “whiplash” is strikingly and consistently often associated with the search for “compensation” in countries or cultures with a tort system. Frequent or traumatic painful injuries; diseases or disorders such as arthritis, headache, radius, and hip fracture; depressive disorders; and fibromyalgia were not associated similarly with searches on “compensation.” Conclusions In this study, we present evidence from the evolving viewpoint of naturalistic Internet search engine analytics that the expectations for receiving compensation may influence Internet search behavior in relation to whiplash injury. PMID:28347974

  2. The LAILAPS search engine: a feature model for relevance ranking in life science databases.

    PubMed

    Lange, Matthias; Spies, Karl; Colmsee, Christian; Flemming, Steffen; Klapperstück, Matthias; Scholz, Uwe

    2010-03-25

    Efficient and effective information retrieval in life sciences is one of the most pressing challenge in bioinformatics. The incredible growth of life science databases to a vast network of interconnected information systems is to the same extent a big challenge and a great chance for life science research. The knowledge found in the Web, in particular in life-science databases, are a valuable major resource. In order to bring it to the scientist desktop, it is essential to have well performing search engines. Thereby, not the response time nor the number of results is important. The most crucial factor for millions of query results is the relevance ranking. In this paper, we present a feature model for relevance ranking in life science databases and its implementation in the LAILAPS search engine. Motivated by the observation of user behavior during their inspection of search engine result, we condensed a set of 9 relevance discriminating features. These features are intuitively used by scientists, who briefly screen database entries for potential relevance. The features are both sufficient to estimate the potential relevance, and efficiently quantifiable. The derivation of a relevance prediction function that computes the relevance from this features constitutes a regression problem. To solve this problem, we used artificial neural networks that have been trained with a reference set of relevant database entries for 19 protein queries. Supporting a flexible text index and a simple data import format, this concepts are implemented in the LAILAPS search engine. It can easily be used both as search engine for comprehensive integrated life science databases and for small in-house project databases. LAILAPS is publicly available for SWISSPROT data at http://lailaps.ipk-gatersleben.de.

  3. Limitations of the mnemonic-keyword method.

    PubMed

    Campos, Alfredo; González, María Angeles; Amor, Angeles

    2003-10-01

    The effectiveness of the mnemonic-keyword method was investigated in 4 experiments in which participants were required to learn the 1st-language (L1, Spanish) equivalents of a list of 30 2nd-language words (L2, Latin). Experiments 1 (adolescents) and 2 (adults) were designed to assess whether the keyword method was more effective than the rote method; the researcher supplied the keyword, and the participants were allowed to pace themselves through the list. Experiments 3 (adolescents) and 4 (adults) were similar to Experiments 1 and 2 except that the participants were also supplied with a drawing that illustrated the relationship between the keyword and the L1 target word. All the experiments were performed with groups of participants in their classrooms (i.e., not in a laboratory context). In all experiments, the rote method was significantly more effective than was the keyword method.

  4. PubMed vs. HighWire Press: a head-to-head comparison of two medical literature search engines.

    PubMed

    Vanhecke, Thomas E; Barnes, Michael A; Zimmerman, Janet; Shoichet, Sandor

    2007-09-01

    PubMed and HighWire Press are both useful medical literature search engines available for free to anyone on the internet. We measured retrieval accuracy, number of results generated, retrieval speed, features and search tools on HighWire Press and PubMed using the quick search features of each. We found that using HighWire Press resulted in a higher likelihood of retrieving the desired article and higher number of search results than the same search on PubMed. PubMed was faster than HighWire Press in delivering search results regardless of search settings. There are considerable differences in search features between these two search engines.

  5. Visibiome: an efficient microbiome search engine based on a scalable, distributed architecture.

    PubMed

    Azman, Syafiq Kamarul; Anwar, Muhammad Zohaib; Henschel, Andreas

    2017-07-24

    Given the current influx of 16S rRNA profiles of microbiota samples, it is conceivable that large amounts of them eventually are available for search, comparison and contextualization with respect to novel samples. This process facilitates the identification of similar compositional features in microbiota elsewhere and therefore can help to understand driving factors for microbial community assembly. We present Visibiome, a microbiome search engine that can perform exhaustive, phylogeny based similarity search and contextualization of user-provided samples against a comprehensive dataset of 16S rRNA profiles environments, while tackling several computational challenges. In order to scale to high demands, we developed a distributed system that combines web framework technology, task queueing and scheduling, cloud computing and a dedicated database server. To further ensure speed and efficiency, we have deployed Nearest Neighbor search algorithms, capable of sublinear searches in high-dimensional metric spaces in combination with an optimized Earth Mover Distance based implementation of weighted UniFrac. The search also incorporates pairwise (adaptive) rarefaction and optionally, 16S rRNA copy number correction. The result of a query microbiome sample is the contextualization against a comprehensive database of microbiome samples from a diverse range of environments, visualized through a rich set of interactive figures and diagrams, including barchart-based compositional comparisons and ranking of the closest matches in the database. Visibiome is a convenient, scalable and efficient framework to search microbiomes against a comprehensive database of environmental samples. The search engine leverages a popular but computationally expensive, phylogeny based distance metric, while providing numerous advantages over the current state of the art tool.

  6. Developing a distributed HTML5-based search engine for geospatial resource discovery

    NASA Astrophysics Data System (ADS)

    ZHOU, N.; XIA, J.; Nebert, D.; Yang, C.; Gui, Z.; Liu, K.

    2013-12-01

    With explosive growth of data, Geospatial Cyberinfrastructure(GCI) components are developed to manage geospatial resources, such as data discovery and data publishing. However, the efficiency of geospatial resources discovery is still challenging in that: (1) existing GCIs are usually developed for users of specific domains. Users may have to visit a number of GCIs to find appropriate resources; (2) The complexity of decentralized network environment usually results in slow response and pool user experience; (3) Users who use different browsers and devices may have very different user experiences because of the diversity of front-end platforms (e.g. Silverlight, Flash or HTML). To address these issues, we developed a distributed and HTML5-based search engine. Specifically, (1)the search engine adopts a brokering approach to retrieve geospatial metadata from various and distributed GCIs; (2) the asynchronous record retrieval mode enhances the search performance and user interactivity; (3) the search engine based on HTML5 is able to provide unified access capabilities for users with different devices (e.g. tablet and smartphone).

  7. Start Your Search Engines. Part 2: When Image is Everything, Here are Some Great Ways to Find One

    ERIC Educational Resources Information Center

    Adam, Anna; Mowers, Helen

    2008-01-01

    There is no doubt that Google is great for finding images. Simply head to its home page, click the "Images" link, enter criteria in the search box, and--voila! In this article, the authors share some of their other favorite search engines for finding images. To make sure the desired images are available for educational use, consider searching for…

  8. Refining comparative proteomics by spectral counting to account for shared peptides and multiple search engines

    PubMed Central

    Chen, Yao-Yi; Dasari, Surendra; Ma, Ze-Qiang; Vega-Montoto, Lorenzo J.; Li, Ming

    2013-01-01

    Spectral counting has become a widely used approach for measuring and comparing protein abundance in label-free shotgun proteomics. However, when analyzing complex samples, the ambiguity of matching between peptides and proteins greatly affects the assessment of peptide and protein inventories, differentiation, and quantification. Meanwhile, the configuration of database searching algorithms that assign peptides to MS/MS spectra may produce different results in comparative proteomic analysis. Here, we present three strategies to improve comparative proteomics through spectral counting. We show that comparing spectral counts for peptide groups rather than for protein groups forestalls problems introduced by shared peptides. We demonstrate the advantage and flexibility of this new method in two datasets. We present four models to combine four popular search engines that lead to significant gains in spectral counting differentiation. Among these models, we demonstrate a powerful vote counting model that scales well for multiple search engines. We also show that semi-tryptic searching outperforms tryptic searching for comparative proteomics. Overall, these techniques considerably improve protein differentiation on the basis of spectral count tables. PMID:22552787

  9. Refining comparative proteomics by spectral counting to account for shared peptides and multiple search engines.

    PubMed

    Chen, Yao-Yi; Dasari, Surendra; Ma, Ze-Qiang; Vega-Montoto, Lorenzo J; Li, Ming; Tabb, David L

    2012-09-01

    Spectral counting has become a widely used approach for measuring and comparing protein abundance in label-free shotgun proteomics. However, when analyzing complex samples, the ambiguity of matching between peptides and proteins greatly affects the assessment of peptide and protein inventories, differentiation, and quantification. Meanwhile, the configuration of database searching algorithms that assign peptides to MS/MS spectra may produce different results in comparative proteomic analysis. Here, we present three strategies to improve comparative proteomics through spectral counting. We show that comparing spectral counts for peptide groups rather than for protein groups forestalls problems introduced by shared peptides. We demonstrate the advantage and flexibility of this new method in two datasets. We present four models to combine four popular search engines that lead to significant gains in spectral counting differentiation. Among these models, we demonstrate a powerful vote counting model that scales well for multiple search engines. We also show that semi-tryptic searching outperforms tryptic searching for comparative proteomics. Overall, these techniques considerably improve protein differentiation on the basis of spectral count tables.

  10. Towards a portal and search engine to facilitate academic and research collaboration in engineering and education

    NASA Astrophysics Data System (ADS)

    Bonilla Villarreal, Isaura Nathaly

    While international academic and research collaborations are of great importance at this time, it is not easy to find researchers in the engineering field that publish in languages other than English. Because of this disconnect, there exists a need for a portal to find Who's Who in Engineering Education in the Americas. The objective of this thesis is to built an object-oriented architecture for this proposed portal. The Unified Modeling Language (UML) model developed in this thesis incorporates the basic structure of a social network for academic purposes. Reverse engineering of three social networks portals yielded important aspects of their structures that have been incorporated in the proposed UML model. Furthermore, the present work includes a pattern for academic social networks..

  11. Impact of Commercial Search Engines and International Databases on Engineering Teaching and Research

    ERIC Educational Resources Information Center

    Chanson, Hubert

    2007-01-01

    For the last three decades, the engineering higher education and professional environments have been completely transformed by the "electronic/digital information revolution" that has included the introduction of personal computer, the development of email and world wide web, and broadband Internet connections at home. Herein the writer compares…

  12. Formalizing An Approach to Curate the Global Change Master Directory (GCMD)'s Controlled Vocabularies (Keywords) Through a Keyword Governance Process and Community Involvement

    NASA Astrophysics Data System (ADS)

    Stevens, T.

    2016-12-01

    NASA's Global Change Master Directory (GCMD) curates a hierarchical set of controlled vocabularies (keywords) covering Earth sciences and associated information (data centers, projects, platforms, and instruments). The purpose of the keywords is to describe Earth science data and services in a consistent and comprehensive manner, allowing for precise metadata search and subsequent retrieval of data and services. The keywords are accessible in a standardized SKOS/RDF/OWL representation and are used as an authoritative taxonomy, as a source for developing ontologies, and to search and access Earth Science data within online metadata catalogs. The keyword curation approach involves: (1) receiving community suggestions; (2) triaging community suggestions; (3) evaluating keywords against a set of criteria coordinated by the NASA Earth Science Data and Information System (ESDIS) Standards Office; (4) implementing the keywords; and (5) publication/notification of keyword changes. This approach emphasizes community input, which helps ensure a high quality, normalized, and relevant keyword structure that will evolve with users' changing needs. The Keyword Community Forum, which promotes a responsive, open, and transparent process, is an area where users can discuss keyword topics and make suggestions for new keywords. Others could potentially use this formalized approach as a model for keyword curation.

  13. Crescendo: A Protein Sequence Database Search Engine for Tandem Mass Spectra.

    PubMed

    Wang, Jianqi; Zhang, Yajie; Yu, Yonghao

    2015-07-01

    A search engine that discovers more peptides reliably is essential to the progress of the computational proteomics. We propose two new scoring functions (L- and P-scores), which aim to capture similar characteristics of a peptide-spectrum match (PSM) as Sequest and Comet do. Crescendo, introduced here, is a software program that implements these two scores for peptide identification. We applied Crescendo to test datasets and compared its performance with widely used search engines, including Mascot, Sequest, and Comet. The results indicate that Crescendo identifies a similar or larger number of peptides at various predefined false discovery rates (FDR). Importantly, it also provides a better separation between the true and decoy PSMs, warranting the future development of a companion post-processing filtering algorithm.

  14. Design and Implementation of a Threaded Search Engine for Tour Recommendation Systems

    NASA Astrophysics Data System (ADS)

    Lee, Junghoon; Park, Gyung-Leen; Ko, Jin-Hee; Shin, In-Hye; Kang, Mikyung

    This paper implements a threaded scan engine for the O(n!) search space and measures its performance, aiming at providing a responsive tour recommendation and scheduling service. As a preliminary step of integrating POI ontology, mobile object database, and personalization profile for the development of new vehicular telematics services, this implementation can give a useful guideline to design a challenging and computation-intensive vehicular telematics service. The implemented engine allocates the subtree to the respective threads and makes them run concurrently exploiting the primitives provided by the operating system and the underlying multiprocessor architecture. It also makes it easy to add a variety of constraints, for example, the search tree is pruned if the cost of partial allocation already exceeds the current best. The performance measurement result shows that the service can run even in the low-power telematics device when the number of destinations does not exceed 15, with an appropriate constraint processing.

  15. Origin of Disagreements in Tandem Mass Spectra Interpretation by Search Engines.

    PubMed

    Tessier, Dominique; Lollier, Virginie; Larré, Colette; Rogniaux, Hélène

    2016-10-07

    Several proteomic database search engines that interpret LC-MS/MS data do not identify the same set of peptides. These disagreements occur even when the scores of the peptide-to-spectrum matches suggest good confidence in the interpretation. Our study shows that these disagreements observed for the interpretations of a given spectrum are almost exclusively due to the variation of what we call the "peptide space", i.e., the set of peptides that are actually compared to the experimental spectra. We discuss the potential difficulties of precisely defining the "peptide space." Indeed, although several parameters that are generally reported in publications can easily be set to the same values, many additional parameters-with much less straightforward user access-might impact the "peptide space" used by each program. Moreover, in a configuration where each search engine identifies the same candidates for each spectrum, the inference of the proteins may remain quite different depending on the false discovery rate selected.

  16. FPS-RAM: Fast Prefix Search RAM-Based Hardware for Forwarding Engine

    NASA Astrophysics Data System (ADS)

    Zaitsu, Kazuya; Yamamoto, Koji; Kuroda, Yasuto; Inoue, Kazunari; Ata, Shingo; Oka, Ikuo

    Ternary content addressable memory (TCAM) is becoming very popular for designing high-throughput forwarding engines on routers. However, TCAM has potential problems in terms of hardware and power costs, which limits its ability to deploy large amounts of capacity in IP routers. In this paper, we propose new hardware architecture for fast forwarding engines, called fast prefix search RAM-based hardware (FPS-RAM). We designed FPS-RAM hardware with the intent of maintaining the same search performance and physical user interface as TCAM because our objective is to replace the TCAM in the market. Our RAM-based hardware architecture is completely different from that of TCAM and has dramatically reduced the costs and power consumption to 62% and 52%, respectively. We implemented FPS-RAM on an FPGA to examine its lookup operation.

  17. Seasonal trends in hypertension in Poland: evidence from Google search engine query data.

    PubMed

    Płatek, Anna E; Sierdziński, Janusz; Krzowski, Bartosz; Szymański, Filip M

    2018-01-01

    Various conditions, including arterial hypertension, exhibit seasonal trends in their occurrence and magnitude. Those trends correspond to an interest exhibited in the number of Internet searches for the specific conditions per month. The aim of the study was to show seasonal trends in the hypertension prevalence in Poland relate to the data from the Google Trends tool. Internet search engine query data were retrieved from Google Trends from January 2008 to November 2017. Data were calculated as a monthly normalised search volume from the nine-year period. Data was presented for specific geographic regions, including Poland, the United States of America, Australia, and worldwide for the following search terms: "arterial hypertension (pol. nadciśnienie tętnicze)", "hypertension (pol. nadciśnienie)" and "hypertension medical condition". Seasonal effects were calculated using regression models and presented graphically. In Poland the search volume is the highest between November and May, while patients exhibit the least interest in arterial hypertension during summer holidays (p < 0.05). Seasonal variations are comparable in the United States of America representing a Northern hemisphere country, while in Australia (Southern hemisphere) they exhibit a contrary trend. In conclusion, arterial hypertension is more likely to occur during winter months, which correlates with increased interest in the search phrase "hypertension" in Google.

  18. Search Tips

    MedlinePlus

    ... do not need to use AND because the search engine automatically finds resources containing all of your search ... Use as a wildcard when you want the search engine to fill in the blank for you; you ...

  19. GIGGLE: a search engine for large-scale integrated genome analysis.

    PubMed

    Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R

    2018-02-01

    GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation.

  20. GIGGLE: a search engine for large-scale integrated genome analysis

    PubMed Central

    Layer, Ryan M; Pedersen, Brent S; DiSera, Tonya; Marth, Gabor T; Gertz, Jason; Quinlan, Aaron R

    2018-01-01

    GIGGLE is a genomics search engine that identifies and ranks the significance of genomic loci shared between query features and thousands of genome interval files. GIGGLE (https://github.com/ryanlayer/giggle) scales to billions of intervals and is over three orders of magnitude faster than existing methods. Its speed extends the accessibility and utility of resources such as ENCODE, Roadmap Epigenomics, and GTEx by facilitating data integration and hypothesis generation. PMID:29309061

  1. Usability Evaluation of NLP-PIER: A Clinical Document Search Engine for Researchers.

    PubMed

    Hultman, Gretchen; McEwan, Reed; Pakhomov, Serguei; Lindemann, Elizabeth; Skube, Steven; Melton, Genevieve B

    2017-01-01

    NLP-PIER (Natural Language Processing - Patient Information Extraction for Research) is a self-service platform with a search engine for clinical researchers to perform natural language processing (NLP) queries using clinical notes. We conducted user-centered testing of NLP-PIER's usability to inform future design decisions. Quantitative and qualitative data were analyzed. Our findings will be used to improve the usability of NLP-PIER.

  2. Engineering Bacteria to Search for Specific Concentrations of Molecules by a Systematic Synthetic Biology Design Method

    PubMed Central

    Chen, Bor-Sen

    2016-01-01

    Bacteria navigate environments full of various chemicals to seek favorable places for survival by controlling the flagella’s rotation using a complicated signal transduction pathway. By influencing the pathway, bacteria can be engineered to search for specific molecules, which has great potential for application to biomedicine and bioremediation. In this study, genetic circuits were constructed to make bacteria search for a specific molecule at particular concentrations in their environment through a synthetic biology method. In addition, by replacing the “brake component” in the synthetic circuit with some specific sensitivities, the bacteria can be engineered to locate areas containing specific concentrations of the molecule. Measured by the swarm assay qualitatively and microfluidic techniques quantitatively, the characteristics of each “brake component” were identified and represented by a mathematical model. Furthermore, we established another mathematical model to anticipate the characteristics of the “brake component”. Based on this model, an abundant component library can be established to provide adequate component selection for different searching conditions without identifying all components individually. Finally, a systematic design procedure was proposed. Following this systematic procedure, one can design a genetic circuit for bacteria to rapidly search for and locate different concentrations of particular molecules by selecting the most adequate “brake component” in the library. Moreover, following simple procedures, one can also establish an exclusive component library suitable for other cultivated environments, promoter systems, or bacterial strains. PMID:27096615

  3. Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework.

    PubMed

    Lewis, Steven; Csordas, Attila; Killcoyne, Sarah; Hermjakob, Henning; Hoopmann, Michael R; Moritz, Robert L; Deutsch, Eric W; Boyle, John

    2012-12-05

    For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources.

  4. Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework

    PubMed Central

    2012-01-01

    Background For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. Results We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. Conclusion The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources. PMID:23216909

  5. Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea.

    PubMed

    Woo, Hyekyung; Cho, Youngtae; Shim, Eunyoung; Lee, Jong-Koo; Lee, Chang-Gun; Kim, Seong Hwan

    2016-07-04

    As suggested as early as in 2006, logs of queries submitted to search engines seeking information could be a source for detection of emerging influenza epidemics if changes in the volume of search queries are monitored (infodemiology). However, selecting queries that are most likely to be associated with influenza epidemics is a particular challenge when it comes to generating better predictions. In this study, we describe a methodological extension for detecting influenza outbreaks using search query data; we provide a new approach for query selection through the exploration of contextual information gleaned from social media data. Additionally, we evaluate whether it is possible to use these queries for monitoring and predicting influenza epidemics in South Korea. Our study was based on freely available weekly influenza incidence data and query data originating from the search engine on the Korean website Daum between April 3, 2011 and April 5, 2014. To select queries related to influenza epidemics, several approaches were applied: (1) exploring influenza-related words in social media data, (2) identifying the chief concerns related to influenza, and (3) using Web query recommendations. Optimal feature selection by least absolute shrinkage and selection operator (Lasso) and support vector machine for regression (SVR) were used to construct a model predicting influenza epidemics. In total, 146 queries related to influenza were generated through our initial query selection approach. A considerable proportion of optimal features for final models were derived from queries with reference to the social media data. The SVR model performed well: the prediction values were highly correlated with the recent observed influenza-like illness (r=.956; P<.001) and virological incidence rate (r=.963; P<.001). These results demonstrate the feasibility of using search queries to enhance influenza surveillance in South Korea. In addition, an approach for query selection using

  6. Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea

    PubMed Central

    Woo, Hyekyung; Shim, Eunyoung; Lee, Jong-Koo; Lee, Chang-Gun; Kim, Seong Hwan

    2016-01-01

    Background As suggested as early as in 2006, logs of queries submitted to search engines seeking information could be a source for detection of emerging influenza epidemics if changes in the volume of search queries are monitored (infodemiology). However, selecting queries that are most likely to be associated with influenza epidemics is a particular challenge when it comes to generating better predictions. Objective In this study, we describe a methodological extension for detecting influenza outbreaks using search query data; we provide a new approach for query selection through the exploration of contextual information gleaned from social media data. Additionally, we evaluate whether it is possible to use these queries for monitoring and predicting influenza epidemics in South Korea. Methods Our study was based on freely available weekly influenza incidence data and query data originating from the search engine on the Korean website Daum between April 3, 2011 and April 5, 2014. To select queries related to influenza epidemics, several approaches were applied: (1) exploring influenza-related words in social media data, (2) identifying the chief concerns related to influenza, and (3) using Web query recommendations. Optimal feature selection by least absolute shrinkage and selection operator (Lasso) and support vector machine for regression (SVR) were used to construct a model predicting influenza epidemics. Results In total, 146 queries related to influenza were generated through our initial query selection approach. A considerable proportion of optimal features for final models were derived from queries with reference to the social media data. The SVR model performed well: the prediction values were highly correlated with the recent observed influenza-like illness (r=.956; P<.001) and virological incidence rate (r=.963; P<.001). Conclusions These results demonstrate the feasibility of using search queries to enhance influenza surveillance in South Korea. In

  7. Search engine as a diagnostic tool in difficult immunological and allergologic cases: is Google useful?

    PubMed

    Lombardi, C; Griffiths, E; McLeod, B; Caviglia, A; Penagos, M

    2009-07-01

    Web search engines are an important tool in communication and diffusion of knowledge. Among these, Google appears to be the most popular one: in August 2008, it accounted for 87% of all web searches in the UK, compared with Yahoo's 3.3%. Google's value as a diagnostic guide in general medicine was recently reported. The aim of this comparative cross-sectional study was to evaluate whether searching Google with disease-related terms was effective in the identification and diagnosis of complex immunological and allergic cases. Forty-five case reports were randomly selected by an independent observer from peer-reviewed medical journals. Clinical data were presented separately to three investigators, blinded to the final diagnoses. Investigator A was a Consultant with an expert knowledge in Internal Medicine and Allergy (IM&A) and basic computing skills. Investigator B was a Registrar in IM&A. Investigator C was a Research Nurse. Both Investigators B and C were familiar with computers and search engines. For every clinical case presented, each investigator independently carried out an Internet search using Google to provide a final diagnosis. Their results were then compared with the published diagnoses. Correct diagnoses were provided in 30/45 (66%) cases, 39/45 (86%) cases, and in 29/45 (64%) cases by investigator A, B, and C, respectively. All of the three investigators achieved the correct diagnosis in 19 cases (42%), and all of them failed in two cases. This Google-based search was useful to identify an appropriate diagnosis in complex immunological and allergic cases. Computing skills may help to get better results.

  8. Gene annotation from scientific literature using mappings between keyword systems.

    PubMed

    Pérez, Antonio J; Perez-Iratxeta, Carolina; Bork, Peer; Thode, Guillermo; Andrade, Miguel A

    2004-09-01

    The description of genes in databases by keywords helps the non-specialist to quickly grasp the properties of a gene and increases the efficiency of computational tools that are applied to gene data (e.g. searching a gene database for sequences related to a particular biological process). However, the association of keywords to genes or protein sequences is a difficult process that ultimately implies examination of the literature related to a gene. To support this task, we present a procedure to derive keywords from the set of scientific abstracts related to a gene. Our system is based on the automated extraction of mappings between related terms from different databases using a model of fuzzy associations that can be applied with all generality to any pair of linked databases. We tested the system by annotating genes of the SWISS-PROT database with keywords derived from the abstracts linked to their entries (stored in the MEDLINE database of scientific references). The performance of the annotation procedure was much better for SWISS-PROT keywords (recall of 47%, precision of 68%) than for Gene Ontology terms (recall of 8%, precision of 67%). The algorithm can be publicly accessed and used for the annotation of sequences through a web server at http://www.bork.embl.de/kat

  9. Surfing for suicide methods and help: content analysis of websites retrieved with search engines in Austria and the United States.

    PubMed

    Till, Benedikt; Niederkrotenthaler, Thomas

    2014-08-01

    The Internet provides a variety of resources for individuals searching for suicide-related information. Structured content-analytic approaches to assess intercultural differences in web contents retrieved with method-related and help-related searches are scarce. We used the 2 most popular search engines (Google and Yahoo/Bing) to retrieve US-American and Austrian search results for the term suicide, method-related search terms (e.g., suicide methods, how to kill yourself, painless suicide, how to hang yourself), and help-related terms (e.g., suicidal thoughts, suicide help) on February 11, 2013. In total, 396 websites retrieved with US search engines and 335 websites from Austrian searches were analyzed with content analysis on the basis of current media guidelines for suicide reporting. We assessed the quality of websites and compared findings across search terms and between the United States and Austria. In both countries, protective outweighed harmful website characteristics by approximately 2:1. Websites retrieved with method-related search terms (e.g., how to hang yourself) contained more harmful (United States: P < .001, Austria: P < .05) and fewer protective characteristics (United States: P < .001, Austria: P < .001) compared to the term suicide. Help-related search terms (e.g., suicidal thoughts) yielded more websites with protective characteristics (United States: P = .07, Austria: P < .01). Websites retrieved with U.S. search engines generally had more protective characteristics (P < .001) than searches with Austrian search engines. Resources with harmful characteristics were better ranked than those with protective characteristics (United States: P < .01, Austria: P < .05). The quality of suicide-related websites obtained depends on the search terms used. Preventive efforts to improve the ranking of preventive web content, particularly regarding method-related search terms, seem necessary. © Copyright 2014 Physicians Postgraduate Press, Inc.

  10. Search and rescue in collapsed structures: engineering and social science aspects.

    PubMed

    El-Tawil, Sherif; Aguirre, Benigno

    2010-10-01

    This paper discusses the social science and engineering dimensions of search and rescue (SAR) in collapsed buildings. First, existing information is presented on factors that influence the behaviour of trapped victims, particularly human, physical, socioeconomic and circumstantial factors. Trapped victims are most often discussed in the context of structural collapse and injuries sustained. Most studies in this area focus on earthquakes as the type of disaster that produces the most extensive structural damage. Second, information is set out on the engineering aspects of urban search and rescue (USAR) in the United States, including the role of structural engineers in USAR operations, training and certification of structural specialists, and safety and general procedures. The use of computational simulation to link the engineering and social science aspects of USAR is discussed. This could supplement training of local SAR groups and USAR teams, allowing them to understand better the collapse process and how voids form in a rubble pile. A preliminary simulation tool developed for this purpose is described. © 2010 The Author(s). Journal compilation © Overseas Development Institute, 2010.

  11. Creative Engineering Based Education with Autonomous Robots Considering Job Search Support

    NASA Astrophysics Data System (ADS)

    Takezawa, Satoshi; Nagamatsu, Masao; Takashima, Akihiko; Nakamura, Kaeko; Ohtake, Hideo; Yoshida, Kanou

    The Robotics Course in our Mechanical Systems Engineering Department offers “Robotics Exercise Lessons” as one of its Problem-Solution Based Specialized Subjects. This is intended to motivate students learning and to help them acquire fundamental items and skills on mechanical engineering and improve understanding of Robotics Basic Theory. Our current curriculum was established to accomplish this objective based on two pieces of research in 2005: an evaluation questionnaire on the education of our Mechanical Systems Engineering Department for graduates and a survey on the kind of human resources which companies are seeking and their expectations for our department. This paper reports the academic results and reflections of job search support in recent years as inherited and developed from the previous curriculum.

  12. Tracking search engine queries for suicide in the United Kingdom, 2004-2013.

    PubMed

    Arora, V S; Stuckler, D; McKee, M

    2016-08-01

    First, to determine if a cyclical trend is observed for search activity of suicide and three common suicide risk factors in the United Kingdom: depression, unemployment, and marital strain. Second, to test the validity of suicide search data as a potential marker of suicide risk by evaluating whether web searches for suicide associate with suicide rates among those of different ages and genders in the United Kingdom. Cross-sectional. Search engine data was obtained from Google Trends, a publicly available repository of information of trends and patterns of user searches on Google. The following phrases were entered into Google Trends to analyse relative search volume for suicide, depression, job loss, and divorce, respectively: 'suicide'; 'depression + depressed + hopeless'; 'unemployed + lost job'; 'divorce'. Spearman's rank correlation coefficient was employed to test bivariate associations between suicide search activity and official suicide rates from the Office of National Statistics (ONS). Cyclical trends were observed in search activity for suicide and depression-related search activity, with peaks in autumn and winter months, and a trough in summer months. A positive, non-significant association was found between suicide-related search activity and suicide rates in the general working-age population (15-64 years) (ρ = 0.164; P = 0.652). This association is stronger in younger age groups, particularly for those 25-34 years of age (ρ = 0.848; P = 0.002). We give credence to a link between search activity for suicide and suicide rates in the United Kingdom from 2004 to 2013 for high risk sub-populations (i.e. male youth and young professionals). There remains a need for further research on how Google Trends can be used in other areas of disease surveillance and for work to provide greater geographical precision, as well as research on ways of mitigating the risk of internet use leading to suicide ideation in youth. Copyright © 2015 The Royal

  13. Evaluation of an Automated Keywording System.

    ERIC Educational Resources Information Center

    Malone, Linda C.; And Others

    1990-01-01

    Discussion of automated indexing techniques focuses on ways to statistically document improvements in the development of an automated keywording system over time. The system developed by the Joint Chiefs of Staff to automate the storage, categorization, and retrieval of information from military exercises is explained, and performance measures are…

  14. Keyword Extraction from Arabic Legal Texts

    ERIC Educational Resources Information Center

    Rammal, Mahmoud; Bahsoun, Zeinab; Al Achkar Jabbour, Mona

    2015-01-01

    Purpose: The purpose of this paper is to apply local grammar (LG) to develop an indexing system which automatically extracts keywords from titles of Lebanese official journals. Design/methodology/approach: To build LG for our system, the first word that plays the determinant role in understanding the meaning of a title is analyzed and grouped as…

  15. Seasonal trends in tinnitus symptomatology: evidence from Internet search engine query data.

    PubMed

    Plante, David T; Ingram, David G

    2015-10-01

    The primary aim of this study was to test the hypothesis that the symptom of tinnitus demonstrates a seasonal pattern with worsening in the winter relative to the summer using Internet search engine query data. Normalized search volume for the term 'tinnitus' from January 2004 through December 2013 was retrieved from Google Trends. Seasonal effects were evaluated using cosinor regression models. Primary countries of interest were the United States and Australia. Secondary exploratory analyses were also performed using data from Germany, the United Kingdom, Canada, Sweden, and Switzerland. Significant seasonal effects for 'tinnitus' search queries were found in the United States and Australia (p < 0.00001 for both countries), with peaks in the winter and troughs in the summer. Secondary analyses demonstrated similarly significant seasonal effects for Germany (p < 0.00001), Canada (p < 0.00001), and Sweden (p = 0.0008), again with increased search volume in the winter relative to the summer. Our findings indicate that there are significant seasonal trends for Internet search queries for tinnitus, with a zenith in winter months. Further research is indicated to determine the biological mechanisms underlying these findings, as they may provide insights into the pathophysiology of this common and debilitating medical symptom.

  16. OrChem - An open source chemistry search engine for Oracle(R).

    PubMed

    Rijnbeek, Mark; Steinbeck, Christoph

    2009-10-22

    Registration, indexing and searching of chemical structures in relational databases is one of the core areas of cheminformatics. However, little detail has been published on the inner workings of search engines and their development has been mostly closed-source. We decided to develop an open source chemistry extension for Oracle, the de facto database platform in the commercial world. Here we present OrChem, an extension for the Oracle 11G database that adds registration and indexing of chemical structures to support fast substructure and similarity searching. The cheminformatics functionality is provided by the Chemistry Development Kit. OrChem provides similarity searching with response times in the order of seconds for databases with millions of compounds, depending on a given similarity cut-off. For substructure searching, it can make use of multiple processor cores on today's powerful database servers to provide fast response times in equally large data sets. OrChem is free software and can be redistributed and/or modified under the terms of the GNU Lesser General Public License as published by the Free Software Foundation. All software is available via http://orchem.sourceforge.net.

  17. OrChem - An open source chemistry search engine for Oracle®

    PubMed Central

    2009-01-01

    Background Registration, indexing and searching of chemical structures in relational databases is one of the core areas of cheminformatics. However, little detail has been published on the inner workings of search engines and their development has been mostly closed-source. We decided to develop an open source chemistry extension for Oracle, the de facto database platform in the commercial world. Results Here we present OrChem, an extension for the Oracle 11G database that adds registration and indexing of chemical structures to support fast substructure and similarity searching. The cheminformatics functionality is provided by the Chemistry Development Kit. OrChem provides similarity searching with response times in the order of seconds for databases with millions of compounds, depending on a given similarity cut-off. For substructure searching, it can make use of multiple processor cores on today's powerful database servers to provide fast response times in equally large data sets. Availability OrChem is free software and can be redistributed and/or modified under the terms of the GNU Lesser General Public License as published by the Free Software Foundation. All software is available via http://orchem.sourceforge.net. PMID:20298521

  18. Keywords image retrieval in historical handwritten Arabic documents

    NASA Astrophysics Data System (ADS)

    Saabni, Raid; El-Sana, Jihad

    2013-01-01

    A system is presented for spotting and searching keywords in handwritten Arabic documents. A slightly modified dynamic time warping algorithm is used to measure similarities between words. Two sets of features are generated from the outer contour of the words/word-parts. The first set is based on the angles between nodes on the contour and the second set is based on the shape context features taken from the outer contour. To recognize a given word, the segmentation-free approach is partially adopted, i.e., continuous word parts are used as the basic alphabet, instead of individual characters or complete words. Additional strokes, such as dots and detached short segments, are classified and used in a postprocessing step to determine the final comparison decision. The search for a keyword is performed by the search for its word parts given in the correct order. The performance of the presented system was very encouraging in terms of efficiency and match rates. To evaluate the presented system its performance is compared to three different systems. Unfortunately, there are no publicly available standard datasets with ground truth for testing Arabic key word searching systems. Therefore, a private set of images partially taken from Juma'a Al-Majid Center in Dubai for evaluation is used, while using a slightly modified version of the IFN/ENIT database for training.

  19. Search engine ranking, quality, and content of webpages that are critical vs noncritical of HPV vaccine

    PubMed Central

    Fu, Linda Y.; Zook, Kathleen; Spoehr-Labutta, Zachary; Hu, Pamela; Joseph, Jill G.

    2015-01-01

    Purpose Online information can influence attitudes toward vaccination. The aim of the present study is to provide a systematic evaluation of the search engine ranking, quality, and content of webpages that are critical versus noncritical of HPV vaccination. Methods We identified HPV vaccine-related webpages with the Google search engine by entering 20 terms. We then assessed each webpage for critical versus noncritical bias as well as for the following quality indicators: authorship disclosure, source disclosure, attribution of at least one reference, currency, exclusion of testimonial accounts, and readability level less than 9th grade. We also determined webpage comprehensiveness in terms of mention of 14 HPV vaccine relevant topics. Results Twenty searches yielded 116 unique webpages. HPV vaccine-critical webpages comprised roughly a third of the top, top 5 and top 10-ranking webpages. The prevalence of HPV vaccine-critical webpages was higher for queries that included term modifiers in addition to root terms. Compared with noncritical webpages, webpages critical of HPV vaccine overall had a lower quality score than those with a noncritical bias (p<.01) and covered fewer important HPV-related topics (p<.001). Critical webpages required viewers to have higher reading skills, were less likely to include an author byline, and were more likely to include testimonial accounts. They also were more likely to raise unsubstantiated concerns about vaccination. Conclusion Webpages critical of HPV vaccine may be frequently returned and highly ranked by search engine queries despite being of lower quality and less comprehensive than noncritical webpages. PMID:26559742

  20. Adverse Reactions Associated With Cannabis Consumption as Evident From Search Engine Queries.

    PubMed

    Yom-Tov, Elad; Lev-Ran, Shaul

    2017-10-26

    Cannabis is one of the most widely used psychoactive substances worldwide, but adverse drug reactions (ADRs) associated with its use are difficult to study because of its prohibited status in many countries. Internet search engine queries have been used to investigate ADRs in pharmaceutical drugs. In this proof-of-concept study, we tested whether these queries can be used to detect the adverse reactions of cannabis use. We analyzed anonymized queries from US-based users of Bing, a widely used search engine, made over a period of 6 months and compared the results with the prevalence of cannabis use as reported in the US National Survey on Drug Use in the Household (NSDUH) and with ADRs reported in the Food and Drug Administration's Adverse Drug Reporting System. Predicted prevalence of cannabis use was estimated from the fraction of people making queries about cannabis, marijuana, and 121 additional synonyms. Predicted ADRs were estimated from queries containing layperson descriptions to 195 ICD-10 symptoms list. Our results indicated that the predicted prevalence of cannabis use at the US census regional level reaches an R 2 of .71 NSDUH data. Queries for ADRs made by people who also searched for cannabis reveal many of the known adverse effects of cannabis (eg, cough and psychotic symptoms), as well as plausible unknown reactions (eg, pyrexia). These results indicate that search engine queries can serve as an important tool for the study of adverse reactions of illicit drugs, which are difficult to study in other settings. ©Elad Yom-Tov, Shaul Lev-Ran. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 26.10.2017.

  1. Adverse Reactions Associated With Cannabis Consumption as Evident From Search Engine Queries

    PubMed Central

    Lev-Ran, Shaul

    2017-01-01

    Background Cannabis is one of the most widely used psychoactive substances worldwide, but adverse drug reactions (ADRs) associated with its use are difficult to study because of its prohibited status in many countries. Objective Internet search engine queries have been used to investigate ADRs in pharmaceutical drugs. In this proof-of-concept study, we tested whether these queries can be used to detect the adverse reactions of cannabis use. Methods We analyzed anonymized queries from US-based users of Bing, a widely used search engine, made over a period of 6 months and compared the results with the prevalence of cannabis use as reported in the US National Survey on Drug Use in the Household (NSDUH) and with ADRs reported in the Food and Drug Administration’s Adverse Drug Reporting System. Predicted prevalence of cannabis use was estimated from the fraction of people making queries about cannabis, marijuana, and 121 additional synonyms. Predicted ADRs were estimated from queries containing layperson descriptions to 195 ICD-10 symptoms list. Results Our results indicated that the predicted prevalence of cannabis use at the US census regional level reaches an R2 of .71 NSDUH data. Queries for ADRs made by people who also searched for cannabis reveal many of the known adverse effects of cannabis (eg, cough and psychotic symptoms), as well as plausible unknown reactions (eg, pyrexia). Conclusions These results indicate that search engine queries can serve as an important tool for the study of adverse reactions of illicit drugs, which are difficult to study in other settings. PMID:29074469

  2. What is the prevalence of health-related searches on the World Wide Web? Qualitative and quantitative analysis of search engine queries on the Internet

    PubMed Central

    Eysenbach, G.; Kohler, Ch.

    2003-01-01

    While health information is often said to be the most sought after information on the web, empirical data on the actual frequency of health-related searches on the web are missing. In the present study we aimed to determine the prevalence of health-related searches on the web by analyzing search terms entered by people into popular search engines. We also made some preliminary attempts in qualitatively describing and classifying these searches. Occasional difficulties in determining what constitutes a “health-related” search led us to propose and validate a simple method to automatically classify a search string as “health-related”. This method is based on determining the proportion of pages on the web containing the search string and the word “health”, as a proportion of the total number of pages with the search string alone. Using human codings as gold standard we plotted a ROC curve and determined empirically that if this “co-occurance rate” is larger than 35%, the search string can be said to be health-related (sensitivity: 85.2%, specificity 80.4%). The results of our “human” codings of search queries determined that about 4.5% of all searches are “health-related”. We estimate that globally a minimum of 6.75 Million health-related searches are being conducted on the web every day, which is roughly the same number of searches that have been conducted on the NLM Medlars system in 1996 in a full year. PMID:14728167

  3. Anatomy and evolution of database search engines-a central component of mass spectrometry based proteomic workflows.

    PubMed

    Verheggen, Kenneth; Raeder, Helge; Berven, Frode S; Martens, Lennart; Barsnes, Harald; Vaudel, Marc

    2017-09-13

    Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines attempt to alleviate these limitations, and provide an overview of the different software frameworks available to the researcher. Finally, we highlight alternative approaches for the identification of proteomic mass spectrometry datasets, either as a replacement for, or as a complement to, sequence database search engines. © 2017 Wiley Periodicals, Inc.

  4. Image search engine with selective filtering and feature-element-based classification

    NASA Astrophysics Data System (ADS)

    Li, Qing; Zhang, Yujin; Dai, Shengyang

    2001-12-01

    With the growth of Internet and storage capability in recent years, image has become a widespread information format in World Wide Web. However, it has become increasingly harder to search for images of interest, and effective image search engine for the WWW needs to be developed. We propose in this paper a selective filtering process and a novel approach for image classification based on feature element in the image search engine we developed for the WWW. First a selective filtering process is embedded in a general web crawler to filter out the meaningless images with GIF format. Two parameters that can be obtained easily are used in the filtering process. Our classification approach first extract feature elements from images instead of feature vectors. Compared with feature vectors, feature elements can better capture visual meanings of the image according to subjective perception of human beings. Different from traditional image classification method, our classification approach based on feature element doesn't calculate the distance between two vectors in the feature space, while trying to find associations between feature element and class attribute of the image. Experiments are presented to show the efficiency of the proposed approach.

  5. Facilitated Protein Association via Engineered Target Search Pathways Visualized by Paramagnetic NMR Spectroscopy.

    PubMed

    An, So Young; Kim, Eun-Hee; Suh, Jeong-Yong

    2018-06-05

    Proteins assemble to form functional complexes via the progressive evolution of nonspecific complexes formed by transient encounters. This target search process generally involves multiple routes that lead the initial encounters to the final complex. In this study, we have employed NMR paramagnetic relaxation enhancement to visualize the encounter complexes between histidine-containing phosphocarrier protein and the N-terminal domain of enzyme I and demonstrate that protein association can be significantly enhanced by engineering on-pathways. Specifically, mutations in surface charges away from the binding interface can elicit new on-pathway encounter complexes, increasing their binding affinity by an order of magnitude. The structure of these encounter complexes indicates that such on-pathways extend the built-in target search process of the native protein complex. Furthermore, blocking on-pathways by countering mutations reverts their binding affinity. Our study thus illustrates that protein interactions can be engineered by rewiring the target search process. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. The accuracy of Internet search engines to predict diagnoses from symptoms can be assessed with a validated scoring system.

    PubMed

    Shenker, Bennett S

    2014-02-01

    To validate a scoring system that evaluates the ability of Internet search engines to correctly predict diagnoses when symptoms are used as search terms. We developed a five point scoring system to evaluate the diagnostic accuracy of Internet search engines. We identified twenty diagnoses common to a primary care setting to validate the scoring system. One investigator entered the symptoms for each diagnosis into three Internet search engines (Google, Bing, and Ask) and saved the first five webpages from each search. Other investigators reviewed the webpages and assigned a diagnostic accuracy score. They rescored a random sample of webpages two weeks later. To validate the five point scoring system, we calculated convergent validity and test-retest reliability using Kendall's W and Spearman's rho, respectively. We used the Kruskal-Wallis test to look for differences in accuracy scores for the three Internet search engines. A total of 600 webpages were reviewed. Kendall's W for the raters was 0.71 (p<0.0001). Spearman's rho for test-retest reliability was 0.72 (p<0.0001). There was no difference in scores based on Internet search engine. We found a significant difference in scores based on the webpage's order on the Internet search engine webpage (p=0.007). Pairwise comparisons revealed higher scores in the first webpages vs. the fourth (corr p=0.009) and fifth (corr p=0.017). However, this significance was lost when creating composite scores. The five point scoring system to assess diagnostic accuracy of Internet search engines is a valid and reliable instrument. The scoring system may be used in future Internet research. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  7. Knowledge-based personalized search engine for the Web-based Human Musculoskeletal System Resources (HMSR) in biomechanics.

    PubMed

    Dao, Tien Tuan; Hoang, Tuan Nha; Ta, Xuan Hien; Tho, Marie Christine Ho Ba

    2013-02-01

    Human musculoskeletal system resources of the human body are valuable for the learning and medical purposes. Internet-based information from conventional search engines such as Google or Yahoo cannot response to the need of useful, accurate, reliable and good-quality human musculoskeletal resources related to medical processes, pathological knowledge and practical expertise. In this present work, an advanced knowledge-based personalized search engine was developed. Our search engine was based on a client-server multi-layer multi-agent architecture and the principle of semantic web services to acquire dynamically accurate and reliable HMSR information by a semantic processing and visualization approach. A security-enhanced mechanism was applied to protect the medical information. A multi-agent crawler was implemented to develop a content-based database of HMSR information. A new semantic-based PageRank score with related mathematical formulas were also defined and implemented. As the results, semantic web service descriptions were presented in OWL, WSDL and OWL-S formats. Operational scenarios with related web-based interfaces for personal computers and mobile devices were presented and analyzed. Functional comparison between our knowledge-based search engine, a conventional search engine and a semantic search engine showed the originality and the robustness of our knowledge-based personalized search engine. In fact, our knowledge-based personalized search engine allows different users such as orthopedic patient and experts or healthcare system managers or medical students to access remotely into useful, accurate, reliable and good-quality HMSR information for their learning and medical purposes. Copyright © 2012 Elsevier Inc. All rights reserved.

  8. The importance of the keyword-generation method in keyword mnemonics.

    PubMed

    Campos, Alfredo; Amor, Angeles; González, María Angeles

    2004-01-01

    Keyword mnemonics is under certain conditions an effective approach for learning foreign-language vocabulary. It appears to be effective for words with high image vividness but not for words with low image vividness. In this study, two experiments were performed to assess the efficacy of a new keyword-generation procedure (peer generation). In Experiment 1, a sample of 363 high-school students was randomly into four groups. The subjects were required to learn L1 equivalents of a list of 16 Latin words (8 with high image vividness, 8 with low image vividness), using a) the rote method, or the keyword method with b) keywords and images generated and supplied by the experimenter, c) keywords and images generated by themselves, or d) keywords and images previously generated by peers (i.e., subjects with similar sociodemographic characteristics). Recall was tested immediately and one week later. For high-vivideness words, recall was significantly better in the keyword groups than the rote method group. For low-vividness words, learning method had no significant effect. Experiment 2 was basically identical, except that the word lists comprised 32 words (16 high-vividness, 16 low-vividness). In this experiment, the peer-generated-keyword group showed significantly better recall of high-vividness words than the rote method groups and the subject generated keyword group; again, however, learning method had no significant effect on recall of low-vividness words.

  9. Using Search Engine Query Data to Explore the Epidemiology of Common Gastrointestinal Symptoms.

    PubMed

    Hassid, Benjamin G; Day, Lukejohn W; Awad, Mohannad A; Sewell, Justin L; Osterberg, E Charles; Breyer, Benjamin N

    2017-03-01

    Internet searches are an increasingly used tool in medical research. To date, no studies have examined Google search data in relation to common gastrointestinal symptoms. The aim of this study was to compare trends in Internet search volume with clinical datasets for common gastrointestinal symptoms. Using Google Trends, we recorded relative changes in volume of searches related to dysphagia, vomiting, and diarrhea in the USA between January 2008 and January 2011. We queried the National Inpatient Sample (NIS) and the National Hospital Ambulatory Medical Care Survey (NHAMCS) during this time period and identified cases related to these symptoms. We assessed the correlation between Google Trends and these two clinical datasets, as well as examined seasonal variation trends. Changes to Google search volume for all three symptoms correlated significantly with changes to NIS output (dysphagia: r = 0.5, P = 0.002; diarrhea: r = 0.79, P < 0.001; vomiting: r = 0.76, P < 0.001). Both Google and NIS data showed that the prevalence of all three symptoms rose during the time period studied. On the other hand, the NHAMCS data trends during this time period did not correlate well with either the NIS or the Google data for any of the three symptoms studied. Both the NIS and Google data showed modest seasonal variation. Changes to the population burden of chronic GI symptoms may be tracked by monitoring changes to Google search engine query volume over time. These data demonstrate that the prevalence of common GI symptoms is rising over time.

  10. Using search engine query data to track pharmaceutical utilization: a study of statins.

    PubMed

    Schuster, Nathaniel M; Rogers, Mary A M; McMahon, Laurence F

    2010-08-01

    To examine temporal and geographic associations between Google queries for health information and healthcare utilization benchmarks. Retrospective longitudinal study. Using Google Trends and Google Insights for Search data, the search terms Lipitor (atorvastatin calcium; Pfizer, Ann Arbor, MI) and simvastatin were evaluated for change over time and for association with Lipitor revenues. The relationship between query data and community-based resource use per Medicare beneficiary was assessed for 35 US metropolitan areas. Google queries for Lipitor significantly decreased from January 2004 through June 2009 and queries for simvastatin significantly increased (P <.001 for both), particularly after Lipitor came off patent (P <.001 for change in slope). The mean number of Google queries for Lipitor correlated (r = 0.98) with the percentage change in Lipitor global revenues from 2004 to 2008 (P <.001). Query preference for Lipitor over simvastatin was positively associated (r = 0.40) with a community's use of Medicare services. For every 1% increase in utilization of Medicare services in a community, there was a 0.2-unit increase in the ratio of Lipitor queries to simvastatin queries in that community (P = .02). Specific search engine queries for medical information correlate with pharmaceutical revenue and with overall healthcare utilization in a community. This suggests that search query data can track community-wide characteristics in healthcare utilization and have the potential for informing payers and policy makers regarding trends in utilization.

  11. Is It "Writing on Water" or "Strike It Rich?" The Experiences of Prospective Teachers in Using Search Engines

    ERIC Educational Resources Information Center

    Sahin, Abdurrahman; Cermik, Hulya; Dogan, Birsen

    2010-01-01

    Information searching skills have become increasingly important for prospective teachers with the exponential growth of learning materials on the web. This study is an attempt to understand the experiences of prospective teachers with search engines through metaphoric images and to further investigate whether their experiences are related to the…

  12. Characterization of the human plasma phosphoproteome using linear ion trap mass spectrometry and multiple search engines.

    PubMed

    Carrascal, Montserrat; Gay, Marina; Ovelleiro, David; Casas, Vanessa; Gelpí, Emilio; Abian, Joaquin

    2010-02-05

    Major plasma protein families play different roles in blood physiology and hemostasis and in immunodefense. Other proteins in plasma can be involved in signaling as chemical messengers or constitute biological markers of the status of distant tissues. In this respect, the plasma phosphoproteome holds potentially relevant information on the mechanisms modulating these processes through the regulation of protein activity. In this work we describe for the first time a collection of phosphopeptides identified in human plasma using immunoaffinity separation of the seven major serum protein families from other plasma proteins, SCX fractionation, and TiO(2) purification prior to LC-MS/MS analysis. One-hundred and twenty-seven phosphosites in 138 phosphopeptides mapping 70 phosphoproteins were identified with FDR < 1%. A high-confidence collection of phosphosites was obtained using a combined search with the OMSSA, SEQUEST, and Phenyx search engines.

  13. A two-level cache for distributed information retrieval in search engines.

    PubMed

    Zhang, Weizhe; He, Hui; Ye, Jianwei

    2013-01-01

    To improve the performance of distributed information retrieval in search engines, we propose a two-level cache structure based on the queries of the users' logs. We extract the highest rank queries of users from the static cache, in which the queries are the most popular. We adopt the dynamic cache as an auxiliary to optimize the distribution of the cache data. We propose a distribution strategy of the cache data. The experiments prove that the hit rate, the efficiency, and the time consumption of the two-level cache have advantages compared with other structures of cache.

  14. A Two-Level Cache for Distributed Information Retrieval in Search Engines

    PubMed Central

    Zhang, Weizhe; He, Hui; Ye, Jianwei

    2013-01-01

    To improve the performance of distributed information retrieval in search engines, we propose a two-level cache structure based on the queries of the users' logs. We extract the highest rank queries of users from the static cache, in which the queries are the most popular. We adopt the dynamic cache as an auxiliary to optimize the distribution of the cache data. We propose a distribution strategy of the cache data. The experiments prove that the hit rate, the efficiency, and the time consumption of the two-level cache have advantages compared with other structures of cache. PMID:24363621

  15. Internet Search Engines: Copyright’s Fair Use in Reproduction and Public Display Rights

    DTIC Science & Technology

    2007-07-12

    asks, in other words, whether and to what extent the new work is transformative.” Campbell v. Acuff-Rose Music , Inc., 510 U.S. 569, 579 (1994). 10...purposes. It cited the U.S. Supreme Court’s decision in Sony Corp. v. Universal Studios, Inc.22 and Kelly, supra, as examples where copying of an...image may have been created originally to serve an entertainment , aesthetic, or informative function, a search engine transforms the image into a pointer

  16. Decision making in family medicine: randomized trial of the effects of the InfoClinique and Trip database search engines.

    PubMed

    Labrecque, Michel; Ratté, Stéphane; Frémont, Pierre; Cauchon, Michel; Ouellet, Jérôme; Hogg, William; McGowan, Jessie; Gagnon, Marie-Pierre; Njoya, Merlin; Légaré, France

    2013-10-01

    To compare the ability of users of 2 medical search engines, InfoClinique and the Trip database, to provide correct answers to clinical questions and to explore the perceived effects of the tools on the clinical decision-making process. Randomized trial. Three family medicine units of the family medicine program of the Faculty of Medicine at Laval University in Quebec city, Que. Fifteen second-year family medicine residents. Residents generated 30 structured questions about therapy or preventive treatment (2 questions per resident) based on clinical encounters. Using an Internet platform designed for the trial, each resident answered 20 of these questions (their own 2, plus 18 of the questions formulated by other residents, selected randomly) before and after searching for information with 1 of the 2 search engines. For each question, 5 residents were randomly assigned to begin their search with InfoClinique and 5 with the Trip database. The ability of residents to provide correct answers to clinical questions using the search engines, as determined by third-party evaluation. After answering each question, participants completed a questionnaire to assess their perception of the engine's effect on the decision-making process in clinical practice. Of 300 possible pairs of answers (1 answer before and 1 after the initial search), 254 (85%) were produced by 14 residents. Of these, 132 (52%) and 122 (48%) pairs of answers concerned questions that had been assigned an initial search with InfoClinique and the Trip database, respectively. Both engines produced an important and similar absolute increase in the proportion of correct answers after searching (26% to 62% for InfoClinique, for an increase of 36%; 24% to 63% for the Trip database, for an increase of 39%; P = .68). For all 30 clinical questions, at least 1 resident produced the correct answer after searching with either search engine. The mean (SD) time of the initial search for each question was 23.5 (7

  17. Semantic technologies improving the recall and precision of the Mercury metadata search engine

    NASA Astrophysics Data System (ADS)

    Pouchard, L. C.; Cook, R. B.; Green, J.; Palanisamy, G.; Noy, N.

    2011-12-01

    The Mercury federated metadata system [1] was developed at the Oak Ridge National Laboratory Distributed Active Archive Center (ORNL DAAC), a NASA-sponsored effort holding datasets about biogeochemical dynamics, ecological data, and environmental processes. Mercury currently indexes over 100,000 records from several data providers conforming to community standards, e.g. EML, FGDC, FGDC Biological Profile, ISO 19115 and DIF. With the breadth of sciences represented in Mercury, the potential exists to address some key interdisciplinary scientific challenges related to climate change, its environmental and ecological impacts, and mitigation of these impacts. However, this wealth of metadata also hinders pinpointing datasets relevant to a particular inquiry. We implemented a semantic solution after concluding that traditional search approaches cannot improve the accuracy of the search results in this domain because: a) unlike everyday queries, scientific queries seek to return specific datasets with numerous parameters that may or may not be exposed to search (Deep Web queries); b) the relevance of a dataset cannot be judged by its popularity, as each scientific inquiry tends to be unique; and c)each domain science has its own terminology, more or less curated, consensual, and standardized depending on the domain. The same terms may refer to different concepts across domains (homonyms), but different terms mean the same thing (synonyms). Interdisciplinary research is arduous because an expert in a domain must become fluent in the language of another, just to find relevant datasets. Thus, we decided to use scientific ontologies because they can provide a context for a free-text search, in a way that string-based keywords never will. With added context, relevant datasets are more easily discoverable. To enable search and programmatic access to ontology entities in Mercury, we are using an instance of the BioPortal ontology repository. Mercury accesses ontology entities

  18. Robust keyword retrieval method for OCRed text

    NASA Astrophysics Data System (ADS)

    Fujii, Yusaku; Takebe, Hiroaki; Tanaka, Hiroshi; Hotta, Yoshinobu

    2011-01-01

    Document management systems have become important because of the growing popularity of electronic filing of documents and scanning of books, magazines, manuals, etc., through a scanner or a digital camera, for storage or reading on a PC or an electronic book. Text information acquired by optical character recognition (OCR) is usually added to the electronic documents for document retrieval. Since texts generated by OCR generally include character recognition errors, robust retrieval methods have been introduced to overcome this problem. In this paper, we propose a retrieval method that is robust against both character segmentation and recognition errors. In the proposed method, the insertion of noise characters and dropping of characters in the keyword retrieval enables robustness against character segmentation errors, and character substitution in the keyword of the recognition candidate for each character in OCR or any other character enables robustness against character recognition errors. The recall rate of the proposed method was 15% higher than that of the conventional method. However, the precision rate was 64% lower.

  19. [On the seasonality of dermatoses: a retrospective analysis of search engine query data depending on the season].

    PubMed

    Köhler, M J; Springer, S; Kaatz, M

    2014-09-01

    The volume of search engine queries about disease-relevant items reflects public interest and correlates with disease prevalence as proven by the example of flu (influenza). Other influences include media attention or holidays. The present work investigates if the seasonality of prevalence or symptom severity of dermatoses correlates with search engine query data. The relative weekly volume of dermatological relevant search terms was assessed by the online tool Google Trends for the years 2009-2013. For each item, the degree of seasonality was calculated via frequency analysis and a geometric approach. Many dermatoses show a marked seasonality, reflected by search engine query volumes. Unexpected seasonal variations of these queries suggest a previously unknown variability of the respective disease prevalence. Furthermore, using the example of allergic rhinitis, a close correlation of search engine query data with actual pollen count can be demonstrated. In many cases, search engine query data are appropriate to estimate seasonal variability in prevalence of common dermatoses. This finding may be useful for real-time analysis and formation of hypotheses concerning pathogenetic or symptom aggravating mechanisms and may thus contribute to improvement of diagnostics and prevention of skin diseases.

  20. Maximizing the sensitivity and reliability of peptide identification in large-scale proteomic experiments by harnessing multiple search engines.

    PubMed

    Yu, Wen; Taylor, J Alex; Davis, Michael T; Bonilla, Leo E; Lee, Kimberly A; Auger, Paul L; Farnsworth, Chris C; Welcher, Andrew A; Patterson, Scott D

    2010-03-01

    Despite recent advances in qualitative proteomics, the automatic identification of peptides with optimal sensitivity and accuracy remains a difficult goal. To address this deficiency, a novel algorithm, Multiple Search Engines, Normalization and Consensus is described. The method employs six search engines and a re-scoring engine to search MS/MS spectra against protein and decoy sequences. After the peptide hits from each engine are normalized to error rates estimated from the decoy hits, peptide assignments are then deduced using a minimum consensus model. These assignments are produced in a series of progressively relaxed false-discovery rates, thus enabling a comprehensive interpretation of the data set. Additionally, the estimated false-discovery rate was found to have good concordance with the observed false-positive rate calculated from known identities. Benchmarking against standard proteins data sets (ISBv1, sPRG2006) and their published analysis, demonstrated that the Multiple Search Engines, Normalization and Consensus algorithm consistently achieved significantly higher sensitivity in peptide identifications, which led to increased or more robust protein identifications in all data sets compared with prior methods. The sensitivity and the false-positive rate of peptide identification exhibit an inverse-proportional and linear relationship with the number of participating search engines.

  1. Discovery of gigantic molecular nanostructures using a flow reaction array as a search engine.

    PubMed

    Zang, Hong-Ying; de la Oliva, Andreu Ruiz; Miras, Haralampos N; Long, De-Liang; McBurney, Roy T; Cronin, Leroy

    2014-04-28

    The discovery of gigantic molecular nanostructures like coordination and polyoxometalate clusters is extremely time-consuming since a vast combinatorial space needs to be searched, and even a systematic and exhaustive exploration of the available synthetic parameters relies on a great deal of serendipity. Here we present a synthetic methodology that combines a flow reaction array and algorithmic control to give a chemical 'real-space' search engine leading to the discovery and isolation of a range of new molecular nanoclusters based on [Mo(2)O(2)S(2)](2+)-based building blocks with either fourfold (C4) or fivefold (C5) symmetry templates and linkers. This engine leads us to isolate six new nanoscale cluster compounds: 1, {Mo(10)(C5)}; 2, {Mo(14)(C4)4(C5)2}; 3, {Mo(60)(C4)10}; 4, {Mo(48)(C4)6}; 5, {Mo(34)(C4)4}; 6, {Mo(18)(C4)9}; in only 200 automated experiments from a parameter space spanning ~5 million possible combinations.

  2. Evaluation of Quality and Readability of Health Information Websites Identified through India's Major Search Engines.

    PubMed

    Raj, S; Sharma, V L; Singh, A J; Goel, S

    2016-01-01

    Background. The available health information on websites should be reliable and accurate in order to make informed decisions by community. This study was done to assess the quality and readability of health information websites on World Wide Web in India. Methods. This cross-sectional study was carried out in June 2014. The key words "Health" and "Information" were used on search engines "Google" and "Yahoo." Out of 50 websites (25 from each search engines), after exclusion, 32 websites were evaluated. LIDA tool was used to assess the quality whereas the readability was assessed using Flesch Reading Ease Score (FRES), Flesch-Kincaid Grade Level (FKGL), and SMOG. Results. Forty percent of websites (n = 13) were sponsored by government. Health On the Net Code of Conduct (HONcode) certification was present on 50% (n = 16) of websites. The mean LIDA score (74.31) was average. Only 3 websites scored high on LIDA score. Only five had readability scores at recommended sixth-grade level. Conclusion. Most health information websites had average quality especially in terms of usability and reliability and were written at high readability levels. Efforts are needed to develop the health information websites which can help general population in informed decision making.

  3. Evaluation of Quality and Readability of Health Information Websites Identified through India's Major Search Engines

    PubMed Central

    Raj, S.; Sharma, V. L.; Singh, A. J.; Goel, S.

    2016-01-01

    Background. The available health information on websites should be reliable and accurate in order to make informed decisions by community. This study was done to assess the quality and readability of health information websites on World Wide Web in India. Methods. This cross-sectional study was carried out in June 2014. The key words “Health” and “Information” were used on search engines “Google” and “Yahoo.” Out of 50 websites (25 from each search engines), after exclusion, 32 websites were evaluated. LIDA tool was used to assess the quality whereas the readability was assessed using Flesch Reading Ease Score (FRES), Flesch-Kincaid Grade Level (FKGL), and SMOG. Results. Forty percent of websites (n = 13) were sponsored by government. Health On the Net Code of Conduct (HONcode) certification was present on 50% (n = 16) of websites. The mean LIDA score (74.31) was average. Only 3 websites scored high on LIDA score. Only five had readability scores at recommended sixth-grade level. Conclusion. Most health information websites had average quality especially in terms of usability and reliability and were written at high readability levels. Efforts are needed to develop the health information websites which can help general population in informed decision making. PMID:27119025

  4. Characterizing interdisciplinarity of researchers and research topics using web search engines.

    PubMed

    Sayama, Hiroki; Akaishi, Jin

    2012-01-01

    Researchers' networks have been subject to active modeling and analysis. Earlier literature mostly focused on citation or co-authorship networks reconstructed from annotated scientific publication databases, which have several limitations. Recently, general-purpose web search engines have also been utilized to collect information about social networks. Here we reconstructed, using web search engines, a network representing the relatedness of researchers to their peers as well as to various research topics. Relatedness between researchers and research topics was characterized by visibility boost-increase of a researcher's visibility by focusing on a particular topic. It was observed that researchers who had high visibility boosts by the same research topic tended to be close to each other in their network. We calculated correlations between visibility boosts by research topics and researchers' interdisciplinarity at the individual level (diversity of topics related to the researcher) and at the social level (his/her centrality in the researchers' network). We found that visibility boosts by certain research topics were positively correlated with researchers' individual-level interdisciplinarity despite their negative correlations with the general popularity of researchers. It was also found that visibility boosts by network-related topics had positive correlations with researchers' social-level interdisciplinarity. Research topics' correlations with researchers' individual- and social-level interdisciplinarities were found to be nearly independent from each other. These findings suggest that the notion of "interdisciplinarity" of a researcher should be understood as a multi-dimensional concept that should be evaluated using multiple assessment means.

  5. Discovery of gigantic molecular nanostructures using a flow reaction array as a search engine

    PubMed Central

    Zang, Hong-Ying; de la Oliva, Andreu Ruiz; Miras, Haralampos N.; Long, De-Liang; McBurney, Roy T.; Cronin, Leroy

    2014-01-01

    The discovery of gigantic molecular nanostructures like coordination and polyoxometalate clusters is extremely time-consuming since a vast combinatorial space needs to be searched, and even a systematic and exhaustive exploration of the available synthetic parameters relies on a great deal of serendipity. Here we present a synthetic methodology that combines a flow reaction array and algorithmic control to give a chemical ‘real-space’ search engine leading to the discovery and isolation of a range of new molecular nanoclusters based on [Mo2O2S2]2+-based building blocks with either fourfold (C4) or fivefold (C5) symmetry templates and linkers. This engine leads us to isolate six new nanoscale cluster compounds: 1, {Mo10(C5)}; 2, {Mo14(C4)4(C5)2}; 3, {Mo60(C4)10}; 4, {Mo48(C4)6}; 5, {Mo34(C4)4}; 6, {Mo18(C4)9}; in only 200 automated experiments from a parameter space spanning ~5 million possible combinations. PMID:24770632

  6. Relevance of Google-customized search engine vs. CISMeF quality-controlled health gateway.

    PubMed

    Gehanno, Jean-François; Kerdelhue, Gaétan; Sakji, Saoussen; Massari, Philippe; Joubert, Michel; Darmoni, Stéfan J

    2009-01-01

    CISMeF (acronym for Catalog and Index of French Language Health Resources on the Internet) is a quality-controlled health gateway conceived to catalog and index the most important and quality-controlled sources of institutional health information in French. The goal of this study is to compare the relevance of results provided by this gateway from a small set of documents selected and described by human experts to those provided by a search engine from a large set of automatically indexed and ranked resources. The Google-Customized search engine (CSE) was used. The evaluation was made using the 10th first results of 15 queries and two blinded physician evaluators. There was no significant difference between the relevance of information retrieval in CISMeF and Google CSE. In conclusion, automatic indexing does not lead to lower relevance than a manual MeSH indexing and may help to cope with the increasing number of references to be indexed in a controlled health quality gateway.

  7. Usability evaluation of an experimental text summarization system and three search engines: implications for the reengineering of health care interfaces.

    PubMed

    Kushniruk, Andre W; Kan, Min-Yem; McKeown, Kathleen; Klavans, Judith; Jordan, Desmond; LaFlamme, Mark; Patel, Vimia L

    2002-01-01

    This paper describes the comparative evaluation of an experimental automated text summarization system, Centrifuser and three conventional search engines - Google, Yahoo and About.com. Centrifuser provides information to patients and families relevant to their questions about specific health conditions. It then produces a multidocument summary of articles retrieved by a standard search engine, tailored to the user's question. Subjects, consisting of friends or family of hospitalized patients, were asked to "think aloud" as they interacted with the four systems. The evaluation involved audio- and video recording of subject interactions with the interfaces in situ at a hospital. Results of the evaluation show that subjects found Centrifuser's summarization capability useful and easy to understand. In comparing Centrifuser to the three search engines, subjects' ratings varied; however, specific interface features were deemed useful across interfaces. We conclude with a discussion of the implications for engineering Web-based retrieval systems.

  8. Abyss or Shelter? On the Relevance of Web Search Engines' Search Results When People Google for Suicide.

    PubMed

    Haim, Mario; Arendt, Florian; Scherr, Sebastian

    2017-02-01

    Despite evidence that suicide rates can increase after suicides are widely reported in the media, appropriate depictions of suicide in the media can help people to overcome suicidal crises and can thus elicit preventive effects. We argue on the level of individual media users that a similar ambivalence can be postulated for search results on online suicide-related search queries. Importantly, the filter bubble hypothesis (Pariser, 2011) states that search results are biased by algorithms based on a person's previous search behavior. In this study, we investigated whether suicide-related search queries, including either potentially suicide-preventive or -facilitative terms, influence subsequent search results. This might thus protect or harm suicidal Internet users. We utilized a 3 (search history: suicide-related harmful, suicide-related helpful, and suicide-unrelated) × 2 (reactive: clicking the top-most result link and no clicking) experimental design applying agent-based testing. While findings show no influences either of search histories or of reactivity on search results in a subsequent situation, the presentation of a helpline offer raises concerns about possible detrimental algorithmic decision-making: Algorithms "decided" whether or not to present a helpline, and this automated decision, then, followed the agent throughout the rest of the observation period. Implications for policy-making and search providers are discussed.

  9. Multimedia explorer: image database, image proxy-server and search-engine.

    PubMed

    Frankewitsch, T; Prokosch, U

    1999-01-01

    Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval.

  10. Multimedia explorer: image database, image proxy-server and search-engine.

    PubMed Central

    Frankewitsch, T.; Prokosch, U.

    1999-01-01

    Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval. PMID:10566463

  11. Sagace: A web-based search engine for biomedical databases in Japan

    PubMed Central

    2012-01-01

    Background In the big data era, biomedical research continues to generate a large amount of data, and the generated information is often stored in a database and made publicly available. Although combining data from multiple databases should accelerate further studies, the current number of life sciences databases is too large to grasp features and contents of each database. Findings We have developed Sagace, a web-based search engine that enables users to retrieve information from a range of biological databases (such as gene expression profiles and proteomics data) and biological resource banks (such as mouse models of disease and cell lines). With Sagace, users can search more than 300 databases in Japan. Sagace offers features tailored to biomedical research, including manually tuned ranking, a faceted navigation to refine search results, and rich snippets constructed with retrieved metadata for each database entry. Conclusions Sagace will be valuable for experts who are involved in biomedical research and drug development in both academia and industry. Sagace is freely available at http://sagace.nibio.go.jp/en/. PMID:23110816

  12. Community Involvement in Enhancing the Global Change Master Directory (GCMD) Controlled Vocabularies (Keywords)

    NASA Technical Reports Server (NTRS)

    Stevens, T.; Ritz, S.; Aleman, A.; Genazzio, M.; Morahan, M.; Wharton, S.

    2016-01-01

    NASA's Global Change Master Directory (GCMD) develops and expands a hierarchical set of controlled vocabularies (keywords) covering the Earth sciences and associated information (data centers, projects, platforms, instruments, etc.). The purpose of the keywords is to describe Earth science data and services in a consistent and comprehensive manner, allowing for the precise searching of metadata and subsequent retrieval of data and services. The keywords are accessible in a standardized SKOSRDFOWL representation and are used as an authoritative taxonomy, as a source for developing ontologies, and to search and access Earth Science data within online metadata catalogues. The keyword development approach involves: (1) receiving community suggestions, (2) triaging community suggestions, (3) evaluating the keywords against a set of criteria coordinated by the NASA ESDIS Standards Office, and (4) publication/notification of the keyword changes. This approach emphasizes community input, which helps ensure a high quality, normalized, and relevant keyword structure that will evolve with users changing needs. The Keyword Community Forum, which promotes a responsive, open, and transparent processes, is an area where users can discuss keyword topics and make suggestions for new keywords. The formalized approach could potentially be used as a model for keyword development.

  13. Keyword Extraction from Multiple Words for Report Recommendations in Media Wiki

    NASA Astrophysics Data System (ADS)

    Elakiya, K.; Sahayadhas, Arun

    2017-03-01

    This paper addresses the problem of multiple words search, with the goal of using these multiple word search to retrieve, relevant wiki page which will be recommended to end user. However, the existing system provides a link to wiki page for only a single keyword only which is available in Wikipedia. Therefore it is difficult to get the correct result when search input has multiple keywords or a sentence. We have introduced a ‘FastStringSearch’ technique which will provide option for efficient search with multiple key words and which will increase the flexibility for the end user to get his expected content easily.

  14. Marine Planning and Service Platform: specific ontology based semantic search engine serving data management and sustainable development

    NASA Astrophysics Data System (ADS)

    Manzella, Giuseppe M. R.; Bartolini, Andrea; Bustaffa, Franco; D'Angelo, Paolo; De Mattei, Maurizio; Frontini, Francesca; Maltese, Maurizio; Medone, Daniele; Monachini, Monica; Novellino, Antonio; Spada, Andrea

    2016-04-01

    The MAPS (Marine Planning and Service Platform) project is aiming at building a computer platform supporting a Marine Information and Knowledge System. One of the main objective of the project is to develop a repository that should gather, classify and structure marine scientific literature and data thus guaranteeing their accessibility to researchers and institutions by means of standard protocols. In oceanography the cost related to data collection is very high and the new paradigm is based on the concept to collect once and re-use many times (for re-analysis, marine environment assessment, studies on trends, etc). This concept requires the access to quality controlled data and to information that is provided in reports (grey literature) and/or in relevant scientific literature. Hence, creation of new technology is needed by integrating several disciplines such as data management, information systems, knowledge management. In one of the most important EC projects on data management, namely SeaDataNet (www.seadatanet.org), an initial example of knowledge management is provided through the Common Data Index, that is providing links to data and (eventually) to papers. There are efforts to develop search engines to find author's contributions to scientific literature or publications. This implies the use of persistent identifiers (such as DOI), as is done in ORCID. However very few efforts are dedicated to link publications to the data cited or used or that can be of importance for the published studies. This is the objective of MAPS. Full-text technologies are often unsuccessful since they assume the presence of specific keywords in the text; in order to fix this problem, the MAPS project suggests to use different semantic technologies for retrieving the text and data and thus getting much more complying results. The main parts of our design of the search engine are: • Syntactic parser - This module is responsible for the extraction of "rich words" from the text

  15. L1000CDS2: LINCS L1000 characteristic direction signatures search engine.

    PubMed

    Duan, Qiaonan; Reid, St Patrick; Clark, Neil R; Wang, Zichen; Fernandez, Nicolas F; Rouillard, Andrew D; Readhead, Ben; Tritsch, Sarah R; Hodos, Rachel; Hafner, Marc; Niepel, Mario; Sorger, Peter K; Dudley, Joel T; Bavari, Sina; Panchal, Rekha G; Ma'ayan, Avi

    2016-01-01

    The library of integrated network-based cellular signatures (LINCS) L1000 data set currently comprises of over a million gene expression profiles of chemically perturbed human cell lines. Through unique several intrinsic and extrinsic benchmarking schemes, we demonstrate that processing the L1000 data with the characteristic direction (CD) method significantly improves signal to noise compared with the MODZ method currently used to compute L1000 signatures. The CD processed L1000 signatures are served through a state-of-the-art web-based search engine application called L1000CDS 2 . The L1000CDS 2 search engine provides prioritization of thousands of small-molecule signatures, and their pairwise combinations, predicted to either mimic or reverse an input gene expression signature using two methods. The L1000CDS 2 search engine also predicts drug targets for all the small molecules profiled by the L1000 assay that we processed. Targets are predicted by computing the cosine similarity between the L1000 small-molecule signatures and a large collection of signatures extracted from the gene expression omnibus (GEO) for single-gene perturbations in mammalian cells. We applied L1000CDS 2 to prioritize small molecules that are predicted to reverse expression in 670 disease signatures also extracted from GEO, and prioritized small molecules that can mimic expression of 22 endogenous ligand signatures profiled by the L1000 assay. As a case study, to further demonstrate the utility of L1000CDS 2 , we collected expression signatures from human cells infected with Ebola virus at 30, 60 and 120 min. Querying these signatures with L1000CDS 2 we identified kenpaullone, a GSK3B/CDK2 inhibitor that we show, in subsequent experiments, has a dose-dependent efficacy in inhibiting Ebola infection in vitro without causing cellular toxicity in human cell lines. In summary, the L1000CDS 2 tool can be applied in many biological and biomedical settings, while improving the extraction of

  16. L1000CDS2: LINCS L1000 characteristic direction signatures search engine

    PubMed Central

    Duan, Qiaonan; Reid, St Patrick; Clark, Neil R; Wang, Zichen; Fernandez, Nicolas F; Rouillard, Andrew D; Readhead, Ben; Tritsch, Sarah R; Hodos, Rachel; Hafner, Marc; Niepel, Mario; Sorger, Peter K; Dudley, Joel T; Bavari, Sina; Panchal, Rekha G; Ma’ayan, Avi

    2016-01-01

    The library of integrated network-based cellular signatures (LINCS) L1000 data set currently comprises of over a million gene expression profiles of chemically perturbed human cell lines. Through unique several intrinsic and extrinsic benchmarking schemes, we demonstrate that processing the L1000 data with the characteristic direction (CD) method significantly improves signal to noise compared with the MODZ method currently used to compute L1000 signatures. The CD processed L1000 signatures are served through a state-of-the-art web-based search engine application called L1000CDS2. The L1000CDS2 search engine provides prioritization of thousands of small-molecule signatures, and their pairwise combinations, predicted to either mimic or reverse an input gene expression signature using two methods. The L1000CDS2 search engine also predicts drug targets for all the small molecules profiled by the L1000 assay that we processed. Targets are predicted by computing the cosine similarity between the L1000 small-molecule signatures and a large collection of signatures extracted from the gene expression omnibus (GEO) for single-gene perturbations in mammalian cells. We applied L1000CDS2 to prioritize small molecules that are predicted to reverse expression in 670 disease signatures also extracted from GEO, and prioritized small molecules that can mimic expression of 22 endogenous ligand signatures profiled by the L1000 assay. As a case study, to further demonstrate the utility of L1000CDS2, we collected expression signatures from human cells infected with Ebola virus at 30, 60 and 120 min. Querying these signatures with L1000CDS2 we identified kenpaullone, a GSK3B/CDK2 inhibitor that we show, in subsequent experiments, has a dose-dependent efficacy in inhibiting Ebola infection in vitro without causing cellular toxicity in human cell lines. In summary, the L1000CDS2 tool can be applied in many biological and biomedical settings, while improving the extraction of knowledge

  17. Rapid automatic keyword extraction for information retrieval and analysis

    DOEpatents

    Rose, Stuart J [Richland, WA; Cowley,; E, Wendy [Richland, WA; Crow, Vernon L [Richland, WA; Cramer, Nicholas O [Richland, WA

    2012-03-06

    Methods and systems for rapid automatic keyword extraction for information retrieval and analysis. Embodiments can include parsing words in an individual document by delimiters, stop words, or both in order to identify candidate keywords. Word scores for each word within the candidate keywords are then calculated based on a function of co-occurrence degree, co-occurrence frequency, or both. Based on a function of the word scores for words within the candidate keyword, a keyword score is calculated for each of the candidate keywords. A portion of the candidate keywords are then extracted as keywords based, at least in part, on the candidate keywords having the highest keyword scores.

  18. Feeling lucky? Using search engines to assess perceptions of urban sustainability

    SciTech Connect

    Keirstead, James

    2009-02-15

    The sustainability of urban environments is an important issue at both local and international scales. Indicators are frequently used by decision-makers seeking to improve urban performance but these metrics can be dependent on sparse quantitative data. This paper explores the potential of an alternative approach, using an internet search engine to quickly gather qualitative data on the key attributes of cities. The method is applied to 21 world cities and the results indicate that, while the technique does shed light on direct and indirect aspects of sustainability, the validity of derived metrics as objective indicators of long-term sustainability is questionable.more » However the method's ability to provide subjective short-term assessments is more promising and it could therefore play an important role in participatory policy exercises such as public consultations. A number of promising technical improvements to the method's performance are also highlighted.« less

  19. World Wide Web Based Image Search Engine Using Text and Image Content Features

    NASA Astrophysics Data System (ADS)

    Luo, Bo; Wang, Xiaogang; Tang, Xiaoou

    2003-01-01

    Using both text and image content features, a hybrid image retrieval system for Word Wide Web is developed in this paper. We first use a text-based image meta-search engine to retrieve images from the Web based on the text information on the image host pages to provide an initial image set. Because of the high-speed and low cost nature of the text-based approach, we can easily retrieve a broad coverage of images with a high recall rate and a relatively low precision. An image content based ordering is then performed on the initial image set. All the images are clustered into different folders based on the image content features. In addition, the images can be re-ranked by the content features according to the user feedback. Such a design makes it truly practical to use both text and image content for image retrieval over the Internet. Experimental results confirm the efficiency of the system.

  20. Characterizing Interdisciplinarity of Researchers and Research Topics Using Web Search Engines

    PubMed Central

    Sayama, Hiroki; Akaishi, Jin

    2012-01-01

    Researchers' networks have been subject to active modeling and analysis. Earlier literature mostly focused on citation or co-authorship networks reconstructed from annotated scientific publication databases, which have several limitations. Recently, general-purpose web search engines have also been utilized to collect information about social networks. Here we reconstructed, using web search engines, a network representing the relatedness of researchers to their peers as well as to various research topics. Relatedness between researchers and research topics was characterized by visibility boost—increase of a researcher's visibility by focusing on a particular topic. It was observed that researchers who had high visibility boosts by the same research topic tended to be close to each other in their network. We calculated correlations between visibility boosts by research topics and researchers' interdisciplinarity at the individual level (diversity of topics related to the researcher) and at the social level (his/her centrality in the researchers' network). We found that visibility boosts by certain research topics were positively correlated with researchers' individual-level interdisciplinarity despite their negative correlations with the general popularity of researchers. It was also found that visibility boosts by network-related topics had positive correlations with researchers' social-level interdisciplinarity. Research topics' correlations with researchers' individual- and social-level interdisciplinarities were found to be nearly independent from each other. These findings suggest that the notion of “interdisciplinarity" of a researcher should be understood as a multi-dimensional concept that should be evaluated using multiple assessment means. PMID:22719935

  1. IPeak: An open source tool to combine results from multiple MS/MS search engines.

    PubMed

    Wen, Bo; Du, Chaoqin; Li, Guilin; Ghali, Fawaz; Jones, Andrew R; Käll, Lukas; Xu, Shaohang; Zhou, Ruo; Ren, Zhe; Feng, Qiang; Xu, Xun; Wang, Jun

    2015-09-01

    Liquid chromatography coupled tandem mass spectrometry (LC-MS/MS) is an important technique for detecting peptides in proteomics studies. Here, we present an open source software tool, termed IPeak, a peptide identification pipeline that is designed to combine the Percolator post-processing algorithm and multi-search strategy to enhance the sensitivity of peptide identifications without compromising accuracy. IPeak provides a graphical user interface (GUI) as well as a command-line interface, which is implemented in JAVA and can work on all three major operating system platforms: Windows, Linux/Unix and OS X. IPeak has been designed to work with the mzIdentML standard from the Proteomics Standards Initiative (PSI) as an input and output, and also been fully integrated into the associated mzidLibrary project, providing access to the overall pipeline, as well as modules for calling Percolator on individual search engine result files. The integration thus enables IPeak (and Percolator) to be used in conjunction with any software packages implementing the mzIdentML data standard. IPeak is freely available and can be downloaded under an Apache 2.0 license at https://code.google.com/p/mzidentml-lib/. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  2. Towards iconic language for patient records, drug monographs, guidelines and medical search engines.

    PubMed

    Lamy, Jean-Baptiste; Duclos, Catherine; Hamek, Saliha; Beuscart-Zéphir, Marie-Catherine; Kerdelhué, Gaetan; Darmoni, Stefan; Favre, Madeleine; Falcoff, Hector; Simon, Christian; Pereira, Suzanne; Serrot, Elisabeth; Mitouard, Thierry; Hardouin, Etienne; Kergosien, Yannick; Venot, Alain

    2010-01-01

    Practicing physicians have limited time for consulting medical knowledge and records. We have previously shown that using icons instead of text to present drug monographs may allow contraindications and adverse effects to be identified more rapidly and more accurately. These findings were based on the use of an iconic language designed for drug knowledge, providing icons for many medical concepts, including diseases, antecedents, drug classes and tests. In this paper, we describe a new project aimed at extending this iconic language, and exploring the possible applications of these icons in medicine. Based on evaluators' comments, focus groups of physicians and opinions of academic, industrial and associative partners, we propose iconic applications related to patient records, for example summarizing patient conditions, searching for specific clinical documents and helping to code structured data. Other applications involve the presentation of clinical practice guidelines and improving the interface of medical search engines. These new applications could use the same iconic language that was designed for drug knowledge, with a few additional items that respect the logic of the language.

  3. Heart research advances using database search engines, Human Protein Atlas and the Sydney Heart Bank.

    PubMed

    Li, Amy; Estigoy, Colleen; Raftery, Mark; Cameron, Darryl; Odeberg, Jacob; Pontén, Fredrik; Lal, Sean; Dos Remedios, Cristobal G

    2013-10-01

    This Methodological Review is intended as a guide for research students who may have just discovered a human "novel" cardiac protein, but it may also help hard-pressed reviewers of journal submissions on a "novel" protein reported in an animal model of human heart failure. Whether you are an expert or not, you may know little or nothing about this particular protein of interest. In this review we provide a strategic guide on how to proceed. We ask: How do you discover what has been published (even in an abstract or research report) about this protein? Everyone knows how to undertake literature searches using PubMed and Medline but these are usually encyclopaedic, often producing long lists of papers, most of which are either irrelevant or only vaguely relevant to your query. Relatively few will be aware of more advanced search engines such as Google Scholar and even fewer will know about Quertle. Next, we provide a strategy for discovering if your "novel" protein is expressed in the normal, healthy human heart, and if it is, we show you how to investigate its subcellular location. This can usually be achieved by visiting the website "Human Protein Atlas" without doing a single experiment. Finally, we provide a pathway to discovering if your protein of interest changes its expression level with heart failure/disease or with ageing. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.

  4. Objective and automated protocols for the evaluation of biomedical search engines using No Title Evaluation protocols.

    PubMed

    Campagne, Fabien

    2008-02-29

    The evaluation of information retrieval techniques has traditionally relied on human judges to determine which documents are relevant to a query and which are not. This protocol is used in the Text Retrieval Evaluation Conference (TREC), organized annually for the past 15 years, to support the unbiased evaluation of novel information retrieval approaches. The TREC Genomics Track has recently been introduced to measure the performance of information retrieval for biomedical applications. We describe two protocols for evaluating biomedical information retrieval techniques without human relevance judgments. We call these protocols No Title Evaluation (NT Evaluation). The first protocol measures performance for focused searches, where only one relevant document exists for each query. The second protocol measures performance for queries expected to have potentially many relevant documents per query (high-recall searches). Both protocols take advantage of the clear separation of titles and abstracts found in Medline. We compare the performance obtained with these evaluation protocols to results obtained by reusing the relevance judgments produced in the 2004 and 2005 TREC Genomics Track and observe significant correlations between performance rankings generated by our approach and TREC. Spearman's correlation coefficients in the range of 0.79-0.92 are observed comparing bpref measured with NT Evaluation or with TREC evaluations. For comparison, coefficients in the range 0.86-0.94 can be observed when evaluating the same set of methods with data from two independent TREC Genomics Track evaluations. We discuss the advantages of NT Evaluation over the TRels and the data fusion evaluation protocols introduced recently. Our results suggest that the NT Evaluation protocols described here could be used to optimize some search engine parameters before human evaluation. Further research is needed to determine if NT Evaluation or variants of these protocols can fully substitute

  5. SeqWare Query Engine: storing and searching sequence data in the cloud.

    PubMed

    O'Connor, Brian D; Merriman, Barry; Nelson, Stanley F

    2010-12-21

    analytical tools. The range of data types supported, the ease of querying and integrating with existing tools, and the robust scalability of the underlying cloud-based technologies make SeqWare Query Engine a nature fit for storing and searching ever-growing genome sequence datasets.

  6. SeqWare Query Engine: storing and searching sequence data in the cloud

    PubMed Central

    2010-01-01

    interface to simplify development of analytical tools. The range of data types supported, the ease of querying and integrating with existing tools, and the robust scalability of the underlying cloud-based technologies make SeqWare Query Engine a nature fit for storing and searching ever-growing genome sequence datasets. PMID:21210981

  7. What We've Learned From Doing Usability Testing on OpenURL Resolvers and Federated Search Engines

    ERIC Educational Resources Information Center

    Cervone, Frank

    2005-01-01

    OpenURL resolvers and federated search engines are important new services in the library field. For some librarians, these services may seem "old hat" by now, but for the majority these services are still in the early stages of implementation or planning. In many cases, these two services are offered as a seamlessly integrated whole.…

  8. Medical record search engines, using pseudonymised patient identity: an alternative to centralised medical records.

    PubMed

    Quantin, Catherine; Jaquet-Chiffelle, David-Olivier; Coatrieux, Gouenou; Benzenine, Eric; Allaert, François-André

    2011-02-01

    The purpose of our multidisciplinary study was to define a pragmatic and secure alternative to the creation of a national centralised medical record which could gather together the different parts of the medical record of a patient scattered in the different hospitals where he was hospitalised without any risk of breaching confidentiality. We first analyse the reasons for the failure and the dangers of centralisation (i.e. difficulty to define a European patients' identifier, to reach a common standard for the contents of the medical record, for data protection) and then propose an alternative that uses the existing available data on the basis that setting up a safe though imperfect system could be better than continuing a quest for a mythical perfect information system that we have still not found after a search that has lasted two decades. We describe the functioning of Medical Record Search Engines (MRSEs), using pseudonymisation of patients' identity. The MRSE will be able to retrieve and to provide upon an MD's request all the available information concerning a patient who has been hospitalised in different hospitals without ever having access to the patient's identity. The drawback of this system is that the medical practitioner then has to read all of the information and to create his own synthesis and eventually to reject extra data. Faced with the difficulties and the risks of setting up a centralised medical record system, a system that gathers all of the available information concerning a patient could be of great interest. This low-cost pragmatic alternative which could be developed quickly should be taken into consideration by health authorities. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.

  9. Semantic similarity measures in the biomedical domain by leveraging a web search engine.

    PubMed

    Hsieh, Sheau-Ling; Chang, Wen-Yung; Chen, Chi-Huang; Weng, Yung-Ching

    2013-07-01

    Various researches in web related semantic similarity measures have been deployed. However, measuring semantic similarity between two terms remains a challenging task. The traditional ontology-based methodologies have a limitation that both concepts must be resided in the same ontology tree(s). Unfortunately, in practice, the assumption is not always applicable. On the other hand, if the corpus is sufficiently adequate, the corpus-based methodologies can overcome the limitation. Now, the web is a continuous and enormous growth corpus. Therefore, a method of estimating semantic similarity is proposed via exploiting the page counts of two biomedical concepts returned by Google AJAX web search engine. The features are extracted as the co-occurrence patterns of two given terms P and Q, by querying P, Q, as well as P AND Q, and the web search hit counts of the defined lexico-syntactic patterns. These similarity scores of different patterns are evaluated, by adapting support vector machines for classification, to leverage the robustness of semantic similarity measures. Experimental results validating against two datasets: dataset 1 provided by A. Hliaoutakis; dataset 2 provided by T. Pedersen, are presented and discussed. In dataset 1, the proposed approach achieves the best correlation coefficient (0.802) under SNOMED-CT. In dataset 2, the proposed method obtains the best correlation coefficient (SNOMED-CT: 0.705; MeSH: 0.723) with physician scores comparing with measures of other methods. However, the correlation coefficients (SNOMED-CT: 0.496; MeSH: 0.539) with coder scores received opposite outcomes. In conclusion, the semantic similarity findings of the proposed method are close to those of physicians' ratings. Furthermore, the study provides a cornerstone investigation for extracting fully relevant information from digitizing, free-text medical records in the National Taiwan University Hospital database.

  10. The impact of search engine selection and sorting criteria on vaccination beliefs and attitudes: two experiments manipulating Google output.

    PubMed

    Allam, Ahmed; Schulz, Peter Johannes; Nakamoto, Kent

    2014-04-02

    During the past 2 decades, the Internet has evolved to become a necessity in our daily lives. The selection and sorting algorithms of search engines exert tremendous influence over the global spread of information and other communication processes. This study is concerned with demonstrating the influence of selection and sorting/ranking criteria operating in search engines on users' knowledge, beliefs, and attitudes of websites about vaccination. In particular, it is to compare the effects of search engines that deliver websites emphasizing on the pro side of vaccination with those focusing on the con side and with normal Google as a control group. We conducted 2 online experiments using manipulated search engines. A pilot study was to verify the existence of dangerous health literacy in connection with searching and using health information on the Internet by exploring the effect of 2 manipulated search engines that yielded either pro or con vaccination sites only, with a group receiving normal Google as control. A pre-post test design was used; participants were American marketing students enrolled in a study-abroad program in Lugano, Switzerland. The second experiment manipulated the search engine by applying different ratios of con versus pro vaccination webpages displayed in the search results. Participants were recruited from Amazon's Mechanical Turk platform where it was published as a human intelligence task (HIT). Both experiments showed knowledge highest in the group offered only pro vaccination sites (Z=-2.088, P=.03; Kruskal-Wallis H test [H₅]=11.30, P=.04). They acknowledged the importance/benefits (Z=-2.326, P=.02; H5=11.34, P=.04) and effectiveness (Z=-2.230, P=.03) of vaccination more, whereas groups offered antivaccination sites only showed increased concern about effects (Z=-2.582, P=.01; H₅=16.88, P=.005) and harmful health outcomes (Z=-2.200, P=.02) of vaccination. Normal Google users perceived information quality to be positive despite a

  11. The Impact of Search Engine Selection and Sorting Criteria on Vaccination Beliefs and Attitudes: Two Experiments Manipulating Google Output

    PubMed Central

    Schulz, Peter Johannes; Nakamoto, Kent

    2014-01-01

    Background During the past 2 decades, the Internet has evolved to become a necessity in our daily lives. The selection and sorting algorithms of search engines exert tremendous influence over the global spread of information and other communication processes. Objective This study is concerned with demonstrating the influence of selection and sorting/ranking criteria operating in search engines on users’ knowledge, beliefs, and attitudes of websites about vaccination. In particular, it is to compare the effects of search engines that deliver websites emphasizing on the pro side of vaccination with those focusing on the con side and with normal Google as a control group. Method We conducted 2 online experiments using manipulated search engines. A pilot study was to verify the existence of dangerous health literacy in connection with searching and using health information on the Internet by exploring the effect of 2 manipulated search engines that yielded either pro or con vaccination sites only, with a group receiving normal Google as control. A pre-post test design was used; participants were American marketing students enrolled in a study-abroad program in Lugano, Switzerland. The second experiment manipulated the search engine by applying different ratios of con versus pro vaccination webpages displayed in the search results. Participants were recruited from Amazon’s Mechanical Turk platform where it was published as a human intelligence task (HIT). Results Both experiments showed knowledge highest in the group offered only pro vaccination sites (Z=–2.088, P=.03; Kruskal-Wallis H test [H5]=11.30, P=.04). They acknowledged the importance/benefits (Z=–2.326, P=.02; H5=11.34, P=.04) and effectiveness (Z=–2.230, P=.03) of vaccination more, whereas groups offered antivaccination sites only showed increased concern about effects (Z=–2.582, P=.01; H5=16.88, P=.005) and harmful health outcomes (Z=–2.200, P=.02) of vaccination. Normal Google users perceived

  12. Development and empirical user-centered evaluation of semantically-based query recommendation for an electronic health record search engine.

    PubMed

    Hanauer, David A; Wu, Danny T Y; Yang, Lei; Mei, Qiaozhu; Murkowski-Steffy, Katherine B; Vydiswaran, V G Vinod; Zheng, Kai

    2017-03-01

    The utility of biomedical information retrieval environments can be severely limited when users lack expertise in constructing effective search queries. To address this issue, we developed a computer-based query recommendation algorithm that suggests semantically interchangeable terms based on an initial user-entered query. In this study, we assessed the value of this approach, which has broad applicability in biomedical information retrieval, by demonstrating its application as part of a search engine that facilitates retrieval of information from electronic health records (EHRs). The query recommendation algorithm utilizes MetaMap to identify medical concepts from search queries and indexed EHR documents. Synonym variants from UMLS are used to expand the concepts along with a synonym set curated from historical EHR search logs. The empirical study involved 33 clinicians and staff who evaluated the system through a set of simulated EHR search tasks. User acceptance was assessed using the widely used technology acceptance model. The search engine's performance was rated consistently higher with the query recommendation feature turned on vs. off. The relevance of computer-recommended search terms was also rated high, and in most cases the participants had not thought of these terms on their own. The questions on perceived usefulness and perceived ease of use received overwhelmingly positive responses. A vast majority of the participants wanted the query recommendation feature to be available to assist in their day-to-day EHR search tasks. Challenges persist for users to construct effective search queries when retrieving information from biomedical documents including those from EHRs. This study demonstrates that semantically-based query recommendation is a viable solution to addressing this challenge. Published by Elsevier Inc.

  13. Enhanced Missing Proteins Detection in NCI60 Cell Lines Using an Integrative Search Engine Approach.

    PubMed

    Guruceaga, Elizabeth; Garin-Muga, Alba; Prieto, Gorka; Bejarano, Bartolomé; Marcilla, Miguel; Marín-Vicente, Consuelo; Perez-Riverol, Yasset; Casal, J Ignacio; Vizcaíno, Juan Antonio; Corrales, Fernando J; Segura, Victor

    2017-12-01

    The Human Proteome Project (HPP) aims deciphering the complete map of the human proteome. In the past few years, significant efforts of the HPP teams have been dedicated to the experimental detection of the missing proteins, which lack reliable mass spectrometry evidence of their existence. In this endeavor, an in depth analysis of shotgun experiments might represent a valuable resource to select a biological matrix in design validation experiments. In this work, we used all the proteomic experiments from the NCI60 cell lines and applied an integrative approach based on the results obtained from Comet, Mascot, OMSSA, and X!Tandem. This workflow benefits from the complementarity of these search engines to increase the proteome coverage. Five missing proteins C-HPP guidelines compliant were identified, although further validation is needed. Moreover, 165 missing proteins were detected with only one unique peptide, and their functional analysis supported their participation in cellular pathways as was also proposed in other studies. Finally, we performed a combined analysis of the gene expression levels and the proteomic identifications from the common cell lines between the NCI60 and the CCLE project to suggest alternatives for further validation of missing protein observations.

  14. Enhanced Missing Proteins Detection in NCI60 Cell Lines Using an Integrative Search Engine Approach

    PubMed Central

    2017-01-01

    The Human Proteome Project (HPP) aims deciphering the complete map of the human proteome. In the past few years, significant efforts of the HPP teams have been dedicated to the experimental detection of the missing proteins, which lack reliable mass spectrometry evidence of their existence. In this endeavor, an in depth analysis of shotgun experiments might represent a valuable resource to select a biological matrix in design validation experiments. In this work, we used all the proteomic experiments from the NCI60 cell lines and applied an integrative approach based on the results obtained from Comet, Mascot, OMSSA, and X!Tandem. This workflow benefits from the complementarity of these search engines to increase the proteome coverage. Five missing proteins C-HPP guidelines compliant were identified, although further validation is needed. Moreover, 165 missing proteins were detected with only one unique peptide, and their functional analysis supported their participation in cellular pathways as was also proposed in other studies. Finally, we performed a combined analysis of the gene expression levels and the proteomic identifications from the common cell lines between the NCI60 and the CCLE project to suggest alternatives for further validation of missing protein observations. PMID:28960077

  15. A survey investigation of UK physiotherapists' use of online search engines for continuing professional development.

    PubMed

    Harland, Nicholas; Drew, Benjamin T

    2013-09-01

    The purpose of this study was to discover the frequency and type of use of online resources for continuing professional development displayed by physiotherapists in the UK. Therapists' skills, needs and frustrations using these resources were explored. With the relatively recent release and saturated use of the internet the potential presence of a skills gap between therapists at different stages of their career was also investigated. National online survey study. The online survey was carried out using the international online service 'Survey Monkey'. 774 physiotherapists from students to band 8c completed the survey. The online survey was advertised through Frontline, the Interactive Chartered Society of Physiotherapy, Journal of Physiotherapy Pain Association and cascade email through research and other networks. Most physiotherapists reported using the internet for professional purposes daily (40%) or 2 to 4 times a week (37%), with only 8% of respondents using it less than once a week. Overall the results suggest band 6 and 7 physiotherapists had the least skills and most frustrations when using online search engines. History and the nature of rapid technological advancement, specifically of the internet, appears to have created a generational skills gap within the largest group of the physiotherapy workforce band 6 and 7 therapists. Students, band 5 and band 8a therapists appear to most successfully use online resources and the reasons for this are explored. Copyright © 2012 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.

  16. Monitoring hand, foot and mouth disease by combining search engine query data and meteorological factors.

    PubMed

    Huang, Da-Cang; Wang, Jin-Feng

    2018-01-15

    Hand, foot and mouth disease (HFMD) has been recognized as a significant public health threat and poses a tremendous challenge to disease control departments. To date, the relationship between meteorological factors and HFMD has been documented, and public interest of disease has been proven to be trackable from the Internet. However, no study has explored the combination of these two factors in the monitoring of HFMD. Therefore, the main aim of this study was to develop an effective monitoring model of HFMD in Guangzhou, China by utilizing historical HFMD cases, Internet-based search engine query data and meteorological factors. To this end, a case study was conducted in Guangzhou, using a network-based generalized additive model (GAM) including all factors related to HFMD. Three other models were also constructed using some of the variables for comparison. The results suggested that the model showed the best estimating ability when considering all of the related factors. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. Real Time Search Algorithm for Observation Outliers During Monitoring Engineering Constructions

    NASA Astrophysics Data System (ADS)

    Latos, Dorota; Kolanowski, Bogdan; Pachelski, Wojciech; Sołoducha, Ryszard

    2017-12-01

    Real time monitoring of engineering structures in case of an emergency of disaster requires collection of a large amount of data to be processed by specific analytical techniques. A quick and accurate assessment of the state of the object is crucial for a probable rescue action. One of the more significant evaluation methods of large sets of data, either collected during a specified interval of time or permanently, is the time series analysis. In this paper presented is a search algorithm for those time series elements which deviate from their values expected during monitoring. Quick and proper detection of observations indicating anomalous behavior of the structure allows to take a variety of preventive actions. In the algorithm, the mathematical formulae used provide maximal sensitivity to detect even minimal changes in the object's behavior. The sensitivity analyses were conducted for the algorithm of moving average as well as for the Douglas-Peucker algorithm used in generalization of linear objects in GIS. In addition to determining the size of deviations from the average it was used the so-called Hausdorff distance. The carried out simulation and verification of laboratory survey data showed that the approach provides sufficient sensitivity for automatic real time analysis of large amount of data obtained from different and various sensors (total stations, leveling, camera, radar).

  18. Modified Backtracking Search Optimization Algorithm Inspired by Simulated Annealing for Constrained Engineering Optimization Problems

    PubMed Central

    Wang, Hailong; Sun, Yuqiu; Su, Qinghua; Xia, Xuewen

    2018-01-01

    The backtracking search optimization algorithm (BSA) is a population-based evolutionary algorithm for numerical optimization problems. BSA has a powerful global exploration capacity while its local exploitation capability is relatively poor. This affects the convergence speed of the algorithm. In this paper, we propose a modified BSA inspired by simulated annealing (BSAISA) to overcome the deficiency of BSA. In the BSAISA, the amplitude control factor (F) is modified based on the Metropolis criterion in simulated annealing. The redesigned F could be adaptively decreased as the number of iterations increases and it does not introduce extra parameters. A self-adaptive ε-constrained method is used to handle the strict constraints. We compared the performance of the proposed BSAISA with BSA and other well-known algorithms when solving thirteen constrained benchmarks and five engineering design problems. The simulation results demonstrated that BSAISA is more effective than BSA and more competitive with other well-known algorithms in terms of convergence speed. PMID:29666635

  19. Multi-lingual search engine to access PubMed monolingual subsets: a feasibility study.

    PubMed

    Darmoni, Stéfan J; Soualmia, Lina F; Griffon, Nicolas; Grosjean, Julien; Kerdelhué, Gaétan; Kergourlay, Ivan; Dahamna, Badisse

    2013-01-01

    PubMed contains many articles in languages other than English but it is difficult to find them using the English version of the Medical Subject Headings (MeSH) Thesaurus. The aim of this work is to propose a tool allowing access to a PubMed subset in one language, and to evaluate its performance. Translations of MeSH were enriched and gathered in the information system. PubMed subsets in main European languages were also added in our database, using a dedicated parser. The CISMeF generic semantic search engine was evaluated on the response time for simple queries. MeSH descriptors are currently available in 11 languages in the information system. All the 654,000 PubMed citations in French were integrated into CISMeF database. None of the response times exceed the threshold defined for usability (2 seconds). It is now possible to freely access biomedical literature in French using a tool in French; health professionals and lay people with a low English language may find it useful. It will be expended to several European languages: German, Spanish, Norwegian and Portuguese.

  20. The sources and popularity of online drug information: an analysis of top search engine results and web page views.

    PubMed

    Law, Michael R; Mintzes, Barbara; Morgan, Steven G

    2011-03-01

    The Internet has become a popular source of health information. However, there is little information on what drug information and which Web sites are being searched. To investigate the sources of online information about prescription drugs by assessing the most common Web sites returned in online drug searches and to assess the comparative popularity of Web pages for particular drugs. This was a cross-sectional study of search results for the most commonly dispensed drugs in the US (n=278 active ingredients) on 4 popular search engines: Bing, Google (both US and Canada), and Yahoo. We determined the number of times a Web site appeared as the first result. A linked retrospective analysis counted Wikipedia page hits for each of these drugs in 2008 and 2009. About three quarters of the first result on Google USA for both brand and generic names linked to the National Library of Medicine. In contrast, Wikipedia was the first result for approximately 80% of generic name searches on the other 3 sites. On these other sites, over two thirds of brand name searches led to industry-sponsored sites. The Wikipedia pages with the highest number of hits were mainly for opiates, benzodiazepines, antibiotics, and antidepressants. Wikipedia and the National Library of Medicine rank highly in online drug searches. Further, our results suggest that patients most often seek information on drugs with the potential for dependence, for stigmatized conditions, that have received media attention, and for episodic treatments. Quality improvement efforts should focus on these drugs.

  1. Where Am I? Location Archetype Keyword Extraction from Urban Mobility Patterns

    PubMed Central

    Kostakos, Vassilis; Juntunen, Tomi; Goncalves, Jorge; Hosio, Simo; Ojala, Timo

    2013-01-01

    Can online behaviour be used as a proxy for studying urban mobility? The increasing availability of digital mobility traces has provided new insights into collective human behaviour. Mobility datasets have been shown to be an accurate proxy for daily behaviour and social patterns, and behavioural data from Twitter has been used to predict real world phenomena such as cinema ticket sale volumes, stock prices, and disease outbreaks. In this paper we correlate city-scale urban traffic patterns with online search trends to uncover keywords describing the pedestrian traffic location. By analysing a 3-year mobility dataset we show that our approach, called Location Archetype Keyword Extraction (LAKE), is capable of uncovering semantically relevant keywords for describing a location. Our findings demonstrate an overarching relationship between online and offline collective behaviour, and allow for advancing analysis of community-level behaviour by using online search keywords as a practical behaviour proxy. PMID:23704964

  2. Where am I? Location archetype keyword extraction from urban mobility patterns.

    PubMed

    Kostakos, Vassilis; Juntunen, Tomi; Goncalves, Jorge; Hosio, Simo; Ojala, Timo

    2013-01-01

    Can online behaviour be used as a proxy for studying urban mobility? The increasing availability of digital mobility traces has provided new insights into collective human behaviour. Mobility datasets have been shown to be an accurate proxy for daily behaviour and social patterns, and behavioural data from Twitter has been used to predict real world phenomena such as cinema ticket sale volumes, stock prices, and disease outbreaks. In this paper we correlate city-scale urban traffic patterns with online search trends to uncover keywords describing the pedestrian traffic location. By analysing a 3-year mobility dataset we show that our approach, called Location Archetype Keyword Extraction (LAKE), is capable of uncovering semantically relevant keywords for describing a location. Our findings demonstrate an overarching relationship between online and offline collective behaviour, and allow for advancing analysis of community-level behaviour by using online search keywords as a practical behaviour proxy.

  3. Engineering Leadership Education--The Search for Definition and a Curricular Approach

    ERIC Educational Resources Information Center

    Schuhmann, Richard J.

    2010-01-01

    While industry and academia agree that leadership skills are critical for engineering graduates, there exists no consensus regarding the definition of "engineering leadership". The engineering leadership development program at Penn State University has a decade-long experience in teaching leadership to engineering undergraduates. In…

  4. Ethnography of Novices' First Use of Web Search Engines: Affective Control in Cognitive Processing.

    ERIC Educational Resources Information Center

    Nahl, Diane

    1998-01-01

    This study of 18 novice Internet users employed a structured self-report method to investigate affective and cognitive operations in the following phases of World Wide Web searching: presearch formulation, search statement formulation, search strategy, and evaluation of results. Users also rated their self-confidence as searchers and satisfaction…

  5. Seismic Search Engine: A distributed database for mining large scale seismic data

    NASA Astrophysics Data System (ADS)

    Liu, Y.; Vaidya, S.; Kuzma, H. A.

    2009-12-01

    The International Monitoring System (IMS) of the CTBTO collects terabytes worth of seismic measurements from many receiver stations situated around the earth with the goal of detecting underground nuclear testing events and distinguishing them from other benign, but more common events such as earthquakes and mine blasts. The International Data Center (IDC) processes and analyzes these measurements, as they are collected by the IMS, to summarize event detections in daily bulletins. Thereafter, the data measurements are archived into a large format database. Our proposed Seismic Search Engine (SSE) will facilitate a framework for data exploration of the seismic database as well as the development of seismic data mining algorithms. Analogous to GenBank, the annotated genetic sequence database maintained by NIH, through SSE, we intend to provide public access to seismic data and a set of processing and analysis tools, along with community-generated annotations and statistical models to help interpret the data. SSE will implement queries as user-defined functions composed from standard tools and models. Each query is compiled and executed over the database internally before reporting results back to the user. Since queries are expressed with standard tools and models, users can easily reproduce published results within this framework for peer-review and making metric comparisons. As an illustration, an example query is “what are the best receiver stations in East Asia for detecting events in the Middle East?” Evaluating this query involves listing all receiver stations in East Asia, characterizing known seismic events in that region, and constructing a profile for each receiver station to determine how effective its measurements are at predicting each event. The results of this query can be used to help prioritize how data is collected, identify defective instruments, and guide future sensor placements.

  6. Random Drift versus Selection in Academic Vocabulary: An Evolutionary Analysis of Published Keywords

    PubMed Central

    Bentley, R. Alexander

    2008-01-01

    The evolution of vocabulary in academic publishing is characterized via keyword frequencies recorded in the ISI Web of Science citations database. In four distinct case-studies, evolutionary analysis of keyword frequency change through time is compared to a model of random copying used as the null hypothesis, such that selection may be identified against it. The case studies from the physical sciences indicate greater selection in keyword choice than in the social sciences. Similar evolutionary analyses can be applied to a wide range of phenomena; wherever the popularity of multiple items through time has been recorded, as with web searches, or sales of popular music and books, for example. PMID:18728786

  7. Random drift versus selection in academic vocabulary: an evolutionary analysis of published keywords.

    PubMed

    Bentley, R Alexander

    2008-08-27

    The evolution of vocabulary in academic publishing is characterized via keyword frequencies recorded in the ISI Web of Science citations database. In four distinct case-studies, evolutionary analysis of keyword frequency change through time is compared to a model of random copying used as the null hypothesis, such that selection may be identified against it. The case studies from the physical sciences indicate greater selection in keyword choice than in the social sciences. Similar evolutionary analyses can be applied to a wide range of phenomena; wherever the popularity of multiple items through time has been recorded, as with web searches, or sales of popular music and books, for example.

  8. Linguistic measures of chemical diversity and the "keywords" of molecular collections.

    PubMed

    Woźniak, Michał; Wołos, Agnieszka; Modrzyk, Urszula; Górski, Rafał L; Winkowski, Jan; Bajczyk, Michał; Szymkuć, Sara; Grzybowski, Bartosz A; Eder, Maciej

    2018-05-15

    Computerized linguistic analyses have proven of immense value in comparing and searching through large text collections ("corpora"), including those deposited on the Internet - indeed, it would nowadays be hard to imagine browsing the Web without, for instance, search algorithms extracting most appropriate keywords from documents. This paper describes how such corpus-linguistic concepts can be extended to chemistry based on characteristic "chemical words" that span more than traditional functional groups and, instead, look at common structural fragments molecules share. Using these words, it is possible to quantify the diversity of chemical collections/databases in new ways and to define molecular "keywords" by which such collections are best characterized and annotated.

  9. Are cannabis prevalence estimates comparable across countries and regions? A cross-cultural validation using search engine query data.

    PubMed

    Steppan, Martin; Kraus, Ludwig; Piontek, Daniela; Siciliano, Valeria

    2013-01-01

    Prevalence estimation of cannabis use is usually based on self-report data. Although there is evidence on the reliability of this data source, its cross-cultural validity is still a major concern. External objective criteria are needed for this purpose. In this study, cannabis-related search engine query data are used as an external criterion. Data on cannabis use were taken from the 2007 European School Survey Project on Alcohol and Other Drugs (ESPAD). Provincial data came from three Italian nation-wide studies using the same methodology (2006-2008; ESPAD-Italia). Information on cannabis-related search engine query data was based on Google search volume indices (GSI). (1) Reliability analysis was conducted for GSI. (2) Latent measurement models of "true" cannabis prevalence were tested using perceived availability, web-based cannabis searches and self-reported prevalence as indicators. (3) Structure models were set up to test the influences of response tendencies and geographical position (latitude, longitude). In order to test the stability of the models, analyses were conducted on country level (Europe, US) and on provincial level in Italy. Cannabis-related GSI were found to be highly reliable and constant over time. The overall measurement model was highly significant in both data sets. On country level, no significant effects of response bias indicators and geographical position on perceived availability, web-based cannabis searches and self-reported prevalence were found. On provincial level, latitude had a significant positive effect on availability indicating that perceived availability of cannabis in northern Italy was higher than expected from the other indicators. Although GSI showed weaker associations with cannabis use than perceived availability, the findings underline the external validity and usefulness of search engine query data as external criteria. The findings suggest an acceptable relative comparability of national (provincial) prevalence

  10. Writing for the Robot: How Employer Search Tools Have Influenced Resume Rhetoric and Ethics

    ERIC Educational Resources Information Center

    Amare, Nicole; Manning, Alan

    2009-01-01

    To date, business communication scholars and textbook writers have encouraged resume rhetoric that accommodates technology, for example, recommending keyword-enhancing techniques to attract the attention of searchbots: customized search engines that allow companies to automatically scan resumes for relevant keywords. However, few scholars have…

  11. P185-M Protein Identification and Validation of Results in Workflows that Integrate over Various Instruments, Datasets, Search Engines

    PubMed Central

    Hufnagel, P.; Glandorf, J.; Körting, G.; Jabs, W.; Schweiger-Hufnagel, U.; Hahner, S.; Lubeck, M.; Suckau, D.

    2007-01-01

    Analysis of complex proteomes often results in long protein lists, but falls short in measuring the validity of identification and quantification results on a greater number of proteins. Biological and technical replicates are mandatory, as is the combination of the MS data from various workflows (gels, 1D-LC, 2D-LC), instruments (TOF/TOF, trap, qTOF or FTMS), and search engines. We describe a database-driven study that combines two workflows, two mass spectrometers, and four search engines with protein identification following a decoy database strategy. The sample was a tryptically digested lysate (10,000 cells) of a human colorectal cancer cell line. Data from two LC-MALDI-TOF/TOF runs and a 2D-LC-ESI-trap run using capillary and nano-LC columns were submitted to the proteomics software platform ProteinScape. The combined MALDI data and the ESI data were searched using Mascot (Matrix Science), Phenyx (GeneBio), ProteinSolver (Bruker and Protagen), and Sequest (Thermo) against a decoy database generated from IPI-human in order to obtain one protein list across all workflows and search engines at a defined maximum false-positive rate of 5%. ProteinScape combined the data to one LC-MALDI and one LC-ESI dataset. The initial separate searches from the two combined datasets generated eight independent peptide lists. These were compiled into an integrated protein list using the ProteinExtractor algorithm. An initial evaluation of the generated data led to the identification of approximately 1200 proteins. Result integration on a peptide level allowed discrimination of protein isoforms that would not have been possible with a mere combination of protein lists.

  12. The Keyword Method of Vocabulary Acquisition: An Experimental Evaluation.

    ERIC Educational Resources Information Center

    Griffith, Douglas

    The keyword method of vocabulary acquisition is a two-step mnemonic technique for learning vocabulary terms. The first step, the acoustic link, generates a keyword based on the sound of the foreign word. The second step, the imagery link, ties the keyword to the meaning of the item to be learned, via an interactive visual image or other…

  13. FDRAnalysis: a tool for the integrated analysis of tandem mass spectrometry identification results from multiple search engines.

    PubMed

    Wedge, David C; Krishna, Ritesh; Blackhurst, Paul; Siepen, Jennifer A; Jones, Andrew R; Hubbard, Simon J

    2011-04-01

    Confident identification of peptides via tandem mass spectrometry underpins modern high-throughput proteomics. This has motivated considerable recent interest in the postprocessing of search engine results to increase confidence and calculate robust statistical measures, for example through the use of decoy databases to calculate false discovery rates (FDR). FDR-based analyses allow for multiple testing and can assign a single confidence value for both sets and individual peptide spectrum matches (PSMs). We recently developed an algorithm for combining the results from multiple search engines, integrating FDRs for sets of PSMs made by different search engine combinations. Here we describe a web-server and a downloadable application that makes this routinely available to the proteomics community. The web server offers a range of outputs including informative graphics to assess the confidence of the PSMs and any potential biases. The underlying pipeline also provides a basic protein inference step, integrating PSMs into protein ambiguity groups where peptides can be matched to more than one protein. Importantly, we have also implemented full support for the mzIdentML data standard, recently released by the Proteomics Standards Initiative, providing users with the ability to convert native formats to mzIdentML files, which are available to download.

  14. FDRAnalysis: A tool for the integrated analysis of tandem mass spectrometry identification results from multiple search engines

    PubMed Central

    Wedge, David C; Krishna, Ritesh; Blackhurst, Paul; Siepen, Jennifer A; Jones, Andrew R.; Hubbard, Simon J.

    2013-01-01

    Confident identification of peptides via tandem mass spectrometry underpins modern high-throughput proteomics. This has motivated considerable recent interest in the post-processing of search engine results to increase confidence and calculate robust statistical measures, for example through the use of decoy databases to calculate false discovery rates (FDR). FDR-based analyses allow for multiple testing and can assign a single confidence value for both sets and individual peptide spectrum matches (PSMs). We recently developed an algorithm for combining the results from multiple search engines, integrating FDRs for sets of PSMs made by different search engine combinations. Here we describe a web-server, and a downloadable application, which makes this routinely available to the proteomics community. The web server offers a range of outputs including informative graphics to assess the confidence of the PSMs and any potential biases. The underlying pipeline provides a basic protein inference step, integrating PSMs into protein ambiguity groups where peptides can be matched to more than one protein. Importantly, we have also implemented full support for the mzIdentML data standard, recently released by the Proteomics Standards Initiative, providing users with the ability to convert native formats to mzIdentML files, which are available to download. PMID:21222473

  15. Using Google Scholar to Search for Online Availability of a Cited Article in Engineering Disciplines

    ERIC Educational Resources Information Center

    Baldwin, Virginia A.

    2009-01-01

    Many published studies examine the effectiveness of Google Scholar (Scholar) as an index for scholarly articles. This paper analyzes the value of Scholar in finding and labeling online full text of articles using titles from the citations of engineering faculty publications. For the fields of engineering and the engineering colleges in the study,…

  16. A search engine to access PubMed monolingual subsets: proof of concept and evaluation in French.

    PubMed

    Griffon, Nicolas; Schuers, Matthieu; Soualmia, Lina Fatima; Grosjean, Julien; Kerdelhué, Gaétan; Kergourlay, Ivan; Dahamna, Badisse; Darmoni, Stéfan Jacques

    2014-12-01

    PubMed contains numerous articles in languages other than English. However, existing solutions to access these articles in the language in which they were written remain unconvincing. The aim of this study was to propose a practical search engine, called Multilingual PubMed, which will permit access to a PubMed subset in 1 language and to evaluate the precision and coverage for the French version (Multilingual PubMed-French). To create this tool, translations of MeSH were enriched (eg, adding synonyms and translations in French) and integrated into a terminology portal. PubMed subsets in several European languages were also added to our database using a dedicated parser. The response time for the generic semantic search engine was evaluated for simple queries. BabelMeSH, Multilingual PubMed-French, and 3 different PubMed strategies were compared by searching for literature in French. Precision and coverage were measured for 20 randomly selected queries. The results were evaluated as relevant to title and abstract, the evaluator being blind to search strategy. More than 650,000 PubMed citations in French were integrated into the Multilingual PubMed-French information system. The response times were all below the threshold defined for usability (2 seconds). Two search strategies (Multilingual PubMed-French and 1 PubMed strategy) showed high precision (0.93 and 0.97, respectively), but coverage was 4 times higher for Multilingual PubMed-French. It is now possible to freely access biomedical literature using a practical search tool in French. This tool will be of particular interest for health professionals and other end users who do not read or query sufficiently in English. The information system is theoretically well suited to expand the approach to other European languages, such as German, Spanish, Norwegian, and Portuguese.

  17. A Search Engine to Access PubMed Monolingual Subsets: Proof of Concept and Evaluation in French

    PubMed Central

    Schuers, Matthieu; Soualmia, Lina Fatima; Grosjean, Julien; Kerdelhué, Gaétan; Kergourlay, Ivan; Dahamna, Badisse; Darmoni, Stéfan Jacques

    2014-01-01

    Background PubMed contains numerous articles in languages other than English. However, existing solutions to access these articles in the language in which they were written remain unconvincing. Objective The aim of this study was to propose a practical search engine, called Multilingual PubMed, which will permit access to a PubMed subset in 1 language and to evaluate the precision and coverage for the French version (Multilingual PubMed-French). Methods To create this tool, translations of MeSH were enriched (eg, adding synonyms and translations in French) and integrated into a terminology portal. PubMed subsets in several European languages were also added to our database using a dedicated parser. The response time for the generic semantic search engine was evaluated for simple queries. BabelMeSH, Multilingual PubMed-French, and 3 different PubMed strategies were compared by searching for literature in French. Precision and coverage were measured for 20 randomly selected queries. The results were evaluated as relevant to title and abstract, the evaluator being blind to search strategy. Results More than 650,000 PubMed citations in French were integrated into the Multilingual PubMed-French information system. The response times were all below the threshold defined for usability (2 seconds). Two search strategies (Multilingual PubMed-French and 1 PubMed strategy) showed high precision (0.93 and 0.97, respectively), but coverage was 4 times higher for Multilingual PubMed-French. Conclusions It is now possible to freely access biomedical literature using a practical search tool in French. This tool will be of particular interest for health professionals and other end users who do not read or query sufficiently in English. The information system is theoretically well suited to expand the approach to other European languages, such as German, Spanish, Norwegian, and Portuguese. PMID:25448528

  18. Suicide rates and information seeking via search engines: A cross-national correlational approach.

    PubMed

    Arendt, Florian

    2018-09-01

    The volume of Google searches for suicide-related terms is positively associated with suicide rates, but previous studies used data from specific, restricted geographical contexts, thus, limiting the generalizability of this finding. We investigated the correlation between suicide-related search volume and suicide rates of 50 nations from five continents. We found a positive correlation between suicide rates and search volume, even after controlling for the level of industrialization. Results give credence to the global existence of a correlation. However, the reason why suicide-related search volume is higher in countries with higher suicide rates is still unclear and up to future research.

  19. Two-stage approach to keyword spotting in handwritten documents

    NASA Astrophysics Data System (ADS)

    Haji, Mehdi; Ameri, Mohammad R.; Bui, Tien D.; Suen, Ching Y.; Ponson, Dominique

    2013-12-01

    Separation of keywords from non-keywords is the main problem in keyword spotting systems which has traditionally been approached by simplistic methods, such as thresholding of recognition scores. In this paper, we analyze this problem from a machine learning perspective, and we study several standard machine learning algorithms specifically in the context of non-keyword rejection. We propose a two-stage approach to keyword spotting and provide a theoretical analysis of the performance of the system which gives insights on how to design the classifier in order to maximize the overall performance in terms of F-measure.

  20. PosMed-plus: an intelligent search engine that inferentially integrates cross-species information resources for molecular breeding of plants.

    PubMed

    Makita, Yuko; Kobayashi, Norio; Mochizuki, Yoshiki; Yoshida, Yuko; Asano, Satomi; Heida, Naohiko; Deshpande, Mrinalini; Bhatia, Rinki; Matsushima, Akihiro; Ishii, Manabu; Kawaguchi, Shuji; Iida, Kei; Hanada, Kosuke; Kuromori, Takashi; Seki, Motoaki; Shinozaki, Kazuo; Toyoda, Tetsuro

    2009-07-01

    Molecular breeding of crops is an efficient way to upgrade plant functions useful to mankind. A key step is forward genetics or positional cloning to identify the genes that confer useful functions. In order to accelerate the whole research process, we have developed an integrated database system powered by an intelligent data-retrieval engine termed PosMed-plus (Positional Medline for plant upgrading science), allowing us to prioritize highly promising candidate genes in a given chromosomal interval(s) of Arabidopsis thaliana and rice, Oryza sativa. By inferentially integrating cross-species information resources including genomes, transcriptomes, proteomes, localizomes, phenomes and literature, the system compares a user's query, such as phenotypic or functional keywords, with the literature associated with the relevant genes located within the interval. By utilizing orthologous and paralogous correspondences, PosMed-plus efficiently integrates cross-species information to facilitate the ranking of rice candidate genes based on evidence from other model species such as Arabidopsis. PosMed-plus is a plant science version of the PosMed system widely used by mammalian researchers, and provides both a powerful integrative search function and a rich integrative display of the integrated databases. PosMed-plus is the first cross-species integrated database that inferentially prioritizes candidate genes for forward genetics approaches in plant science, and will be expanded for wider use in plant upgrading in many species.

  1. In-depth analysis of protein inference algorithms using multiple search engines and well-defined metrics.

    PubMed

    Audain, Enrique; Uszkoreit, Julian; Sachsenberg, Timo; Pfeuffer, Julianus; Liang, Xiao; Hermjakob, Henning; Sanchez, Aniel; Eisenacher, Martin; Reinert, Knut; Tabb, David L; Kohlbacher, Oliver; Perez-Riverol, Yasset

    2017-01-06

    In mass spectrometry-based shotgun proteomics, protein identifications are usually the desired result. However, most of the analytical methods are based on the identification of reliable peptides and not the direct identification of intact proteins. Thus, assembling peptides identified from tandem mass spectra into a list of proteins, referred to as protein inference, is a critical step in proteomics research. Currently, different protein inference algorithms and tools are available for the proteomics community. Here, we evaluated five software tools for protein inference (PIA, ProteinProphet, Fido, ProteinLP, MSBayesPro) using three popular database search engines: Mascot, X!Tandem, and MS-GF+. All the algorithms were evaluated using a highly customizable KNIME workflow using four different public datasets with varying complexities (different sample preparation, species and analytical instruments). We defined a set of quality control metrics to evaluate the performance of each combination of search engines, protein inference algorithm, and parameters on each dataset. We show that the results for complex samples vary not only regarding the actual numbers of reported protein groups but also concerning the actual composition of groups. Furthermore, the robustness of reported proteins when using databases of differing complexities is strongly dependant on the applied inference algorithm. Finally, merging the identifications of multiple search engines does not necessarily increase the number of reported proteins, but does increase the number of peptides per protein and thus can generally be recommended. Protein inference is one of the major challenges in MS-based proteomics nowadays. Currently, there are a vast number of protein inference algorithms and implementations available for the proteomics community. Protein assembly impacts in the final results of the research, the quantitation values and the final claims in the research manuscript. Even though protein

  2. Search Engine Ranking, Quality, and Content of Web Pages That Are Critical Versus Noncritical of Human Papillomavirus Vaccine.

    PubMed

    Fu, Linda Y; Zook, Kathleen; Spoehr-Labutta, Zachary; Hu, Pamela; Joseph, Jill G

    2016-01-01

    Online information can influence attitudes toward vaccination. The aim of the present study was to provide a systematic evaluation of the search engine ranking, quality, and content of Web pages that are critical versus noncritical of human papillomavirus (HPV) vaccination. We identified HPV vaccine-related Web pages with the Google search engine by entering 20 terms. We then assessed each Web page for critical versus noncritical bias and for the following quality indicators: authorship disclosure, source disclosure, attribution of at least one reference, currency, exclusion of testimonial accounts, and readability level less than ninth grade. We also determined Web page comprehensiveness in terms of mention of 14 HPV vaccine-relevant topics. Twenty searches yielded 116 unique Web pages. HPV vaccine-critical Web pages comprised roughly a third of the top, top 5- and top 10-ranking Web pages. The prevalence of HPV vaccine-critical Web pages was higher for queries that included term modifiers in addition to root terms. Compared with noncritical Web pages, Web pages critical of HPV vaccine overall had a lower quality score than those with a noncritical bias (p < .01) and covered fewer important HPV-related topics (p < .001). Critical Web pages required viewers to have higher reading skills, were less likely to include an author byline, and were more likely to include testimonial accounts. They also were more likely to raise unsubstantiated concerns about vaccination. Web pages critical of HPV vaccine may be frequently returned and highly ranked by search engine queries despite being of lower quality and less comprehensive than noncritical Web pages. Copyright © 2016 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.

  3. ClinicalKey 2.0: Upgrades in a Point-of-Care Search Engine.

    PubMed

    Huslig, Mary Ann; Vardell, Emily

    2015-01-01

    ClinicalKey 2.0, launched September 23, 2014, offers a mobile-friendly design with a search history feature for targeting point-of-care resources for health care professionals. Browsing is improved with searchable, filterable listings of sources highlighting new resources. ClinicalKey 2.0 improvements include more than 1,400 new Topic Pages for quick access to point-of-care content. A sample search details some of the upgrades and content options.

  4. RxnFinder: biochemical reaction search engines using molecular structures, molecular fragments and reaction similarity.

    PubMed

    Hu, Qian-Nan; Deng, Zhe; Hu, Huanan; Cao, Dong-Sheng; Liang, Yi-Zeng

    2011-09-01

    Biochemical reactions play a key role to help sustain life and allow cells to grow. RxnFinder was developed to search biochemical reactions from KEGG reaction database using three search criteria: molecular structures, molecular fragments and reaction similarity. RxnFinder is helpful to get reference reactions for biosynthesis and xenobiotics metabolism. RxnFinder is freely available via: http://sdd.whu.edu.cn/rxnfinder. qnhu@whu.edu.cn.

  5. A unified framework for image retrieval using keyword and visual features.

    PubMed

    Jing, Feng; Li, Mingling; Zhang, Hong-Jiang; Zhang, Bo

    2005-07-01

    In this paper, a unified image retrieval framework based on both keyword annotations and visual features is proposed. In this framework, a set of statistical models are built based on visual features of a small set of manually labeled images to represent semantic concepts and used to propagate keywords to other unlabeled images. These models are updated periodically when more images implicitly labeled by users become available through relevance feedback. In this sense, the keyword models serve the function of accumulation and memorization of knowledge learned from user-provided relevance feedback. Furthermore, two sets of effective and efficient similarity measures and relevance feedback schemes are proposed for query by keyword scenario and query by image example scenario, respectively. Keyword models are combined with visual features in these schemes. In particular, a new, entropy-based active learning strategy is introduced to improve the efficiency of relevance feedback for query by keyword. Furthermore, a new algorithm is proposed to estimate the keyword features of the search concept for query by image example. It is shown to be more appropriate than two existing relevance feedback algorithms. Experimental results demonstrate the effectiveness of the proposed framework.

  6. Preliminary comparison of the Essie and PubMed search engines for answering clinical questions using MD on Tap, a PDA-based program for accessing biomedical literature

    PubMed Central

    Sutton, Victoria R.; Hauser, Susan E.

    2005-01-01

    MD on Tap, a PDA application that searches and retrieves biomedical literature, is specifically designed for use by mobile healthcare professionals. With the goal of improving the usability of the application, a preliminary comparison was made of two search engines (PubMed and Essie) to determine which provided most efficient path to the desired clinically-relevant information. PMID:16779415

  7. Preliminary comparison of the Essie and PubMed search engines for answering clinical questions using MD on Tap, a PDA-based program for accessing biomedical literature.

    PubMed

    Sutton, Victoria R; Hauser, Susan E

    2005-01-01

    MD on Tap, a PDA application that searches and retrieves biomedical literature, is specifically designed for use by mobile healthcare professionals. With the goal of improving the usability of the application, a preliminary comparison was made of two search engines (PubMed and Essie) to determine which provided most efficient path to the desired clinically-relevant information.

  8. Do economic equality and generalized trust inhibit academic dishonesty? Evidence from state-level search-engine queries.

    PubMed

    Neville, Lukas

    2012-04-01

    What effect does economic inequality have on academic integrity? Using data from search-engine queries made between 2003 and 2011 on Google and state-level measures of income inequality and generalized trust, I found that academically dishonest searches (queries seeking term-paper mills and help with cheating) were more likely to come from states with higher income inequality and lower levels of generalized trust. These relations persisted even when controlling for contextual variables, such as average income and the number of colleges per capita. The relation between income inequality and academic dishonesty was fully mediated by generalized trust. When there is higher economic inequality, people are less likely to view one another as trustworthy. This lower generalized trust, in turn, is associated with a greater prevalence of academic dishonesty. These results might explain previous findings on the effectiveness of honor codes.

  9. Internal combustion engine fuel controls. (Latest citations from the US Patent database). Published Search

    SciTech Connect

    Not Available

    1992-12-01

    The bibliography contains citations of selected patents concerning fuel control devices and methods for use in internal combustion engines. Patents describe air-fuel ratio control, fuel injection systems, evaporative fuel control, and surge-corrected fuel control. Citations also discuss electronic and feedback control, methods for engine protection, and fuel conservation. (Contains a minimum of 232 citations and includes a subject term index and title list.)

  10. A keyword spotting model using perceptually significant energy features

    NASA Astrophysics Data System (ADS)

    Umakanthan, Padmalochini

    The task of a keyword recognition system is to detect the presence of certain words in a conversation based on the linguistic information present in human speech. Such keyword spotting systems have applications in homeland security, telephone surveillance and human-computer interfacing. General procedure of a keyword spotting system involves feature generation and matching. In this work, new set of features that are based on the psycho-acoustic masking nature of human speech are proposed. After developing these features a time aligned pattern matching process was implemented to locate the words in a set of unknown words. A word boundary detection technique based on frame classification using the nonlinear characteristics of speech is also addressed in this work. Validation of this keyword spotting model was done using widely acclaimed Cepstral features. The experimental results indicate the viability of using these perceptually significant features as an augmented feature set in keyword spotting.

  11. [Efficacy of the keyword mnemonic method in adults].

    PubMed

    Campos, Alfredo; Pérez-Fabello, María José; Camino, Estefanía

    2010-11-01

    Two experiments were used to assess the efficacy of the keyword mnemonic method in adults. In Experiment 1, immediate and delayed recall (at a one-day interval) were assessed by comparing the results obtained by a group of adults using the keyword mnemonic method in contrast to a group using the repetition method. The mean age of the sample under study was 59.35 years. Subjects were required to learn a list of 16 words translated from Latin into Spanish. Participants who used keyword mnemonics that had been devised by other experimental participants of the same characteristics, obtained significantly higher immediate and delayed recall scores than participants in the repetition method. In Experiment 2, other participants had to learn a list of 24 Latin words translated into Spanish by using the keyword mnemonic method reinforced with pictures. Immediate and delayed recall were significantly greater in the keyword mnemonic method group than in the repetition method group.

  12. RecA: Regulation and Mechanism of a Molecular Search Engine.

    PubMed

    Bell, Jason C; Kowalczykowski, Stephen C

    2016-06-01

    Homologous recombination maintains genomic integrity by repairing broken chromosomes. The broken chromosome is partially resected to produce single-stranded DNA (ssDNA) that is used to search for homologous double-stranded DNA (dsDNA). This homology driven 'search and rescue' is catalyzed by a class of DNA strand exchange proteins that are defined in relation to Escherichia coli RecA, which forms a filament on ssDNA. Here, we review the regulation of RecA filament assembly and the mechanism by which RecA quickly and efficiently searches for and identifies a unique homologous sequence among a vast excess of heterologous DNA. Given that RecA is the prototypic DNA strand exchange protein, its behavior affords insight into the actions of eukaryotic RAD51 orthologs and their regulators, BRCA2 and other tumor suppressors. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Detecting internet activity for erectile dysfunction using search engine query data in the Republic of Ireland.

    PubMed

    Davis, Niall F; Smyth, Lisa G; Flood, Hugh D

    2012-12-01

    What's known on the subject? and What does the study add? Despite the increasing prevalence of erectile dysfunction (ED), there is reluctance among symptomatic patients to present to healthcare providers for appropriate advice and treatment. A number of Internet campaigns have been launched by the Irish healthcare media since 2007 aiming to provide easily accessible advice on ED. Novel online technologies appear to provide a useful tool for educating the general public on the symptoms of ED because there has been a significant increase in overall Internet search activity for this term since 2007. • To assess Internet search trends for erectile dysfunction (ED) subsequent to public awareness campaigns being launched within the Republic of Ireland • To assess whether the advent of such campaigns correlates with increased Internet search activity for ED. • Google insights for search was utilized to examine Internet search trends for the term 'erectile dysfunction' across all categories between January 2005 and December 2011. • Search activity was limited to users from the Republic of Ireland within this timeframe. • Additionally, the number of Irish Internet media campaigns and Irish web pages providing information on ED was assessed between January 2005 and December 2011. • Statistical analysis of the data was performed using analysis of variance and Student's t-tests for pairwise comparisons. • There has been a significant increase in mean search activity for ED on an annual basis since 2007 (P < 0.001). • The number of Irish web pages associated with information on ED has also increased significantly on an annual basis since 2007 (P < 0.001). • There have been seven different Irish Internet media campaigns on ED since 2007 compared to two from 2005 to 2007 (P < 0.001). • There was no significant change in mean search activity for ED from 2005 to 2007 • The advent of recent Internet media campaigns and increasing number of Irish web pages is

  14. Integrated Proteomic Pipeline Using Multiple Search Engines for a Proteogenomic Study with a Controlled Protein False Discovery Rate.

    PubMed

    Park, Gun Wook; Hwang, Heeyoun; Kim, Kwang Hoe; Lee, Ju Yeon; Lee, Hyun Kyoung; Park, Ji Yeong; Ji, Eun Sun; Park, Sung-Kyu Robin; Yates, John R; Kwon, Kyung-Hoon; Park, Young Mok; Lee, Hyoung-Joo; Paik, Young-Ki; Kim, Jin Young; Yoo, Jong Shin

    2016-11-04

    In the Chromosome-Centric Human Proteome Project (C-HPP), false-positive identification by peptide spectrum matches (PSMs) after database searches is a major issue for proteogenomic studies using liquid-chromatography and mass-spectrometry-based large proteomic profiling. Here we developed a simple strategy for protein identification, with a controlled false discovery rate (FDR) at the protein level, using an integrated proteomic pipeline (IPP) that consists of four engrailed steps as follows. First, using three different search engines, SEQUEST, MASCOT, and MS-GF+, individual proteomic searches were performed against the neXtProt database. Second, the search results from the PSMs were combined using statistical evaluation tools including DTASelect and Percolator. Third, the peptide search scores were converted into E-scores normalized using an in-house program. Last, ProteinInferencer was used to filter the proteins containing two or more peptides with a controlled FDR of 1.0% at the protein level. Finally, we compared the performance of the IPP to a conventional proteomic pipeline (CPP) for protein identification using a controlled FDR of <1% at the protein level. Using the IPP, a total of 5756 proteins (vs 4453 using the CPP) including 477 alternative splicing variants (vs 182 using the CPP) were identified from human hippocampal tissue. In addition, a total of 10 missing proteins (vs 7 using the CPP) were identified with two or more unique peptides, and their tryptic peptides were validated using MS/MS spectral pattern from a repository database or their corresponding synthetic peptides. This study shows that the IPP effectively improved the identification of proteins, including alternative splicing variants and missing proteins, in human hippocampal tissues for the C-HPP. All RAW files used in this study were deposited in ProteomeXchange (PXD000395).

  15. A geospatial search engine for discovering multi-format geospatial data across the web

    Treesearch

    Christopher Bone; Alan Ager; Ken Bunzel; Lauren Tierney

    2014-01-01

    The volume of publically available geospatial data on the web is rapidly increasing due to advances in server-based technologies and the ease at which data can now be created. However, challenges remain with connecting individuals searching for geospatial data with servers and websites where such data exist. The objective of this paper is to present a publically...

  16. Consolidating Russia and Eurasia Antibiotic Resistance Data for 1992-2014 Using Search Engine.

    PubMed

    Bedenkov, Alexander; Shpinev, Vitaly; Suvorov, Nikolay; Sokolov, Evgeny; Riabenko, Evgeniy

    2016-01-01

    The World Health Organization recognizes the antibiotic resistance problem as a major health threat in the twenty first century. The paper describes an effort to fight it undertaken at the verge of two industries-healthcare and Data Science. One of the major difficulties in monitoring antibiotic resistance is low availability of comprehensive research data. Our aim is to develop a nation-wide antibiotic resistance database using Internet search and data processing algorithms using Russian language publications. An interdisciplinary team built an intelligent Internet search filter to locate all publicly available research data on antibiotic resistance in Russia and Eurasia countries, extracted it, and collated it for analysis. A database was constructed using data from 850 original studies conducted at 153 locations in 12 countries between 1992 and 2014. The studies contained susceptibility and resistance rates of 156 microorganisms to 157 antibiotic drugs. The applied search methodology was highly robust in that it yielded search precision of 58 vs. 20% in a typical Internet search. It allowed finding and collating within the database the following data items (among many others): publication details including title, source, date, authors, etc.; study details: time period, locations, research organization, therapy area, etc.; microorganisms and antibiotic drugs included in the study along with prevalence values of resistant and susceptible strains, and numbers of isolates. The next stage in project development will try to validate the data by matching it to major benchmark studies; in addition, a panel of experts will be convened to evaluate the outcomes. The work provides a supplementary tool to national surveillance systems in antibiotic resistance, and consolidates fragmented research data available for 12 countries for a period of more than 20 years.

  17. Consolidating Russia and Eurasia Antibiotic Resistance Data for 1992–2014 Using Search Engine

    PubMed Central

    Bedenkov, Alexander; Shpinev, Vitaly; Suvorov, Nikolay; Sokolov, Evgeny; Riabenko, Evgeniy

    2016-01-01

    Background: The World Health Organization recognizes the antibiotic resistance problem as a major health threat in the twenty first century. The paper describes an effort to fight it undertaken at the verge of two industries—healthcare and Data Science. One of the major difficulties in monitoring antibiotic resistance is low availability of comprehensive research data. Our aim is to develop a nation-wide antibiotic resistance database using Internet search and data processing algorithms using Russian language publications. Materials and Methods: An interdisciplinary team built an intelligent Internet search filter to locate all publicly available research data on antibiotic resistance in Russia and Eurasia countries, extracted it, and collated it for analysis. A database was constructed using data from 850 original studies conducted at 153 locations in 12 countries between 1992 and 2014. The studies contained susceptibility and resistance rates of 156 microorganisms to 157 antibiotic drugs. Results: The applied search methodology was highly robust in that it yielded search precision of 58 vs. 20% in a typical Internet search. It allowed finding and collating within the database the following data items (among many others): publication details including title, source, date, authors, etc.; study details: time period, locations, research organization, therapy area, etc.; microorganisms and antibiotic drugs included in the study along with prevalence values of resistant and susceptible strains, and numbers of isolates. The next stage in project development will try to validate the data by matching it to major benchmark studies; in addition, a panel of experts will be convened to evaluate the outcomes. Conclusions: The work provides a supplementary tool to national surveillance systems in antibiotic resistance, and consolidates fragmented research data available for 12 countries for a period of more than 20 years. PMID:27014217

  18. Search | Galaxy of Images

    Science.gov Websites

    Books dot header Search Tips Search Keywords in Author Last Name or in the Title of the Books: Enter a books Images FAQ | Privacy | SI Terms of Use | Smithsonian Home DCSIMG

  19. Employability of Graduates--A Search through the Educational Processes of Indian Engineering Institutions

    ERIC Educational Resources Information Center

    Viswanadhan, K. G.

    2008-01-01

    There is a big boom in manpower requirements in IT industries in India in recent years. Presently India is aiming at becoming the major source of manpower in IT and other technical fields in the global scenario. To achieve this target, enhancement of employability of graduates coming out of engineering colleges are very important. A study has been…

  20. Breaking Ground: Improving Undergraduate Engineering Projects through Flipped Teaching of Literature Search Techniques

    ERIC Educational Resources Information Center

    Maddison, Tasha; Beneteau, Donna; Sokoloski, Brandy

    2014-01-01

    This case study describes the use of flipped teaching for information literacy instruction in a new course, "Drill, Blast, and Excavate GeoE 498," within the mining option for geological engineering (GeoE) students. These students will enter the mining industry with less discipline-specific knowledge than a student that graduated with a…

  1. Light Aircraft Engines, the Potential and Problems for Use of Automotive Fuels. Phase 1. Literature Search

    DTIC Science & Technology

    1980-12-01

    14, 1944), 466-500. 22. Smith, Maxwell. AVIATION FUELS. Foulis & Co., Ltd., University Printing House, Cambridge, 1970. 23. Dugan, W. P., and Toulmin ...Plane Engines And Their Fuel Problems," SAE JOURNAL, Vol. 50, No. 5 (May 1942), 188-195. Dugan, W. P., and Toulmin , H. A. THE CARBURETOR ICING TENDENCIES

  2. Strategic plan : providing high precision search to NASA employees using the NASA engineering network

    NASA Technical Reports Server (NTRS)

    Dutra, Jayne E.; Smith, Lisa

    2006-01-01

    The goal of this plan is to briefly describe new technologies available to us in the arenas of information discovery and discuss the strategic value they have for the NASA enterprise with some considerations and suggestions for near term implementations using the NASA Engineering Network (NEN) as a delivery venue.

  3. EPA Metadata Style Guide Keywords and EPA Organization Names

    EPA Pesticide Factsheets

    The following keywords and EPA organization names listed below, along with EPA’s Metadata Style Guide, are intended to provide suggestions and guidance to assist with the standardization of metadata records.

  4. Variational dynamic background model for keyword spotting in handwritten documents

    NASA Astrophysics Data System (ADS)

    Kumar, Gaurav; Wshah, Safwan; Govindaraju, Venu

    2013-12-01

    We propose a bayesian framework for keyword spotting in handwritten documents. This work is an extension to our previous work where we proposed dynamic background model, DBM for keyword spotting that takes into account the local character level scores and global word level scores to learn a logistic regression classifier to separate keywords from non-keywords. In this work, we add a bayesian layer on top of the DBM called the variational dynamic background model, VDBM. The logistic regression classifier uses the sigmoid function to separate keywords from non-keywords. The sigmoid function being neither convex nor concave, exact inference of VDBM becomes intractable. An expectation maximization step is proposed to do approximate inference. The advantage of VDBM over the DBM is multi-fold. Firstly, being bayesian, it prevents over-fitting of data. Secondly, it provides better modeling of data and an improved prediction of unseen data. VDBM is evaluated on the IAM dataset and the results prove that it outperforms our prior work and other state of the art line based word spotting system.

  5. High-performance hardware implementation of a parallel database search engine for real-time peptide mass fingerprinting

    PubMed Central

    Bogdán, István A.; Rivers, Jenny; Beynon, Robert J.; Coca, Daniel

    2008-01-01

    Motivation: Peptide mass fingerprinting (PMF) is a method for protein identification in which a protein is fragmented by a defined cleavage protocol (usually proteolysis with trypsin), and the masses of these products constitute a ‘fingerprint’ that can be searched against theoretical fingerprints of all known proteins. In the first stage of PMF, the raw mass spectrometric data are processed to generate a peptide mass list. In the second stage this protein fingerprint is used to search a database of known proteins for the best protein match. Although current software solutions can typically deliver a match in a relatively short time, a system that can find a match in real time could change the way in which PMF is deployed and presented. In a paper published earlier we presented a hardware design of a raw mass spectra processor that, when implemented in Field Programmable Gate Array (FPGA) hardware, achieves almost 170-fold speed gain relative to a conventional software implementation running on a dual processor server. In this article we present a complementary hardware realization of a parallel database search engine that, when running on a Xilinx Virtex 2 FPGA at 100 MHz, delivers 1800-fold speed-up compared with an equivalent C software routine, running on a 3.06 GHz Xeon workstation. The inherent scalability of the design means that processing speed can be multiplied by deploying the design on multiple FPGAs. The database search processor and the mass spectra processor, running on a reconfigurable computing platform, provide a complete real-time PMF protein identification solution. Contact: d.coca@sheffield.ac.uk PMID:18453553

  6. Tandem Mass Spectrum Sequencing: An Alternative to Database Search Engines in Shotgun Proteomics.

    PubMed

    Muth, Thilo; Rapp, Erdmann; Berven, Frode S; Barsnes, Harald; Vaudel, Marc

    2016-01-01

    Protein identification via database searches has become the gold standard in mass spectrometry based shotgun proteomics. However, as the quality of tandem mass spectra improves, direct mass spectrum sequencing gains interest as a database-independent alternative. In this chapter, the general principle of this so-called de novo sequencing is introduced along with pitfalls and challenges of the technique. The main tools available are presented with a focus on user friendly open source software which can be directly applied in everyday proteomic workflows.

  7. Omicseq: a web-based search engine for exploring omics datasets

    PubMed Central

    Sun, Xiaobo; Pittard, William S.; Xu, Tianlei; Chen, Li; Zwick, Michael E.; Jiang, Xiaoqian; Wang, Fusheng

    2017-01-01

    Abstract The development and application of high-throughput genomics technologies has resulted in massive quantities of diverse omics data that continue to accumulate rapidly. These rich datasets offer unprecedented and exciting opportunities to address long standing questions in biomedical research. However, our ability to explore and query the content of diverse omics data is very limited. Existing dataset search tools rely almost exclusively on the metadata. A text-based query for gene name(s) does not work well on datasets wherein the vast majority of their content is numeric. To overcome this barrier, we have developed Omicseq, a novel web-based platform that facilitates the easy interrogation of omics datasets holistically to improve ‘findability’ of relevant data. The core component of Omicseq is trackRank, a novel algorithm for ranking omics datasets that fully uses the numerical content of the dataset to determine relevance to the query entity. The Omicseq system is supported by a scalable and elastic, NoSQL database that hosts a large collection of processed omics datasets. In the front end, a simple, web-based interface allows users to enter queries and instantly receive search results as a list of ranked datasets deemed to be the most relevant. Omicseq is freely available at http://www.omicseq.org. PMID:28402462

  8. Omicseq: a web-based search engine for exploring omics datasets.

    PubMed

    Sun, Xiaobo; Pittard, William S; Xu, Tianlei; Chen, Li; Zwick, Michael E; Jiang, Xiaoqian; Wang, Fusheng; Qin, Zhaohui S

    2017-07-03

    The development and application of high-throughput genomics technologies has resulted in massive quantities of diverse omics data that continue to accumulate rapidly. These rich datasets offer unprecedented and exciting opportunities to address long standing questions in biomedical research. However, our ability to explore and query the content of diverse omics data is very limited. Existing dataset search tools rely almost exclusively on the metadata. A text-based query for gene name(s) does not work well on datasets wherein the vast majority of their content is numeric. To overcome this barrier, we have developed Omicseq, a novel web-based platform that facilitates the easy interrogation of omics datasets holistically to improve 'findability' of relevant data. The core component of Omicseq is trackRank, a novel algorithm for ranking omics datasets that fully uses the numerical content of the dataset to determine relevance to the query entity. The Omicseq system is supported by a scalable and elastic, NoSQL database that hosts a large collection of processed omics datasets. In the front end, a simple, web-based interface allows users to enter queries and instantly receive search results as a list of ranked datasets deemed to be the most relevant. Omicseq is freely available at http://www.omicseq.org. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Can Item Keyword Feedback Help Remediate Knowledge Gaps?

    PubMed Central

    Feinberg, Richard A.; Clauser, Amanda L.

    2016-01-01

    ABSTRACT Background  In graduate medical education, assessment results can effectively guide professional development when both assessment and feedback support a formative model. When individuals cannot directly access the test questions and responses, a way of using assessment results formatively is to provide item keyword feedback. Objective  The purpose of the following study was to investigate whether exposure to item keyword feedback aids in learner remediation. Methods  Participants included 319 trainees who completed a medical subspecialty in-training examination (ITE) in 2012 as first-year fellows, and then 1 year later in 2013 as second-year fellows. Performance on 2013 ITE items in which keywords were, or were not, exposed as part of the 2012 ITE score feedback was compared across groups based on the amount of time studying (preparation). For the same items common to both 2012 and 2013 ITEs, response patterns were analyzed to investigate changes in answer selection. Results  Test takers who indicated greater amounts of preparation on the 2013 ITE did not perform better on the items in which keywords were exposed compared to those who were not exposed. The response pattern analysis substantiated overall growth in performance from the 2012 ITE. For items with incorrect responses on both attempts, examinees selected the same option 58% of the time. Conclusions  Results from the current study were unsuccessful in supporting the use of item keywords in aiding remediation. Unfortunately, the results did provide evidence of examinees retaining misinformation. PMID:27777664

  10. Can Item Keyword Feedback Help Remediate Knowledge Gaps?

    PubMed

    Feinberg, Richard A; Clauser, Amanda L

    2016-10-01

    In graduate medical education, assessment results can effectively guide professional development when both assessment and feedback support a formative model. When individuals cannot directly access the test questions and responses, a way of using assessment results formatively is to provide item keyword feedback. The purpose of the following study was to investigate whether exposure to item keyword feedback aids in learner remediation. Participants included 319 trainees who completed a medical subspecialty in-training examination (ITE) in 2012 as first-year fellows, and then 1 year later in 2013 as second-year fellows. Performance on 2013 ITE items in which keywords were, or were not, exposed as part of the 2012 ITE score feedback was compared across groups based on the amount of time studying (preparation). For the same items common to both 2012 and 2013 ITEs, response patterns were analyzed to investigate changes in answer selection. Test takers who indicated greater amounts of preparation on the 2013 ITE did not perform better on the items in which keywords were exposed compared to those who were not exposed. The response pattern analysis substantiated overall growth in performance from the 2012 ITE. For items with incorrect responses on both attempts, examinees selected the same option 58% of the time. Results from the current study were unsuccessful in supporting the use of item keywords in aiding remediation. Unfortunately, the results did provide evidence of examinees retaining misinformation.

  11. A keyword searchable attribute-based encryption scheme with attribute update for cloud storage.

    PubMed

    Wang, Shangping; Ye, Jian; Zhang, Yaling

    2018-01-01

    Ciphertext-policy attribute-based encryption (CP-ABE) scheme is a new type of data encryption primitive, which is very suitable for data cloud storage for its fine-grained access control. Keyword-based searchable encryption scheme enables users to quickly find interesting data stored in the cloud server without revealing any information of the searched keywords. In this work, we provide a keyword searchable attribute-based encryption scheme with attribute update for cloud storage, which is a combination of attribute-based encryption scheme and keyword searchable encryption scheme. The new scheme supports the user's attribute update, especially in our new scheme when a user's attribute need to be updated, only the user's secret key related with the attribute need to be updated, while other user's secret key and the ciphertexts related with this attribute need not to be updated with the help of the cloud server. In addition, we outsource the operation with high computation cost to cloud server to reduce the user's computational burden. Moreover, our scheme is proven to be semantic security against chosen ciphertext-policy and chosen plaintext attack in the general bilinear group model. And our scheme is also proven to be semantic security against chosen keyword attack under bilinear Diffie-Hellman (BDH) assumption.

  12. A keyword searchable attribute-based encryption scheme with attribute update for cloud storage

    PubMed Central

    Wang, Shangping; Zhang, Yaling

    2018-01-01

    Ciphertext-policy attribute-based encryption (CP-ABE) scheme is a new type of data encryption primitive, which is very suitable for data cloud storage for its fine-grained access control. Keyword-based searchable encryption scheme enables users to quickly find interesting data stored in the cloud server without revealing any information of the searched keywords. In this work, we provide a keyword searchable attribute-based encryption scheme with attribute update for cloud storage, which is a combination of attribute-based encryption scheme and keyword searchable encryption scheme. The new scheme supports the user's attribute update, especially in our new scheme when a user's attribute need to be updated, only the user's secret key related with the attribute need to be updated, while other user's secret key and the ciphertexts related with this attribute need not to be updated with the help of the cloud server. In addition, we outsource the operation with high computation cost to cloud server to reduce the user's computational burden. Moreover, our scheme is proven to be semantic security against chosen ciphertext-policy and chosen plaintext attack in the general bilinear group model. And our scheme is also proven to be semantic security against chosen keyword attack under bilinear Diffie-Hellman (BDH) assumption. PMID:29795577

  13. The Search Engine for Multi-Proteoform Complexes: An Online Tool for the Identification and Stoichiometry Determination of Protein Complexes.

    PubMed

    Skinner, Owen S; Schachner, Luis F; Kelleher, Neil L

    2016-12-08

    Recent advances in top-down mass spectrometry using native electrospray now enable the analysis of intact protein complexes with relatively small sample amounts in an untargeted mode. Here, we describe how to characterize both homo- and heteropolymeric complexes with high molecular specificity using input data produced by tandem mass spectrometry of whole protein assemblies. The tool described is a "search engine for multi-proteoform complexes," (SEMPC) and is available for free online. The output is a list of candidate multi-proteoform complexes and scoring metrics, which are used to define a distinct set of one or more unique protein subunits, their overall stoichiometry in the intact complex, and their pre- and post-translational modifications. Thus, we present an approach for the identification and characterization of intact protein complexes from native mass spectrometry data. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.

  14. Space Shuttle main engine OPAD: The search for a hardware enhanced plume

    NASA Technical Reports Server (NTRS)

    Powers, W. T.; Cooper, A. E.; Wallace, Tim L.; Buntine, W. L.; Whitaker, K. W.

    1993-01-01

    The process of applying spectroscopy to the Space Shuttle Main Engine (SSME) for plume diagnostics, as it exists today, originated at Marshall Space Flight Center in Huntsville, Alabama, and its implementation was assured largely through the efforts of Sverdrup AEDC, in Tullahoma, Tennessee. This team continues to lead and guide efforts in the plume diagnostics arena. The process, Optical Plume Anomaly Detection (OPAD), formed the basis for various activities in the development of ground-based systems as well as the development of in-flight plume spectroscopy. OPAD currently provides and will continue to provide valuable information relative to future systems definitions, instrumentation development, code validation, and data diagnostic processing. OPAD is based on the detection of anomalous atomic and molecular species in the SSME plume using two complete, stand-alone optical spectrometers. To-date OPAD has acquired data on 44 test firings of the SSME at the Technology Test Bed (TTB) at MSFC. The purpose of this paper will be to provide an introduction to the OPAD system by discussing the process of obtaining data as well as the methods of examining and interpreting the data. It will encompass such issues as selection of instrumentation correlation of data to nominal engine operation, investigation of SSME component erosion via OPAD spectral data, necessity and benefits of plume seeding, application of artificial intelligence (AI) techniques to data analysis, and the present status of efforts to quantify specie erosion utilizing standard plume and chemistry codes as well as radiative models currently under development.

  15. Space Shuttle main engine OPAD: The search for a hardware enhanced plume

    NASA Astrophysics Data System (ADS)

    Powers, W. T.; Cooper, A. E.; Wallace, Tim L.; Buntine, W. L.; Whitaker, K. W.

    1993-11-01

    The process of applying spectroscopy to the Space Shuttle Main Engine (SSME) for plume diagnostics, as it exists today, originated at Marshall Space Flight Center in Huntsville, Alabama, and its implementation was assured largely through the efforts of Sverdrup AEDC, in Tullahoma, Tennessee. This team continues to lead and guide efforts in the plume diagnostics arena. The process, Optical Plume Anomaly Detection (OPAD), formed the basis for various activities in the development of ground-based systems as well as the development of in-flight plume spectroscopy. OPAD currently provides and will continue to provide valuable information relative to future systems definitions, instrumentation development, code validation, and data diagnostic processing. OPAD is based on the detection of anomalous atomic and molecular species in the SSME plume using two complete, stand-alone optical spectrometers. To-date OPAD has acquired data on 44 test firings of the SSME at the Technology Test Bed (TTB) at MSFC. The purpose of this paper will be to provide an introduction to the OPAD system by discussing the process of obtaining data as well as the methods of examining and interpreting the data. It will encompass such issues as selection of instrumentation correlation of data to nominal engine operation, investigation of SSME component erosion via OPAD spectral data, necessity and benefits of plume seeding, application of artificial intelligence (AI) techniques to data analysis, and the present status of efforts to quantify specie erosion utilizing standard plume and chemistry codes as well as radiative models currently under development.

  16. Detecting the Norovirus Season in Sweden Using Search Engine Data – Meeting the Needs of Hospital Infection Control Teams

    PubMed Central

    Edelstein, Michael; Wallensten, Anders; Zetterqvist, Inga; Hulth, Anette

    2014-01-01

    Norovirus outbreaks severely disrupt healthcare systems. We evaluated whether Websök, an internet-based surveillance system using search engine data, improved norovirus surveillance and response in Sweden. We compared Websök users' characteristics with the general population, cross-correlated weekly Websök searches with laboratory notifications between 2006 and 2013, compared the time Websök and laboratory data crossed the epidemic threshold and surveyed infection control teams about their perception and use of Websök. Users of Websök were not representative of the general population. Websök correlated with laboratory data (b = 0.88-0.89) and gave an earlier signal to the onset of the norovirus season compared with laboratory-based surveillance. 17/21 (81%) infection control teams answered the survey, of which 11 (65%) believed Websök could help with infection control plans. Websök is a low-resource, easily replicable system that detects the norovirus season as reliably as laboratory data, but earlier. Using Websök in routine surveillance can help infection control teams prepare for the yearly norovirus season. PMID:24955857

  17. Information is in the eye of the beholder: Seeking information on the MMR vaccine through an Internet search engine.

    PubMed

    Yom-Tov, Elad; Fernandez-Luque, Luis

    2014-01-01

    Vaccination campaigns are one of the most important and successful public health programs ever undertaken. People who want to learn about vaccines in order to make an informed decision on whether to vaccinate are faced with a wealth of information on the Internet, both for and against vaccinations. In this paper we develop an automated way to score Internet search queries and web pages as to the likelihood that a person making these queries or reading those pages would decide to vaccinate. We apply this method to data from a major Internet search engine, while people seek information about the Measles, Mumps and Rubella (MMR) vaccine. We show that our method is accurate, and use it to learn about the information acquisition process of people. Our results show that people who are pro-vaccination as well as people who are anti-vaccination seek similar information, but browsing this information has differing effect on their future browsing. These findings demonstrate the need for health authorities to tailor their information according to the current stance of users.

  18. Predicting glycerophosphoinositol identities in lipidomic datasets using VaLID (Visualization and Phospholipid Identification)--an online bioinformatic search engine.

    PubMed

    McDowell, Graeme S V; Blanchard, Alexandre P; Taylor, Graeme P; Figeys, Daniel; Fai, Stephen; Bennett, Steffany A L

    2014-01-01

    The capacity to predict and visualize all theoretically possible glycerophospholipid molecular identities present in lipidomic datasets is currently limited. To address this issue, we expanded the search-engine and compositional databases of the online Visualization and Phospholipid Identification (VaLID) bioinformatic tool to include the glycerophosphoinositol superfamily. VaLID v1.0.0 originally allowed exact and average mass libraries of 736,584 individual species from eight phospholipid classes: glycerophosphates, glyceropyrophosphates, glycerophosphocholines, glycerophosphoethanolamines, glycerophosphoglycerols, glycerophosphoglycerophosphates, glycerophosphoserines, and cytidine 5'-diphosphate 1,2-diacyl-sn-glycerols to be searched for any mass to charge value (with adjustable tolerance levels) under a variety of mass spectrometry conditions. Here, we describe an update that now includes all possible glycerophosphoinositols, glycerophosphoinositol monophosphates, glycerophosphoinositol bisphosphates, and glycerophosphoinositol trisphosphates. This update expands the total number of lipid species represented in the VaLID v2.0.0 database to 1,473,168 phospholipids. Each phospholipid can be generated in skeletal representation. A subset of species curated by the Canadian Institutes of Health Research Training Program in Neurodegenerative Lipidomics (CTPNL) team is provided as an array of high-resolution structures. VaLID is freely available and responds to all users through the CTPNL resources web site.

  19. Detecting the norovirus season in Sweden using search engine data--meeting the needs of hospital infection control teams.

    PubMed

    Edelstein, Michael; Wallensten, Anders; Zetterqvist, Inga; Hulth, Anette

    2014-01-01

    Norovirus outbreaks severely disrupt healthcare systems. We evaluated whether Websök, an internet-based surveillance system using search engine data, improved norovirus surveillance and response in Sweden. We compared Websök users' characteristics with the general population, cross-correlated weekly Websök searches with laboratory notifications between 2006 and 2013, compared the time Websök and laboratory data crossed the epidemic threshold and surveyed infection control teams about their perception and use of Websök. Users of Websök were not representative of the general population. Websök correlated with laboratory data (b = 0.88-0.89) and gave an earlier signal to the onset of the norovirus season compared with laboratory-based surveillance. 17/21 (81%) infection control teams answered the survey, of which 11 (65%) believed Websök could help with infection control plans. Websök is a low-resource, easily replicable system that detects the norovirus season as reliably as laboratory data, but earlier. Using Websök in routine surveillance can help infection control teams prepare for the yearly norovirus season.

  20. Information is in the eye of the beholder: Seeking information on the MMR vaccine through an Internet search engine

    PubMed Central

    Yom-Tov, Elad; Fernandez-Luque, Luis

    2014-01-01

    Vaccination campaigns are one of the most important and successful public health programs ever undertaken. People who want to learn about vaccines in order to make an informed decision on whether to vaccinate are faced with a wealth of information on the Internet, both for and against vaccinations. In this paper we develop an automated way to score Internet search queries and web pages as to the likelihood that a person making these queries or reading those pages would decide to vaccinate. We apply this method to data from a major Internet search engine, while people seek information about the Measles, Mumps and Rubella (MMR) vaccine. We show that our method is accurate, and use it to learn about the information acquisition process of people. Our results show that people who are pro-vaccination as well as people who are anti-vaccination seek similar information, but browsing this information has differing effect on their future browsing. These findings demonstrate the need for health authorities to tailor their information according to the current stance of users. PMID:25954435

  1. Using the Keyword Mnemonics Method among Adult Learners

    ERIC Educational Resources Information Center

    Campos, Alfredo; Camino, Estefania; Perez-Fabello, Maria Jose

    2011-01-01

    The aim of the present study was to assess the influence of word image vividness on the immediate and long-term recall (one-day interval) of words using either the rote repetition learning method or the keyword mnemonics method in a sample of adults aged 55 to 70 years. Subjects learned a list of concrete and abstract words using either rote…

  2. Collocations of High Frequency Noun Keywords in Prescribed Science Textbooks

    ERIC Educational Resources Information Center

    Menon, Sujatha; Mukundan, Jayakaran

    2012-01-01

    This paper analyses the discourse of science through the study of collocational patterns of high frequency noun keywords in science textbooks used by upper secondary students in Malaysia. Research has shown that one of the areas of difficulty in science discourse concerns lexis, especially that of collocations. This paper describes a corpus-based…

  3. A layered approach to technology transfer of AVIRIS between Earth Search Sciences, Inc. and the Idaho National Engineering Laboratory

    NASA Technical Reports Server (NTRS)

    Ferguson, James S.; Ferguson, Joanne E.; Peel, John, III; Vance, Larry

    1995-01-01

    Since initial contact between Earth Search Sciences, Inc. (ESSI) and the Idaho National Engineering Laboratory (INEL) in February, 1994, at least seven proposals have been submitted in response to a variety of solicitations to commercialize and improve the AVIRIS instrument. These proposals, matching ESSI's unique position with respect to agreements with the National Aeronautics and Space Administration (NASA) and the Jet Propulsion Laboratory (JPL) to utilize, miniaturize, and commercialize the AVIRIS instrument and platform, are combined with the applied engineering of the INEL. Teaming ESSI, NASA/JPL, and INEL with diverse industrial partners has strengthened the respective proposals. These efforts carefully structure the overall project plans to ensure the development, demonstration, and deployment of this concept to the national and international arenas. The objectives of these efforts include: (1) developing a miniaturized commercial, real-time, cost effective version of the AVIRIS instrument; (2) identifying multiple users for AVIRIS; (3) integrating the AVIRIS technology with other technologies; (4) gaining the confidence/acceptance of other government agencies and private industry in AVIRIS; and (5) increasing the technology base of U.S. industry.

  4. Datasets2Tools, repository and search engine for bioinformatics datasets, tools and canned analyses

    PubMed Central

    Torre, Denis; Krawczuk, Patrycja; Jagodnik, Kathleen M.; Lachmann, Alexander; Wang, Zichen; Wang, Lily; Kuleshov, Maxim V.; Ma’ayan, Avi

    2018-01-01

    Biomedical data repositories such as the Gene Expression Omnibus (GEO) enable the search and discovery of relevant biomedical digital data objects. Similarly, resources such as OMICtools, index bioinformatics tools that can extract knowledge from these digital data objects. However, systematic access to pre-generated ‘canned’ analyses applied by bioinformatics tools to biomedical digital data objects is currently not available. Datasets2Tools is a repository indexing 31,473 canned bioinformatics analyses applied to 6,431 datasets. The Datasets2Tools repository also contains the indexing of 4,901 published bioinformatics software tools, and all the analyzed datasets. Datasets2Tools enables users to rapidly find datasets, tools, and canned analyses through an intuitive web interface, a Google Chrome extension, and an API. Furthermore, Datasets2Tools provides a platform for contributing canned analyses, datasets, and tools, as well as evaluating these digital objects according to their compliance with the findable, accessible, interoperable, and reusable (FAIR) principles. By incorporating community engagement, Datasets2Tools promotes sharing of digital resources to stimulate the extraction of knowledge from biomedical research data. Datasets2Tools is freely available from: http://amp.pharm.mssm.edu/datasets2tools. PMID:29485625

  5. Datasets2Tools, repository and search engine for bioinformatics datasets, tools and canned analyses.

    PubMed

    Torre, Denis; Krawczuk, Patrycja; Jagodnik, Kathleen M; Lachmann, Alexander; Wang, Zichen; Wang, Lily; Kuleshov, Maxim V; Ma'ayan, Avi

    2018-02-27

    Biomedical data repositories such as the Gene Expression Omnibus (GEO) enable the search and discovery of relevant biomedical digital data objects. Similarly, resources such as OMICtools, index bioinformatics tools that can extract knowledge from these digital data objects. However, systematic access to pre-generated 'canned' analyses applied by bioinformatics tools to biomedical digital data objects is currently not available. Datasets2Tools is a repository indexing 31,473 canned bioinformatics analyses applied to 6,431 datasets. The Datasets2Tools repository also contains the indexing of 4,901 published bioinformatics software tools, and all the analyzed datasets. Datasets2Tools enables users to rapidly find datasets, tools, and canned analyses through an intuitive web interface, a Google Chrome extension, and an API. Furthermore, Datasets2Tools provides a platform for contributing canned analyses, datasets, and tools, as well as evaluating these digital objects according to their compliance with the findable, accessible, interoperable, and reusable (FAIR) principles. By incorporating community engagement, Datasets2Tools promotes sharing of digital resources to stimulate the extraction of knowledge from biomedical research data. Datasets2Tools is freely available from: http://amp.pharm.mssm.edu/datasets2tools.

  6. Accuracy of weight loss information in Spanish search engine results on the internet.

    PubMed

    Cardel, Michelle I; Chavez, Sarah; Bian, Jiang; Peñaranda, Eribeth; Miller, Darci R; Huo, Tianyao; Modave, François

    2016-11-01

    To systematically assess the quality of online information related to weight loss that Spanish speakers in the U.S. are likely to access. This study evaluated the accessibility and quality of information for websites that were identified from weight loss queries in Spanish and compared this with previously published results in English. The content was scored with respect to five dimensions: nutrition, physical activity, behavior, pharmacotherapy, and surgical recommendations. Sixty-six websites met eligibility criteria (21 commercial, 24 news/media, 10 blogs, 0 medical/government/university, 11 unclassified sites). Of 16 possible points, mean content quality score was 3.4 (SD = 2.0). Approximately 1.5% of sites scored greater than 8 (out of 12) on nutrition, physical activity, and behavior. Unsubstantiated claims were made on 94% of the websites. Content quality scores varied significantly by type of website (P < 0.0001) with unclassified websites scoring the highest (mean = 6.3, SD = 1.4) and blogs scoring the lowest (mean = 2.2, SD = 1.2). All content quality scores were lower for Spanish websites relative to English websites. Weight loss information accessed in Spanish Web searches is suboptimal and relatively worse than weight loss information accessed in English, suggesting that U.S. Spanish speakers accessing weight loss information online may be provided with incomplete and inaccurate information. © 2016 The Obesity Society.

  7. Analysis of the Accuracy of Weight Loss Information Search Engine Results on the Internet

    PubMed Central

    Shokar, Navkiran K.; Peñaranda, Eribeth; Nguyen, Norma

    2014-01-01

    Objectives. We systematically identified and evaluated the quality and comprehensiveness of online information related to weight loss that users were likely to access. Methods. We evaluated the content quality, accessibility of the information, and author credentials for Web sites in 2012 that were identified from weight loss specific queries that we generated. We scored the content with respect to available evidence-based guidelines for weight loss. Results. One hundred three Web sites met our eligibility criteria (21 commercial, 52 news/media, 7 blogs, 14 medical, government, or university, and 9 unclassified sites). The mean content quality score was 3.75 (range = 0–16; SD = 2.48). Approximately 5% (4.85%) of the sites scored greater than 8 (of 12) on nutrition, physical activity, and behavior. Content quality score varied significantly by type of Web site; the medical, government, or university sites (mean = 4.82, SD = 2.27) and blogs (mean = 6.33, SD = 1.99) had the highest scores. Commercial (mean = 2.37, SD = 2.60) or news/media sites (mean = 3.52, SD = 2.31) had the lowest scores (analysis of variance P < .005). Conclusions. The weight loss information that people were likely to access online was often of substandard quality because most comprehensive and quality Web sites ranked too low in search results. PMID:25122030

  8. 'Doctor Google' ending the diagnostic odyssey in lysosomal storage disorders: parents using internet search engines as an efficient diagnostic strategy in rare diseases.

    PubMed

    Bouwman, Machtelt G; Teunissen, Quirine G A; Wijburg, Frits A; Linthorst, Gabor E

    2010-08-01

    The expansion of the internet has resulted in widespread availability of medical information for both patients and physicians. People increasingly spend time on the internet searching for an explanation, diagnosis or treatment for their symptoms. Regarding rare diseases, the use of the internet may be an important tool in the diagnostic process. The authors present two cases in which concerned parents made a correct diagnosis of a lysosomal storage disorder in their child by searching the internet after a long doctor's delay. These cases illustrate the utility of publicly available internet search engines in diagnosing rare disorders and in addition illustrate the lengthy diagnostic odyssey which is common in these disorders.

  9. Searching for Global Descriptors of Engineered Nanomaterial Fate and Transport in the Environment

    PubMed Central

    Nowack, Bernd

    2012-01-01

    CONSPECTUS Engineered nanomaterials (ENMs) are a new class of environmental pollutants. Researchers are beginning to debate whether new modeling paradigms and experimental tests to obtain model parameters are required for ENMs or if approaches for existing pollutants are robust enough to predict ENM distribution between environmental compartments. This Account outlines how experimental research can yield quantitative data for use in ENM fate and exposure models. We first review experimental testing approaches that are employed with ENMs. Then we compare and contrast ENMs against other pollutants. Finally, we summarize the findings and identify research needs that may yield global descriptors for ENMs that are suitable for use in fate and transport modeling. Over the past decade, researchers have made significant progress in understanding factors that influence the fate and transport of ENMs. In some cases researchers have developed approaches toward global descriptor models (experimental, conceptual, and quantitative). We suggest the following global descriptors for ENMs: octanol-water partition coefficients, solid-water partition coefficients, attachment coefficients, and rate constants describing reactions such as dissolution, sedimentation, and degradation. ENMs appear to accumulate at the octanol-water interface and readily interact with other interfaces, such as lipid-water interfaces. Batch experiments to investigate factors that influence retention of ENMs on solid phases are very promising. However ENMs probably do not behave in the same way as dissolved chemicals, and therefore researchers need to use measurement techniques and concepts more commonly associated with colloids. Despite several years of research with ENMs in column studies, available summaries tend to discuss the effects of ionic strength, pH, organic matter, ENM type, packing media, or other parameters qualitatively rather than reporting quantitative values, such as attachment efficiencies

  10. Power Search.

    ERIC Educational Resources Information Center

    Haskin, David

    1997-01-01

    Compares six leading Web search engines (AltaVista, Excite, HotBot, Infoseek, Lycos, and Northern Light), looking at the breadth of their coverage, accuracy, and ease of use, and finds a clear favorite of the six. Includes tips that can improve search results. (AEF)

  11. Automatic sorting of toxicological information into the IUCLID (International Uniform Chemical Information Database) endpoint-categories making use of the semantic search engine Go3R.

    PubMed

    Sauer, Ursula G; Wächter, Thomas; Hareng, Lars; Wareing, Britta; Langsch, Angelika; Zschunke, Matthias; Alvers, Michael R; Landsiedel, Robert

    2014-06-01

    The knowledge-based search engine Go3R, www.Go3R.org, has been developed to assist scientists from industry and regulatory authorities in collecting comprehensive toxicological information with a special focus on identifying available alternatives to animal testing. The semantic search paradigm of Go3R makes use of expert knowledge on 3Rs methods and regulatory toxicology, laid down in the ontology, a network of concepts, terms, and synonyms, to recognize the contents of documents. Search results are automatically sorted into a dynamic table of contents presented alongside the list of documents retrieved. This table of contents allows the user to quickly filter the set of documents by topics of interest. Documents containing hazard information are automatically assigned to a user interface following the endpoint-specific IUCLID5 categorization scheme required, e.g. for REACH registration dossiers. For this purpose, complex endpoint-specific search queries were compiled and integrated into the search engine (based upon a gold standard of 310 references that had been assigned manually to the different endpoint categories). Go3R sorts 87% of the references concordantly into the respective IUCLID5 categories. Currently, Go3R searches in the 22 million documents available in the PubMed and TOXNET databases. However, it can be customized to search in other databases including in-house databanks. Copyright © 2013 Elsevier Ltd. All rights reserved.

  12. Yearning to Give Back: Searching for Social Purpose in Computer Science and Engineering

    PubMed Central

    Carrigan, Coleen M.

    2017-01-01

    Computing is highly segregated and stratified by gender. While there is abundant scholarship investigating this problem, emerging evidence suggests that a hierarchy of value exists between the social and technical dimensions of Computer Science and Engineering (CSE) and this plays a role in the underrepresentation of women in the field. This ethnographic study of women's experiences in computing offers evidence of a systemic preference for the technical dimensions of computing over the social and a correlation between gender and social aspirations. Additionally, it suggests there is a gap between the exaltation of computing's social contributions and the realities of them. My participants expressed a yearning to contribute to the collective well-being of society using their computing skills. I trace moments of rupture in my participants' stories, moments when they felt these aspirations were in conflict with the cultural values in their organizations. I interpret these ruptures within a consideration of yearning, a need my participants had to contribute meaningfully to society that remained unfulfilled. The yearning to align one's altruistic values with one's careers aspirations in CSE illuminates an area for greater exploration on the path to realizing gender equity in computing. I argue that before a case can be made that careers in computing do indeed contribute to social and civil engagements, we must first address the meaning of the social within the values, ideologies and practices of CSE institutions and next, develop ways to measure and evaluate the field's contributions to society. PMID:28790936

  13. Yearning to Give Back: Searching for Social Purpose in Computer Science and Engineering.

    PubMed

    Carrigan, Coleen M

    2017-01-01

    Computing is highly segregated and stratified by gender. While there is abundant scholarship investigating this problem, emerging evidence suggests that a hierarchy of value exists between the social and technical dimensions of Computer Science and Engineering (CSE) and this plays a role in the underrepresentation of women in the field. This ethnographic study of women's experiences in computing offers evidence of a systemic preference for the technical dimensions of computing over the social and a correlation between gender and social aspirations. Additionally, it suggests there is a gap between the exaltation of computing's social contributions and the realities of them. My participants expressed a yearning to contribute to the collective well-being of society using their computing skills. I trace moments of rupture in my participants' stories, moments when they felt these aspirations were in conflict with the cultural values in their organizations. I interpret these ruptures within a consideration of yearning, a need my participants had to contribute meaningfully to society that remained unfulfilled. The yearning to align one's altruistic values with one's careers aspirations in CSE illuminates an area for greater exploration on the path to realizing gender equity in computing. I argue that before a case can be made that careers in computing do indeed contribute to social and civil engagements, we must first address the meaning of the social within the values, ideologies and practices of CSE institutions and next, develop ways to measure and evaluate the field's contributions to society.

  14. Assessing explicit error reporting in the narrative electronic medical record using keyword searching.

    PubMed

    Cao, Hui; Stetson, Peter; Hripcsak, George

    2003-01-01

    In this study, we assessed the explicit reporting of medical errors in the electronic record. We looked for cases in which the provider explicitly stated that he or she or another provider had committed an error. The advantage of the technique is that it is not limited to a specific type of error. Our goals were to 1) measure the rate at which medical errors were documented in medical records, and 2) characterize the types of errors that were reported.

  15. Understanding the delayed-keyword effect on metacomprehension accuracy.

    PubMed

    Thiede, Keith W; Dunlosky, John; Griffin, Thomas D; Wiley, Jennifer

    2005-11-01

    The typical finding from research on metacomprehension is that accuracy is quite low. However, recent studies have shown robust accuracy improvements when judgments follow certain generation tasks (summarizing or keyword listing) but only when these tasks are performed at a delay rather than immediately after reading (K. W. Thiede & M. C. M. Anderson, 2003; K. W. Thiede, M. C. M. Anderson, & D. Therriault, 2003). The delayed and immediate conditions in these studies confounded the delay between reading and generation tasks with other task lags, including the lag between multiple generation tasks and the lag between generation tasks and judgments. The first 2 experiments disentangle these confounded manipulations and provide clear evidence that the delay between reading and keyword generation is the only lag critical to improving metacomprehension accuracy. The 3rd and 4th experiments show that not all delayed tasks produce improvements and suggest that delayed generative tasks provide necessary diagnostic cues about comprehension for improving metacomprehension accuracy.

  16. Evidential significance of automotive paint trace evidence using a pattern recognition based infrared library search engine for the Paint Data Query Forensic Database.

    PubMed

    Lavine, Barry K; White, Collin G; Allen, Matthew D; Fasasi, Ayuba; Weakley, Andrew

    2016-10-01

    A prototype library search engine has been further developed to search the infrared spectral libraries of the paint data query database to identify the line and model of a vehicle from the clear coat, surfacer-primer, and e-coat layers of an intact paint chip. For this study, search prefilters were developed from 1181 automotive paint systems spanning 3 manufacturers: General Motors, Chrysler, and Ford. The best match between each unknown and the spectra in the hit list generated by the search prefilters was identified using a cross-correlation library search algorithm that performed both a forward and backward search. In the forward search, spectra were divided into intervals and further subdivided into windows (which corresponds to the time lag for the comparison) within those intervals. The top five hits identified in each search window were compiled; a histogram was computed that summarized the frequency of occurrence for each library sample, with the IR spectra most similar to the unknown flagged. The backward search computed the frequency and occurrence of each line and model without regard to the identity of the individual spectra. Only those lines and models with a frequency of occurrence greater than or equal to 20% were included in the final hit list. If there was agreement between the forward and backward search results, the specific line and model common to both hit lists was always the correct assignment. Samples assigned to the same line and model by both searches are always well represented in the library and correlate well on an individual basis to specific library samples. For these samples, one can have confidence in the accuracy of the match. This was not the case for the results obtained using commercial library search algorithms, as the hit quality index scores for the top twenty hits were always greater than 99%. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. The Paragon Algorithm, a next generation search engine that uses sequence temperature values and feature probabilities to identify peptides from tandem mass spectra.

    PubMed

    Shilov, Ignat V; Seymour, Sean L; Patel, Alpesh A; Loboda, Alex; Tang, Wilfred H; Keating, Sean P; Hunter, Christie L; Nuwaysir, Lydia M; Schaeffer, Daniel A

    2007-09-01

    The Paragon Algorithm, a novel database search engine for the identification of peptides from tandem mass spectrometry data, is presented. Sequence Temperature Values are computed using a sequence tag algorithm, allowing the degree of implication by an MS/MS spectrum of each region of a database to be determined on a continuum. Counter to conventional approaches, features such as modifications, substitutions, and cleavage events are modeled with probabilities rather than by discrete user-controlled settings to consider or not consider a feature. The use of feature probabilities in conjunction with Sequence Temperature Values allows for a very large increase in the effective search space with only a very small increase in the actual number of hypotheses that must be scored. The algorithm has a new kind of user interface that removes the user expertise requirement, presenting control settings in the language of the laboratory that are translated to optimal algorithmic settings. To validate this new algorithm, a comparison with Mascot is presented for a series of analogous searches to explore the relative impact of increasing search space probed with Mascot by relaxing the tryptic digestion conformance requirements from trypsin to semitrypsin to no enzyme and with the Paragon Algorithm using its Rapid mode and Thorough mode with and without tryptic specificity. Although they performed similarly for small search space, dramatic differences were observed in large search space. With the Paragon Algorithm, hundreds of biological and artifact modifications, all possible substitutions, and all levels of conformance to the expected digestion pattern can be searched in a single search step, yet the typical cost in search time is only 2-5 times that of conventional small search space. Despite this large increase in effective search space, there is no drastic loss of discrimination that typically accompanies the exploration of large search space.

  18. Keyword extraction, ranking, and organization for the neuroinformatics platform.

    PubMed

    Usui, S; Palmes, P; Nagata, K; Taniguchi, T; Ueda, N

    2007-04-01

    Brain-related researches encompass many fields of studies and usually involve worldwide collaborations. Recognizing the value of these international collaborations for efficient use of resources and improving the quality of brain research, the International Neuroinformatics Coordinating Facility (INCF) started to coordinate the effort of establishing neuroinformatics (NI) centers and portal sites among the different participating countries. These NI centers and portal sites will serve as the conduit for the interchange of information and brain-related resources among different countries. In Japan, several NI platforms under the support of NIJC (NI Japan Center) are being developed with one platform called, Visiome, already operating and publicly accessible at "http://www.platform.visiome.org". Each of these platforms requires their own set of keywords that represent important terms covering their respective fields of study. One important function of this predefined keyword list is to help contributors classify the contents of their contributions and group related resources. It is vital, therefore, that this predefined list should be properly chosen to cover the necessary areas. Currently, the process of identifying these appropriate keywords relies on the availability of human experts which does not scale well considering that different areas are rapidly evolving. This problem prompted us to develop a tool to automatically filter the most likely terms preferred by human experts. We tested the effectiveness of the proposed approach using the abstracts of the Vision Research Journal (VR) and Investigative Ophthalmology and Visual Science Journal (IOVS) as source files.

  19. COM5/443: Health Portals, Search Engines and Intranets: Making order of the chaos - A review article

    PubMed Central

    Maini, P

    1999-01-01

    Effective use of technology has led to medical advances that have not only extended life expectancy, but also fuelled an increasingly well informed public's desires to expect more and more from today's healthcare providers. A phenomenal amount of medical information is now present, facilitated by the virtually unrestricted ability to publish on the Internet. The WWW grows by roughly a million pages per day with a significant number devoted to some element of healthcare. As a consequence of the web's rapid, chaotic growth, the resulting network of information lacks organisation and structure and the quest for a method of finding relevant and reliable information quickly is spawning the growth of the Internet Portal Sites. The use of intranets confers some degree of organisation, allowing for structured growth of an organisation's web site but the majority of sites are not sheltered within an intranet. Millions of dollars are being invested in researching the technologies powering search engines and well designed portals, in the hope of imposing some kind of order on the existing chaos. The winner will most likely reap huge financial rewards as advertisers climb on board, hoping to gain massive exposure and the public will undoubtedly gain from relevant and timely answers to their searches and queries. Information is a valuable commodity, but the true value is seen only when it is organised into an accessible and useable form. The coming of age of the Internet has heralded radical changes in the fabric and economics of healthcare and as a consequence our expectations from professionals involved in this industry are greatly increased. Although significant challenges need to be overcome, there are huge benefits to be gained for consumers, although it remains to be seen whether the world becomes a more healthy place.

  20. Anamneses-Based Internet Information Supply: Can a Combination of an Expert System and Meta-Search Engine Help Consumers find the Health Information they Require?

    PubMed Central

    Honekamp, Wilfried; Ostermann, Herwig

    2010-01-01

    An increasing number of people search for health information online. During the last 10 years various researchers have determined the requirements for an ideal consumer health information system. The aim of this study was to figure out, whether medical laymen can find a more accurate diagnosis for a given anamnesis via the developed prototype health information system than via ordinary internet search. In a randomized controlled trial, the prototype information system was evaluated by the assessment of two sample cases. Participants had to determine the diagnosis of a patient with a headache via information found searching the web. A patient’s history sheet and a computer with internet access were provided to the participants and they were guided through the study by an especially designed study website. The intervention group used the prototype information system; the control group used common search engines and portals. The numbers of correct diagnoses in each group were compared. A total of 140 (60/80) participants took part in two study sections. In the first case, which determined a common diagnosis, both groups did equally well. In the second section, which determined a less common and more complex case, the intervention group did significantly better (P=0.031) due to the tailored information supply. Using medical expert systems in combination with a portal searching meta-search engine represents a feasible strategy to provide reliable patient-tailored information and can ultimately contribute to patient safety with respect to information found via the internet. PMID:20502597