Application of kernel functions for accurate similarity search in large chemical databases.
Wang, Xiaohong; Huan, Jun; Smalter, Aaron; Lushington, Gerald H
2010-04-29
Similarity search in chemical structure databases is an important problem with many applications in chemical genomics, drug design, and efficient chemical probe screening among others. It is widely believed that structure based methods provide an efficient way to do the query. Recently various graph kernel functions have been designed to capture the intrinsic similarity of graphs. Though successful in constructing accurate predictive and classification models, graph kernel functions can not be applied to large chemical compound database due to the high computational complexity and the difficulties in indexing similarity search for large databases. To bridge graph kernel function and similarity search in chemical databases, we applied a novel kernel-based similarity measurement, developed in our team, to measure similarity of graph represented chemicals. In our method, we utilize a hash table to support new graph kernel function definition, efficient storage and fast search. We have applied our method, named G-hash, to large chemical databases. Our results show that the G-hash method achieves state-of-the-art performance for k-nearest neighbor (k-NN) classification. Moreover, the similarity measurement and the index structure is scalable to large chemical databases with smaller indexing size, and faster query processing time as compared to state-of-the-art indexing methods such as Daylight fingerprints, C-tree and GraphGrep. Efficient similarity query processing method for large chemical databases is challenging since we need to balance running time efficiency and similarity search accuracy. Our previous similarity search method, G-hash, provides a new way to perform similarity search in chemical databases. Experimental study validates the utility of G-hash in chemical databases.
Protein structural similarity search by Ramachandran codes
Lo, Wei-Cheng; Huang, Po-Jung; Chang, Chih-Hung; Lyu, Ping-Chiang
2007-01-01
Background Protein structural data has increased exponentially, such that fast and accurate tools are necessary to access structure similarity search. To improve the search speed, several methods have been designed to reduce three-dimensional protein structures to one-dimensional text strings that are then analyzed by traditional sequence alignment methods; however, the accuracy is usually sacrificed and the speed is still unable to match sequence similarity search tools. Here, we aimed to improve the linear encoding methodology and develop efficient search tools that can rapidly retrieve structural homologs from large protein databases. Results We propose a new linear encoding method, SARST (Structural similarity search Aided by Ramachandran Sequential Transformation). SARST transforms protein structures into text strings through a Ramachandran map organized by nearest-neighbor clustering and uses a regenerative approach to produce substitution matrices. Then, classical sequence similarity search methods can be applied to the structural similarity search. Its accuracy is similar to Combinatorial Extension (CE) and works over 243,000 times faster, searching 34,000 proteins in 0.34 sec with a 3.2-GHz CPU. SARST provides statistically meaningful expectation values to assess the retrieved information. It has been implemented into a web service and a stand-alone Java program that is able to run on many different platforms. Conclusion As a database search method, SARST can rapidly distinguish high from low similarities and efficiently retrieve homologous structures. It demonstrates that the easily accessible linear encoding methodology has the potential to serve as a foundation for efficient protein structural similarity search tools. These search tools are supposed applicable to automated and high-throughput functional annotations or predictions for the ever increasing number of published protein structures in this post-genomic era. PMID:17716377
BLAST and FASTA similarity searching for multiple sequence alignment.
Pearson, William R
2014-01-01
BLAST, FASTA, and other similarity searching programs seek to identify homologous proteins and DNA sequences based on excess sequence similarity. If two sequences share much more similarity than expected by chance, the simplest explanation for the excess similarity is common ancestry-homology. The most effective similarity searches compare protein sequences, rather than DNA sequences, for sequences that encode proteins, and use expectation values, rather than percent identity, to infer homology. The BLAST and FASTA packages of sequence comparison programs provide programs for comparing protein and DNA sequences to protein databases (the most sensitive searches). Protein and translated-DNA comparisons to protein databases routinely allow evolutionary look back times from 1 to 2 billion years; DNA:DNA searches are 5-10-fold less sensitive. BLAST and FASTA can be run on popular web sites, but can also be downloaded and installed on local computers. With local installation, target databases can be customized for the sequence data being characterized. With today's very large protein databases, search sensitivity can also be improved by searching smaller comprehensive databases, for example, a complete protein set from an evolutionarily neighboring model organism. By default, BLAST and FASTA use scoring strategies target for distant evolutionary relationships; for comparisons involving short domains or queries, or searches that seek relatively close homologs (e.g. mouse-human), shallower scoring matrices will be more effective. Both BLAST and FASTA provide very accurate statistical estimates, which can be used to reliably identify protein sequences that diverged more than 2 billion years ago.
G-Hash: Towards Fast Kernel-based Similarity Search in Large Graph Databases.
Wang, Xiaohong; Smalter, Aaron; Huan, Jun; Lushington, Gerald H
2009-01-01
Structured data including sets, sequences, trees and graphs, pose significant challenges to fundamental aspects of data management such as efficient storage, indexing, and similarity search. With the fast accumulation of graph databases, similarity search in graph databases has emerged as an important research topic. Graph similarity search has applications in a wide range of domains including cheminformatics, bioinformatics, sensor network management, social network management, and XML documents, among others.Most of the current graph indexing methods focus on subgraph query processing, i.e. determining the set of database graphs that contains the query graph and hence do not directly support similarity search. In data mining and machine learning, various graph kernel functions have been designed to capture the intrinsic similarity of graphs. Though successful in constructing accurate predictive and classification models for supervised learning, graph kernel functions have (i) high computational complexity and (ii) non-trivial difficulty to be indexed in a graph database.Our objective is to bridge graph kernel function and similarity search in graph databases by proposing (i) a novel kernel-based similarity measurement and (ii) an efficient indexing structure for graph data management. Our method of similarity measurement builds upon local features extracted from each node and their neighboring nodes in graphs. A hash table is utilized to support efficient storage and fast search of the extracted local features. Using the hash table, a graph kernel function is defined to capture the intrinsic similarity of graphs and for fast similarity query processing. We have implemented our method, which we have named G-hash, and have demonstrated its utility on large chemical graph databases. Our results show that the G-hash method achieves state-of-the-art performance for k-nearest neighbor (k-NN) classification. Most importantly, the new similarity measurement and the index
SlideSort: all pairs similarity search for short reads
Shimizu, Kana; Tsuda, Koji
2011-01-01
Motivation: Recent progress in DNA sequencing technologies calls for fast and accurate algorithms that can evaluate sequence similarity for a huge amount of short reads. Searching similar pairs from a string pool is a fundamental process of de novo genome assembly, genome-wide alignment and other important analyses. Results: In this study, we designed and implemented an exact algorithm SlideSort that finds all similar pairs from a string pool in terms of edit distance. Using an efficient pattern growth algorithm, SlideSort discovers chains of common k-mers to narrow down the search. Compared to existing methods based on single k-mers, our method is more effective in reducing the number of edit distance calculations. In comparison to backtracking methods such as BWA, our method is much faster in finding remote matches, scaling easily to tens of millions of sequences. Our software has an additional function of single link clustering, which is useful in summarizing short reads for further processing. Availability: Executable binary files and C++ libraries are available at http://www.cbrc.jp/~shimizu/slidesort/ for Linux and Windows. Contact: slidesort@m.aist.go.jp; shimizu-kana@aist.go.jp Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21148542
Couvin, David; Zozio, Thierry; Rastogi, Nalin
2017-07-01
Spoligotyping is one of the most commonly used polymerase chain reaction (PCR)-based methods for identification and study of genetic diversity of Mycobacterium tuberculosis complex (MTBC). Despite its known limitations if used alone, the methodology is particularly useful when used in combination with other methods such as mycobacterial interspersed repetitive units - variable number of tandem DNA repeats (MIRU-VNTRs). At a worldwide scale, spoligotyping has allowed identification of information on 103,856 MTBC isolates (corresponding to 98049 clustered strains plus 5807 unique isolates from 169 countries of patient origin) contained within the SITVIT2 proprietary database of the Institut Pasteur de la Guadeloupe. The SpolSimilaritySearch web-tool described herein (available at: http://www.pasteur-guadeloupe.fr:8081/SpolSimilaritySearch) incorporates a similarity search algorithm allowing users to get a complete overview of similar spoligotype patterns (with information on presence or absence of 43 spacers) in the aforementioned worldwide database. This tool allows one to analyze spread and evolutionary patterns of MTBC by comparing similar spoligotype patterns, to distinguish between widespread, specific and/or confined patterns, as well as to pinpoint patterns with large deleted blocks, which play an intriguing role in the genetic epidemiology of M. tuberculosis. Finally, the SpolSimilaritySearch tool also provides with the country distribution patterns for each queried spoligotype. Copyright © 2017 Elsevier Ltd. All rights reserved.
Biosequence Similarity Search on the Mercury System
Krishnamurthy, Praveen; Buhler, Jeremy; Chamberlain, Roger; Franklin, Mark; Gyang, Kwame; Jacob, Arpith; Lancaster, Joseph
2007-01-01
Biosequence similarity search is an important application in modern molecular biology. Search algorithms aim to identify sets of sequences whose extensional similarity suggests a common evolutionary origin or function. The most widely used similarity search tool for biosequences is BLAST, a program designed to compare query sequences to a database. Here, we present the design of BLASTN, the version of BLAST that searches DNA sequences, on the Mercury system, an architecture that supports high-volume, high-throughput data movement off a data store and into reconfigurable hardware. An important component of application deployment on the Mercury system is the functional decomposition of the application onto both the reconfigurable hardware and the traditional processor. Both the Mercury BLASTN application design and its performance analysis are described. PMID:18846267
Accurate HLA type inference using a weighted similarity graph.
Xie, Minzhu; Li, Jing; Jiang, Tao
2010-12-14
The human leukocyte antigen system (HLA) contains many highly variable genes. HLA genes play an important role in the human immune system, and HLA gene matching is crucial for the success of human organ transplantations. Numerous studies have demonstrated that variation in HLA genes is associated with many autoimmune, inflammatory and infectious diseases. However, typing HLA genes by serology or PCR is time consuming and expensive, which limits large-scale studies involving HLA genes. Since it is much easier and cheaper to obtain single nucleotide polymorphism (SNP) genotype data, accurate computational algorithms to infer HLA gene types from SNP genotype data are in need. To infer HLA types from SNP genotypes, the first step is to infer SNP haplotypes from genotypes. However, for the same SNP genotype data set, the haplotype configurations inferred by different methods are usually inconsistent, and it is often difficult to decide which one is true. In this paper, we design an accurate HLA gene type inference algorithm by utilizing SNP genotype data from pedigrees, known HLA gene types of some individuals and the relationship between inferred SNP haplotypes and HLA gene types. Given a set of haplotypes inferred from the genotypes of a population consisting of many pedigrees, the algorithm first constructs a weighted similarity graph based on a new haplotype similarity measure and derives constraint edges from known HLA gene types. Based on the principle that different HLA gene alleles should have different background haplotypes, the algorithm searches for an optimal labeling of all the haplotypes with unknown HLA gene types such that the total weight among the same HLA gene types is maximized. To deal with ambiguous haplotype solutions, we use a genetic algorithm to select haplotype configurations that tend to maximize the same optimization criterion. Our experiments on a previously typed subset of the HapMap data show that the algorithm is highly accurate
Distributed Efficient Similarity Search Mechanism in Wireless Sensor Networks
Ahmed, Khandakar; Gregory, Mark A.
2015-01-01
The Wireless Sensor Network similarity search problem has received considerable research attention due to sensor hardware imprecision and environmental parameter variations. Most of the state-of-the-art distributed data centric storage (DCS) schemes lack optimization for similarity queries of events. In this paper, a DCS scheme with metric based similarity searching (DCSMSS) is proposed. DCSMSS takes motivation from vector distance index, called iDistance, in order to transform the issue of similarity searching into the problem of an interval search in one dimension. In addition, a sector based distance routing algorithm is used to efficiently route messages. Extensive simulation results reveal that DCSMSS is highly efficient and significantly outperforms previous approaches in processing similarity search queries. PMID:25751081
2011-01-01
Background Data fusion methods are widely used in virtual screening, and make the implicit assumption that the more often a molecule is retrieved in multiple similarity searches, the more likely it is to be active. This paper tests the correctness of this assumption. Results Sets of 25 searches using either the same reference structure and 25 different similarity measures (similarity fusion) or 25 different reference structures and the same similarity measure (group fusion) show that large numbers of unique molecules are retrieved by just a single search, but that the numbers of unique molecules decrease very rapidly as more searches are considered. This rapid decrease is accompanied by a rapid increase in the fraction of those retrieved molecules that are active. There is an approximately log-log relationship between the numbers of different molecules retrieved and the number of searches carried out, and a rationale for this power-law behaviour is provided. Conclusions Using multiple searches provides a simple way of increasing the precision of a similarity search, and thus provides a justification for the use of data fusion methods in virtual screening. PMID:21824430
Using SQL Databases for Sequence Similarity Searching and Analysis.
Pearson, William R; Mackey, Aaron J
2017-09-13
Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
Predicting the performance of fingerprint similarity searching.
Vogt, Martin; Bajorath, Jürgen
2011-01-01
Fingerprints are bit string representations of molecular structure that typically encode structural fragments, topological features, or pharmacophore patterns. Various fingerprint designs are utilized in virtual screening and their search performance essentially depends on three parameters: the nature of the fingerprint, the active compounds serving as reference molecules, and the composition of the screening database. It is of considerable interest and practical relevance to predict the performance of fingerprint similarity searching. A quantitative assessment of the potential that a fingerprint search might successfully retrieve active compounds, if available in the screening database, would substantially help to select the type of fingerprint most suitable for a given search problem. The method presented herein utilizes concepts from information theory to relate the fingerprint feature distributions of reference compounds to screening libraries. If these feature distributions do not sufficiently differ, active database compounds that are similar to reference molecules cannot be retrieved because they disappear in the "background." By quantifying the difference in feature distribution using the Kullback-Leibler divergence and relating the divergence to compound recovery rates obtained for different benchmark classes, fingerprint search performance can be quantitatively predicted.
Ultra-accurate collaborative information filtering via directed user similarity
NASA Astrophysics Data System (ADS)
Guo, Q.; Song, W.-J.; Liu, J.-G.
2014-07-01
A key challenge of the collaborative filtering (CF) information filtering is how to obtain the reliable and accurate results with the help of peers' recommendation. Since the similarities from small-degree users to large-degree users would be larger than the ones in opposite direction, the large-degree users' selections are recommended extensively by the traditional second-order CF algorithms. By considering the users' similarity direction and the second-order correlations to depress the influence of mainstream preferences, we present the directed second-order CF (HDCF) algorithm specifically to address the challenge of accuracy and diversity of the CF algorithm. The numerical results for two benchmark data sets, MovieLens and Netflix, show that the accuracy of the new algorithm outperforms the state-of-the-art CF algorithms. Comparing with the CF algorithm based on random walks proposed by Liu et al. (Int. J. Mod. Phys. C, 20 (2009) 285) the average ranking score could reach 0.0767 and 0.0402, which is enhanced by 27.3% and 19.1% for MovieLens and Netflix, respectively. In addition, the diversity, precision and recall are also enhanced greatly. Without relying on any context-specific information, tuning the similarity direction of CF algorithms could obtain accurate and diverse recommendations. This work suggests that the user similarity direction is an important factor to improve the personalized recommendation performance.
The HMMER Web Server for Protein Sequence Similarity Search.
Prakash, Ananth; Jeffryes, Matt; Bateman, Alex; Finn, Robert D
2017-12-08
Protein sequence similarity search is one of the most commonly used bioinformatics methods for identifying evolutionarily related proteins. In general, sequences that are evolutionarily related share some degree of similarity, and sequence-search algorithms use this principle to identify homologs. The requirement for a fast and sensitive sequence search method led to the development of the HMMER software, which in the latest version (v3.1) uses a combination of sophisticated acceleration heuristics and mathematical and computational optimizations to enable the use of profile hidden Markov models (HMMs) for sequence analysis. The HMMER Web server provides a common platform by linking the HMMER algorithms to databases, thereby enabling the search for homologs, as well as providing sequence and functional annotation by linking external databases. This unit describes three basic protocols and two alternate protocols that explain how to use the HMMER Web server using various input formats and user defined parameters. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
Accurate estimation of influenza epidemics using Google search data via ARGO.
Yang, Shihao; Santillana, Mauricio; Kou, S C
2015-11-24
Accurate real-time tracking of influenza outbreaks helps public health officials make timely and meaningful decisions that could save lives. We propose an influenza tracking model, ARGO (AutoRegression with GOogle search data), that uses publicly available online search data. In addition to having a rigorous statistical foundation, ARGO outperforms all previously available Google-search-based tracking models, including the latest version of Google Flu Trends, even though it uses only low-quality search data as input from publicly available Google Trends and Google Correlate websites. ARGO not only incorporates the seasonality in influenza epidemics but also captures changes in people's online search behavior over time. ARGO is also flexible, self-correcting, robust, and scalable, making it a potentially powerful tool that can be used for real-time tracking of other social events at multiple temporal and spatial resolutions.
Relational Agreement Measures for Similarity Searching of Cheminformatic Data Sets.
Rivera-Borroto, Oscar Miguel; García-de la Vega, José Manuel; Marrero-Ponce, Yovani; Grau, Ricardo
2016-01-01
Research on similarity searching of cheminformatic data sets has been focused on similarity measures using fingerprints. However, nominal scales are the least informative of all metric scales, increasing the tied similarity scores, and decreasing the effectivity of the retrieval engines. Tanimoto's coefficient has been claimed to be the most prominent measure for this task. Nevertheless, this field is far from being exhausted since the computer science no free lunch theorem predicts that "no similarity measure has overall superiority over the population of data sets". We introduce 12 relational agreement (RA) coefficients for seven metric scales, which are integrated within a group fusion-based similarity searching algorithm. These similarity measures are compared to a reference panel of 21 proximity quantifiers over 17 benchmark data sets (MUV), by using informative descriptors, a feature selection stage, a suitable performance metric, and powerful comparison tests. In this stage, RA coefficients perform favourably with repect to the state-of-the-art proximity measures. Afterward, the RA-based method outperform another four nearest neighbor searching algorithms over the same data domains. In a third validation stage, RA measures are successfully applied to the virtual screening of the NCI data set. Finally, we discuss a possible molecular interpretation for these similarity variants.
SW#db: GPU-Accelerated Exact Sequence Similarity Database Search.
Korpar, Matija; Šošić, Martin; Blažeka, Dino; Šikić, Mile
2015-01-01
In recent years we have witnessed a growth in sequencing yield, the number of samples sequenced, and as a result-the growth of publicly maintained sequence databases. The increase of data present all around has put high requirements on protein similarity search algorithms with two ever-opposite goals: how to keep the running times acceptable while maintaining a high-enough level of sensitivity. The most time consuming step of similarity search are the local alignments between query and database sequences. This step is usually performed using exact local alignment algorithms such as Smith-Waterman. Due to its quadratic time complexity, alignments of a query to the whole database are usually too slow. Therefore, the majority of the protein similarity search methods prior to doing the exact local alignment apply heuristics to reduce the number of possible candidate sequences in the database. However, there is still a need for the alignment of a query sequence to a reduced database. In this paper we present the SW#db tool and a library for fast exact similarity search. Although its running times, as a standalone tool, are comparable to the running times of BLAST, it is primarily intended to be used for exact local alignment phase in which the database of sequences has already been reduced. It uses both GPU and CPU parallelization and was 4-5 times faster than SSEARCH, 6-25 times faster than CUDASW++ and more than 20 times faster than SSW at the time of writing, using multiple queries on Swiss-prot and Uniref90 databases.
Effects of Part-based Similarity on Visual Search: The Frankenbear Experiment
Alexander, Robert G.; Zelinsky, Gregory J.
2012-01-01
Do the target-distractor and distractor-distractor similarity relationships known to exist for simple stimuli extend to real-world objects, and are these effects expressed in search guidance or target verification? Parts of photorealistic distractors were replaced with target parts to create four levels of target-distractor similarity under heterogenous and homogenous conditions. We found that increasing target-distractor similarity and decreasing distractor-distractor similarity impaired search guidance and target verification, but that target-distractor similarity and heterogeneity/homogeneity interacted only in measures of guidance; distractor homogeneity lessens effects of target-distractor similarity by causing gaze to fixate the target sooner, not by speeding target detection following its fixation. PMID:22227607
A new method to improve network topological similarity search: applied to fold recognition
Lhota, John; Hauptman, Ruth; Hart, Thomas; Ng, Clara; Xie, Lei
2015-01-01
Motivation: Similarity search is the foundation of bioinformatics. It plays a key role in establishing structural, functional and evolutionary relationships between biological sequences. Although the power of the similarity search has increased steadily in recent years, a high percentage of sequences remain uncharacterized in the protein universe. Thus, new similarity search strategies are needed to efficiently and reliably infer the structure and function of new sequences. The existing paradigm for studying protein sequence, structure, function and evolution has been established based on the assumption that the protein universe is discrete and hierarchical. Cumulative evidence suggests that the protein universe is continuous. As a result, conventional sequence homology search methods may be not able to detect novel structural, functional and evolutionary relationships between proteins from weak and noisy sequence signals. To overcome the limitations in existing similarity search methods, we propose a new algorithmic framework—Enrichment of Network Topological Similarity (ENTS)—to improve the performance of large scale similarity searches in bioinformatics. Results: We apply ENTS to a challenging unsolved problem: protein fold recognition. Our rigorous benchmark studies demonstrate that ENTS considerably outperforms state-of-the-art methods. As the concept of ENTS can be applied to any similarity metric, it may provide a general framework for similarity search on any set of biological entities, given their representation as a network. Availability and implementation: Source code freely available upon request Contact: lxie@iscb.org PMID:25717198
Accurate estimation of influenza epidemics using Google search data via ARGO
Yang, Shihao; Santillana, Mauricio; Kou, S. C.
2015-01-01
Accurate real-time tracking of influenza outbreaks helps public health officials make timely and meaningful decisions that could save lives. We propose an influenza tracking model, ARGO (AutoRegression with GOogle search data), that uses publicly available online search data. In addition to having a rigorous statistical foundation, ARGO outperforms all previously available Google-search–based tracking models, including the latest version of Google Flu Trends, even though it uses only low-quality search data as input from publicly available Google Trends and Google Correlate websites. ARGO not only incorporates the seasonality in influenza epidemics but also captures changes in people’s online search behavior over time. ARGO is also flexible, self-correcting, robust, and scalable, making it a potentially powerful tool that can be used for real-time tracking of other social events at multiple temporal and spatial resolutions. PMID:26553980
RAPSearch: a fast protein similarity search tool for short reads
2011-01-01
Background Next Generation Sequencing (NGS) is producing enormous corpuses of short DNA reads, affecting emerging fields like metagenomics. Protein similarity search--a key step to achieve annotation of protein-coding genes in these short reads, and identification of their biological functions--faces daunting challenges because of the very sizes of the short read datasets. Results We developed a fast protein similarity search tool RAPSearch that utilizes a reduced amino acid alphabet and suffix array to detect seeds of flexible length. For short reads (translated in 6 frames) we tested, RAPSearch achieved ~20-90 times speedup as compared to BLASTX. RAPSearch missed only a small fraction (~1.3-3.2%) of BLASTX similarity hits, but it also discovered additional homologous proteins (~0.3-2.1%) that BLASTX missed. By contrast, BLAT, a tool that is even slightly faster than RAPSearch, had significant loss of sensitivity as compared to RAPSearch and BLAST. Conclusions RAPSearch is implemented as open-source software and is accessible at http://omics.informatics.indiana.edu/mg/RAPSearch. It enables faster protein similarity search. The application of RAPSearch in metageomics has also been demonstrated. PMID:21575167
Robust hashing with local models for approximate similarity search.
Song, Jingkuan; Yang, Yi; Li, Xuelong; Huang, Zi; Yang, Yang
2014-07-01
Similarity search plays an important role in many applications involving high-dimensional data. Due to the known dimensionality curse, the performance of most existing indexing structures degrades quickly as the feature dimensionality increases. Hashing methods, such as locality sensitive hashing (LSH) and its variants, have been widely used to achieve fast approximate similarity search by trading search quality for efficiency. However, most existing hashing methods make use of randomized algorithms to generate hash codes without considering the specific structural information in the data. In this paper, we propose a novel hashing method, namely, robust hashing with local models (RHLM), which learns a set of robust hash functions to map the high-dimensional data points into binary hash codes by effectively utilizing local structural information. In RHLM, for each individual data point in the training dataset, a local hashing model is learned and used to predict the hash codes of its neighboring data points. The local models from all the data points are globally aligned so that an optimal hash code can be assigned to each data point. After obtaining the hash codes of all the training data points, we design a robust method by employing l2,1 -norm minimization on the loss function to learn effective hash functions, which are then used to map each database point into its hash code. Given a query data point, the search process first maps it into the query hash code by the hash functions and then explores the buckets, which have similar hash codes to the query hash code. Extensive experimental results conducted on real-life datasets show that the proposed RHLM outperforms the state-of-the-art methods in terms of search quality and efficiency.
Web Image Search Re-ranking with Click-based Similarity and Typicality.
Yang, Xiaopeng; Mei, Tao; Zhang, Yong Dong; Liu, Jie; Satoh, Shin'ichi
2016-07-20
In image search re-ranking, besides the well known semantic gap, intent gap, which is the gap between the representation of users' query/demand and the real intent of the users, is becoming a major problem restricting the development of image retrieval. To reduce human effects, in this paper, we use image click-through data, which can be viewed as the "implicit feedback" from users, to help overcome the intention gap, and further improve the image search performance. Generally, the hypothesis visually similar images should be close in a ranking list and the strategy images with higher relevance should be ranked higher than others are widely accepted. To obtain satisfying search results, thus, image similarity and the level of relevance typicality are determinate factors correspondingly. However, when measuring image similarity and typicality, conventional re-ranking approaches only consider visual information and initial ranks of images, while overlooking the influence of click-through data. This paper presents a novel re-ranking approach, named spectral clustering re-ranking with click-based similarity and typicality (SCCST). First, to learn an appropriate similarity measurement, we propose click-based multi-feature similarity learning algorithm (CMSL), which conducts metric learning based on clickbased triplets selection, and integrates multiple features into a unified similarity space via multiple kernel learning. Then based on the learnt click-based image similarity measure, we conduct spectral clustering to group visually and semantically similar images into same clusters, and get the final re-rank list by calculating click-based clusters typicality and withinclusters click-based image typicality in descending order. Our experiments conducted on two real-world query-image datasets with diverse representative queries show that our proposed reranking approach can significantly improve initial search results, and outperform several existing re-ranking approaches.
Combined semantic and similarity search in medical image databases
NASA Astrophysics Data System (ADS)
Seifert, Sascha; Thoma, Marisa; Stegmaier, Florian; Hammon, Matthias; Kramer, Martin; Huber, Martin; Kriegel, Hans-Peter; Cavallaro, Alexander; Comaniciu, Dorin
2011-03-01
The current diagnostic process at hospitals is mainly based on reviewing and comparing images coming from multiple time points and modalities in order to monitor disease progression over a period of time. However, for ambiguous cases the radiologist deeply relies on reference literature or second opinion. Although there is a vast amount of acquired images stored in PACS systems which could be reused for decision support, these data sets suffer from weak search capabilities. Thus, we present a search methodology which enables the physician to fulfill intelligent search scenarios on medical image databases combining ontology-based semantic and appearance-based similarity search. It enabled the elimination of 12% of the top ten hits which would arise without taking the semantic context into account.
Wang, Yi; Wan, Jianwu; Guo, Jun; Cheung, Yiu-Ming; Yuen, Pong C; Yi Wang; Jianwu Wan; Jun Guo; Yiu-Ming Cheung; Yuen, Pong C; Cheung, Yiu-Ming; Guo, Jun; Yuen, Pong C; Wan, Jianwu; Wang, Yi
2018-07-01
Similarity search is essential to many important applications and often involves searching at scale on high-dimensional data based on their similarity to a query. In biometric applications, recent vulnerability studies have shown that adversarial machine learning can compromise biometric recognition systems by exploiting the biometric similarity information. Existing methods for biometric privacy protection are in general based on pairwise matching of secured biometric templates and have inherent limitations in search efficiency and scalability. In this paper, we propose an inference-based framework for privacy-preserving similarity search in Hamming space. Our approach builds on an obfuscated distance measure that can conceal Hamming distance in a dynamic interval. Such a mechanism enables us to systematically design statistically reliable methods for retrieving most likely candidates without knowing the exact distance values. We further propose to apply Montgomery multiplication for generating search indexes that can withstand adversarial similarity analysis, and show that information leakage in randomized Montgomery domains can be made negligibly small. Our experiments on public biometric datasets demonstrate that the inference-based approach can achieve a search accuracy close to the best performance possible with secure computation methods, but the associated cost is reduced by orders of magnitude compared to cryptographic primitives.
δ-Similar Elimination to Enhance Search Performance of Multiobjective Evolutionary Algorithms
NASA Astrophysics Data System (ADS)
Aguirre, Hernán; Sato, Masahiko; Tanaka, Kiyoshi
In this paper, we propose δ-similar elimination to improve the search performance of multiobjective evolutionary algorithms in combinatorial optimization problems. This method eliminates similar individuals in objective space to fairly distribute selection among the different regions of the instantaneous Pareto front. We investigate four eliminating methods analyzing their effects using NSGA-II. In addition, we compare the search performance of NSGA-II enhanced by our method and NSGA-II enhanced by controlled elitism.
Similarity relations in visual search predict rapid visual categorization
Mohan, Krithika; Arun, S. P.
2012-01-01
How do we perform rapid visual categorization?It is widely thought that categorization involves evaluating the similarity of an object to other category items, but the underlying features and similarity relations remain unknown. Here, we hypothesized that categorization performance is based on perceived similarity relations between items within and outside the category. To this end, we measured the categorization performance of human subjects on three diverse visual categories (animals, vehicles, and tools) and across three hierarchical levels (superordinate, basic, and subordinate levels among animals). For the same subjects, we measured their perceived pair-wise similarities between objects using a visual search task. Regardless of category and hierarchical level, we found that the time taken to categorize an object could be predicted using its similarity to members within and outside its category. We were able to account for several classic categorization phenomena, such as (a) the longer times required to reject category membership; (b) the longer times to categorize atypical objects; and (c) differences in performance across tasks and across hierarchical levels. These categorization times were also accounted for by a model that extracts coarse structure from an image. The striking agreement observed between categorization and visual search suggests that these two disparate tasks depend on a shared coarse object representation. PMID:23092947
Using multidimensional scaling to quantify similarity in visual search and beyond
Godwin, Hayward J.; Fitzsimmons, Gemma; Robbins, Arryn; Menneer, Tamaryn; Goldinger, Stephen D.
2017-01-01
Visual search is one of the most widely studied topics in vision science, both as an independent topic of interest, and as a tool for studying attention and visual cognition. A wide literature exists that seeks to understand how people find things under varying conditions of difficulty and complexity, and in situations ranging from the mundane (e.g., looking for one’s keys) to those with significant societal importance (e.g., baggage or medical screening). A primary determinant of the ease and probability of success during search are the similarity relationships that exist in the search environment, such as the similarity between the background and the target, or the likeness of the non-targets to one another. A sense of similarity is often intuitive, but it is seldom quantified directly. This presents a problem in that similarity relationships are imprecisely specified, limiting the capacity of the researcher to examine adequately their influence. In this article, we present a novel approach to overcoming this problem that combines multidimensional scaling (MDS) analyses with behavioral and eye-tracking measurements. We propose a method whereby MDS can be repurposed to successfully quantify the similarity of experimental stimuli, thereby opening up theoretical questions in visual search and attention that cannot currently be addressed. These quantifications, in conjunction with behavioral and oculomotor measures, allow for critical observations about how similarity affects performance, information selection, and information processing. We provide a demonstration and tutorial of the approach, identify documented examples of its use, discuss how complementary computer vision methods could also be adopted, and close with a discussion of potential avenues for future application of this technique. PMID:26494381
Wassermann, Anne Mai; Lounkine, Eugen; Glick, Meir
2013-03-25
Virtual screening using bioactivity profiles has become an integral part of currently applied hit finding methods in pharmaceutical industry. However, a significant drawback of this approach is that it is only applicable to compounds that have been biologically tested in the past and have sufficient activity annotations for meaningful profile comparisons. Although bioactivity data generated in pharmaceutical institutions are growing on an unprecedented scale, the number of biologically annotated compounds still covers only a minuscule fraction of chemical space. For a newly synthesized compound or an isolated natural product to be biologically characterized across multiple assays, it may take a considerable amount of time. Consequently, this chemical matter will not be included in virtual screening campaigns based on bioactivity profiles. To overcome this problem, we herein introduce bioturbo similarity searching that uses chemical similarity to map molecules without biological annotations into bioactivity space and then searches for biologically similar compounds in this reference system. In benchmark calculations on primary screening data, we demonstrate that our approach generally achieves higher hit rates and identifies structurally more diverse compounds than approaches using chemical information only. Furthermore, our method is able to discover hits with novel modes of inhibition that traditional 2D and 3D similarity approaches are unlikely to discover. Test calculations on a set of natural products reveal the practical utility of the approach for identifying novel and synthetically more accessible chemical matter.
Finding similar nucleotide sequences using network BLAST searches.
Ladunga, Istvan
2009-06-01
The Basic Local Alignment Search Tool (BLAST) is a keystone of bioinformatics due to its performance and user-friendliness. Beginner and intermediate users will learn how to design and submit blastn and Megablast searches on the Web pages at the National Center for Biotechnology Information. We map nucleic acid sequences to genomes, find identical or similar mRNA, expressed sequence tag, and noncoding RNA sequences, and run Megablast searches, which are much faster than blastn. Understanding results is assisted by taxonomy reports, genomic views, and multiple alignments. We interpret expected frequency thresholds, biological significance, and statistical significance. Weak hits provide no evidence, but hints for further analyses. We find genes that may code for homologous proteins by translated BLAST. We reduce false positives by filtering out low-complexity regions. Parsed BLAST results can be integrated into analysis pipelines. Links in the output connect to Entrez, PUBMED, structural, sequence, interaction, and expression databases. This facilitates integration with a wide spectrum of biological knowledge.
Mackey, Aaron J; Pearson, William R
2004-10-01
Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.
Suzuki, Hirofumi; Kawabata, Takeshi; Nakamura, Haruki
2016-02-15
Omokage search is a service to search the global shape similarity of biological macromolecules and their assemblies, in both the Protein Data Bank (PDB) and Electron Microscopy Data Bank (EMDB). The server compares global shapes of assemblies independent of sequence order and number of subunits. As a search query, the user inputs a structure ID (PDB ID or EMDB ID) or uploads an atomic model or 3D density map to the server. The search is performed usually within 1 min, using one-dimensional profiles (incremental distance rank profiles) to characterize the shapes. Using the gmfit (Gaussian mixture model fitting) program, the found structures are fitted onto the query structure and their superimposed structures are displayed on the Web browser. Our service provides new structural perspectives to life science researchers. Omokage search is freely accessible at http://pdbj.org/omokage/. © The Author 2015. Published by Oxford University Press.
Su, Xiaoquan; Xu, Jian; Ning, Kang
2012-10-01
It has long been intriguing scientists to effectively compare different microbial communities (also referred as 'metagenomic samples' here) in a large scale: given a set of unknown samples, find similar metagenomic samples from a large repository and examine how similar these samples are. With the current metagenomic samples accumulated, it is possible to build a database of metagenomic samples of interests. Any metagenomic samples could then be searched against this database to find the most similar metagenomic sample(s). However, on one hand, current databases with a large number of metagenomic samples mostly serve as data repositories that offer few functionalities for analysis; and on the other hand, methods to measure the similarity of metagenomic data work well only for small set of samples by pairwise comparison. It is not yet clear, how to efficiently search for metagenomic samples against a large metagenomic database. In this study, we have proposed a novel method, Meta-Storms, that could systematically and efficiently organize and search metagenomic data. It includes the following components: (i) creating a database of metagenomic samples based on their taxonomical annotations, (ii) efficient indexing of samples in the database based on a hierarchical taxonomy indexing strategy, (iii) searching for a metagenomic sample against the database by a fast scoring function based on quantitative phylogeny and (iv) managing database by index export, index import, data insertion, data deletion and database merging. We have collected more than 1300 metagenomic data from the public domain and in-house facilities, and tested the Meta-Storms method on these datasets. Our experimental results show that Meta-Storms is capable of database creation and effective searching for a large number of metagenomic samples, and it could achieve similar accuracies compared with the current popular significance testing-based methods. Meta-Storms method would serve as a suitable
Perceptual Grouping in Haptic Search: The Influence of Proximity, Similarity, and Good Continuation
ERIC Educational Resources Information Center
Overvliet, Krista E.; Krampe, Ralf Th.; Wagemans, Johan
2012-01-01
We conducted a haptic search experiment to investigate the influence of the Gestalt principles of proximity, similarity, and good continuation. We expected faster search when the distractors could be grouped. We chose edges at different orientations as stimuli because they are processed similarly in the haptic and visual modality. We therefore…
Ertl, Peter; Patiny, Luc; Sander, Thomas; Rufener, Christian; Zasso, Michaël
2015-01-01
Wikipedia, the world's largest and most popular encyclopedia is an indispensable source of chemistry information. It contains among others also entries for over 15,000 chemicals including metabolites, drugs, agrochemicals and industrial chemicals. To provide an easy access to this wealth of information we decided to develop a substructure and similarity search tool for chemical structures referenced in Wikipedia. We extracted chemical structures from entries in Wikipedia and implemented a web system allowing structure and similarity searching on these data. The whole search as well as visualization system is written in JavaScript and therefore can run locally within a web page and does not require a central server. The Wikipedia Chemical Structure Explorer is accessible on-line at www.cheminfo.org/wikipedia and is available also as an open source project from GitHub for local installation. The web-based Wikipedia Chemical Structure Explorer provides a useful resource for research as well as for chemical education enabling both researchers and students easy and user friendly chemistry searching and identification of relevant information in Wikipedia. The tool can also help to improve quality of chemical entries in Wikipedia by providing potential contributors regularly updated list of entries with problematic structures. And last but not least this search system is a nice example of how the modern web technology can be applied in the field of cheminformatics. Graphical abstractWikipedia Chemical Structure Explorer allows substructure and similarity searches on molecules referenced in Wikipedia.
Semantically enabled image similarity search
NASA Astrophysics Data System (ADS)
Casterline, May V.; Emerick, Timothy; Sadeghi, Kolia; Gosse, C. A.; Bartlett, Brent; Casey, Jason
2015-05-01
Georeferenced data of various modalities are increasingly available for intelligence and commercial use, however effectively exploiting these sources demands a unified data space capable of capturing the unique contribution of each input. This work presents a suite of software tools for representing geospatial vector data and overhead imagery in a shared high-dimension vector or embedding" space that supports fused learning and similarity search across dissimilar modalities. While the approach is suitable for fusing arbitrary input types, including free text, the present work exploits the obvious but computationally difficult relationship between GIS and overhead imagery. GIS is comprised of temporally-smoothed but information-limited content of a GIS, while overhead imagery provides an information-rich but temporally-limited perspective. This processing framework includes some important extensions of concepts in literature but, more critically, presents a means to accomplish them as a unified framework at scale on commodity cloud architectures.
Cloud4Psi: cloud computing for 3D protein structure similarity searching.
Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Kłapciński, Artur
2014-10-01
Popular methods for 3D protein structure similarity searching, especially those that generate high-quality alignments such as Combinatorial Extension (CE) and Flexible structure Alignment by Chaining Aligned fragment pairs allowing Twists (FATCAT) are still time consuming. As a consequence, performing similarity searching against large repositories of structural data requires increased computational resources that are not always available. Cloud computing provides huge amounts of computational power that can be provisioned on a pay-as-you-go basis. We have developed the cloud-based system that allows scaling of the similarity searching process vertically and horizontally. Cloud4Psi (Cloud for Protein Similarity) was tested in the Microsoft Azure cloud environment and provided good, almost linearly proportional acceleration when scaled out onto many computational units. Cloud4Psi is available as Software as a Service for testing purposes at: http://cloud4psi.cloudapp.net/. For source code and software availability, please visit the Cloud4Psi project home page at http://zti.polsl.pl/dmrozek/science/cloud4psi.htm. © The Author 2014. Published by Oxford University Press.
Cloud4Psi: cloud computing for 3D protein structure similarity searching
Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Kłapciński, Artur
2014-01-01
Summary: Popular methods for 3D protein structure similarity searching, especially those that generate high-quality alignments such as Combinatorial Extension (CE) and Flexible structure Alignment by Chaining Aligned fragment pairs allowing Twists (FATCAT) are still time consuming. As a consequence, performing similarity searching against large repositories of structural data requires increased computational resources that are not always available. Cloud computing provides huge amounts of computational power that can be provisioned on a pay-as-you-go basis. We have developed the cloud-based system that allows scaling of the similarity searching process vertically and horizontally. Cloud4Psi (Cloud for Protein Similarity) was tested in the Microsoft Azure cloud environment and provided good, almost linearly proportional acceleration when scaled out onto many computational units. Availability and implementation: Cloud4Psi is available as Software as a Service for testing purposes at: http://cloud4psi.cloudapp.net/. For source code and software availability, please visit the Cloud4Psi project home page at http://zti.polsl.pl/dmrozek/science/cloud4psi.htm. Contact: dariusz.mrozek@polsl.pl PMID:24930141
OS2: Oblivious similarity based searching for encrypted data outsourced to an untrusted domain
Pervez, Zeeshan; Ahmad, Mahmood; Khattak, Asad Masood; Ramzan, Naeem
2017-01-01
Public cloud storage services are becoming prevalent and myriad data sharing, archiving and collaborative services have emerged which harness the pay-as-you-go business model of public cloud. To ensure privacy and confidentiality often encrypted data is outsourced to such services, which further complicates the process of accessing relevant data by using search queries. Search over encrypted data schemes solve this problem by exploiting cryptographic primitives and secure indexing to identify outsourced data that satisfy the search criteria. Almost all of these schemes rely on exact matching between the encrypted data and search criteria. A few schemes which extend the notion of exact matching to similarity based search, lack realism as those schemes rely on trusted third parties or due to increase storage and computational complexity. In this paper we propose Oblivious Similarity based Search (OS2) for encrypted data. It enables authorized users to model their own encrypted search queries which are resilient to typographical errors. Unlike conventional methodologies, OS2 ranks the search results by using similarity measure offering a better search experience than exact matching. It utilizes encrypted bloom filter and probabilistic homomorphic encryption to enable authorized users to access relevant data without revealing results of search query evaluation process to the untrusted cloud service provider. Encrypted bloom filter based search enables OS2 to reduce search space to potentially relevant encrypted data avoiding unnecessary computation on public cloud. The efficacy of OS2 is evaluated on Google App Engine for various bloom filter lengths on different cloud configurations. PMID:28692697
SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters.
Wang, Chunlin; Lefkowitz, Elliot J
2004-10-28
Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single node. Used together
SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters
Wang, Chunlin; Lefkowitz, Elliot J
2004-01-01
Background Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. Results We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single
Querying Event Sequences by Exact Match or Similarity Search: Design and Empirical Evaluation
Wongsuphasawat, Krist; Plaisant, Catherine; Taieb-Maimon, Meirav; Shneiderman, Ben
2012-01-01
Specifying event sequence queries is challenging even for skilled computer professionals familiar with SQL. Most graphical user interfaces for database search use an exact match approach, which is often effective, but near misses may also be of interest. We describe a new similarity search interface, in which users specify a query by simply placing events on a blank timeline and retrieve a similarity-ranked list of results. Behind this user interface is a new similarity measure for event sequences which the users can customize by four decision criteria, enabling them to adjust the impact of missing, extra, or swapped events or the impact of time shifts. We describe a use case with Electronic Health Records based on our ongoing collaboration with hospital physicians. A controlled experiment with 18 participants compared exact match and similarity search interfaces. We report on the advantages and disadvantages of each interface and suggest a hybrid interface combining the best of both. PMID:22379286
Boehm, Markus; Wu, Tong-Ying; Claussen, Holger; Lemmen, Christian
2008-04-24
Large collections of combinatorial libraries are an integral element in today's pharmaceutical industry. It is of great interest to perform similarity searches against all virtual compounds that are synthetically accessible by any such library. Here we describe the successful application of a new software tool CoLibri on 358 combinatorial libraries based on validated reaction protocols to create a single chemistry space containing over 10 (12) possible products. Similarity searching with FTrees-FS allows the systematic exploration of this space without the need to enumerate all product structures. The search result is a set of virtual hits which are synthetically accessible by one or more of the existing reaction protocols. Grouping these virtual hits by their synthetic protocols allows the rapid design and synthesis of multiple follow-up libraries. Such library ideas support hit-to-lead design efforts for tasks like follow-up from high-throughput screening hits or scaffold hopping from one hit to another attractive series.
Earthquake detection through computationally efficient similarity search
Yoon, Clara E.; O’Reilly, Ossian; Bergen, Karianne J.; Beroza, Gregory C.
2015-01-01
Seismology is experiencing rapid growth in the quantity of data, which has outpaced the development of processing algorithms. Earthquake detection—identification of seismic events in continuous data—is a fundamental operation for observational seismology. We developed an efficient method to detect earthquakes using waveform similarity that overcomes the disadvantages of existing detection methods. Our method, called Fingerprint And Similarity Thresholding (FAST), can analyze a week of continuous seismic waveform data in less than 2 hours, or 140 times faster than autocorrelation. FAST adapts a data mining algorithm, originally designed to identify similar audio clips within large databases; it first creates compact “fingerprints” of waveforms by extracting key discriminative features, then groups similar fingerprints together within a database to facilitate fast, scalable search for similar fingerprint pairs, and finally generates a list of earthquake detections. FAST detected most (21 of 24) cataloged earthquakes and 68 uncataloged earthquakes in 1 week of continuous data from a station located near the Calaveras Fault in central California, achieving detection performance comparable to that of autocorrelation, with some additional false detections. FAST is expected to realize its full potential when applied to extremely long duration data sets over a distributed network of seismic stations. The widespread application of FAST has the potential to aid in the discovery of unexpected seismic signals, improve seismic monitoring, and promote a greater understanding of a variety of earthquake processes. PMID:26665176
A search for structurally similar cellular internal ribosome entry sites
Baird, Stephen D.; Lewis, Stephen M.; Turcotte, Marcel; Holcik, Martin
2007-01-01
Internal ribosome entry sites (IRES) allow ribosomes to be recruited to mRNA in a cap-independent manner. Some viruses that impair cap-dependent translation initiation utilize IRES to ensure that the viral RNA will efficiently compete for the translation machinery. IRES are also employed for the translation of a subset of cellular messages during conditions that inhibit cap-dependent translation initiation. IRES from viruses like Hepatitis C and Classical Swine Fever virus share a similar structure/function without sharing primary sequence similarity. Of the cellular IRES structures derived so far, none were shown to share an overall structural similarity. Therefore, we undertook a genome-wide search of human 5′UTRs (untranslated regions) with an empirically derived structure of the IRES from the key inhibitor of apoptosis, X-linked inhibitor of apoptosis protein (XIAP), to identify novel IRES that share structure/function similarity. Three of the top matches identified by this search that exhibit IRES activity are the 5′UTRs of Aquaporin 4, ELG1 and NF-kappaB repressing factor (NRF). The structures of AQP4 and ELG1 IRES have limited similarity to the XIAP IRES; however, they share trans-acting factors that bind the XIAP IRES. We therefore propose that cellular IRES are not defined by overall structure, as viral IRES, but are instead dependent upon short motifs and trans-acting factors for their function. PMID:17591613
Optimal neighborhood indexing for protein similarity search.
Peterlongo, Pierre; Noé, Laurent; Lavenier, Dominique; Nguyen, Van Hoa; Kucherov, Gregory; Giraud, Mathieu
2008-12-16
Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional information to limit the number of random memory accesses. However, this improvement leads to a larger index that may become a bottleneck. In the case of protein similarity search, we propose to decrease the index size by reducing the amino acid alphabet. The paper presents two main contributions. First, we show that an optimal neighborhood indexing combining an alphabet reduction and a longer neighborhood leads to a reduction of 35% of memory involved into the process, without sacrificing the quality of results nor the computational time. Second, our approach led us to develop a new kind of substitution score matrices and their associated e-value parameters. In contrast to usual matrices, these matrices are rectangular since they compare amino acid groups from different alphabets. We describe the method used for computing those matrices and we provide some typical examples that can be used in such comparisons. Supplementary data can be found on the website http://bioinfo.lifl.fr/reblosum. We propose a practical index size reduction of the neighborhood data, that does not negatively affect the performance of large-scale search in protein sequences. Such an index can be used in any study involving large protein data. Moreover, rectangular substitution score matrices and their associated statistical parameters can have applications in any study involving an alphabet reduction.
Optimal neighborhood indexing for protein similarity search
Peterlongo, Pierre; Noé, Laurent; Lavenier, Dominique; Nguyen, Van Hoa; Kucherov, Gregory; Giraud, Mathieu
2008-01-01
Background Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with large seed indexes. In order to speed up this technique, the index can be enhanced by storing additional information to limit the number of random memory accesses. However, this improvement leads to a larger index that may become a bottleneck. In the case of protein similarity search, we propose to decrease the index size by reducing the amino acid alphabet. Results The paper presents two main contributions. First, we show that an optimal neighborhood indexing combining an alphabet reduction and a longer neighborhood leads to a reduction of 35% of memory involved into the process, without sacrificing the quality of results nor the computational time. Second, our approach led us to develop a new kind of substitution score matrices and their associated e-value parameters. In contrast to usual matrices, these matrices are rectangular since they compare amino acid groups from different alphabets. We describe the method used for computing those matrices and we provide some typical examples that can be used in such comparisons. Supplementary data can be found on the website . Conclusion We propose a practical index size reduction of the neighborhood data, that does not negatively affect the performance of large-scale search in protein sequences. Such an index can be used in any study involving large protein data. Moreover, rectangular substitution score matrices and their associated statistical parameters can have applications in any study involving an alphabet reduction. PMID:19087280
Managing biomedical image metadata for search and retrieval of similar images.
Korenblum, Daniel; Rubin, Daniel; Napel, Sandy; Rodriguez, Cesar; Beaulieu, Chris
2011-08-01
Radiology images are generally disconnected from the metadata describing their contents, such as imaging observations ("semantic" metadata), which are usually described in text reports that are not directly linked to the images. We developed a system, the Biomedical Image Metadata Manager (BIMM) to (1) address the problem of managing biomedical image metadata and (2) facilitate the retrieval of similar images using semantic feature metadata. Our approach allows radiologists, researchers, and students to take advantage of the vast and growing repositories of medical image data by explicitly linking images to their associated metadata in a relational database that is globally accessible through a Web application. BIMM receives input in the form of standard-based metadata files using Web service and parses and stores the metadata in a relational database allowing efficient data query and maintenance capabilities. Upon querying BIMM for images, 2D regions of interest (ROIs) stored as metadata are automatically rendered onto preview images included in search results. The system's "match observations" function retrieves images with similar ROIs based on specific semantic features describing imaging observation characteristics (IOCs). We demonstrate that the system, using IOCs alone, can accurately retrieve images with diagnoses matching the query images, and we evaluate its performance on a set of annotated liver lesion images. BIMM has several potential applications, e.g., computer-aided detection and diagnosis, content-based image retrieval, automating medical analysis protocols, and gathering population statistics like disease prevalences. The system provides a framework for decision support systems, potentially improving their diagnostic accuracy and selection of appropriate therapies.
Pervez, Zeeshan; Ahmad, Mahmood; Khattak, Asad Masood; Ramzan, Naeem; Khan, Wajahat Ali
2017-01-01
Public cloud storage services are becoming prevalent and myriad data sharing, archiving and collaborative services have emerged which harness the pay-as-you-go business model of public cloud. To ensure privacy and confidentiality often encrypted data is outsourced to such services, which further complicates the process of accessing relevant data by using search queries. Search over encrypted data schemes solve this problem by exploiting cryptographic primitives and secure indexing to identify outsourced data that satisfy the search criteria. Almost all of these schemes rely on exact matching between the encrypted data and search criteria. A few schemes which extend the notion of exact matching to similarity based search, lack realism as those schemes rely on trusted third parties or due to increase storage and computational complexity. In this paper we propose Oblivious Similarity based Search ([Formula: see text]) for encrypted data. It enables authorized users to model their own encrypted search queries which are resilient to typographical errors. Unlike conventional methodologies, [Formula: see text] ranks the search results by using similarity measure offering a better search experience than exact matching. It utilizes encrypted bloom filter and probabilistic homomorphic encryption to enable authorized users to access relevant data without revealing results of search query evaluation process to the untrusted cloud service provider. Encrypted bloom filter based search enables [Formula: see text] to reduce search space to potentially relevant encrypted data avoiding unnecessary computation on public cloud. The efficacy of [Formula: see text] is evaluated on Google App Engine for various bloom filter lengths on different cloud configurations.
Semantic similarity measure in biomedical domain leverage web search engine.
Chen, Chi-Huang; Hsieh, Sheau-Ling; Weng, Yung-Ching; Chang, Wen-Yung; Lai, Feipei
2010-01-01
Semantic similarity measure plays an essential role in Information Retrieval and Natural Language Processing. In this paper we propose a page-count-based semantic similarity measure and apply it in biomedical domains. Previous researches in semantic web related applications have deployed various semantic similarity measures. Despite the usefulness of the measurements in those applications, measuring semantic similarity between two terms remains a challenge task. The proposed method exploits page counts returned by the Web Search Engine. We define various similarity scores for two given terms P and Q, using the page counts for querying P, Q and P AND Q. Moreover, we propose a novel approach to compute semantic similarity using lexico-syntactic patterns with page counts. These different similarity scores are integrated adapting support vector machines, to leverage the robustness of semantic similarity measures. Experimental results on two datasets achieve correlation coefficients of 0.798 on the dataset provided by A. Hliaoutakis, 0.705 on the dataset provide by T. Pedersen with physician scores and 0.496 on the dataset provided by T. Pedersen et al. with expert scores.
He, Jieyue; Li, Chaojun; Ye, Baoliu; Zhong, Wei
2012-06-25
Most computational algorithms mainly focus on detecting highly connected subgraphs in PPI networks as protein complexes but ignore their inherent organization. Furthermore, many of these algorithms are computationally expensive. However, recent analysis indicates that experimentally detected protein complexes generally contain Core/attachment structures. In this paper, a Greedy Search Method based on Core-Attachment structure (GSM-CA) is proposed. The GSM-CA method detects densely connected regions in large protein-protein interaction networks based on the edge weight and two criteria for determining core nodes and attachment nodes. The GSM-CA method improves the prediction accuracy compared to other similar module detection approaches, however it is computationally expensive. Many module detection approaches are based on the traditional hierarchical methods, which is also computationally inefficient because the hierarchical tree structure produced by these approaches cannot provide adequate information to identify whether a network belongs to a module structure or not. In order to speed up the computational process, the Greedy Search Method based on Fast Clustering (GSM-FC) is proposed in this work. The edge weight based GSM-FC method uses a greedy procedure to traverse all edges just once to separate the network into the suitable set of modules. The proposed methods are applied to the protein interaction network of S. cerevisiae. Experimental results indicate that many significant functional modules are detected, most of which match the known complexes. Results also demonstrate that the GSM-FC algorithm is faster and more accurate as compared to other competing algorithms. Based on the new edge weight definition, the proposed algorithm takes advantages of the greedy search procedure to separate the network into the suitable set of modules. Experimental analysis shows that the identified modules are statistically significant. The algorithm can reduce the
Fast structure similarity searches among protein models: efficient clustering of protein fragments
2012-01-01
Background For many predictive applications a large number of models is generated and later clustered in subsets based on structure similarity. In most clustering algorithms an all-vs-all root mean square deviation (RMSD) comparison is performed. Most of the time is typically spent on comparison of non-similar structures. For sets with more than, say, 10,000 models this procedure is very time-consuming and alternative faster algorithms, restricting comparisons only to most similar structures would be useful. Results We exploit the inverse triangle inequality on the RMSD between two structures given the RMSDs with a third structure. The lower bound on RMSD may be used, when restricting the search of similarity to a reasonably low RMSD threshold value, to speed up similarity searches significantly. Tests are performed on large sets of decoys which are widely used as test cases for predictive methods, with a speed-up of up to 100 times with respect to all-vs-all comparison depending on the set and parameters used. Sample applications are shown. Conclusions The algorithm presented here allows fast comparison of large data sets of structures with limited memory requirements. As an example of application we present clustering of more than 100000 fragments of length 5 from the top500H dataset into few hundred representative fragments. A more realistic scenario is provided by the search of similarity within the very large decoy sets used for the tests. Other applications regard filtering nearly-indentical conformation in selected CASP9 datasets and clustering molecular dynamics snapshots. Availability A linux executable and a Perl script with examples are given in the supplementary material (Additional file 1). The source code is available upon request from the authors. PMID:22642815
Khashan, Raed S
2015-01-01
As the number of available ligand-receptor complexes is increasing, researchers are becoming more dedicated to mine these complexes to aid in the drug design and development process. We present free software which is developed as a tool for performing similarity search across ligand-receptor complexes for identifying binding pockets which are similar to that of a target receptor. The search is based on 3D-geometric and chemical similarity of the atoms forming the binding pocket. For each match identified, the ligand's fragment(s) corresponding to that binding pocket are extracted, thus forming a virtual library of fragments (FragVLib) that is useful for structure-based drug design. The program provides a very useful tool to explore available databases.
Efficient blind search for similar-waveform earthquakes in years of continuous seismic data
NASA Astrophysics Data System (ADS)
Yoon, C. E.; Bergen, K.; Rong, K.; Elezabi, H.; Bailis, P.; Levis, P.; Beroza, G. C.
2017-12-01
Cross-correlating an earthquake waveform template with continuous seismic data has proven to be a sensitive, discriminating detector of small events missing from earthquake catalogs, but a key limitation of this approach is that it requires advance knowledge of the earthquake signals we wish to detect. To overcome this limitation, we can perform a blind search for events with similar waveforms, comparing waveforms from all possible times within the continuous data (Brown et al., 2008). However, the runtime for naive blind search scales quadratically with the duration of continuous data, making it impractical to process years of continuous data. The Fingerprint And Similarity Thresholding (FAST) detection method (Yoon et al., 2015) enables a comprehensive blind search for similar-waveform earthquakes in a fast, scalable manner by adapting data-mining techniques originally developed for audio and image search within massive databases. FAST converts seismic waveforms into compact "fingerprints", which are efficiently organized and searched within a database. In this way, FAST avoids the unnecessary comparison of dissimilar waveforms. To date, the longest duration of continuous data used for event detection with FAST was 3 months at a single station near Guy-Greenbrier, Arkansas, which revealed microearthquakes closely correlated with stages of hydraulic fracturing (Yoon et al., 2017). In this presentation we introduce an optimized, parallel version of the FAST software with improvements to the fingerprinting algorithm and the ability to detect events using continuous data from a network of stations (Bergen et al., 2016). We demonstrate its ability to detect low-magnitude earthquakes within several years of continuous data at locations of interest in California.
Fu, Yong-Bi; Yang, Mo-Hua; Zeng, Fangqin; Biligetu, Bill
2017-01-01
Molecular plant breeding with the aid of molecular markers has played an important role in modern plant breeding over the last two decades. Many marker-based predictions for quantitative traits have been made to enhance parental selection, but the trait prediction accuracy remains generally low, even with the aid of dense, genome-wide SNP markers. To search for more accurate trait-specific prediction with informative SNP markers, we conducted a literature review on the prediction issues in molecular plant breeding and on the applicability of an RNA-Seq technique for developing function-associated specific trait (FAST) SNP markers. To understand whether and how FAST SNP markers could enhance trait prediction, we also performed a theoretical reasoning on the effectiveness of these markers in a trait-specific prediction, and verified the reasoning through computer simulation. To the end, the search yielded an alternative to regular genomic selection with FAST SNP markers that could be explored to achieve more accurate trait-specific prediction. Continuous search for better alternatives is encouraged to enhance marker-based predictions for an individual quantitative trait in molecular plant breeding. PMID:28729875
Huurneman, Bianca; Boonstra, F Nienke
2015-01-22
In typically developing children, crowding decreases with increasing age. The influence of target-distractor similarity with respect to orientation and element spacing on visual search performance was investigated in 29 school-age children with normal vision (4- to 6-year-olds [N = 16], 7- to 8-year-olds [N = 13]). Children were instructed to search for a target E among distractor Es (feature search: all flanking Es pointing right; conjunction search: flankers in three orientations). Orientation of the target was manipulated in four directions: right (target absent), left (inversed), up, and down (vertical). Spacing was varied in four steps: 0.04°, 0.5°, 1°, and 2°. During feature search, high target-distractor similarity had a stronger impact on performance than spacing: Orientation affected accuracy until spacing was 1°, and spacing only influenced accuracy for identifying inversed targets. Spatial analyses showed that orientation affected oculomotor strategy: Children made more fixations in the "inversed" target area (4.6) than the vertical target areas (1.8 and 1.9). Furthermore, age groups differed in fixation duration: 4- to 6-year-old children showed longer fixation durations than 7- to 8-year-olds at the two largest element spacings (p = 0.039 and p = 0.027). Conjunction search performance was unaffected by spacing. Four conclusions can be drawn from this study: (a) Target-distractor similarity governs visual search performance in school-age children, (b) children make more fixations in target areas when target-distractor similarity is high, (c) 4- to 6-year-olds show longer fixation durations than 7- to 8-year-olds at 1° and 2° element spacing, and (d) spacing affects feature but not conjunction search-a finding that might indicate top-down control ameliorates crowding in children. © 2015 ARVO.
Efficient searching and annotation of metabolic networks using chemical similarity
Pertusi, Dante A.; Stine, Andrew E.; Broadbelt, Linda J.; Tyo, Keith E.J.
2015-01-01
Motivation: The urgent need for efficient and sustainable biological production of fuels and high-value chemicals has elicited a wave of in silico techniques for identifying promising novel pathways to these compounds in large putative metabolic networks. To date, these approaches have primarily used general graph search algorithms, which are prohibitively slow as putative metabolic networks may exceed 1 million compounds. To alleviate this limitation, we report two methods—SimIndex (SI) and SimZyme—which use chemical similarity of 2D chemical fingerprints to efficiently navigate large metabolic networks and propose enzymatic connections between the constituent nodes. We also report a Byers–Waterman type pathway search algorithm for further paring down pertinent networks. Results: Benchmarking tests run with SI show it can reduce the number of nodes visited in searching a putative network by 100-fold with a computational time improvement of up to 105-fold. Subsequent Byers–Waterman search application further reduces the number of nodes searched by up to 100-fold, while SimZyme demonstrates ∼90% accuracy in matching query substrates with enzymes. Using these modules, we have designed and annotated an alternative to the methylerythritol phosphate pathway to produce isopentenyl pyrophosphate with more favorable thermodynamics than the native pathway. These algorithms will have a significant impact on our ability to use large metabolic networks that lack annotation of promiscuous reactions. Availability and implementation: Python files will be available for download at http://tyolab.northwestern.edu/tools/. Contact: k-tyo@northwestern.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25417203
Does linear separability really matter? Complex visual search is explained by simple search
Vighneshvel, T.; Arun, S. P.
2013-01-01
Visual search in real life involves complex displays with a target among multiple types of distracters, but in the laboratory, it is often tested using simple displays with identical distracters. Can complex search be understood in terms of simple searches? This link may not be straightforward if complex search has emergent properties. One such property is linear separability, whereby search is hard when a target cannot be separated from its distracters using a single linear boundary. However, evidence in favor of linear separability is based on testing stimulus configurations in an external parametric space that need not be related to their true perceptual representation. We therefore set out to assess whether linear separability influences complex search at all. Our null hypothesis was that complex search performance depends only on classical factors such as target-distracter similarity and distracter homogeneity, which we measured using simple searches. Across three experiments involving a variety of artificial and natural objects, differences between linearly separable and nonseparable searches were explained using target-distracter similarity and distracter heterogeneity. Further, simple searches accurately predicted complex search regardless of linear separability (r = 0.91). Our results show that complex search is explained by simple search, refuting the widely held belief that linear separability influences visual search. PMID:24029822
Efficient searching and annotation of metabolic networks using chemical similarity.
Pertusi, Dante A; Stine, Andrew E; Broadbelt, Linda J; Tyo, Keith E J
2015-04-01
The urgent need for efficient and sustainable biological production of fuels and high-value chemicals has elicited a wave of in silico techniques for identifying promising novel pathways to these compounds in large putative metabolic networks. To date, these approaches have primarily used general graph search algorithms, which are prohibitively slow as putative metabolic networks may exceed 1 million compounds. To alleviate this limitation, we report two methods--SimIndex (SI) and SimZyme--which use chemical similarity of 2D chemical fingerprints to efficiently navigate large metabolic networks and propose enzymatic connections between the constituent nodes. We also report a Byers-Waterman type pathway search algorithm for further paring down pertinent networks. Benchmarking tests run with SI show it can reduce the number of nodes visited in searching a putative network by 100-fold with a computational time improvement of up to 10(5)-fold. Subsequent Byers-Waterman search application further reduces the number of nodes searched by up to 100-fold, while SimZyme demonstrates ∼ 90% accuracy in matching query substrates with enzymes. Using these modules, we have designed and annotated an alternative to the methylerythritol phosphate pathway to produce isopentenyl pyrophosphate with more favorable thermodynamics than the native pathway. These algorithms will have a significant impact on our ability to use large metabolic networks that lack annotation of promiscuous reactions. Python files will be available for download at http://tyolab.northwestern.edu/tools/. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Query-seeded iterative sequence similarity searching improves selectivity 5–20-fold
Li, Weizhong; Lopez, Rodrigo
2017-01-01
Abstract Iterative similarity search programs, like psiblast, jackhmmer, and psisearch, are much more sensitive than pairwise similarity search methods like blast and ssearch because they build a position specific scoring model (a PSSM or HMM) that captures the pattern of sequence conservation characteristic to a protein family. But models are subject to contamination; once an unrelated sequence has been added to the model, homologs of the unrelated sequence will also produce high scores, and the model can diverge from the original protein family. Examination of alignment errors during psiblast PSSM contamination suggested a simple strategy for dramatically reducing PSSM contamination. psiblast PSSMs are built from the query-based multiple sequence alignment (MSA) implied by the pairwise alignments between the query model (PSSM, HMM) and the subject sequences in the library. When the original query sequence residues are inserted into gapped positions in the aligned subject sequence, the resulting PSSM rarely produces alignment over-extensions or alignments to unrelated sequences. This simple step, which tends to anchor the PSSM to the original query sequence and slightly increase target percent identity, can reduce the frequency of false-positive alignments more than 20-fold compared with psiblast and jackhmmer, with little loss in search sensitivity. PMID:27923999
Applying Statistical Models and Parametric Distance Measures for Music Similarity Search
NASA Astrophysics Data System (ADS)
Lukashevich, Hanna; Dittmar, Christian; Bastuck, Christoph
Automatic deriving of similarity relations between music pieces is an inherent field of music information retrieval research. Due to the nearly unrestricted amount of musical data, the real-world similarity search algorithms have to be highly efficient and scalable. The possible solution is to represent each music excerpt with a statistical model (ex. Gaussian mixture model) and thus to reduce the computational costs by applying the parametric distance measures between the models. In this paper we discuss the combinations of applying different parametric modelling techniques and distance measures and weigh the benefits of each one against the others.
Evolution and Optimality of Similar Neural Mechanisms for Perception and Action during Search
Zhang, Sheng; Eckstein, Miguel P.
2010-01-01
A prevailing theory proposes that the brain's two visual pathways, the ventral and dorsal, lead to differing visual processing and world representations for conscious perception than those for action. Others have claimed that perception and action share much of their visual processing. But which of these two neural architectures is favored by evolution? Successful visual search is life-critical and here we investigate the evolution and optimality of neural mechanisms mediating perception and eye movement actions for visual search in natural images. We implement an approximation to the ideal Bayesian searcher with two separate processing streams, one controlling the eye movements and the other stream determining the perceptual search decisions. We virtually evolved the neural mechanisms of the searchers' two separate pathways built from linear combinations of primary visual cortex receptive fields (V1) by making the simulated individuals' probability of survival depend on the perceptual accuracy finding targets in cluttered backgrounds. We find that for a variety of targets, backgrounds, and dependence of target detectability on retinal eccentricity, the mechanisms of the searchers' two processing streams converge to similar representations showing that mismatches in the mechanisms for perception and eye movements lead to suboptimal search. Three exceptions which resulted in partial or no convergence were a case of an organism for which the targets are equally detectable across the retina, an organism with sufficient time to foveate all possible target locations, and a strict two-pathway model with no interconnections and differential pre-filtering based on parvocellular and magnocellular lateral geniculate cell properties. Thus, similar neural mechanisms for perception and eye movement actions during search are optimal and should be expected from the effects of natural selection on an organism with limited time to search for food that is not equi-detectable across
An accurate algorithm to calculate the Hurst exponent of self-similar processes
NASA Astrophysics Data System (ADS)
Fernández-Martínez, M.; Sánchez-Granero, M. A.; Trinidad Segovia, J. E.; Román-Sánchez, I. M.
2014-06-01
In this paper, we introduce a new approach which generalizes the GM2 algorithm (introduced in Sánchez-Granero et al. (2008) [52]) as well as fractal dimension algorithms (FD1, FD2 and FD3) (first appeared in Sánchez-Granero et al. (2012) [51]), providing an accurate algorithm to calculate the Hurst exponent of self-similar processes. We prove that this algorithm performs properly in the case of short time series when fractional Brownian motions and Lévy stable motions are considered. We conclude the paper with a dynamic study of the Hurst exponent evolution in the S&P500 index stocks.
Hu, Qian-Nan; Deng, Zhe; Hu, Huanan; Cao, Dong-Sheng; Liang, Yi-Zeng
2011-09-01
Biochemical reactions play a key role to help sustain life and allow cells to grow. RxnFinder was developed to search biochemical reactions from KEGG reaction database using three search criteria: molecular structures, molecular fragments and reaction similarity. RxnFinder is helpful to get reference reactions for biosynthesis and xenobiotics metabolism. RxnFinder is freely available via: http://sdd.whu.edu.cn/rxnfinder. qnhu@whu.edu.cn.
Worley, K C; Wiese, B A; Smith, R F
1995-09-01
BEAUTY (BLAST enhanced alignment utility) is an enhanced version of the NCBI's BLAST data base search tool that facilitates identification of the functions of matched sequences. We have created new data bases of conserved regions and functional domains for protein sequences in NCBI's Entrez data base, and BEAUTY allows this information to be incorporated directly into BLAST search results. A Conserved Regions Data Base, containing the locations of conserved regions within Entrez protein sequences, was constructed by (1) clustering the entire data base into families, (2) aligning each family using our PIMA multiple sequence alignment program, and (3) scanning the multiple alignments to locate the conserved regions within each aligned sequence. A separate Annotated Domains Data Base was constructed by extracting the locations of all annotated domains and sites from sequences represented in the Entrez, PROSITE, BLOCKS, and PRINTS data bases. BEAUTY performs a BLAST search of those Entrez sequences with conserved regions and/or annotated domains. BEAUTY then uses the information from the Conserved Regions and Annotated Domains data bases to generate, for each matched sequence, a schematic display that allows one to directly compare the relative locations of (1) the conserved regions, (2) annotated domains and sites, and (3) the locally aligned regions matched in the BLAST search. In addition, BEAUTY search results include World-Wide Web hypertext links to a number of external data bases that provide a variety of additional types of information on the function of matched sequences. This convenient integration of protein families, conserved regions, annotated domains, alignment displays, and World-Wide Web resources greatly enhances the biological informativeness of sequence similarity searches. BEAUTY searches can be performed remotely on our system using the "BCM Search Launcher" World-Wide Web pages (URL is < http:/ /gc.bcm.tmc.edu:8088/ search-launcher/launcher.html > ).
Tai, David; Fang, Jianwen
2012-08-27
The large sizes of today's chemical databases require efficient algorithms to perform similarity searches. It can be very time consuming to compare two large chemical databases. This paper seeks to build upon existing research efforts by describing a novel strategy for accelerating existing search algorithms for comparing large chemical collections. The quest for efficiency has focused on developing better indexing algorithms by creating heuristics for searching individual chemical against a chemical library by detecting and eliminating needless similarity calculations. For comparing two chemical collections, these algorithms simply execute searches for each chemical in the query set sequentially. The strategy presented in this paper achieves a speedup upon these algorithms by indexing the set of all query chemicals so redundant calculations that arise in the case of sequential searches are eliminated. We implement this novel algorithm by developing a similarity search program called Symmetric inDexing or SymDex. SymDex shows over a 232% maximum speedup compared to the state-of-the-art single query search algorithm over real data for various fingerprint lengths. Considerable speedup is even seen for batch searches where query set sizes are relatively small compared to typical database sizes. To the best of our knowledge, SymDex is the first search algorithm designed specifically for comparing chemical libraries. It can be adapted to most, if not all, existing indexing algorithms and shows potential for accelerating future similarity search algorithms for comparing chemical databases.
SHOP: scaffold HOPping by GRID-based similarity searches.
Bergmann, Rikke; Linusson, Anna; Zamora, Ismael
2007-05-31
A new GRID-based method for scaffold hopping (SHOP) is presented. In a fully automatic manner, scaffolds were identified in a database based on three types of 3D-descriptors. SHOP's ability to recover scaffolds was assessed and validated by searching a database spiked with fragments of known ligands of three different protein targets relevant for drug discovery using a rational approach based on statistical experimental design. Five out of eight and seven out of eight thrombin scaffolds and all seven HIV protease scaffolds were recovered within the top 10 and 31 out of 31 neuraminidase scaffolds were in the 31 top-ranked scaffolds. SHOP also identified new scaffolds with substantially different chemotypes from the queries. Docking analysis indicated that the new scaffolds would have similar binding modes to those of the respective query scaffolds observed in X-ray structures. The databases contained scaffolds from published combinatorial libraries to ensure that identified scaffolds could be feasibly synthesized.
Semantic similarity measures in the biomedical domain by leveraging a web search engine.
Hsieh, Sheau-Ling; Chang, Wen-Yung; Chen, Chi-Huang; Weng, Yung-Ching
2013-07-01
Various researches in web related semantic similarity measures have been deployed. However, measuring semantic similarity between two terms remains a challenging task. The traditional ontology-based methodologies have a limitation that both concepts must be resided in the same ontology tree(s). Unfortunately, in practice, the assumption is not always applicable. On the other hand, if the corpus is sufficiently adequate, the corpus-based methodologies can overcome the limitation. Now, the web is a continuous and enormous growth corpus. Therefore, a method of estimating semantic similarity is proposed via exploiting the page counts of two biomedical concepts returned by Google AJAX web search engine. The features are extracted as the co-occurrence patterns of two given terms P and Q, by querying P, Q, as well as P AND Q, and the web search hit counts of the defined lexico-syntactic patterns. These similarity scores of different patterns are evaluated, by adapting support vector machines for classification, to leverage the robustness of semantic similarity measures. Experimental results validating against two datasets: dataset 1 provided by A. Hliaoutakis; dataset 2 provided by T. Pedersen, are presented and discussed. In dataset 1, the proposed approach achieves the best correlation coefficient (0.802) under SNOMED-CT. In dataset 2, the proposed method obtains the best correlation coefficient (SNOMED-CT: 0.705; MeSH: 0.723) with physician scores comparing with measures of other methods. However, the correlation coefficients (SNOMED-CT: 0.496; MeSH: 0.539) with coder scores received opposite outcomes. In conclusion, the semantic similarity findings of the proposed method are close to those of physicians' ratings. Furthermore, the study provides a cornerstone investigation for extracting fully relevant information from digitizing, free-text medical records in the National Taiwan University Hospital database.
Kurgan, Lukasz; Cios, Krzysztof; Chen, Ke
2008-05-01
Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction. SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of the features, which are
Kurgan, Lukasz; Cios, Krzysztof; Chen, Ke
2008-01-01
Background Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the structural class. Current structural class prediction methods that predict the four structural classes defined in SCOP provide up to 63% accuracy for the datasets in which sequence identity of any pair of sequences belongs to the twilight-zone. We propose SCPRED method that improves prediction accuracy for sequences that share twilight-zone pairwise similarity with sequences used for the prediction. Results SCPRED uses a support vector machine classifier that takes several custom-designed features as its input to predict the structural classes. Based on extensive design that considers over 2300 index-, composition- and physicochemical properties-based features along with features based on the predicted secondary structure and content, the classifier's input includes 8 features based on information extracted from the secondary structure predicted with PSI-PRED and one feature computed from the sequence. Tests performed with datasets of 1673 protein chains, in which any pair of sequences shares twilight-zone similarity, show that SCPRED obtains 80.3% accuracy when predicting the four SCOP-defined structural classes, which is superior when compared with over a dozen recent competing methods that are based on support vector machine, logistic regression, and ensemble of classifiers predictors. Conclusion The SCPRED can accurately find similar structures for sequences that share low identity with sequence used for the prediction. The high predictive accuracy achieved by SCPRED is attributed to the design of
Twin Similarities in Holland Types as Shown by Scores on the Self-Directed Search
ERIC Educational Resources Information Center
Chauvin, Ida; McDaniel, Janelle R.; Miller, Mark J.; King, James M.; Eddlemon, Ondie L. M.
2012-01-01
This study examined the degree of similarity between scores on the Self-Directed Search from one set of identical twins. Predictably, a high congruence score was found. Results from a biographical sheet are discussed as well as implications of the results for career counselors.
Biggs, Adam T; Mitroff, Stephen R
2014-01-01
Visual search, locating target items among distractors, underlies daily activities ranging from critical tasks (e.g., looking for dangerous objects during security screening) to commonplace ones (e.g., finding your friends in a crowded bar). Both professional and nonprofessional individuals conduct visual searches, and the present investigation is aimed at understanding how they perform similarly and differently. We administered a multiple-target visual search task to both professional (airport security officers) and nonprofessional participants (members of the Duke University community) to determine how search abilities differ between these populations and what factors might predict accuracy. There were minimal overall accuracy differences, although the professionals were generally slower to respond. However, the factors that predicted accuracy varied drastically between groups; variability in search consistency-how similarly an individual searched from trial to trial in terms of speed-best explained accuracy for professional searchers (more consistent professionals were more accurate), whereas search speed-how long an individual took to complete a search when no targets were present-best explained accuracy for nonprofessional searchers (slower nonprofessionals were more accurate). These findings suggest that professional searchers may utilize different search strategies from those of nonprofessionals, and that search consistency, in particular, may provide a valuable tool for enhancing professional search accuracy.
Dhir, Somdutta; Pacurar, Mircea; Franklin, Dino; Gáspári, Zoltán; Kertész-Farkas, Attila; Kocsor, András; Eisenhaber, Frank; Pongor, Sándor
2010-11-01
SBASE is a project initiated to detect known domain types and predicting domain architectures using sequence similarity searching (Simon et al., Protein Seq Data Anal, 5: 39-42, 1992, Pongor et al, Nucl. Acids. Res. 21:3111-3115, 1992). The current approach uses a curated collection of domain sequences - the SBASE domain library - and standard similarity search algorithms, followed by postprocessing which is based on a simple statistics of the domain similarity network (http://hydra.icgeb.trieste.it/sbase/). It is especially useful in detecting rare, atypical examples of known domain types which are sometimes missed even by more sophisticated methodologies. This approach does not require multiple alignment or machine learning techniques, and can be a useful complement to other domain detection methodologies. This article gives an overview of the project history as well as of the concepts and principles developed within this the project.
Clinical Diagnostics in Human Genetics with Semantic Similarity Searches in Ontologies
Köhler, Sebastian; Schulz, Marcel H.; Krawitz, Peter; Bauer, Sebastian; Dölken, Sandra; Ott, Claus E.; Mundlos, Christine; Horn, Denise; Mundlos, Stefan; Robinson, Peter N.
2009-01-01
The differential diagnostic process attempts to identify candidate diseases that best explain a set of clinical features. This process can be complicated by the fact that the features can have varying degrees of specificity, as well as by the presence of features unrelated to the disease itself. Depending on the experience of the physician and the availability of laboratory tests, clinical abnormalities may be described in greater or lesser detail. We have adapted semantic similarity metrics to measure phenotypic similarity between queries and hereditary diseases annotated with the use of the Human Phenotype Ontology (HPO) and have developed a statistical model to assign p values to the resulting similarity scores, which can be used to rank the candidate diseases. We show that our approach outperforms simpler term-matching approaches that do not take the semantic interrelationships between terms into account. The advantage of our approach was greater for queries containing phenotypic noise or imprecise clinical descriptions. The semantic network defined by the HPO can be used to refine the differential diagnosis by suggesting clinical features that, if present, best differentiate among the candidate diagnoses. Thus, semantic similarity searches in ontologies represent a useful way of harnessing the semantic structure of human phenotypic abnormalities to help with the differential diagnosis. We have implemented our methods in a freely available web application for the field of human Mendelian disorders. PMID:19800049
Egnos-Based Multi-Sensor Accurate and Reliable Navigation in Search-And Missions with Uavs
NASA Astrophysics Data System (ADS)
Molina, P.; Colomina, I.; Vitoria, T.; Silva, P. F.; Stebler, Y.; Skaloud, J.; Kornus, W.; Prades, R.
2011-09-01
This paper will introduce and describe the goals, concept and overall approach of the European 7th Framework Programme's project named CLOSE-SEARCH, which stands for 'Accurate and safe EGNOS-SoL Navigation for UAV-based low-cost SAR operations'. The goal of CLOSE-SEARCH is to integrate in a helicopter-type unmanned aerial vehicle, a thermal imaging sensor and a multi-sensor navigation system (based on the use of a Barometric Altimeter (BA), a Magnetometer (MAGN), a Redundant Inertial Navigation System (RINS) and an EGNOS-enabled GNSS receiver) with an Autonomous Integrity Monitoring (AIM) capability, to support the search component of Search-And-Rescue operations in remote, difficult-to-access areas and/or in time critical situations. The proposed integration will result in a hardware and software prototype that will demonstrate an end-to-end functionality, that is to fly in patterns over a region of interest (possibly inaccessible) during day or night and also under adverse weather conditions and locate there disaster survivors or lost people through the detection of the body heat. This paper will identify the technical challenges of the proposed approach, from navigating with a BA/MAGN/RINS/GNSS-EGNOSbased integrated system to the interpretation of thermal images for person identification. Moreover, the AIM approach will be described together with the proposed integrity requirements. Finally, this paper will show some results obtained in the project during the first test campaign performed on November 2010. On that day, a prototype was flown in three different missions to assess its high-level performance and to observe some fundamental mission parameters as the optimal flying height and flying speed to enable body recognition. The second test campaign is scheduled for the end of 2011.
Efficient protein structure search using indexing methods
2013-01-01
Understanding functions of proteins is one of the most important challenges in many studies of biological processes. The function of a protein can be predicted by analyzing the functions of structurally similar proteins, thus finding structurally similar proteins accurately and efficiently from a large set of proteins is crucial. A protein structure can be represented as a vector by 3D-Zernike Descriptor (3DZD) which compactly represents the surface shape of the protein tertiary structure. This simplified representation accelerates the searching process. However, computing the similarity of two protein structures is still computationally expensive, thus it is hard to efficiently process many simultaneous requests of structurally similar protein search. This paper proposes indexing techniques which substantially reduce the search time to find structurally similar proteins. In particular, we first exploit two indexing techniques, i.e., iDistance and iKernel, on the 3DZDs. After that, we extend the techniques to further improve the search speed for protein structures. The extended indexing techniques build and utilize an reduced index constructed from the first few attributes of 3DZDs of protein structures. To retrieve top-k similar structures, top-10 × k similar structures are first found using the reduced index, and top-k structures are selected among them. We also modify the indexing techniques to support θ-based nearest neighbor search, which returns data points less than θ to the query point. The results show that both iDistance and iKernel significantly enhance the searching speed. In top-k nearest neighbor search, the searching time is reduced 69.6%, 77%, 77.4% and 87.9%, respectively using iDistance, iKernel, the extended iDistance, and the extended iKernel. In θ-based nearest neighbor serach, the searching time is reduced 80%, 81%, 95.6% and 95.6% using iDistance, iKernel, the extended iDistance, and the extended iKernel, respectively. PMID:23691543
Efficient protein structure search using indexing methods.
Kim, Sungchul; Sael, Lee; Yu, Hwanjo
2013-01-01
Understanding functions of proteins is one of the most important challenges in many studies of biological processes. The function of a protein can be predicted by analyzing the functions of structurally similar proteins, thus finding structurally similar proteins accurately and efficiently from a large set of proteins is crucial. A protein structure can be represented as a vector by 3D-Zernike Descriptor (3DZD) which compactly represents the surface shape of the protein tertiary structure. This simplified representation accelerates the searching process. However, computing the similarity of two protein structures is still computationally expensive, thus it is hard to efficiently process many simultaneous requests of structurally similar protein search. This paper proposes indexing techniques which substantially reduce the search time to find structurally similar proteins. In particular, we first exploit two indexing techniques, i.e., iDistance and iKernel, on the 3DZDs. After that, we extend the techniques to further improve the search speed for protein structures. The extended indexing techniques build and utilize an reduced index constructed from the first few attributes of 3DZDs of protein structures. To retrieve top-k similar structures, top-10 × k similar structures are first found using the reduced index, and top-k structures are selected among them. We also modify the indexing techniques to support θ-based nearest neighbor search, which returns data points less than θ to the query point. The results show that both iDistance and iKernel significantly enhance the searching speed. In top-k nearest neighbor search, the searching time is reduced 69.6%, 77%, 77.4% and 87.9%, respectively using iDistance, iKernel, the extended iDistance, and the extended iKernel. In θ-based nearest neighbor serach, the searching time is reduced 80%, 81%, 95.6% and 95.6% using iDistance, iKernel, the extended iDistance, and the extended iKernel, respectively.
Accurate segmenting of cervical tumors in PET imaging based on similarity between adjacent slices.
Chen, Liyuan; Shen, Chenyang; Zhou, Zhiguo; Maquilan, Genevieve; Thomas, Kimberly; Folkert, Michael R; Albuquerque, Kevin; Wang, Jing
2018-06-01
Because in PET imaging cervical tumors are close to the bladder with high capacity for the secreted 18 FDG tracer, conventional intensity-based segmentation methods often misclassify the bladder as a tumor. Based on the observation that tumor position and area do not change dramatically from slice to slice, we propose a two-stage scheme that facilitates segmentation. In the first stage, we used a graph-cut based algorithm to obtain initial contouring of the tumor based on local similarity information between voxels; this was achieved through manual contouring of the cervical tumor on one slice. In the second stage, initial tumor contours were fine-tuned to more accurate segmentation by incorporating similarity information on tumor shape and position among adjacent slices, according to an intensity-spatial-distance map. Experimental results illustrate that the proposed two-stage algorithm provides a more effective approach to segmenting cervical tumors in 3D 18 FDG PET images than the benchmarks used for comparison. Copyright © 2018 Elsevier Ltd. All rights reserved.
Towards novel organic high-Tc superconductors: Data mining using density of states similarity search
NASA Astrophysics Data System (ADS)
Geilhufe, R. Matthias; Borysov, Stanislav S.; Kalpakchi, Dmytro; Balatsky, Alexander V.
2018-02-01
Identifying novel functional materials with desired key properties is an important part of bridging the gap between fundamental research and technological advancement. In this context, high-throughput calculations combined with data-mining techniques highly accelerated this process in different areas of research during the past years. The strength of a data-driven approach for materials prediction lies in narrowing down the search space of thousands of materials to a subset of prospective candidates. Recently, the open-access organic materials database OMDB was released providing electronic structure data for thousands of previously synthesized three-dimensional organic crystals. Based on the OMDB, we report about the implementation of a novel density of states similarity search tool which is capable of retrieving materials with similar density of states to a reference material. The tool is based on the approximate nearest neighbor algorithm as implemented in the ANNOY library and can be applied via the OMDB web interface. The approach presented here is wide ranging and can be applied to various problems where the density of states is responsible for certain key properties of a material. As the first application, we report about materials exhibiting electronic structure similarities to the aromatic hydrocarbon p-terphenyl which was recently discussed as a potential organic high-temperature superconductor exhibiting a transition temperature in the order of 120 K under strong potassium doping. Although the mechanism driving the remarkable transition temperature remains under debate, we argue that the density of states, reflecting the electronic structure of a material, might serve as a crucial ingredient for the observed high Tc. To provide candidates which might exhibit comparable properties, we present 15 purely organic materials with similar features to p-terphenyl within the electronic structure, which also tend to have structural similarities with p
Active browsing using similarity pyramids
NASA Astrophysics Data System (ADS)
Chen, Jau-Yuen; Bouman, Charles A.; Dalton, John C.
1998-12-01
In this paper, we describe a new approach to managing large image databases, which we call active browsing. Active browsing integrates relevance feedback into the browsing environment, so that users can modify the database's organization to suit the desired task. Our method is based on a similarity pyramid data structure, which hierarchically organizes the database, so that it can be efficiently browsed. At coarse levels, the similarity pyramid allows users to view the database as large clusters of similar images. Alternatively, users can 'zoom into' finer levels to view individual images. We discuss relevance feedback for the browsing process, and argue that it is fundamentally different from relevance feedback for more traditional search-by-query tasks. We propose two fundamental operations for active browsing: pruning and reorganization. Both of these operations depend on a user-defined relevance set, which represents the image or set of images desired by the user. We present statistical methods for accurately pruning the database, and we propose a new 'worm hole' distance metric for reorganizing the database, so that members of the relevance set are grouped together.
How task demands influence scanpath similarity in a sequential number-search task.
Dewhurst, Richard; Foulsham, Tom; Jarodzka, Halszka; Johansson, Roger; Holmqvist, Kenneth; Nyström, Marcus
2018-06-07
More and more researchers are considering the omnibus eye movement sequence-the scanpath-in their studies of visual and cognitive processing (e.g. Hayes, Petrov, & Sederberg, 2011; Madsen, Larson, Loschky, & Rebello, 2012; Ni et al., 2011; von der Malsburg & Vasishth, 2011). However, it remains unclear how recent methods for comparing scanpaths perform in experiments producing variable scanpaths, and whether these methods supplement more traditional analyses of individual oculomotor statistics. We address this problem for MultiMatch (Jarodzka et al., 2010; Dewhurst et al., 2012), evaluating its performance with a visual search-like task in which participants must fixate a series of target numbers in a prescribed order. This task should produce predictable sequences of fixations and thus provide a testing ground for scanpath measures. Task difficulty was manipulated by making the targets more or less visible through changes in font and the presence of distractors or visual noise. These changes in task demands led to slower search and more fixations. Importantly, they also resulted in a reduction in the between-subjects scanpath similarity, demonstrating that participants' gaze patterns became more heterogenous in terms of saccade length and angle, and fixation position. This implies a divergent strategy or random component to eye-movement behaviour which increases as the task becomes more difficult. Interestingly, the duration of fixations along aligned vectors showed the opposite pattern, becoming more similar between observers in 2 of the 3 difficulty manipulations. This provides important information for vision scientists who may wish to use scanpath metrics to quantify variations in gaze across a spectrum of perceptual and cognitive tasks. Copyright © 2018 Elsevier Ltd. All rights reserved.
Accurate expectancies diminish perceptual distraction during visual search
Sy, Jocelyn L.; Guerin, Scott A.; Stegman, Anna; Giesbrecht, Barry
2014-01-01
The load theory of visual attention proposes that efficient selective perceptual processing of task-relevant information during search is determined automatically by the perceptual demands of the display. If the perceptual demands required to process task-relevant information are not enough to consume all available capacity, then the remaining capacity automatically and exhaustively “spills-over” to task-irrelevant information. The spill-over of perceptual processing capacity increases the likelihood that task-irrelevant information will impair performance. In two visual search experiments, we tested the automaticity of the allocation of perceptual processing resources by measuring the extent to which the processing of task-irrelevant distracting stimuli was modulated by both perceptual load and top-down expectations using behavior, functional magnetic resonance imaging, and electrophysiology. Expectations were generated using a trial-by-trial cue that provided information about the likely load of the upcoming visual search task. When the cues were valid, behavioral interference was eliminated and the influence of load on frontoparietal and visual cortical responses was attenuated relative to when the cues were invalid. In conditions in which task-irrelevant information interfered with performance and modulated visual activity, individual differences in mean blood oxygenation level dependent responses measured from the left intraparietal sulcus were negatively correlated with individual differences in the severity of distraction. These results are consistent with the interpretation that a top-down biasing mechanism interacts with perceptual load to support filtering of task-irrelevant information. PMID:24904374
Budowski-Tal, Inbal; Nov, Yuval; Kolodny, Rachel
2010-02-23
Fast identification of protein structures that are similar to a specified query structure in the entire Protein Data Bank (PDB) is fundamental in structure and function prediction. We present FragBag: An ultrafast and accurate method for comparing protein structures. We describe a protein structure by the collection of its overlapping short contiguous backbone segments, and discretize this set using a library of fragments. Then, we succinctly represent the protein as a "bags-of-fragments"-a vector that counts the number of occurrences of each fragment-and measure the similarity between two structures by the similarity between their vectors. Our representation has two additional benefits: (i) it can be used to construct an inverted index, for implementing a fast structural search engine of the entire PDB, and (ii) one can specify a structure as a collection of substructures, without combining them into a single structure; this is valuable for structure prediction, when there are reliable predictions only of parts of the protein. We use receiver operating characteristic curve analysis to quantify the success of FragBag in identifying neighbor candidate sets in a dataset of over 2,900 structures. The gold standard is the set of neighbors found by six state of the art structural aligners. Our best FragBag library finds more accurate candidate sets than the three other filter methods: The SGM, PRIDE, and a method by Zotenko et al. More interestingly, FragBag performs on a par with the computationally expensive, yet highly trusted structural aligners STRUCTAL and CE.
Mobile Visual Search Based on Histogram Matching and Zone Weight Learning
NASA Astrophysics Data System (ADS)
Zhu, Chuang; Tao, Li; Yang, Fan; Lu, Tao; Jia, Huizhu; Xie, Xiaodong
2018-01-01
In this paper, we propose a novel image retrieval algorithm for mobile visual search. At first, a short visual codebook is generated based on the descriptor database to represent the statistical information of the dataset. Then, an accurate local descriptor similarity score is computed by merging the tf-idf weighted histogram matching and the weighting strategy in compact descriptors for visual search (CDVS). At last, both the global descriptor matching score and the local descriptor similarity score are summed up to rerank the retrieval results according to the learned zone weights. The results show that the proposed approach outperforms the state-of-the-art image retrieval method in CDVS.
Predicting user click behaviour in search engine advertisements
NASA Astrophysics Data System (ADS)
Daryaie Zanjani, Mohammad; Khadivi, Shahram
2015-10-01
According to the specific requirements and interests of users, search engines select and display advertisements that match user needs and have higher probability of attracting users' attention based on their previous search history. New objects such as user, advertisement or query cause a deterioration of precision in targeted advertising due to their lack of history. This article surveys this challenge. In the case of new objects, we first extract similar observed objects to the new object and then we use their history as the history of new object. Similarity between objects is measured based on correlation, which is a relation between user and advertisement when the advertisement is displayed to the user. This method is used for all objects, so it has helped us to accurately select relevant advertisements for users' queries. In our proposed model, we assume that similar users behave in a similar manner. We find that users with few queries are similar to new users. We will show that correlation between users and advertisements' keywords is high. Thus, users who pay attention to advertisements' keywords, click similar advertisements. In addition, users who pay attention to specific brand names might have similar behaviours too.
Automatic Content Creation for Games to Train Students Distinguishing Similar Chinese Characters
NASA Astrophysics Data System (ADS)
Lai, Kwong-Hung; Leung, Howard; Tang, Jeff K. T.
In learning Chinese, many students often have the problem of mixing up similar characters. This can cause misunderstanding and miscommunication in daily life. It is thus important for students learning the Chinese language to be able to distinguish similar characters and understand their proper usage. In this paper, we propose a game style framework in which the game content in identifying similar Chinese characters in idioms and words is created automatically. Our prior work on analyzing students’ Chinese handwriting can be applied in the similarity measure of Chinese characters. We extend this work by adding the component of radical extraction to speed up the search process. Experimental results show that the proposed method is more accurate and faster in finding more similar Chinese characters compared with the baseline method without considering the radical information.
Path integration mediated systematic search: a Bayesian model.
Vickerstaff, Robert J; Merkle, Tobias
2012-08-21
The systematic search behaviour is a backup system that increases the chances of desert ants finding their nest entrance after foraging when the path integrator has failed to guide them home accurately enough. Here we present a mathematical model of the systematic search that is based on extensive behavioural studies in North African desert ants Cataglyphis fortis. First, a simple search heuristic utilising Bayesian inference and a probability density function is developed. This model, which optimises the short-term nest detection probability, is then compared to three simpler search heuristics and to recorded search patterns of Cataglyphis ants. To compare the different searches a method to quantify search efficiency is established as well as an estimate of the error rate in the ants' path integrator. We demonstrate that the Bayesian search heuristic is able to automatically adapt to increasing levels of positional uncertainty to produce broader search patterns, just as desert ants do, and that it outperforms the three other search heuristics tested. The searches produced by it are also arguably the most similar in appearance to the ant's searches. Copyright © 2012 Elsevier Ltd. All rights reserved.
Akala, Hoseah M.; Macharia, Rosaline W.; Juma, Dennis W.; Cheruiyot, Agnes C.; Andagalu, Ben; Brown, Mathew L.; El-Shemy, Hany A.; Nyanjom, Steven G.
2017-01-01
Malaria causes about half a million deaths annually, with Plasmodium falciparum being responsible for 90% of all the cases. Recent reports on artemisinin resistance in Southeast Asia warrant urgent discovery of novel drugs for the treatment of malaria. However, most bioactive compounds fail to progress to treatments due to safety concerns. Drug repositioning offers an alternative strategy where drugs that have already been approved as safe for other diseases could be used to treat malaria. This study screened approved drugs for antimalarial activity using an in silico chemogenomics approach prior to in vitro verification. All the P. falciparum proteins sequences available in NCBI RefSeq were mined and used to perform a similarity search against DrugBank, TTD and STITCH databases to identify similar putative drug targets. Druggability indices of the potential P. falciparum drug targets were obtained from TDR targets database. Functional amino acid residues of the drug targets were determined using ConSurf server which was used to fine tune the similarity search. This study predicted 133 approved drugs that could target 34 P. falciparum proteins. A literature search done at PubMed and Google Scholar showed 105 out of the 133 drugs to have been previously tested against malaria, with most showing activity. For further validation, drug susceptibility assays using SYBR Green I method were done on a representative group of 10 predicted drugs, eight of which did show activity against P. falciparum 3D7 clone. Seven had IC50 values ranging from 1 μM to 50 μM. This study also suggests drug-target association and hence possible mechanisms of action of drugs that did show antiplasmodial activity. The study results validate the use of proteome-wide target similarity approach in identifying approved drugs with activity against P. falciparum and could be adapted for other pathogens. PMID:29088219
An integrative approach for measuring semantic similarities using gene ontology.
Peng, Jiajie; Li, Hongxiang; Jiang, Qinghua; Wang, Yadong; Chen, Jin
2014-01-01
Gene Ontology (GO) provides rich information and a convenient way to study gene functional similarity, which has been successfully used in various applications. However, the existing GO based similarity measurements have limited functions for only a subset of GO information is considered in each measure. An appropriate integration of the existing measures to take into account more information in GO is demanding. We propose a novel integrative measure called InteGO2 to automatically select appropriate seed measures and then to integrate them using a metaheuristic search method. The experiment results show that InteGO2 significantly improves the performance of gene similarity in human, Arabidopsis and yeast on both molecular function and biological process GO categories. InteGO2 computes gene-to-gene similarities more accurately than tested existing measures and has high robustness. The supplementary document and software are available at http://mlg.hit.edu.cn:8082/.
Massive problem reports mining and analysis based parallelism for similar search
NASA Astrophysics Data System (ADS)
Zhou, Ya; Hu, Cailin; Xiong, Han; Wei, Xiafei; Li, Ling
2017-05-01
Massive problem reports and solutions accumulated over time and continuously collected in XML Spreadsheet (XMLSS) format from enterprises and organizations, which record a series of comprehensive description about problems that can help technicians to trace problems and their solutions. It's a significant and challenging issue to effectively manage and analyze these massive semi-structured data to provide similar problem solutions, decisions of immediate problem and assisting product optimization for users during hardware and software maintenance. For this purpose, we build a data management system to manage, mine and analyze these data search results that can be categorized and organized into several categories for users to quickly find out where their interesting results locate. Experiment results demonstrate that this system is better than traditional centralized management system on the performance and the adaptive capability of heterogeneous data greatly. Besides, because of re-extracting topics, it enables each cluster to be described more precise and reasonable.
Indexed variation graphs for efficient and accurate resistome profiling.
Rowe, Will P M; Winn, Martyn D
2018-05-14
Antimicrobial resistance remains a major threat to global health. Profiling the collective antimicrobial resistance genes within a metagenome (the "resistome") facilitates greater understanding of antimicrobial resistance gene diversity and dynamics. In turn, this can allow for gene surveillance, individualised treatment of bacterial infections and more sustainable use of antimicrobials. However, resistome profiling can be complicated by high similarity between reference genes, as well as the sheer volume of sequencing data and the complexity of analysis workflows. We have developed an efficient and accurate method for resistome profiling that addresses these complications and improves upon currently available tools. Our method combines a variation graph representation of gene sets with an LSH Forest indexing scheme to allow for fast classification of metagenomic sequence reads using similarity-search queries. Subsequent hierarchical local alignment of classified reads against graph traversals enables accurate reconstruction of full-length gene sequences using a scoring scheme. We provide our implementation, GROOT, and show it to be both faster and more accurate than a current reference-dependent tool for resistome profiling. GROOT runs on a laptop and can process a typical 2 gigabyte metagenome in 2 minutes using a single CPU. Our method is not restricted to resistome profiling and has the potential to improve current metagenomic workflows. GROOT is written in Go and is available at https://github.com/will-rowe/groot (MIT license). will.rowe@stfc.ac.uk. Supplementary data are available at Bioinformatics online.
Similar compounds searching system by using the gene expression microarray database.
Toyoshiba, Hiroyoshi; Sawada, Hiroshi; Naeshiro, Ichiro; Horinouchi, Akira
2009-04-10
Numbers of microarrays have been examined and several public and commercial databases have been developed. However, it is not easy to compare in-house microarray data with those in a database because of insufficient reproducibility due to differences in the experimental conditions. As one of the approach to use these databases, we developed the similar compounds searching system (SCSS) on a toxicogenomics database. The datasets of 55 compounds administered to rats in the Toxicogenomics Project (TGP) database in Japan were used in this study. Using the fold-change ranking method developed by Lamb et al. [Lamb, J., Crawford, E.D., Peck, D., Modell, J.W., Blat, I.C., Wrobel, M.J., Lerner, J., Brunet, J.P., Subramanian, A., Ross, K.N., Reich, M., Hieronymus, H., Wei, G., Armstrong, S.A., Haggarty, S.J., Clemons, P.A., Wei, R., Carr, S.A., Lander, E.S., Golub, T.R., 2006. The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313, 1929-1935] and criteria called hit ratio, the system let us compare in-house microarray data and those in the database. In-house generated data for clofibrate, phenobarbital, and a proprietary compound were tested to evaluate the performance of the SCSS method. Phenobarbital and clofibrate, which were included in the TGP database, scored highest by the SCSS method. Other high scoring compounds had effects similar to either phenobarbital (a cytochrome P450s inducer) or clofibrate (a peroxisome proliferator). Some of high scoring compounds identified using the proprietary compound-administered rats have been known to cause similar toxicological changes in different species. Our results suggest that the SCSS method could be used in drug discovery and development. Moreover, this method may be a powerful tool to understand the mechanisms by which biological systems respond to various chemical compounds and may also predict adverse effects of new compounds.
VisSearch: A Collaborative Web Searching Environment
ERIC Educational Resources Information Center
Lee, Young-Jin
2005-01-01
VisSearch is a collaborative Web searching environment intended for sharing Web search results among people with similar interests, such as college students taking the same course. It facilitates students' Web searches by visualizing various Web searching processes. It also collects the visualized Web search results and applies an association rule…
Kinoshita, Kengo; Murakami, Yoichi; Nakamura, Haruki
2007-07-01
We have developed a method to predict ligand-binding sites in a new protein structure by searching for similar binding sites in the Protein Data Bank (PDB). The similarities are measured according to the shapes of the molecular surfaces and their electrostatic potentials. A new web server, eF-seek, provides an interface to our search method. It simply requires a coordinate file in the PDB format, and generates a prediction result as a virtual complex structure, with the putative ligands in a PDB format file as the output. In addition, the predicted interacting interface is displayed to facilitate the examination of the virtual complex structure on our own applet viewer with the web browser (URL: http://eF-site.hgc.jp/eF-seek).
Bogdanov, Yuri F; Dadashev, Sergei Y; Grishaeva, Tatiana M
2003-01-01
Evolutionarily distant organisms have not only orthologs, but also nonhomologous proteins that build functionally similar subcellular structures. For instance, this is true with protein components of the synaptonemal complex (SC), a universal ultrastructure that ensures the successful pairing and recombination of homologous chromosomes during meiosis. We aimed at developing a method to search databases for genes that code for such nonhomologous but functionally analogous proteins. Advantage was taken of the ultrastructural parameters of SC and the conformation of SC proteins responsible for these. Proteins involved in SC central space are known to be similar in secondary structure. Using published data, we found a highly significant correlation between the width of the SC central space and the length of rod-shaped central domain of mammalian and yeast intermediate proteins forming transversal filaments in the SC central space. Basing on this, we suggested a method for searching genome databases of distant organisms for genes whose virtual proteins meet the above correlation requirement. Our recent finding of the Drosophila melanogaster CG17604 gene coding for synaptonemal complex transversal filament protein received experimental support from another lab. With the same strategy, we showed that the Arabidopsis thaliana and Caenorhabditis elegans genomes contain unique genes coding for such proteins.
Beyond the search surface: visual search and attentional engagement.
Duncan, J; Humphreys, G
1992-05-01
Treisman (1991) described a series of visual search studies testing feature integration theory against an alternative (Duncan & Humphreys, 1989) in which feature and conjunction search are basically similar. Here the latter account is noted to have 2 distinct levels: (a) a summary of search findings in terms of stimulus similarities, and (b) a theory of how visual attention is brought to bear on relevant objects. Working at the 1st level, Treisman found that even when similarities were calibrated and controlled, conjunction search was much harder than feature search. The theory, however, can only really be tested at the 2nd level, because the 1st is an approximation. An account of the findings is developed at the 2nd level, based on the 2 processes of input-template matching and spreading suppression. New data show that, when both of these factors are controlled, feature and conjunction search are equally difficult. Possibilities for unification of the alternative views are considered.
Dobi, Krisztina; Flachner, Beáta; Pukáncsik, Mária; Máthé, Enikő; Bognár, Melinda; Szaszkó, Mária; Magyar, Csaba; Hajdú, István; Lőrincz, Zsolt; Simon, István; Fülöp, Ferenc; Cseh, Sándor; Dormán, György
2015-10-01
Rapid in silico selection of target-focused libraries from commercial repositories is an attractive and cost-effective approach. If structures of active compounds are available, rapid 2D similarity search can be performed on multimillion compound databases, but the generated library requires further focusing. We report here a combination of the 2D approach with pharmacophore matching which was used for selecting 5-HT6 antagonists. In the first screening round, 12 compounds showed >85% antagonist efficacy of the 91 screened. For the second-round (hit validation) screening phase, pharmacophore models were built, applied, and compared with the routine 2D similarity search. Three pharmacophore models were created based on the structure of the reference compounds and the first-round hit compounds. The pharmacophore search resulted in a high hit rate (40%) and led to novel chemotypes, while 2D similarity search had slightly better hit rate (51%), but lacking the novelty. To demonstrate the power of the virtual screening cascade, ligand efficiency indices were also calculated and their steady improvement was confirmed. © 2015 John Wiley & Sons A/S.
Extended Graph-Based Models for Enhanced Similarity Search in Cavbase.
Krotzky, Timo; Fober, Thomas; Hüllermeier, Eyke; Klebe, Gerhard
2014-01-01
To calculate similarities between molecular structures, measures based on the maximum common subgraph are frequently applied. For the comparison of protein binding sites, these measures are not fully appropriate since graphs representing binding sites on a detailed atomic level tend to get very large. In combination with an NP-hard problem, a large graph leads to a computationally demanding task. Therefore, for the comparison of binding sites, a less detailed coarse graph model is used building upon so-called pseudocenters. Consistently, a loss of structural data is caused since many atoms are discarded and no information about the shape of the binding site is considered. This is usually resolved by performing subsequent calculations based on additional information. These steps are usually quite expensive, making the whole approach very slow. The main drawback of a graph-based model solely based on pseudocenters, however, is the loss of information about the shape of the protein surface. In this study, we propose a novel and efficient modeling formalism that does not increase the size of the graph model compared to the original approach, but leads to graphs containing considerably more information assigned to the nodes. More specifically, additional descriptors considering surface characteristics are extracted from the local surface and attributed to the pseudocenters stored in Cavbase. These properties are evaluated as additional node labels, which lead to a gain of information and allow for much faster but still very accurate comparisons between different structures.
O'Loughlin, Declan; Oliveira, Bárbara L; Elahi, Muhammad Adnan; Glavin, Martin; Jones, Edward; Popović, Milica; O'Halloran, Martin
2017-12-06
Inaccurate estimation of average dielectric properties can have a tangible impact on microwave radar-based breast images. Despite this, recent patient imaging studies have used a fixed estimate although this is known to vary from patient to patient. Parameter search algorithms are a promising technique for estimating the average dielectric properties from the reconstructed microwave images themselves without additional hardware. In this work, qualities of accurately reconstructed images are identified from point spread functions. As the qualities of accurately reconstructed microwave images are similar to the qualities of focused microscopic and photographic images, this work proposes the use of focal quality metrics for average dielectric property estimation. The robustness of the parameter search is evaluated using experimental dielectrically heterogeneous phantoms on the three-dimensional volumetric image. Based on a very broad initial estimate of the average dielectric properties, this paper shows how these metrics can be used as suitable fitness functions in parameter search algorithms to reconstruct clear and focused microwave radar images.
Visual Search Efficiency is Greater for Human Faces Compared to Animal Faces
Simpson, Elizabeth A.; Mertins, Haley L.; Yee, Krysten; Fullerton, Alison; Jakobsen, Krisztina V.
2015-01-01
The Animate Monitoring Hypothesis proposes that humans and animals were the most important categories of visual stimuli for ancestral humans to monitor, as they presented important challenges and opportunities for survival and reproduction; however, it remains unknown whether animal faces are located as efficiently as human faces. We tested this hypothesis by examining whether human, primate, and mammal faces elicit similarly efficient searches, or whether human faces are privileged. In the first three experiments, participants located a target (human, primate, or mammal face) among distractors (non-face objects). We found fixations on human faces were faster and more accurate than primate faces, even when controlling for search category specificity. A final experiment revealed that, even when task-irrelevant, human faces slowed searches for non-faces, suggesting some bottom-up processing may be responsible for the human face search efficiency advantage. PMID:24962122
Borozan, Ivan; Watt, Stuart; Ferretti, Vincent
2015-05-01
Alignment-based sequence similarity searches, while accurate for some type of sequences, can produce incorrect results when used on more divergent but functionally related sequences that have undergone the sequence rearrangements observed in many bacterial and viral genomes. Here, we propose a classification model that exploits the complementary nature of alignment-based and alignment-free similarity measures with the aim to improve the accuracy with which DNA and protein sequences are characterized. Our model classifies sequences using a combined sequence similarity score calculated by adaptively weighting the contribution of different sequence similarity measures. Weights are determined independently for each sequence in the test set and reflect the discriminatory ability of individual similarity measures in the training set. Because the similarity between some sequences is determined more accurately with one type of measure rather than another, our classifier allows different sets of weights to be associated with different sequences. Using five different similarity measures, we show that our model significantly improves the classification accuracy over the current composition- and alignment-based models, when predicting the taxonomic lineage for both short viral sequence fragments and complete viral sequences. We also show that our model can be used effectively for the classification of reads from a real metagenome dataset as well as protein sequences. All the datasets and the code used in this study are freely available at https://collaborators.oicr.on.ca/vferretti/borozan_csss/csss.html. ivan.borozan@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
Borozan, Ivan; Watt, Stuart; Ferretti, Vincent
2015-01-01
Motivation: Alignment-based sequence similarity searches, while accurate for some type of sequences, can produce incorrect results when used on more divergent but functionally related sequences that have undergone the sequence rearrangements observed in many bacterial and viral genomes. Here, we propose a classification model that exploits the complementary nature of alignment-based and alignment-free similarity measures with the aim to improve the accuracy with which DNA and protein sequences are characterized. Results: Our model classifies sequences using a combined sequence similarity score calculated by adaptively weighting the contribution of different sequence similarity measures. Weights are determined independently for each sequence in the test set and reflect the discriminatory ability of individual similarity measures in the training set. Because the similarity between some sequences is determined more accurately with one type of measure rather than another, our classifier allows different sets of weights to be associated with different sequences. Using five different similarity measures, we show that our model significantly improves the classification accuracy over the current composition- and alignment-based models, when predicting the taxonomic lineage for both short viral sequence fragments and complete viral sequences. We also show that our model can be used effectively for the classification of reads from a real metagenome dataset as well as protein sequences. Availability and implementation: All the datasets and the code used in this study are freely available at https://collaborators.oicr.on.ca/vferretti/borozan_csss/csss.html. Contact: ivan.borozan@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25573913
Finding Protein and Nucleotide Similarities with FASTA
Pearson, William R.
2016-01-01
The FASTA programs provide a comprehensive set of rapid similarity searching tools ( fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local and global similarity searches ( ssearch36, ggsearch36) and for searching with short peptides and oligonucleotides ( fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matrices and gap penalties, improving alignment boundary accuracy and search sensitivity (Unit 3.5). The FASTA programs can produce “BLAST-like” alignment and tabular output, for ease of integration into existing analysis pipelines, and can search small, representative databases, and then report results for a larger set of sequences, using links from the smaller dataset. The FASTA programs work with a wide variety of database formats, including mySQL and postgreSQL databases (Unit 9.4). The programs also provide a strategy for integrating domain and active site annotations into alignments and highlighting the mutational state of functionally critical residues. These protocols describe how to use the FASTA programs to characterize protein and DNA sequences, using protein:protein, protein:DNA, and DNA:DNA comparisons. PMID:27010337
Finding Protein and Nucleotide Similarities with FASTA.
Pearson, William R
2016-03-24
The FASTA programs provide a comprehensive set of rapid similarity searching tools (fasta36, fastx36, tfastx36, fasty36, tfasty36), similar to those provided by the BLAST package, as well as programs for slower, optimal, local, and global similarity searches (ssearch36, ggsearch36), and for searching with short peptides and oligonucleotides (fasts36, fastm36). The FASTA programs use an empirical strategy for estimating statistical significance that accommodates a range of similarity scoring matrices and gap penalties, improving alignment boundary accuracy and search sensitivity. The FASTA programs can produce "BLAST-like" alignment and tabular output, for ease of integration into existing analysis pipelines, and can search small, representative databases, and then report results for a larger set of sequences, using links from the smaller dataset. The FASTA programs work with a wide variety of database formats, including mySQL and postgreSQL databases. The programs also provide a strategy for integrating domain and active site annotations into alignments and highlighting the mutational state of functionally critical residues. These protocols describe how to use the FASTA programs to characterize protein and DNA sequences, using protein:protein, protein:DNA, and DNA:DNA comparisons. Copyright © 2016 John Wiley & Sons, Inc.
Earthquake Fingerprints: Representing Earthquake Waveforms for Similarity-Based Detection
NASA Astrophysics Data System (ADS)
Bergen, K.; Beroza, G. C.
2016-12-01
New earthquake detection methods, such as Fingerprint and Similarity Thresholding (FAST), use fast approximate similarity search to identify similar waveforms in long-duration data without templates (Yoon et al. 2015). These methods have two key components: fingerprint extraction and an efficient search algorithm. Fingerprint extraction converts waveforms into fingerprints, compact signatures that represent short-duration waveforms for identification and search. Earthquakes are detected using an efficient indexing and search scheme, such as locality-sensitive hashing, that identifies similar waveforms in a fingerprint database. The quality of the search results, and thus the earthquake detection results, is strongly dependent on the fingerprinting scheme. Fingerprint extraction should map similar earthquake waveforms to similar waveform fingerprints to ensure a high detection rate, even under additive noise and small distortions. Additionally, fingerprints corresponding to noise intervals should have mutually dissimilar fingerprints to minimize false detections. In this work, we compare the performance of multiple fingerprint extraction approaches for the earthquake waveform similarity search problem. We apply existing audio fingerprinting (used in content-based audio identification systems) and time series indexing techniques and present modified versions that are specifically adapted for seismic data. We also explore data-driven fingerprinting approaches that can take advantage of labeled or unlabeled waveform data. For each fingerprinting approach we measure its ability to identify similar waveforms in a low signal-to-noise setting, and quantify the trade-off between true and false detection rates in the presence of persistent noise sources. We compare the performance using known event waveforms from eight independent stations in the Northern California Seismic Network.
Ertl, P
1998-02-01
Easy to use, interactive, and platform-independent WWW-based tools are ideal for development of chemical applications. By using the newly emerging Web technologies such as Java applets and sophisticated scripting, it is possible to deliver powerful molecular processing capabilities directly to the desk of synthetic organic chemists. In Novartis Crop Protection in Basel, a Web-based molecular modelling system has been in use since 1995. In this article two new modules of this system are presented: a program for interactive calculation of important hydrophobic, electronic, and steric properties of organic substituents, and a module for substituent similarity searches enabling the identification of bioisosteric functional groups. Various possible applications of calculated substituent parameters are also discussed, including automatic design of molecules with the desired properties and creation of targeted virtual combinatorial libraries.
A literature search tool for intelligent extraction of disease-associated genes.
Jung, Jae-Yoon; DeLuca, Todd F; Nelson, Tristan H; Wall, Dennis P
2014-01-01
To extract disorder-associated genes from the scientific literature in PubMed with greater sensitivity for literature-based support than existing methods. We developed a PubMed query to retrieve disorder-related, original research articles. Then we applied a rule-based text-mining algorithm with keyword matching to extract target disorders, genes with significant results, and the type of study described by the article. We compared our resulting candidate disorder genes and supporting references with existing databases. We demonstrated that our candidate gene set covers nearly all genes in manually curated databases, and that the references supporting the disorder-gene link are more extensive and accurate than other general purpose gene-to-disorder association databases. We implemented a novel publication search tool to find target articles, specifically focused on links between disorders and genotypes. Through comparison against gold-standard manually updated gene-disorder databases and comparison with automated databases of similar functionality we show that our tool can search through the entirety of PubMed to extract the main gene findings for human diseases rapidly and accurately.
MOST: most-similar ligand based approach to target prediction.
Huang, Tao; Mi, Hong; Lin, Cheng-Yuan; Zhao, Ling; Zhong, Linda L D; Liu, Feng-Bin; Zhang, Ge; Lu, Ai-Ping; Bian, Zhao-Xiang
2017-03-11
Many computational approaches have been used for target prediction, including machine learning, reverse docking, bioactivity spectra analysis, and chemical similarity searching. Recent studies have suggested that chemical similarity searching may be driven by the most-similar ligand. However, the extent of bioactivity of most-similar ligands has been oversimplified or even neglected in these studies, and this has impaired the prediction power. Here we propose the MOst-Similar ligand-based Target inference approach, namely MOST, which uses fingerprint similarity and explicit bioactivity of the most-similar ligands to predict targets of the query compound. Performance of MOST was evaluated by using combinations of different fingerprint schemes, machine learning methods, and bioactivity representations. In sevenfold cross-validation with a benchmark Ki dataset from CHEMBL release 19 containing 61,937 bioactivity data of 173 human targets, MOST achieved high average prediction accuracy (0.95 for pKi ≥ 5, and 0.87 for pKi ≥ 6). Morgan fingerprint was shown to be slightly better than FP2. Logistic Regression and Random Forest methods performed better than Naïve Bayes. In a temporal validation, the Ki dataset from CHEMBL19 were used to train models and predict the bioactivity of newly deposited ligands in CHEMBL20. MOST also performed well with high accuracy (0.90 for pKi ≥ 5, and 0.76 for pKi ≥ 6), when Logistic Regression and Morgan fingerprint were employed. Furthermore, the p values associated with explicit bioactivity were found be a robust index for removing false positive predictions. Implicit bioactivity did not offer this capability. Finally, p values generated with Logistic Regression, Morgan fingerprint and explicit activity were integrated with a false discovery rate (FDR) control procedure to reduce false positives in multiple-target prediction scenario, and the success of this strategy it was demonstrated with a case of fluanisone
Domain similarity based orthology detection.
Bitard-Feildel, Tristan; Kemena, Carsten; Greenwood, Jenny M; Bornberg-Bauer, Erich
2015-05-13
Orthologous protein detection software mostly uses pairwise comparisons of amino-acid sequences to assert whether two proteins are orthologous or not. Accordingly, when the number of sequences for comparison increases, the number of comparisons to compute grows in a quadratic order. A current challenge of bioinformatic research, especially when taking into account the increasing number of sequenced organisms available, is to make this ever-growing number of comparisons computationally feasible in a reasonable amount of time. We propose to speed up the detection of orthologous proteins by using strings of domains to characterize the proteins. We present two new protein similarity measures, a cosine and a maximal weight matching score based on domain content similarity, and new software, named porthoDom. The qualities of the cosine and the maximal weight matching similarity measures are compared against curated datasets. The measures show that domain content similarities are able to correctly group proteins into their families. Accordingly, the cosine similarity measure is used inside porthoDom, the wrapper developed for proteinortho. porthoDom makes use of domain content similarity measures to group proteins together before searching for orthologs. By using domains instead of amino acid sequences, the reduction of the search space decreases the computational complexity of an all-against-all sequence comparison. We demonstrate that representing and comparing proteins as strings of discrete domains, i.e. as a concatenation of their unique identifiers, allows a drastic simplification of search space. porthoDom has the advantage of speeding up orthology detection while maintaining a degree of accuracy similar to proteinortho. The implementation of porthoDom is released using python and C++ languages and is available under the GNU GPL licence 3 at http://www.bornberglab.org/pages/porthoda .
HBLAST: Parallelised sequence similarity--A Hadoop MapReducable basic local alignment search tool.
O'Driscoll, Aisling; Belogrudov, Vladislav; Carroll, John; Kropp, Kai; Walsh, Paul; Ghazal, Peter; Sleator, Roy D
2015-04-01
The recent exponential growth of genomic databases has resulted in the common task of sequence alignment becoming one of the major bottlenecks in the field of computational biology. It is typical for these large datasets and complex computations to require cost prohibitive High Performance Computing (HPC) to function. As such, parallelised solutions have been proposed but many exhibit scalability limitations and are incapable of effectively processing "Big Data" - the name attributed to datasets that are extremely large, complex and require rapid processing. The Hadoop framework, comprised of distributed storage and a parallelised programming framework known as MapReduce, is specifically designed to work with such datasets but it is not trivial to efficiently redesign and implement bioinformatics algorithms according to this paradigm. The parallelisation strategy of "divide and conquer" for alignment algorithms can be applied to both data sets and input query sequences. However, scalability is still an issue due to memory constraints or large databases, with very large database segmentation leading to additional performance decline. Herein, we present Hadoop Blast (HBlast), a parallelised BLAST algorithm that proposes a flexible method to partition both databases and input query sequences using "virtual partitioning". HBlast presents improved scalability over existing solutions and well balanced computational work load while keeping database segmentation and recompilation to a minimum. Enhanced BLAST search performance on cheap memory constrained hardware has significant implications for in field clinical diagnostic testing; enabling faster and more accurate identification of pathogenic DNA in human blood or tissue samples. Copyright © 2015 Elsevier Inc. All rights reserved.
FLASHFLOOD: A 3D Field-based similarity search and alignment method for flexible molecules
NASA Astrophysics Data System (ADS)
Pitman, Michael C.; Huber, Wolfgang K.; Horn, Hans; Krämer, Andreas; Rice, Julia E.; Swope, William C.
2001-07-01
A three-dimensional field-based similarity search and alignment method for flexible molecules is introduced. The conformational space of a flexible molecule is represented in terms of fragments and torsional angles of allowed conformations. A user-definable property field is used to compute features of fragment pairs. Features are generalizations of CoMMA descriptors (Silverman, B.D. and Platt, D.E., J. Med. Chem., 39 (1996) 2129.) that characterize local regions of the property field by its local moments. The features are invariant under coordinate system transformations. Features taken from a query molecule are used to form alignments with fragment pairs in the database. An assembly algorithm is then used to merge the fragment pairs into full structures, aligned to the query. Key to the method is the use of a context adaptive descriptor scaling procedure as the basis for similarity. This allows the user to tune the weights of the various feature components based on examples relevant to the particular context under investigation. The property fields may range from simple, phenomenological fields, to fields derived from quantum mechanical calculations. We apply the method to the dihydrofolate/methotrexate benchmark system, and show that when one injects relevant contextual information into the descriptor scaling procedure, better results are obtained more efficiently. We also show how the method works and include computer times for a query from a database that represents approximately 23 million conformers of seventeen flexible molecules.
The application of similar image retrieval in electronic commerce.
Hu, YuPing; Yin, Hua; Han, Dezhi; Yu, Fei
2014-01-01
Traditional online shopping platform (OSP), which searches product information by keywords, faces three problems: indirect search mode, large search space, and inaccuracy in search results. For solving these problems, we discuss and research the application of similar image retrieval in electronic commerce. Aiming at improving the network customers' experience and providing merchants with the accuracy of advertising, we design a reasonable and extensive electronic commerce application system, which includes three subsystems: image search display subsystem, image search subsystem, and product information collecting subsystem. This system can provide seamless connection between information platform and OSP, on which consumers can automatically and directly search similar images according to the pictures from information platform. At the same time, it can be used to provide accuracy of internet marketing for enterprises. The experiment shows the efficiency of constructing the system.
The Application of Similar Image Retrieval in Electronic Commerce
Hu, YuPing; Yin, Hua; Han, Dezhi; Yu, Fei
2014-01-01
Traditional online shopping platform (OSP), which searches product information by keywords, faces three problems: indirect search mode, large search space, and inaccuracy in search results. For solving these problems, we discuss and research the application of similar image retrieval in electronic commerce. Aiming at improving the network customers' experience and providing merchants with the accuracy of advertising, we design a reasonable and extensive electronic commerce application system, which includes three subsystems: image search display subsystem, image search subsystem, and product information collecting subsystem. This system can provide seamless connection between information platform and OSP, on which consumers can automatically and directly search similar images according to the pictures from information platform. At the same time, it can be used to provide accuracy of internet marketing for enterprises. The experiment shows the efficiency of constructing the system. PMID:24883411
Application of 3D Zernike descriptors to shape-based ligand similarity searching.
Venkatraman, Vishwesh; Chakravarthy, Padmasini Ramji; Kihara, Daisuke
2009-12-17
The identification of promising drug leads from a large database of compounds is an important step in the preliminary stages of drug design. Although shape is known to play a key role in the molecular recognition process, its application to virtual screening poses significant hurdles both in terms of the encoding scheme and speed. In this study, we have examined the efficacy of the alignment independent three-dimensional Zernike descriptor (3DZD) for fast shape based similarity searching. Performance of this approach was compared with several other methods including the statistical moments based ultrafast shape recognition scheme (USR) and SIMCOMP, a graph matching algorithm that compares atom environments. Three benchmark datasets are used to thoroughly test the methods in terms of their ability for molecular classification, retrieval rate, and performance under the situation that simulates actual virtual screening tasks over a large pharmaceutical database. The 3DZD performed better than or comparable to the other methods examined, depending on the datasets and evaluation metrics used. Reasons for the success and the failure of the shape based methods for specific cases are investigated. Based on the results for the three datasets, general conclusions are drawn with regard to their efficiency and applicability. The 3DZD has unique ability for fast comparison of three-dimensional shape of compounds. Examples analyzed illustrate the advantages and the room for improvements for the 3DZD.
NASA Astrophysics Data System (ADS)
Lee, Feifei; Kotani, Koji; Chen, Qiu; Ohmi, Tadahiro
2010-02-01
In this paper, a fast search algorithm for MPEG-4 video clips from video database is proposed. An adjacent pixel intensity difference quantization (APIDQ) histogram is utilized as the feature vector of VOP (video object plane), which had been reliably applied to human face recognition previously. Instead of fully decompressed video sequence, partially decoded data, namely DC sequence of the video object are extracted from the video sequence. Combined with active search, a temporal pruning algorithm, fast and robust video search can be realized. The proposed search algorithm has been evaluated by total 15 hours of video contained of TV programs such as drama, talk, news, etc. to search for given 200 MPEG-4 video clips which each length is 15 seconds. Experimental results show the proposed algorithm can detect the similar video clip in merely 80ms, and Equal Error Rate (ERR) of 2 % in drama and news categories are achieved, which are more accurately and robust than conventional fast video search algorithm.
Document similarity measures and document browsing
NASA Astrophysics Data System (ADS)
Ahmadullin, Ildus; Fan, Jian; Damera-Venkata, Niranjan; Lim, Suk Hwan; Lin, Qian; Liu, Jerry; Liu, Sam; O'Brien-Strain, Eamonn; Allebach, Jan
2011-03-01
Managing large document databases is an important task today. Being able to automatically com- pare document layouts and classify and search documents with respect to their visual appearance proves to be desirable in many applications. We measure single page documents' similarity with respect to distance functions between three document components: background, text, and saliency. Each document component is represented as a Gaussian mixture distribution; and distances between dierent documents' components are calculated as probabilistic similarities between corresponding distributions. The similarity measure between documents is represented as a weighted sum of the components' distances. Using this document similarity measure, we propose a browsing mechanism operating on a document dataset. For these purposes, we use a hierarchical browsing environment which we call the document similarity pyramid. It allows the user to browse a large document dataset and to search for documents in the dataset that are similar to the query. The user can browse the dataset on dierent levels of the pyramid, and zoom into the documents that are of interest.
Application of 3D Zernike descriptors to shape-based ligand similarity searching
2009-01-01
Background The identification of promising drug leads from a large database of compounds is an important step in the preliminary stages of drug design. Although shape is known to play a key role in the molecular recognition process, its application to virtual screening poses significant hurdles both in terms of the encoding scheme and speed. Results In this study, we have examined the efficacy of the alignment independent three-dimensional Zernike descriptor (3DZD) for fast shape based similarity searching. Performance of this approach was compared with several other methods including the statistical moments based ultrafast shape recognition scheme (USR) and SIMCOMP, a graph matching algorithm that compares atom environments. Three benchmark datasets are used to thoroughly test the methods in terms of their ability for molecular classification, retrieval rate, and performance under the situation that simulates actual virtual screening tasks over a large pharmaceutical database. The 3DZD performed better than or comparable to the other methods examined, depending on the datasets and evaluation metrics used. Reasons for the success and the failure of the shape based methods for specific cases are investigated. Based on the results for the three datasets, general conclusions are drawn with regard to their efficiency and applicability. Conclusion The 3DZD has unique ability for fast comparison of three-dimensional shape of compounds. Examples analyzed illustrate the advantages and the room for improvements for the 3DZD. PMID:20150998
Nasr, Ramzi; Vernica, Rares; Li, Chen; Baldi, Pierre
2012-01-01
In ligand-based screening, retrosynthesis, and other chemoinformatics applications, one of-ten seeks to search large databases of molecules in order to retrieve molecules that are similar to a given query. With the expanding size of molecular databases, the efficiency and scalability of data structures and algorithms for chemical searches are becoming increasingly important. Remarkably, both the chemoinformatics and information retrieval communities have converged on similar solutions whereby molecules or documents are represented by binary vectors, or fingerprints, indexing their substructures such as labeled paths for molecules and n-grams for text, with the same Jaccard-Tanimoto similarity measure. As a result, similarity search methods from one field can be adapted to the other. Here we adapt recent, state-of-the-art, inverted index methods from information retrieval to speed up similarity searches in chemoinformatics. Our results show a several-fold speed-up improvement over previous methods for both thresh-old searches and top-K searches. We also provide a mathematical analysis that allows one to predict the level of pruning achieved by the inverted index approach, and validate the quality of these predictions through simulation experiments. All results can be replicated using data freely downloadable from http://cdb.ics.uci.edu/. PMID:22462644
An efficient and accurate 3D displacements tracking strategy for digital volume correlation
NASA Astrophysics Data System (ADS)
Pan, Bing; Wang, Bo; Wu, Dafang; Lubineau, Gilles
2014-07-01
Owing to its inherent computational complexity, practical implementation of digital volume correlation (DVC) for internal displacement and strain mapping faces important challenges in improving its computational efficiency. In this work, an efficient and accurate 3D displacement tracking strategy is proposed for fast DVC calculation. The efficiency advantage is achieved by using three improvements. First, to eliminate the need of updating Hessian matrix in each iteration, an efficient 3D inverse compositional Gauss-Newton (3D IC-GN) algorithm is introduced to replace existing forward additive algorithms for accurate sub-voxel displacement registration. Second, to ensure the 3D IC-GN algorithm that converges accurately and rapidly and avoid time-consuming integer-voxel displacement searching, a generalized reliability-guided displacement tracking strategy is designed to transfer accurate and complete initial guess of deformation for each calculation point from its computed neighbors. Third, to avoid the repeated computation of sub-voxel intensity interpolation coefficients, an interpolation coefficient lookup table is established for tricubic interpolation. The computational complexity of the proposed fast DVC and the existing typical DVC algorithms are first analyzed quantitatively according to necessary arithmetic operations. Then, numerical tests are performed to verify the performance of the fast DVC algorithm in terms of measurement accuracy and computational efficiency. The experimental results indicate that, compared with the existing DVC algorithm, the presented fast DVC algorithm produces similar precision and slightly higher accuracy at a substantially reduced computational cost.
FunSimMat: a comprehensive functional similarity database
Schlicker, Andreas; Albrecht, Mario
2008-01-01
Functional similarity based on Gene Ontology (GO) annotation is used in diverse applications like gene clustering, gene expression data analysis, protein interaction prediction and evaluation. However, there exists no comprehensive resource of functional similarity values although such a database would facilitate the use of functional similarity measures in different applications. Here, we describe FunSimMat (Functional Similarity Matrix, http://funsimmat.bioinf.mpi-inf.mpg.de/), a large new database that provides several different semantic similarity measures for GO terms. It offers various precomputed functional similarity values for proteins contained in UniProtKB and for protein families in Pfam and SMART. The web interface allows users to efficiently perform both semantic similarity searches with GO terms and functional similarity searches with proteins or protein families. All results can be downloaded in tab-delimited files for use with other tools. An additional XML–RPC interface gives automatic online access to FunSimMat for programs and remote services. PMID:17932054
Efficient heuristics for maximum common substructure search.
Englert, Péter; Kovács, Péter
2015-05-26
Maximum common substructure search is a computationally hard optimization problem with diverse applications in the field of cheminformatics, including similarity search, lead optimization, molecule alignment, and clustering. Most of these applications have strict constraints on running time, so heuristic methods are often preferred. However, the development of an algorithm that is both fast enough and accurate enough for most practical purposes is still a challenge. Moreover, in some applications, the quality of a common substructure depends not only on its size but also on various topological features of the one-to-one atom correspondence it defines. Two state-of-the-art heuristic algorithms for finding maximum common substructures have been implemented at ChemAxon Ltd., and effective heuristics have been developed to improve both their efficiency and the relevance of the atom mappings they provide. The implementations have been thoroughly evaluated and compared with existing solutions (KCOMBU and Indigo). The heuristics have been found to greatly improve the performance and applicability of the algorithms. The purpose of this paper is to introduce the applied methods and present the experimental results.
Finding accurate frontiers: A knowledge-intensive approach to relational learning
NASA Technical Reports Server (NTRS)
Pazzani, Michael; Brunk, Clifford
1994-01-01
An approach to analytic learning is described that searches for accurate entailments of a Horn Clause domain theory. A hill-climbing search, guided by an information based evaluation function, is performed by applying a set of operators that derive frontiers from domain theories. The analytic learning system is one component of a multi-strategy relational learning system. We compare the accuracy of concepts learned with this analytic strategy to concepts learned with an analytic strategy that operationalizes the domain theory.
[Eye movement study in multiple object search process].
Xu, Zhaofang; Liu, Zhongqi; Wang, Xingwei; Zhang, Xin
2017-04-01
The aim of this study is to investigate the search time regulation of objectives and eye movement behavior characteristics in the multi-objective visual search. The experimental task was accomplished with computer programming and presented characters on a 24 inch computer display. The subjects were asked to search three targets among the characters. Three target characters in the same group were of high similarity degree while those in different groups of target characters and distraction characters were in different similarity degrees. We recorded the search time and eye movement data through the whole experiment. It could be seen from the eye movement data that the quantity of fixation points was large when the target characters and distraction characters were similar. There were three kinds of visual search patterns for the subjects including parallel search, serial search, and parallel-serial search. In addition, the last pattern had the best search performance among the three search patterns, that is, the subjects who used parallel-serial search pattern spent shorter time finding the target. The order that the targets presented were able to affect the search performance significantly; and the similarity degree between target characters and distraction characters could also affect the search performance.
Searching through synaesthetic colors.
Laeng, Bruno
2009-10-01
Synaesthesia can be characterized by illusory colors being elicited automatically when one reads an alphanumeric symbol. These colors can affect attention; synaesthetes can show advantages in visual search of achromatic symbols that normally cause slow searches. However, some studies have failed to find these advantages, challenging the conclusion that synaesthetic colors influence attention in a manner similar to the influence of perceptual colors. In the present study, we investigated 2 synaesthetes who reported colors localized in space over alphanumeric symbols' shapes. The Euclidian distance in CIE xyY color space between two synaesthetic colors was computed for each specific visual search, so that the relationship between color distance (CD) and efficiency of search could be explored with simple regression analyses. Target-to-distractors color salience systematically predicted the speed of search, but the CD between a target or distractors and the physically presented achromatic color did not. When the synaesthetic colors of a target and distractors were nearly complementary, searches resembled popout performance with real colors. Control participants who performed searches for the same symbols (which were colored according to the synaesthetic colors) showed search functions very similar to those shown by the synaesthetes for the physically achromatic symbols.
SETI The Search for Extraterrestrial Intelligence.
ERIC Educational Resources Information Center
Jones, Barrie W.
1991-01-01
Discussed is the search for life on other planets similar to Earth based on the Drake equation. Described are search strategies and microwave searches. The reasons why people are searching are also discussed. (KR)
NASA Technical Reports Server (NTRS)
Albornoz, Caleb Ronald
2012-01-01
Thousands of millions of documents are stored and updated daily in the World Wide Web. Most of the information is not efficiently organized to build knowledge from the stored data. Nowadays, search engines are mainly used by users who rely on their skills to look for the information needed. This paper presents different techniques search engine users can apply in Google Search to improve the relevancy of search results. According to the Pew Research Center, the average person spends eight hours a month searching for the right information. For instance, a company that employs 1000 employees wastes $2.5 million dollars on looking for nonexistent and/or not found information. The cost is very high because decisions are made based on the information that is readily available to use. Whenever the information necessary to formulate an argument is not available or found, poor decisions may be made and mistakes will be more likely to occur. Also, the survey indicates that only 56% of Google users feel confident with their current search skills. Moreover, just 76% of the information that is available on the Internet is accurate.
Custom Search Engines: Tools & Tips
ERIC Educational Resources Information Center
Notess, Greg R.
2008-01-01
Few have the resources to build a Google or Yahoo! from scratch. Yet anyone can build a search engine based on a subset of the large search engines' databases. Use Google Custom Search Engine or Yahoo! Search Builder or any of the other similar programs to create a vertical search engine targeting sites of interest to users. The basic steps to…
Index Relativity and Patron Search Strategy.
ERIC Educational Resources Information Center
Allison, DeeAnn; Childers Scott
2002-01-01
Describes a study at the University of Nebraska-Lincoln that compared searches in two different keyword indexes with similar content where search results were dependent on search strategy quality, search engine execution, and content. Results showed search engine execution had an impact on the number of matches and that users ignored search help…
Searching Process with Raita Algorithm and its Application
NASA Astrophysics Data System (ADS)
Rahim, Robbi; Saleh Ahmar, Ansari; Abdullah, Dahlan; Hartama, Dedy; Napitupulu, Darmawan; Putera Utama Siahaan, Andysah; Hasan Siregar, Muhammad Noor; Nasution, Nurliana; Sundari, Siti; Sriadhi, S.
2018-04-01
Searching is a common process performed by many computer users, Raita algorithm is one algorithm that can be used to match and find information in accordance with the patterns entered. Raita algorithm applied to the file search application using java programming language and the results obtained from the testing process of the file search quickly and with accurate results and support many data types.
Saha, S. K.; Dutta, R.; Choudhury, R.; Kar, R.; Mandal, D.; Ghoshal, S. P.
2013-01-01
In this paper, opposition-based harmony search has been applied for the optimal design of linear phase FIR filters. RGA, PSO, and DE have also been adopted for the sake of comparison. The original harmony search algorithm is chosen as the parent one, and opposition-based approach is applied. During the initialization, randomly generated population of solutions is chosen, opposite solutions are also considered, and the fitter one is selected as a priori guess. In harmony memory, each such solution passes through memory consideration rule, pitch adjustment rule, and then opposition-based reinitialization generation jumping, which gives the optimum result corresponding to the least error fitness in multidimensional search space of FIR filter design. Incorporation of different control parameters in the basic HS algorithm results in the balancing of exploration and exploitation of search space. Low pass, high pass, band pass, and band stop FIR filters are designed with the proposed OHS and other aforementioned algorithms individually for comparative optimization performance. A comparison of simulation results reveals the optimization efficacy of the OHS over the other optimization techniques for the solution of the multimodal, nondifferentiable, nonlinear, and constrained FIR filter design problems. PMID:23844390
Saha, S K; Dutta, R; Choudhury, R; Kar, R; Mandal, D; Ghoshal, S P
2013-01-01
In this paper, opposition-based harmony search has been applied for the optimal design of linear phase FIR filters. RGA, PSO, and DE have also been adopted for the sake of comparison. The original harmony search algorithm is chosen as the parent one, and opposition-based approach is applied. During the initialization, randomly generated population of solutions is chosen, opposite solutions are also considered, and the fitter one is selected as a priori guess. In harmony memory, each such solution passes through memory consideration rule, pitch adjustment rule, and then opposition-based reinitialization generation jumping, which gives the optimum result corresponding to the least error fitness in multidimensional search space of FIR filter design. Incorporation of different control parameters in the basic HS algorithm results in the balancing of exploration and exploitation of search space. Low pass, high pass, band pass, and band stop FIR filters are designed with the proposed OHS and other aforementioned algorithms individually for comparative optimization performance. A comparison of simulation results reveals the optimization efficacy of the OHS over the other optimization techniques for the solution of the multimodal, nondifferentiable, nonlinear, and constrained FIR filter design problems.
Fast and Accurate Circuit Design Automation through Hierarchical Model Switching.
Huynh, Linh; Tagkopoulos, Ilias
2015-08-21
In computer-aided biological design, the trifecta of characterized part libraries, accurate models and optimal design parameters is crucial for producing reliable designs. As the number of parts and model complexity increase, however, it becomes exponentially more difficult for any optimization method to search the solution space, hence creating a trade-off that hampers efficient design. To address this issue, we present a hierarchical computer-aided design architecture that uses a two-step approach for biological design. First, a simple model of low computational complexity is used to predict circuit behavior and assess candidate circuit branches through branch-and-bound methods. Then, a complex, nonlinear circuit model is used for a fine-grained search of the reduced solution space, thus achieving more accurate results. Evaluation with a benchmark of 11 circuits and a library of 102 experimental designs with known characterization parameters demonstrates a speed-up of 3 orders of magnitude when compared to other design methods that provide optimality guarantees.
Almeida, Renita A; Dickinson, J Edwin; Maybery, Murray T; Badcock, Johanna C; Badcock, David R
2010-12-01
The Embedded Figures Test (EFT) requires detecting a shape within a complex background and individuals with autism or high Autism-spectrum Quotient (AQ) scores are faster and more accurate on this task than controls. This research aimed to uncover the visual processes producing this difference. Previously we developed a search task using radial frequency (RF) patterns with controllable amounts of target/distracter overlap on which high AQ participants showed more efficient search than low AQ observers. The current study extended the design of this search task by adding two lines which traverse the display on random paths sometimes intersecting target/distracters, other times passing between them. As with the EFT, these lines segment and group the display in ways that are task irrelevant. We tested two new groups of observers and found that while RF search was slowed by the addition of segmenting lines for both groups, the high AQ group retained a consistent search advantage (reflected in a shallower gradient for reaction time as a function of set size) over the low AQ group. Further, the high AQ group were significantly faster and more accurate on the EFT compared to the low AQ group. That is, the results from the present RF search task demonstrate that segmentation and grouping created by intersecting lines does not further differentiate the groups and is therefore unlikely to be a critical factor underlying the EFT performance difference. However, once again, we found that superior EFT performance was associated with shallower gradients on the RF search task. Copyright © 2010 Elsevier Ltd. All rights reserved.
Searching for periodic sources with LIGO. II. Hierarchical searches
NASA Astrophysics Data System (ADS)
Brady, Patrick R.; Creighton, Teviet
2000-04-01
The detection of quasi-periodic sources of gravitational waves requires the accumulation of signal to noise over long observation times. This represents the most difficult data analysis problem facing experimenters with detectors such as those at LIGO. If not removed, Earth-motion induced Doppler modulations and intrinsic variations of the gravitational-wave frequency make the signals impossible to detect. These effects can be corrected (removed) using a parametrized model for the frequency evolution. In a previous paper, we introduced such a model and computed the number of independent parameter space points for which corrections must be applied to the data stream in a coherent search. Since this number increases with the observation time, the sensitivity of a search for continuous gravitational-wave signals is computationally bound when data analysis proceeds at a similar rate to data acquisition. In this paper, we extend the formalism developed by Brady et al. [Phys. Rev. D 57, 2101 (1998)], and we compute the number of independent corrections Np(ΔT,N) required for incoherent search strategies. These strategies rely on the method of stacked power spectra-a demodulated time series is divided into N segments of length ΔT, each segment is Fourier transformed, a power spectrum is computed, and the N spectra are summed up. This method is incoherent; phase information is lost from segment to segment. Nevertheless, power from a signal with fixed frequency (in the corrected time series) is accumulated in a single frequency bin, and amplitude signal to noise accumulates as ~N1/4 (assuming the segment length ΔT is held fixed). For fixed available computing power, there are optimal values for N and ΔT which maximize the sensitivity of a search in which data analysis takes a total time NΔT. We estimate that the optimal sensitivity of an all-sky search that uses incoherent stacks is a factor of 2-4 better than achieved using coherent Fourier transforms, assuming the
The development of organized visual search
Woods, Adam J.; Goksun, Tilbe; Chatterjee, Anjan; Zelonis, Sarah; Mehta, Anika; Smith, Sabrina E.
2013-01-01
Visual search plays an important role in guiding behavior. Children have more difficulty performing conjunction search tasks than adults. The present research evaluates whether developmental differences in children's ability to organize serial visual search (i.e., search organization skills) contribute to performance limitations in a typical conjunction search task. We evaluated 134 children between the ages of 2 and 17 on separate tasks measuring search for targets defined by a conjunction of features or by distinct features. Our results demonstrated that children organize their visual search better as they get older. As children's skills at organizing visual search improve they become more accurate at locating targets with conjunction of features amongst distractors, but not for targets with distinct features. Developmental limitations in children's abilities to organize their visual search of the environment are an important component of poor conjunction search in young children. In addition, our findings provide preliminary evidence that, like other visuospatial tasks, exposure to reading may influence children's spatial orientation to the visual environment when performing a visual search. PMID:23584560
Automatic search for maximum similarity between molecular electrostatic potential distributions
NASA Astrophysics Data System (ADS)
Manaut, Francesc; Sanz, Ferran; José, Jaume; Milesi, Massimo
1991-08-01
A new computer program has been developed to automatically obtain the relative position of two molecules in which the similarity between molecular electrostatic-potential distributions is greatest. These distributions are considered in a volume around the molecules, and the similarity is measured by the Spearman rank coefficient. The program has been tested using several pairs of molecules: water vs. water; phenylethylamine and phenylpropylamine vs. benzylamine; and methotrexate vs. dihydrofolic acid.
Design of transonic airfoil sections using a similarity theory
NASA Technical Reports Server (NTRS)
Nixon, D.
1978-01-01
A study of the available methods for transonic airfoil and wing design indicates that the most powerful technique is the numerical optimization procedure. However, the computer time for this method is relatively large because of the amount of computation required in the searches during optimization. The optimization method requires that base and calibration solutions be computed to determine a minimum drag direction. The design space is then computationally searched in this direction; it is these searches that dominate the computation time. A recent similarity theory allows certain transonic flows to be calculated rapidly from the base and calibration solutions. In this paper the application of the similarity theory to design problems is examined with the object of at least partially eliminating the costly searches of the design optimization method. An example of an airfoil design is presented.
Wang, Zhiyuan; Lleras, Alejandro; Buetti, Simona
2018-04-17
Our lab recently found evidence that efficient visual search (with a fixed target) is characterized by logarithmic Reaction Time (RT) × Set Size functions whose steepness is modulated by the similarity between target and distractors. To determine whether this pattern of results was based on low-level visual factors uncontrolled by previous experiments, we minimized the possibility of crowding effects in the display, compensated for the cortical magnification factor by magnifying search items based on their eccentricity, and compared search performance on such displays to performance on displays without magnification compensation. In both cases, the RT × Set Size functions were found to be logarithmic, and the modulation of the log slopes by target-distractor similarity was replicated. Consistent with previous results in the literature, cortical magnification compensation eliminated most target eccentricity effects. We conclude that the log functions and their modulation by target-distractor similarity relations reflect a parallel exhaustive processing architecture for early vision.
Selective scanpath repetition during memory-guided visual search.
Wynn, Jordana S; Bone, Michael B; Dragan, Michelle C; Hoffman, Kari L; Buchsbaum, Bradley R; Ryan, Jennifer D
2016-01-02
Visual search efficiency improves with repetition of a search display, yet the mechanisms behind these processing gains remain unclear. According to Scanpath Theory, memory retrieval is mediated by repetition of the pattern of eye movements or "scanpath" elicited during stimulus encoding. Using this framework, we tested the prediction that scanpath recapitulation reflects relational memory guidance during repeated search events. Younger and older subjects were instructed to find changing targets within flickering naturalistic scenes. Search efficiency (search time, number of fixations, fixation duration) and scanpath similarity (repetition) were compared across age groups for novel (V1) and repeated (V2) search events. Younger adults outperformed older adults on all efficiency measures at both V1 and V2, while the search time benefit for repeated viewing (V1-V2) did not differ by age. Fixation-binned scanpath similarity analyses revealed repetition of initial and final (but not middle) V1 fixations at V2, with older adults repeating more initial V1 fixations than young adults. In young adults only, early scanpath similarity correlated negatively with search time at test, indicating increased efficiency, whereas the similarity of V2 fixations to middle V1 fixations predicted poor search performance. We conclude that scanpath compression mediates increased search efficiency by selectively recapitulating encoding fixations that provide goal-relevant input. Extending Scanpath Theory, results suggest that scanpath repetition varies as a function of time and memory integrity.
Selective scanpath repetition during memory-guided visual search
Wynn, Jordana S.; Bone, Michael B.; Dragan, Michelle C.; Hoffman, Kari L.; Buchsbaum, Bradley R.; Ryan, Jennifer D.
2016-01-01
ABSTRACT Visual search efficiency improves with repetition of a search display, yet the mechanisms behind these processing gains remain unclear. According to Scanpath Theory, memory retrieval is mediated by repetition of the pattern of eye movements or “scanpath” elicited during stimulus encoding. Using this framework, we tested the prediction that scanpath recapitulation reflects relational memory guidance during repeated search events. Younger and older subjects were instructed to find changing targets within flickering naturalistic scenes. Search efficiency (search time, number of fixations, fixation duration) and scanpath similarity (repetition) were compared across age groups for novel (V1) and repeated (V2) search events. Younger adults outperformed older adults on all efficiency measures at both V1 and V2, while the search time benefit for repeated viewing (V1–V2) did not differ by age. Fixation-binned scanpath similarity analyses revealed repetition of initial and final (but not middle) V1 fixations at V2, with older adults repeating more initial V1 fixations than young adults. In young adults only, early scanpath similarity correlated negatively with search time at test, indicating increased efficiency, whereas the similarity of V2 fixations to middle V1 fixations predicted poor search performance. We conclude that scanpath compression mediates increased search efficiency by selectively recapitulating encoding fixations that provide goal-relevant input. Extending Scanpath Theory, results suggest that scanpath repetition varies as a function of time and memory integrity. PMID:27570471
A new protocol to accurately determine microtubule lattice seam location
Zhang, Rui; Nogales, Eva
2015-09-28
Microtubules (MTs) are cylindrical polymers of αβ-tubulin that display pseudo-helical symmetry due to the presence of a lattice seam of heterologous lateral contacts. The structural similarity between α- and β-tubulin makes it difficult to computationally distinguish them in the noisy cryo-EM images, unless a marker protein for the tubulin dimer, such as kinesin motor domain, is present. We have developed a new data processing protocol that can accurately determine αβ-tubulin register and seam location for MT segments. Our strategy can handle difficult situations, where the marker protein is relatively small or the decoration of marker protein is sparse. Using thismore » new seam-search protocol, combined with movie processing for data from a direct electron detection camera, we were able to determine the cryo-EM structures of MT at 3.5. Å resolution in different functional states. The successful distinction of α- and β-tubulin allowed us to visualize the nucleotide state at the E-site and the configuration of lateral contacts at the seam.« less
Noesis: Ontology based Scoped Search Engine and Resource Aggregator for Atmospheric Science
NASA Astrophysics Data System (ADS)
Ramachandran, R.; Movva, S.; Li, X.; Cherukuri, P.; Graves, S.
2006-12-01
The goal for search engines is to return results that are both accurate and complete. The search engines should find only what you really want and find everything you really want. Search engines (even meta search engines) lack semantics. The basis for search is simply based on string matching between the user's query term and the resource database and the semantics associated with the search string is not captured. For example, if an atmospheric scientist is searching for "pressure" related web resources, most search engines return inaccurate results such as web resources related to blood pressure. In this presentation Noesis, which is a meta-search engine and a resource aggregator that uses domain ontologies to provide scoped search capabilities will be described. Noesis uses domain ontologies to help the user scope the search query to ensure that the search results are both accurate and complete. The domain ontologies guide the user to refine their search query and thereby reduce the user's burden of experimenting with different search strings. Semantics are captured by refining the query terms to cover synonyms, specializations, generalizations and related concepts. Noesis also serves as a resource aggregator. It categorizes the search results from different online resources such as education materials, publications, datasets, web search engines that might be of interest to the user.
Chemical-text hybrid search engines.
Zhou, Yingyao; Zhou, Bin; Jiang, Shumei; King, Frederick J
2010-01-01
As the amount of chemical literature increases, it is critical that researchers be enabled to accurately locate documents related to a particular aspect of a given compound. Existing solutions, based on text and chemical search engines alone, suffer from the inclusion of "false negative" and "false positive" results, and cannot accommodate diverse repertoire of formats currently available for chemical documents. To address these concerns, we developed an approach called Entity-Canonical Keyword Indexing (ECKI), which converts a chemical entity embedded in a data source into its canonical keyword representation prior to being indexed by text search engines. We implemented ECKI using Microsoft Office SharePoint Server Search, and the resultant hybrid search engine not only supported complex mixed chemical and keyword queries but also was applied to both intranet and Internet environments. We envision that the adoption of ECKI will empower researchers to pose more complex search questions that were not readily attainable previously and to obtain answers at much improved speed and accuracy.
Search for gravitational waves from LIGO-Virgo science run and data interpretation
NASA Astrophysics Data System (ADS)
Biswas, Rahul
Search for gravitational wave events was performed on data jointly taken during LIGO's fifth science run (S5) and Virgo's first science mn (VSR1). The data taken during this period was broken down into five separate months. I shall report the analysis performed on one of these months. Apart from the search, I shall describe the work related to estimation of rate based on the loudest event in the search. I shall demonstrate methods used in construction of rate intervals at 90% confidence level and combination of rates from multiple experiments of similar duration. To have confidence in our detection, accurate estimation of false alarm probability (F.A.P.) associated with the event candidate is required. Current false alarm estimation techniques limit our ability to measure the F.A.P. to about 1 in 100. I shall describe a method that significantly improves this estimate using information from multiple detectors. Besides accurate knowledge of F.A.P., detection is also dependent on our ability to distinguish real signals to those from noise. Several tests exist which use the quality of the signal to differentiate between real and noise signal. The chi-square test is one such computationally expensive test applied in our search; we shall understand the dependence of the chi-square parameter on the signal to noise ratio (SNR) for a given signal, which will help us to model the chi-square parameter based on SNR. The two detectors at Hanford, WA, H1(4km) and H2(2km), share the same vacuum system and hence their noise is correlated. Our present method of background estimation cannot capture this correlation and often underestimates the background when only H1 and H2 are operating. I shall describe a novel method of time reversed filtering to correctly estimate the background.
Krysan, Maria
2008-06-01
In a departure from most studies of the causes of racial residential segregation that focus on the three main factors of economics, preferences, and discrimination, this paper examines one of the mechanisms through which segregation may be perpetuated: the housing search process itself. Data come from a 2004 face-to-face survey of an area probability sample of African American and white householders living in the three counties of the Detroit metropolitan area (n = 734). These data are used to address three research questions: (1) What are the strategies people use to find housing, and are there racial differences in those strategies? (2) Do whites and African Americans report similar or different experiences in the search for housing? (3) Do the locations in which people search for housing vary by race? Results show that once controlling for the type of search and background characteristics, the search strategies are generally similar for whites and blacks, though more so for buyers than renters: for example, black renters use more informal strategies and networks than do white renters. Analyses that look at the features of these strategies, however, reveal some significant racial differences. Search experiences are similar in terms of length and number of homes inspected, but other objective and subjective questions about the search show blacks at a disadvantage compared to whites: African Americans submit more offers/applications for homes, report more difficulties, and are much more likely to feel they were taken advantage of during the search. The racial characteristics of the communities in which blacks and whites search are quite different: whites mainly search in white communities, while African Americans search in communities with a variety of racial compositions. The paper concludes with a call for further research on housing search strategies, with particular attention to the role of social networks.
Krysan, Maria
2008-01-01
In a departure from most studies of the causes of racial residential segregation that focus on the three main factors of economics, preferences, and discrimination, this paper examines one of the mechanisms through which segregation may be perpetuated: the housing search process itself. Data come from a 2004 face-to-face survey of an area probability sample of African American and white householders living in the three counties of the Detroit metropolitan area (n=734). These data are used to address three research questions: (1) What are the strategies people use to find housing, and are there racial differences in those strategies? (2) Do whites and African Americans report similar or different experiences in the search for housing? (3) Do the locations in which people search for housing vary by race? Results show that once controlling for the type of search and background characteristics, the search strategies are generally similar for whites and blacks, though more so for buyers than renters: for example, black renters use more informal strategies and networks than do white renters. Analyses that look at the features of these strategies, however, reveal some significant racial differences. Search experiences are similar in terms of length and number of homes inspected, but other objective and subjective questions about the search show blacks at a disadvantage compared to whites: African Americans submit more offers/applications for homes, report more difficulties, and are much more likely to feel they were taken advantage of during the search. The racial characteristics of the communities in which blacks and whites search are quite different: whites mainly search in white communities, while African Americans search in communities with a variety of racial compositions. The paper concludes with a call for further research on housing search strategies, with particular attention to the role of social networks. PMID:19069060
NASA Astrophysics Data System (ADS)
Tao, Laifa; Lu, Chen; Noktehdan, Azadeh
2015-10-01
Battery capacity estimation is a significant recent challenge given the complex physical and chemical processes that occur within batteries and the restrictions on the accessibility of capacity degradation data. In this study, we describe an approach called dynamic spatial time warping, which is used to determine the similarities of two arbitrary curves. Unlike classical dynamic time warping methods, this approach can maintain the invariance of curve similarity to the rotations and translations of curves, which is vital in curve similarity search. Moreover, it utilizes the online charging or discharging data that are easily collected and do not require special assumptions. The accuracy of this approach is verified using NASA battery datasets. Results suggest that the proposed approach provides a highly accurate means of estimating battery capacity at less time cost than traditional dynamic time warping methods do for different individuals and under various operating conditions.
Category and Word Search: Generalizing Search Principles to Complex Processing.
1982-03-01
complex processing (e.g., LaBerge & Samels, 1974; Shiffrin & Schneider, 1977). In the present paper we examine how well the major phenomena in simple visual...subjects are searching for novel characters ( LaBerge , 1973). The relatively large and rapid CH practice effects for word and category search are analogous...1974) demonstrated interference effects of irrelevant flanking letters. Shaffer and Laberge (1979) showed a similar effect with words and semantic
Zhang, Xuetao; Huang, Jie; Yigit-Elliott, Serap; Rosenholtz, Ruth
2015-03-16
Observers can quickly search among shaded cubes for one lit from a unique direction. However, replace the cubes with similar 2-D patterns that do not appear to have a 3-D shape, and search difficulty increases. These results have challenged models of visual search and attention. We demonstrate that cube search displays differ from those with "equivalent" 2-D search items in terms of the informativeness of fairly low-level image statistics. This informativeness predicts peripheral discriminability of target-present from target-absent patches, which in turn predicts visual search performance, across a wide range of conditions. Comparing model performance on a number of classic search tasks, cube search does not appear unexpectedly easy. Easy cube search, per se, does not provide evidence for preattentive computation of 3-D scene properties. However, search asymmetries derived from rotating and/or flipping the cube search displays cannot be explained by the information in our current set of image statistics. This may merely suggest a need to modify the model's set of 2-D image statistics. Alternatively, it may be difficult cube search that provides evidence for preattentive computation of 3-D scene properties. By attributing 2-D luminance variations to a shaded 3-D shape, 3-D scene understanding may slow search for 2-D features of the target. © 2015 ARVO.
Zhang, Xuetao; Huang, Jie; Yigit-Elliott, Serap; Rosenholtz, Ruth
2015-01-01
Observers can quickly search among shaded cubes for one lit from a unique direction. However, replace the cubes with similar 2-D patterns that do not appear to have a 3-D shape, and search difficulty increases. These results have challenged models of visual search and attention. We demonstrate that cube search displays differ from those with “equivalent” 2-D search items in terms of the informativeness of fairly low-level image statistics. This informativeness predicts peripheral discriminability of target-present from target-absent patches, which in turn predicts visual search performance, across a wide range of conditions. Comparing model performance on a number of classic search tasks, cube search does not appear unexpectedly easy. Easy cube search, per se, does not provide evidence for preattentive computation of 3-D scene properties. However, search asymmetries derived from rotating and/or flipping the cube search displays cannot be explained by the information in our current set of image statistics. This may merely suggest a need to modify the model's set of 2-D image statistics. Alternatively, it may be difficult cube search that provides evidence for preattentive computation of 3-D scene properties. By attributing 2-D luminance variations to a shaded 3-D shape, 3-D scene understanding may slow search for 2-D features of the target. PMID:25780063
Dobi, Krisztina; Hajdú, István; Flachner, Beáta; Fabó, Gabriella; Szaszkó, Mária; Bognár, Melinda; Magyar, Csaba; Simon, István; Szisz, Dániel; Lőrincz, Zsolt; Cseh, Sándor; Dormán, György
2014-05-28
Rapid in silico selection of target focused libraries from commercial repositories is an attractive and cost effective approach. If structures of active compounds are available rapid 2D similarity search can be performed on multimillion compound databases but the generated library requires further focusing by various 2D/3D chemoinformatics tools. We report here a combination of the 2D approach with a ligand-based 3D method (Screen3D) which applies flexible matching to align reference and target compounds in a dynamic manner and thus to assess their structural and conformational similarity. In the first case study we compared the 2D and 3D similarity scores on an existing dataset derived from the biological evaluation of a PDE5 focused library. Based on the obtained similarity metrices a fusion score was proposed. The fusion score was applied to refine the 2D similarity search in a second case study where we aimed at selecting and evaluating a PDE4B focused library. The application of this fused 2D/3D similarity measure led to an increase of the hit rate from 8.5% (1st round, 47% inhibition at 10 µM) to 28.5% (2nd round at 50% inhibition at 10 µM) and the best two hits had 53 nM inhibitory activities.
Information filtering based on transferring similarity.
Sun, Duo; Zhou, Tao; Liu, Jian-Guo; Liu, Run-Ran; Jia, Chun-Xiao; Wang, Bing-Hong
2009-07-01
In this Brief Report, we propose an index of user similarity, namely, the transferring similarity, which involves all high-order similarities between users. Accordingly, we design a modified collaborative filtering algorithm, which provides remarkably higher accurate predictions than the standard collaborative filtering. More interestingly, we find that the algorithmic performance will approach its optimal value when the parameter, contained in the definition of transferring similarity, gets close to its critical value, before which the series expansion of transferring similarity is convergent and after which it is divergent. Our study is complementary to the one reported in [E. A. Leicht, P. Holme, and M. E. J. Newman, Phys. Rev. E 73, 026120 (2006)], and is relevant to the missing link prediction problem.
Younger, Paula; Boddy, Kate
2009-06-01
The researchers involved in this study work at Exeter Health library and at the Complementary Medicine Unit, Peninsula School of Medicine and Dentistry (PCMD). Within this collaborative environment it is possible to access the electronic resources of three institutions. This includes access to AMED and other databases using different interfaces. The aim of this study was to investigate whether searching different interfaces to the AMED allied health and complementary medicine database produced the same results when using identical search terms. The following Internet-based AMED interfaces were searched: DIALOG DataStar; EBSCOhost and OVID SP_UI01.00.02. Search results from all three databases were saved in an endnote database to facilitate analysis. A checklist was also compiled comparing interface features. In our initial search, DIALOG returned 29 hits, OVID 14 and Ebsco 8. If we assume that DIALOG returned 100% of potential hits, OVID initially returned only 48% of hits and EBSCOhost only 28%. In our search, a researcher using the Ebsco interface to carry out a simple search on AMED would miss over 70% of possible search hits. Subsequent EBSCOhost searches on different subjects failed to find between 21 and 86% of the hits retrieved using the same keywords via DIALOG DataStar. In two cases, the simple EBSCOhost search failed to find any of the results found via DIALOG DataStar. Depending on the interface, the number of hits retrieved from the same database with the same simple search can vary dramatically. Some simple searches fail to retrieve a substantial percentage of citations. This may result in an uninformed literature review, research funding application or treatment intervention. In addition to ensuring that keywords, spelling and medical subject headings (MeSH) accurately reflect the nature of the search, database users should include wildcards and truncation and adapt their search strategy substantially to retrieve the maximum number of appropriate
Edge-SIFT: discriminative binary descriptor for scalable partial-duplicate mobile search.
Zhang, Shiliang; Tian, Qi; Lu, Ke; Huang, Qingming; Gao, Wen
2013-07-01
As the basis of large-scale partial duplicate visual search on mobile devices, image local descriptor is expected to be discriminative, efficient, and compact. Our study shows that the popularly used histogram-based descriptors, such as scale invariant feature transform (SIFT) are not optimal for this task. This is mainly because histogram representation is relatively expensive to compute on mobile platforms and loses significant spatial clues, which are important for improving discriminative power and matching near-duplicate image patches. To address these issues, we propose to extract a novel binary local descriptor named Edge-SIFT from the binary edge maps of scale- and orientation-normalized image patches. By preserving both locations and orientations of edges and compressing the sparse binary edge maps with a boosting strategy, the final Edge-SIFT shows strong discriminative power with compact representation. Furthermore, we propose a fast similarity measurement and an indexing framework with flexible online verification. Hence, the Edge-SIFT allows an accurate and efficient image search and is ideal for computation sensitive scenarios such as a mobile image search. Experiments on a large-scale dataset manifest that the Edge-SIFT shows superior retrieval accuracy to Oriented BRIEF (ORB) and is superior to SIFT in the aspects of retrieval precision, efficiency, compactness, and transmission cost.
Block Architecture Problem with Depth First Search Solution and Its Application
NASA Astrophysics Data System (ADS)
Rahim, Robbi; Abdullah, Dahlan; Simarmata, Janner; Pranolo, Andri; Saleh Ahmar, Ansari; Hidayat, Rahmat; Napitupulu, Darmawan; Nurdiyanto, Heri; Febriadi, Bayu; Zamzami, Z.
2018-01-01
Searching is a common process performed by many computer users, Raita algorithm is one algorithm that can be used to match and find information in accordance with the patterns entered. Raita algorithm applied to the file search application using java programming language and the results obtained from the testing process of the file search quickly and with accurate results and support many data types.
Predicting consumer behavior with Web search.
Goel, Sharad; Hofman, Jake M; Lahaie, Sébastien; Pennock, David M; Watts, Duncan J
2010-10-12
Recent work has demonstrated that Web search volume can "predict the present," meaning that it can be used to accurately track outcomes such as unemployment levels, auto and home sales, and disease prevalence in near real time. Here we show that what consumers are searching for online can also predict their collective future behavior days or even weeks in advance. Specifically we use search query volume to forecast the opening weekend box-office revenue for feature films, first-month sales of video games, and the rank of songs on the Billboard Hot 100 chart, finding in all cases that search counts are highly predictive of future outcomes. We also find that search counts generally boost the performance of baseline models fit on other publicly available data, where the boost varies from modest to dramatic, depending on the application in question. Finally, we reexamine previous work on tracking flu trends and show that, perhaps surprisingly, the utility of search data relative to a simple autoregressive model is modest. We conclude that in the absence of other data sources, or where small improvements in predictive performance are material, search queries provide a useful guide to the near future.
Genetic algorithms as global random search methods
NASA Technical Reports Server (NTRS)
Peck, Charles C.; Dhawan, Atam P.
1995-01-01
Genetic algorithm behavior is described in terms of the construction and evolution of the sampling distributions over the space of candidate solutions. This novel perspective is motivated by analysis indicating that the schema theory is inadequate for completely and properly explaining genetic algorithm behavior. Based on the proposed theory, it is argued that the similarities of candidate solutions should be exploited directly, rather than encoding candidate solutions and then exploiting their similarities. Proportional selection is characterized as a global search operator, and recombination is characterized as the search process that exploits similarities. Sequential algorithms and many deletion methods are also analyzed. It is shown that by properly constraining the search breadth of recombination operators, convergence of genetic algorithms to a global optimum can be ensured.
Genetic algorithms as global random search methods
NASA Technical Reports Server (NTRS)
Peck, Charles C.; Dhawan, Atam P.
1995-01-01
Genetic algorithm behavior is described in terms of the construction and evolution of the sampling distributions over the space of candidate solutions. This novel perspective is motivated by analysis indicating that that schema theory is inadequate for completely and properly explaining genetic algorithm behavior. Based on the proposed theory, it is argued that the similarities of candidate solutions should be exploited directly, rather than encoding candidate solution and then exploiting their similarities. Proportional selection is characterized as a global search operator, and recombination is characterized as the search process that exploits similarities. Sequential algorithms and many deletion methods are also analyzed. It is shown that by properly constraining the search breadth of recombination operators, convergence of genetic algorithms to a global optimum can be ensured.
Novel citation-based search method for scientific literature: application to meta-analyses.
Janssens, A Cecile J W; Gwinn, M
2015-10-13
Finding eligible studies for meta-analysis and systematic reviews relies on keyword-based searching as the gold standard, despite its inefficiency. Searching based on direct citations is not sufficiently comprehensive. We propose a novel strategy that ranks articles on their degree of co-citation with one or more "known" articles before reviewing their eligibility. In two independent studies, we aimed to reproduce the results of literature searches for sets of published meta-analyses (n = 10 and n = 42). For each meta-analysis, we extracted co-citations for the randomly selected 'known' articles from the Web of Science database, counted their frequencies and screened all articles with a score above a selection threshold. In the second study, we extended the method by retrieving direct citations for all selected articles. In the first study, we retrieved 82% of the studies included in the meta-analyses while screening only 11% as many articles as were screened for the original publications. Articles that we missed were published in non-English languages, published before 1975, published very recently, or available only as conference abstracts. In the second study, we retrieved 79% of included studies while screening half the original number of articles. Citation searching appears to be an efficient and reasonably accurate method for finding articles similar to one or more articles of interest for meta-analysis and reviews.
Comparing image search behaviour in the ARRS GoldMiner search engine and a clinical PACS/RIS.
De-Arteaga, Maria; Eggel, Ivan; Do, Bao; Rubin, Daniel; Kahn, Charles E; Müller, Henning
2015-08-01
Information search has changed the way we manage knowledge and the ubiquity of information access has made search a frequent activity, whether via Internet search engines or increasingly via mobile devices. Medical information search is in this respect no different and much research has been devoted to analyzing the way in which physicians aim to access information. Medical image search is a much smaller domain but has gained much attention as it has different characteristics than search for text documents. While web search log files have been analysed many times to better understand user behaviour, the log files of hospital internal systems for search in a PACS/RIS (Picture Archival and Communication System, Radiology Information System) have rarely been analysed. Such a comparison between a hospital PACS/RIS search and a web system for searching images of the biomedical literature is the goal of this paper. Objectives are to identify similarities and differences in search behaviour of the two systems, which could then be used to optimize existing systems and build new search engines. Log files of the ARRS GoldMiner medical image search engine (freely accessible on the Internet) containing 222,005 queries, and log files of Stanford's internal PACS/RIS search called radTF containing 18,068 queries were analysed. Each query was preprocessed and all query terms were mapped to the RadLex (Radiology Lexicon) terminology, a comprehensive lexicon of radiology terms created and maintained by the Radiological Society of North America, so the semantic content in the queries and the links between terms could be analysed, and synonyms for the same concept could be detected. RadLex was mainly created for the use in radiology reports, to aid structured reporting and the preparation of educational material (Lanlotz, 2006) [1]. In standard medical vocabularies such as MeSH (Medical Subject Headings) and UMLS (Unified Medical Language System) specific terms of radiology are often
Accurate Grid-based Clustering Algorithm with Diagonal Grid Searching and Merging
NASA Astrophysics Data System (ADS)
Liu, Feng; Ye, Chengcheng; Zhu, Erzhou
2017-09-01
Due to the advent of big data, data mining technology has attracted more and more attentions. As an important data analysis method, grid clustering algorithm is fast but with relatively lower accuracy. This paper presents an improved clustering algorithm combined with grid and density parameters. The algorithm first divides the data space into the valid meshes and invalid meshes through grid parameters. Secondly, from the starting point located at the first point of the diagonal of the grids, the algorithm takes the direction of “horizontal right, vertical down” to merge the valid meshes. Furthermore, by the boundary grid processing, the invalid grids are searched and merged when the adjacent left, above, and diagonal-direction grids are all the valid ones. By doing this, the accuracy of clustering is improved. The experimental results have shown that the proposed algorithm is accuracy and relatively faster when compared with some popularly used algorithms.
Nakashima, Ryoichi; Yokosawa, Kazuhiko
2013-02-01
A common search paradigm requires observers to search for a target among undivided spatial arrays of many items. Yet our visual environment is populated with items that are typically arranged within smaller (subdivided) spatial areas outlined by dividers (e.g., frames). It remains unclear how dividers impact visual search performance. In this study, we manipulated the presence and absence of frames and the number of frames subdividing search displays. Observers searched for a target O among Cs, a typically inefficient search task, and for a target C among Os, a typically efficient search. The results indicated that the presence of divider frames in a search display initially interferes with visual search tasks when targets are quickly detected (i.e., efficient search), leading to early interference; conversely, frames later facilitate visual search in tasks in which targets take longer to detect (i.e., inefficient search), leading to late facilitation. Such interference and facilitation appear only for conditions with a specific number of frames. Relative to previous studies of grouping (due to item proximity or similarity), these findings suggest that frame enclosures of multiple items may induce a grouping effect that influences search performance.
Accurate mass measurements and their appropriate use for reliable analyte identification.
Godfrey, A Ruth; Brenton, A Gareth
2012-09-01
Accurate mass instrumentation is becoming increasingly available to non-expert users. This data can be mis-used, particularly for analyte identification. Current best practice in assigning potential elemental formula for reliable analyte identification has been described with modern informatic approaches to analyte elucidation, including chemometric characterisation, data processing and searching using facilities such as the Chemical Abstracts Service (CAS) Registry and Chemspider.
New generation of the multimedia search engines
NASA Astrophysics Data System (ADS)
Mijes Cruz, Mario Humberto; Soto Aldaco, Andrea; Maldonado Cano, Luis Alejandro; López Rodríguez, Mario; Rodríguez Vázqueza, Manuel Antonio; Amaya Reyes, Laura Mariel; Cano Martínez, Elizabeth; Pérez Rosas, Osvaldo Gerardo; Rodríguez Espejo, Luis; Flores Secundino, Jesús Abimelek; Rivera Martínez, José Luis; García Vázquez, Mireya Saraí; Zamudio Fuentes, Luis Miguel; Sánchez Valenzuela, Juan Carlos; Montoya Obeso, Abraham; Ramírez Acosta, Alejandro Álvaro
2016-09-01
Current search engines are based upon search methods that involve the combination of words (text-based search); which has been efficient until now. However, the Internet's growing demand indicates that there's more diversity on it with each passing day. Text-based searches are becoming limited, as most of the information on the Internet can be found in different types of content denominated multimedia content (images, audio files, video files). Indeed, what needs to be improved in current search engines is: search content, and precision; as well as an accurate display of expected search results by the user. Any search can be more precise if it uses more text parameters, but it doesn't help improve the content or speed of the search itself. One solution is to improve them through the characterization of the content for the search in multimedia files. In this article, an analysis of the new generation multimedia search engines is presented, focusing the needs according to new technologies. Multimedia content has become a central part of the flow of information in our daily life. This reflects the necessity of having multimedia search engines, as well as knowing the real tasks that it must comply. Through this analysis, it is shown that there are not many search engines that can perform content searches. The area of research of multimedia search engines of new generation is a multidisciplinary area that's in constant growth, generating tools that satisfy the different needs of new generation systems.
Haunted by a doppelgänger: irrelevant facial similarity affects rule-based judgments.
von Helversen, Bettina; Herzog, Stefan M; Rieskamp, Jörg
2014-01-01
Judging other people is a common and important task. Every day professionals make decisions that affect the lives of other people when they diagnose medical conditions, grant parole, or hire new employees. To prevent discrimination, professional standards require that decision makers render accurate and unbiased judgments solely based on relevant information. Facial similarity to previously encountered persons can be a potential source of bias. Psychological research suggests that people only rely on similarity-based judgment strategies if the provided information does not allow them to make accurate rule-based judgments. Our study shows, however, that facial similarity to previously encountered persons influences judgment even in situations in which relevant information is available for making accurate rule-based judgments and where similarity is irrelevant for the task and relying on similarity is detrimental. In two experiments in an employment context we show that applicants who looked similar to high-performing former employees were judged as more suitable than applicants who looked similar to low-performing former employees. This similarity effect was found despite the fact that the participants used the relevant résumé information about the applicants by following a rule-based judgment strategy. These findings suggest that similarity-based and rule-based processes simultaneously underlie human judgment.
Implicit Object Naming in Visual Search: Evidence from Phonological Competition
Walenchok, Stephen C.; Hout, Michael C.; Goldinger, Stephen D.
2016-01-01
During visual search, people are distracted by objects that visually resemble search targets; search is impaired when targets and distractors share overlapping features. In this study, we examined whether a nonvisual form of similarity, overlapping object names, can also affect search performance. In three experiments, people searched for images of real-world objects (e.g., a beetle) among items whose names either all shared the same phonological onset (/bi/), or were phonologically varied. Participants either searched for one or three potential targets per trial, with search targets designated either visually or verbally. We examined standard visual search (Experiments 1 and 3) and a self-paced serial search task wherein participants manually rejected each distractor (Experiment 2). We hypothesized that people would maintain visual templates when searching for single targets, but would rely more on object names when searching for multiple items and when targets were verbally cued. This reliance on target names would make performance susceptible to interference from similar-sounding distractors. Experiments 1 and 2 showed the predicted interference effect in conditions with high memory load and verbal cues. In Experiment 3, eye-movement results showed that phonological interference resulted from small increases in dwell time to all distractors. The results suggest that distractor names are implicitly activated during search, slowing attention disengagement when targets and distractors share similar names. PMID:27531018
Online Information Search Performance and Search Strategies in a Health Problem-Solving Scenario.
Sharit, Joseph; Taha, Jessica; Berkowsky, Ronald W; Profita, Halley; Czaja, Sara J
2015-01-01
Although access to Internet health information can be beneficial, solving complex health-related problems online is challenging for many individuals. In this study, we investigated the performance of a sample of 60 adults ages 18 to 85 years in using the Internet to resolve a relatively complex health information problem. The impact of age, Internet experience, and cognitive abilities on measures of search time, amount of search, and search accuracy was examined, and a model of Internet information seeking was developed to guide the characterization of participants' search strategies. Internet experience was found to have no impact on performance measures. Older participants exhibited longer search times and lower amounts of search but similar search accuracy performance as their younger counterparts. Overall, greater search accuracy was related to an increased amount of search but not to increased search duration and was primarily attributable to higher cognitive abilities, such as processing speed, reasoning ability, and executive function. There was a tendency for those who were younger, had greater Internet experience, and had higher cognitive abilities to use a bottom-up (i.e., analytic) search strategy, although use of a top-down (i.e., browsing) strategy was not necessarily unsuccessful. Implications of the findings for future studies and design interventions are discussed.
Online Information Search Performance and Search Strategies in a Health Problem-Solving Scenario
Sharit, Joseph; Taha, Jessica; Berkowsky, Ronald W.; Profita, Halley; Czaja, Sara J.
2017-01-01
Although access to Internet health information can be beneficial, solving complex health-related problems online is challenging for many individuals. In this study, we investigated the performance of a sample of 60 adults ages 18 to 85 years in using the Internet to resolve a relatively complex health information problem. The impact of age, Internet experience, and cognitive abilities on measures of search time, amount of search, and search accuracy was examined, and a model of Internet information seeking was developed to guide the characterization of participants’ search strategies. Internet experience was found to have no impact on performance measures. Older participants exhibited longer search times and lower amounts of search but similar search accuracy performance as their younger counterparts. Overall, greater search accuracy was related to an increased amount of search but not to increased search duration and was primarily attributable to higher cognitive abilities, such as processing speed, reasoning ability, and executive function. There was a tendency for those who were younger, had greater Internet experience, and had higher cognitive abilities to use a bottom-up (i.e., analytic) search strategy, although use of a top-down (i.e., browsing) strategy was not necessarily unsuccessful. Implications of the findings for future studies and design interventions are discussed. PMID:29056885
Turning Search into Knowledge Management.
ERIC Educational Resources Information Center
Kaufman, David
2002-01-01
Discussion of knowledge management for electronic data focuses on creating a high quality similarity ranking algorithm. Topics include similarity ranking and unstructured data management; searching, categorization, and summarization of documents; query evaluation; considering sentences in addition to keywords; and vector models. (LRW)
Tachyon search speeds up retrieval of similar sequences by several orders of magnitude.
Tan, Joshua; Kuchibhatla, Durga; Sirota, Fernanda L; Sherman, Westley A; Gattermayer, Tobias; Kwoh, Chia Yee; Eisenhaber, Frank; Schneider, Georg; Maurer-Stroh, Sebastian
2012-06-15
The usage of current sequence search tools becomes increasingly slower as databases of protein sequences continue to grow exponentially. Tachyon, a new algorithm that identifies closely related protein sequences ~200 times faster than standard BLAST, circumvents this limitation with a reduced database and oligopeptide matching heuristic. The tool is publicly accessible as a webserver at http://tachyon.bii.a-star.edu.sg and can also be accessed programmatically through SOAP.
Predicting consumer behavior with Web search
Goel, Sharad; Hofman, Jake M.; Lahaie, Sébastien; Pennock, David M.; Watts, Duncan J.
2010-01-01
Recent work has demonstrated that Web search volume can “predict the present,” meaning that it can be used to accurately track outcomes such as unemployment levels, auto and home sales, and disease prevalence in near real time. Here we show that what consumers are searching for online can also predict their collective future behavior days or even weeks in advance. Specifically we use search query volume to forecast the opening weekend box-office revenue for feature films, first-month sales of video games, and the rank of songs on the Billboard Hot 100 chart, finding in all cases that search counts are highly predictive of future outcomes. We also find that search counts generally boost the performance of baseline models fit on other publicly available data, where the boost varies from modest to dramatic, depending on the application in question. Finally, we reexamine previous work on tracking flu trends and show that, perhaps surprisingly, the utility of search data relative to a simple autoregressive model is modest. We conclude that in the absence of other data sources, or where small improvements in predictive performance are material, search queries provide a useful guide to the near future. PMID:20876140
Generating Personalized Web Search Using Semantic Context
Xu, Zheng; Chen, Hai-Yan; Yu, Jie
2015-01-01
The “one size fits the all” criticism of search engines is that when queries are submitted, the same results are returned to different users. In order to solve this problem, personalized search is proposed, since it can provide different search results based upon the preferences of users. However, existing methods concentrate more on the long-term and independent user profile, and thus reduce the effectiveness of personalized search. In this paper, the method captures the user context to provide accurate preferences of users for effectively personalized search. First, the short-term query context is generated to identify related concepts of the query. Second, the user context is generated based on the click through data of users. Finally, a forgetting factor is introduced to merge the independent user context in a user session, which maintains the evolution of user preferences. Experimental results fully confirm that our approach can successfully represent user context according to individual user information needs. PMID:26000335
Interest in Anesthesia as Reflected by Keyword Searches using Common Search Engines.
Liu, Renyu; García, Paul S; Fleisher, Lee A
2012-01-23
Since current general interest in anesthesia is unknown, we analyzed internet keyword searches to gauge general interest in anesthesia in comparison with surgery and pain. The trend of keyword searches from 2004 to 2010 related to anesthesia and anaesthesia was investigated using Google Insights for Search. The trend of number of peer reviewed articles on anesthesia cited on PubMed and Medline from 2004 to 2010 was investigated. The average cost on advertising on anesthesia, surgery and pain was estimated using Google AdWords. Searching results in other common search engines were also analyzed. Correlation between year and relative number of searches was determined with p< 0.05 considered statistically significant. Searches for the keyword "anesthesia" or "anaesthesia" diminished since 2004 reflected by Google Insights for Search (p< 0.05). The search for "anesthesia side effects" is trending up over the same time period while the search for "anesthesia and safety" is trending down. The search phrase "before anesthesia" is searched more frequently than "preanesthesia" and the search for "before anesthesia" is trending up. Using "pain" as a keyword is steadily increasing over the years indicated. While different search engines may provide different total number of searching results (available posts), the ratios of searching results between some common keywords related to perioperative care are comparable, indicating similar trend. The peer reviewed manuscripts on "anesthesia" and the proportion of papers on "anesthesia and outcome" are trending up. Estimates for spending of advertising dollars are less for anesthesia-related terms when compared to that for pain or surgery due to relative smaller number of searching traffic. General interest in anesthesia (anaesthesia) as measured by internet searches appears to be decreasing. Pain, preanesthesia evaluation, anesthesia and outcome and side effects of anesthesia are the critical areas that anesthesiologists should
ERIC Educational Resources Information Center
Almeida, Renita A.; Dickinson, J. Edwin; Maybery, Murray T.; Badcock, Johanna C.; Badcock, David R.
2010-01-01
The Embedded Figures Test (EFT) requires detecting a shape within a complex background and individuals with autism or high Autism-spectrum Quotient (AQ) scores are faster and more accurate on this task than controls. This research aimed to uncover the visual processes producing this difference. Previously we developed a search task using radial…
SPARK: Adapting Keyword Query to Semantic Search
NASA Astrophysics Data System (ADS)
Zhou, Qi; Wang, Chong; Xiong, Miao; Wang, Haofen; Yu, Yong
Semantic search promises to provide more accurate result than present-day keyword search. However, progress with semantic search has been delayed due to the complexity of its query languages. In this paper, we explore a novel approach of adapting keywords to querying the semantic web: the approach automatically translates keyword queries into formal logic queries so that end users can use familiar keywords to perform semantic search. A prototype system named 'SPARK' has been implemented in light of this approach. Given a keyword query, SPARK outputs a ranked list of SPARQL queries as the translation result. The translation in SPARK consists of three major steps: term mapping, query graph construction and query ranking. Specifically, a probabilistic query ranking model is proposed to select the most likely SPARQL query. In the experiment, SPARK achieved an encouraging translation result.
Wang, Dayong; Otto, Charles; Jain, Anil K
2017-06-01
Given the prevalence of social media websites, one challenge facing computer vision researchers is to devise methods to search for persons of interest among the billions of shared photos on these websites. Despite significant progress in face recognition, searching a large collection of unconstrained face images remains a difficult problem. To address this challenge, we propose a face search system which combines a fast search procedure, coupled with a state-of-the-art commercial off the shelf (COTS) matcher, in a cascaded framework. Given a probe face, we first filter the large gallery of photos to find the top- k most similar faces using features learned by a convolutional neural network. The k retrieved candidates are re-ranked by combining similarities based on deep features and those output by the COTS matcher. We evaluate the proposed face search system on a gallery containing 80 million web-downloaded face images. Experimental results demonstrate that while the deep features perform worse than the COTS matcher on a mugshot dataset (93.7 percent versus 98.6 percent TAR@FAR of 0.01 percent), fusing the deep features with the COTS matcher improves the overall performance ( 99.5 percent TAR@FAR of 0.01 percent). This shows that the learned deep features provide complementary information over representations used in state-of-the-art face matchers. On the unconstrained face image benchmarks, the performance of the learned deep features is competitive with reported accuracies. LFW database: 98.20 percent accuracy under the standard protocol and 88.03 percent TAR@FAR of 0.1 percent under the BLUFR protocol; IJB-A benchmark: 51.0 percent TAR@FAR of 0.1 percent (verification), rank 1 retrieval of 82.2 percent (closed-set search), 61.5 percent FNIR@FAR of 1 percent (open-set search). The proposed face search system offers an excellent trade-off between accuracy and scalability on galleries with millions of images. Additionally, in a face search experiment involving
Health search engine with e-document analysis for reliable search results.
Gaudinat, Arnaud; Ruch, Patrick; Joubert, Michel; Uziel, Philippe; Strauss, Anne; Thonnet, Michèle; Baud, Robert; Spahni, Stéphane; Weber, Patrick; Bonal, Juan; Boyer, Celia; Fieschi, Marius; Geissbuhler, Antoine
2006-01-01
After a review of the existing practical solution available to the citizen to retrieve eHealth document, the paper describes an original specialized search engine WRAPIN. WRAPIN uses advanced cross lingual information retrieval technologies to check information quality by synthesizing medical concepts, conclusions and references contained in the health literature, to identify accurate, relevant sources. Thanks to MeSH terminology [1] (Medical Subject Headings from the U.S. National Library of Medicine) and advanced approaches such as conclusion extraction from structured document, reformulation of the query, WRAPIN offers to the user a privileged access to navigate through multilingual documents without language or medical prerequisites. The results of an evaluation conducted on the WRAPIN prototype show that results of the WRAPIN search engine are perceived as informative 65% (59% for a general-purpose search engine), reliable and trustworthy 72% (41% for the other engine) by users. But it leaves room for improvement such as the increase of database coverage, the explanation of the original functionalities and an audience adaptability. Thanks to evaluation outcomes, WRAPIN is now in exploitation on the HON web site (http://www.healthonnet.org), free of charge. Intended to the citizen it is a good alternative to general-purpose search engines when the user looks up trustworthy health and medical information or wants to check automatically a doubtful content of a Web page.
NIBBS-search for fast and accurate prediction of phenotype-biased metabolic systems.
Schmidt, Matthew C; Rocha, Andrea M; Padmanabhan, Kanchana; Shpanskaya, Yekaterina; Banfield, Jill; Scott, Kathleen; Mihelcic, James R; Samatova, Nagiza F
2012-01-01
Understanding of genotype-phenotype associations is important not only for furthering our knowledge on internal cellular processes, but also essential for providing the foundation necessary for genetic engineering of microorganisms for industrial use (e.g., production of bioenergy or biofuels). However, genotype-phenotype associations alone do not provide enough information to alter an organism's genome to either suppress or exhibit a phenotype. It is important to look at the phenotype-related genes in the context of the genome-scale network to understand how the genes interact with other genes in the organism. Identification of metabolic subsystems involved in the expression of the phenotype is one way of placing the phenotype-related genes in the context of the entire network. A metabolic system refers to a metabolic network subgraph; nodes are compounds and edges labels are the enzymes that catalyze the reaction. The metabolic subsystem could be part of a single metabolic pathway or span parts of multiple pathways. Arguably, comparative genome-scale metabolic network analysis is a promising strategy to identify these phenotype-related metabolic subsystems. Network Instance-Based Biased Subgraph Search (NIBBS) is a graph-theoretic method for genome-scale metabolic network comparative analysis that can identify metabolic systems that are statistically biased toward phenotype-expressing organismal networks. We set up experiments with target phenotypes like hydrogen production, TCA expression, and acid-tolerance. We show via extensive literature search that some of the resulting metabolic subsystems are indeed phenotype-related and formulate hypotheses for other systems in terms of their role in phenotype expression. NIBBS is also orders of magnitude faster than MULE, one of the most efficient maximal frequent subgraph mining algorithms that could be adjusted for this problem. Also, the set of phenotype-biased metabolic systems output by NIBBS comes very close to
NIBBS-Search for Fast and Accurate Prediction of Phenotype-Biased Metabolic Systems
Padmanabhan, Kanchana; Shpanskaya, Yekaterina; Banfield, Jill; Scott, Kathleen; Mihelcic, James R.; Samatova, Nagiza F.
2012-01-01
Understanding of genotype-phenotype associations is important not only for furthering our knowledge on internal cellular processes, but also essential for providing the foundation necessary for genetic engineering of microorganisms for industrial use (e.g., production of bioenergy or biofuels). However, genotype-phenotype associations alone do not provide enough information to alter an organism's genome to either suppress or exhibit a phenotype. It is important to look at the phenotype-related genes in the context of the genome-scale network to understand how the genes interact with other genes in the organism. Identification of metabolic subsystems involved in the expression of the phenotype is one way of placing the phenotype-related genes in the context of the entire network. A metabolic system refers to a metabolic network subgraph; nodes are compounds and edges labels are the enzymes that catalyze the reaction. The metabolic subsystem could be part of a single metabolic pathway or span parts of multiple pathways. Arguably, comparative genome-scale metabolic network analysis is a promising strategy to identify these phenotype-related metabolic subsystems. Network Instance-Based Biased Subgraph Search (NIBBS) is a graph-theoretic method for genome-scale metabolic network comparative analysis that can identify metabolic systems that are statistically biased toward phenotype-expressing organismal networks. We set up experiments with target phenotypes like hydrogen production, TCA expression, and acid-tolerance. We show via extensive literature search that some of the resulting metabolic subsystems are indeed phenotype-related and formulate hypotheses for other systems in terms of their role in phenotype expression. NIBBS is also orders of magnitude faster than MULE, one of the most efficient maximal frequent subgraph mining algorithms that could be adjusted for this problem. Also, the set of phenotype-biased metabolic systems output by NIBBS comes very close to
Dual Target Search is Neither Purely Simultaneous nor Purely Successive.
Cave, Kyle R; Menneer, Tamaryn; Nomani, Mohammad S; Stroud, Michael J; Donnelly, Nick
2017-08-31
Previous research shows that visual search for two different targets is less efficient than search for a single target. Stroud, Menneer, Cave and Donnelly (2012) concluded that two target colours are represented separately based on modeling the fixation patterns. Although those analyses provide evidence for two separate target representations, they do not show whether participants search simultaneously for both targets, or first search for one target and then the other. Some studies suggest that multiple target representations are simultaneously active, while others indicate that search can be voluntarily simultaneous, or switching, or a mixture of both. Stroud et al.'s participants were not explicitly instructed to use any particular strategy. These data were revisited to determine which strategy was employed. Each fixated item was categorised according to whether its colour was more similar to one target or the other. Once an item similar to one target is fixated, the next fixated item is more likely to be similar to that target than the other, showing that at a given moment during search, one target is generally favoured. However, the search for one target is not completed before search for the other begins. Instead, there are often short runs of one or two fixations to distractors similar to one target, with each run followed by a switch to the other target. Thus, the results suggest that one target is more highly weighted than the other at any given time, but not to the extent that search is purely successive.
Modelling eye movements in a categorical search task
Zelinsky, Gregory J.; Adeli, Hossein; Peng, Yifan; Samaras, Dimitris
2013-01-01
We introduce a model of eye movements during categorical search, the task of finding and recognizing categorically defined targets. It extends a previous model of eye movements during search (target acquisition model, TAM) by using distances from an support vector machine classification boundary to create probability maps indicating pixel-by-pixel evidence for the target category in search images. Other additions include functionality enabling target-absent searches, and a fixation-based blurring of the search images now based on a mapping between visual and collicular space. We tested this model on images from a previously conducted variable set-size (6/13/20) present/absent search experiment where participants searched for categorically defined teddy bear targets among random category distractors. The model not only captured target-present/absent set-size effects, but also accurately predicted for all conditions the numbers of fixations made prior to search judgements. It also predicted the percentages of first eye movements during search landing on targets, a conservative measure of search guidance. Effects of set size on false negative and false positive errors were also captured, but error rates in general were overestimated. We conclude that visual features discriminating a target category from non-targets can be learned and used to guide eye movements during categorical search. PMID:24018720
Guiding Conformation Space Search with an All-Atom Energy Potential
Brunette, TJ; Brock, Oliver
2009-01-01
The most significant impediment for protein structure prediction is the inadequacy of conformation space search. Conformation space is too large and the energy landscape too rugged for existing search methods to consistently find near-optimal minima. To alleviate this problem, we present model-based search, a novel conformation space search method. Model-based search uses highly accurate information obtained during search to build an approximate, partial model of the energy landscape. Model-based search aggregates information in the model as it progresses, and in turn uses this information to guide exploration towards regions most likely to contain a near-optimal minimum. We validate our method by predicting the structure of 32 proteins, ranging in length from 49 to 213 amino acids. Our results demonstrate that model-based search is more effective at finding low-energy conformations in high-dimensional conformation spaces than existing search methods. The reduction in energy translates into structure predictions of increased accuracy. PMID:18536015
Analysis of a librarian-mediated literature search service.
Friesen, Carol; Lê, Mê-Linh; Cooke, Carol; Raynard, Melissa
2015-01-01
Librarian-mediated literature searching is a key service provided at medical libraries. This analysis outlines ten years of data on 19,248 literature searches and describes information on the volume and frequency of search requests, time spent per search, databases used, and professional designations of the patron requestors. Combined with information on best practices for expert searching and evaluations of similar services, these findings were used to form recommendations on the improvement and standardization of a literature search service at a large health library system.
Optimal directed searches for continuous gravitational waves
NASA Astrophysics Data System (ADS)
Ming, Jing; Krishnan, Badri; Papa, Maria Alessandra; Aulbert, Carsten; Fehrmann, Henning
2016-03-01
Wide parameter space searches for long-lived continuous gravitational wave signals are computationally limited. It is therefore critically important that the available computational resources are used rationally. In this paper we consider directed searches, i.e., targets for which the sky position is known accurately but the frequency and spin-down parameters are completely unknown. Given a list of such potential astrophysical targets, we therefore need to prioritize. On which target(s) should we spend scarce computing resources? What parameter space region in frequency and spin-down should we search through? Finally, what is the optimal search setup that we should use? In this paper we present a general framework that allows us to solve all three of these problems. This framework is based on maximizing the probability of making a detection subject to a constraint on the maximum available computational cost. We illustrate the method for a simplified problem.
Improving visual search in instruction manuals using pictograms.
Kovačević, Dorotea; Brozović, Maja; Možina, Klementina
2016-11-01
Instruction manuals provide important messages about the proper use of a product. They should communicate in such a way that they facilitate users' searches for specific information. Despite the increasing research interest in visual search, there is a lack of empirical knowledge concerning the role of pictograms in search performance during the browsing of a manual's pages. This study investigates how the inclusion of pictograms improves the search for the target information. Furthermore, it examines whether this search process is influenced by the visual similarity between the pictograms and the searched for information. On the basis of eye-tracking measurements, as objective indicators of the participants' visual attention, it was found that pictograms can be a useful element of search strategy. Another interesting finding was that boldface highlighting is a more effective method for improving user experience in information seeking, rather than the similarity between the pictorial and adjacent textual information. Implications for designing effective user manuals are discussed. Practitioner Summary: Users often view instruction manuals with the aim of finding specific information. We used eye-tracking technology to examine different manual pages in order to improve the user's visual search for target information. The results indicate that the use of pictograms and bold highlighting of relevant information facilitate the search process.
Natural Language Search Interfaces: Health Data Needs Single-Field Variable Search.
Jay, Caroline; Harper, Simon; Dunlop, Ian; Smith, Sam; Sufi, Shoaib; Goble, Carole; Buchan, Iain
2016-01-14
Data discovery, particularly the discovery of key variables and their inter-relationships, is key to secondary data analysis, and in-turn, the evolving field of data science. Interface designers have presumed that their users are domain experts, and so they have provided complex interfaces to support these "experts." Such interfaces hark back to a time when searches needed to be accurate first time as there was a high computational cost associated with each search. Our work is part of a governmental research initiative between the medical and social research funding bodies to improve the use of social data in medical research. The cross-disciplinary nature of data science can make no assumptions regarding the domain expertise of a particular scientist, whose interests may intersect multiple domains. Here we consider the common requirement for scientists to seek archived data for secondary analysis. This has more in common with search needs of the "Google generation" than with their single-domain, single-tool forebears. Our study compares a Google-like interface with traditional ways of searching for noncomplex health data in a data archive. Two user interfaces are evaluated for the same set of tasks in extracting data from surveys stored in the UK Data Archive (UKDA). One interface, Web search, is "Google-like," enabling users to browse, search for, and view metadata about study variables, whereas the other, traditional search, has standard multioption user interface. Using a comprehensive set of tasks with 20 volunteers, we found that the Web search interface met data discovery needs and expectations better than the traditional search. A task × interface repeated measures analysis showed a main effect indicating that answers found through the Web search interface were more likely to be correct (F1,19=37.3, P<.001), with a main effect of task (F3,57=6.3, P<.001). Further, participants completed the task significantly faster using the Web search interface (F1
Natural Language Search Interfaces: Health Data Needs Single-Field Variable Search
Smith, Sam; Sufi, Shoaib; Goble, Carole; Buchan, Iain
2016-01-01
Background Data discovery, particularly the discovery of key variables and their inter-relationships, is key to secondary data analysis, and in-turn, the evolving field of data science. Interface designers have presumed that their users are domain experts, and so they have provided complex interfaces to support these “experts.” Such interfaces hark back to a time when searches needed to be accurate first time as there was a high computational cost associated with each search. Our work is part of a governmental research initiative between the medical and social research funding bodies to improve the use of social data in medical research. Objective The cross-disciplinary nature of data science can make no assumptions regarding the domain expertise of a particular scientist, whose interests may intersect multiple domains. Here we consider the common requirement for scientists to seek archived data for secondary analysis. This has more in common with search needs of the “Google generation” than with their single-domain, single-tool forebears. Our study compares a Google-like interface with traditional ways of searching for noncomplex health data in a data archive. Methods Two user interfaces are evaluated for the same set of tasks in extracting data from surveys stored in the UK Data Archive (UKDA). One interface, Web search, is “Google-like,” enabling users to browse, search for, and view metadata about study variables, whereas the other, traditional search, has standard multioption user interface. Results Using a comprehensive set of tasks with 20 volunteers, we found that the Web search interface met data discovery needs and expectations better than the traditional search. A task × interface repeated measures analysis showed a main effect indicating that answers found through the Web search interface were more likely to be correct (F 1,19=37.3, P<.001), with a main effect of task (F 3,57=6.3, P<.001). Further, participants completed the task
Interest in Anesthesia as Reflected by Keyword Searches using Common Search Engines
Liu, Renyu; García, Paul S.; Fleisher, Lee A.
2012-01-01
Background Since current general interest in anesthesia is unknown, we analyzed internet keyword searches to gauge general interest in anesthesia in comparison with surgery and pain. Methods The trend of keyword searches from 2004 to 2010 related to anesthesia and anaesthesia was investigated using Google Insights for Search. The trend of number of peer reviewed articles on anesthesia cited on PubMed and Medline from 2004 to 2010 was investigated. The average cost on advertising on anesthesia, surgery and pain was estimated using Google AdWords. Searching results in other common search engines were also analyzed. Correlation between year and relative number of searches was determined with p< 0.05 considered statistically significant. Results Searches for the keyword “anesthesia” or “anaesthesia” diminished since 2004 reflected by Google Insights for Search (p< 0.05). The search for “anesthesia side effects” is trending up over the same time period while the search for “anesthesia and safety” is trending down. The search phrase “before anesthesia” is searched more frequently than “preanesthesia” and the search for “before anesthesia” is trending up. Using “pain” as a keyword is steadily increasing over the years indicated. While different search engines may provide different total number of searching results (available posts), the ratios of searching results between some common keywords related to perioperative care are comparable, indicating similar trend. The peer reviewed manuscripts on “anesthesia” and the proportion of papers on “anesthesia and outcome” are trending up. Estimates for spending of advertising dollars are less for anesthesia-related terms when compared to that for pain or surgery due to relative smaller number of searching traffic. Conclusions General interest in anesthesia (anaesthesia) as measured by internet searches appears to be decreasing. Pain, preanesthesia evaluation, anesthesia and outcome and side
Reading and visual search: a developmental study in normal children.
Seassau, Magali; Bucci, Maria-Pia
2013-01-01
Studies dealing with developmental aspects of binocular eye movement behaviour during reading are scarce. In this study we have explored binocular strategies during reading and during visual search tasks in a large population of normal young readers. Binocular eye movements were recorded using an infrared video-oculography system in sixty-nine children (aged 6 to 15) and in a group of 10 adults (aged 24 to 39). The main findings are (i) in both tasks the number of progressive saccades (to the right) and regressive saccades (to the left) decreases with age; (ii) the amplitude of progressive saccades increases with age in the reading task only; (iii) in both tasks, the duration of fixations as well as the total duration of the task decreases with age; (iv) in both tasks, the amplitude of disconjugacy recorded during and after the saccades decreases with age; (v) children are significantly more accurate in reading than in visual search after 10 years of age. Data reported here confirms and expands previous studies on children's reading. The new finding is that younger children show poorer coordination than adults, both while reading and while performing a visual search task. Both reading skills and binocular saccades coordination improve with age and children reach a similar level to adults after the age of 10. This finding is most likely related to the fact that learning mechanisms responsible for saccade yoking develop during childhood until adolescence.
Perceptual load corresponds with factors known to influence visual search.
Roper, Zachary J J; Cosman, Joshua D; Vecera, Shaun P
2013-10-01
One account of the early versus late selection debate in attention proposes that perceptual load determines the locus of selection. Attention selects stimuli at a late processing level under low-load conditions but selects stimuli at an early level under high-load conditions. Despite the successes of perceptual load theory, a noncircular definition of perceptual load remains elusive. We investigated the factors that influence perceptual load by using manipulations that have been studied extensively in visual search, namely target-distractor similarity and distractor-distractor similarity. Consistent with previous work, search was most efficient when targets and distractors were dissimilar and the displays contained homogeneous distractors; search became less efficient when target-distractor similarity increased irrespective of display heterogeneity. Importantly, we used these same stimuli in a typical perceptual load task that measured attentional spillover to a task-irrelevant flanker. We found a strong correspondence between search efficiency and perceptual load; stimuli that generated efficient searches produced flanker interference effects, suggesting that such displays involved low perceptual load. Flanker interference effects were reduced in displays that produced less efficient searches. Furthermore, our results demonstrate that search difficulty, as measured by search intercept, has little bearing on perceptual load. We conclude that rather than be arbitrarily defined, perceptual load might be defined by well-characterized, continuous factors that influence visual search. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Perceptual load corresponds with factors known to influence visual search
Roper, Zachary J. J.; Cosman, Joshua D.; Vecera, Shaun P.
2014-01-01
One account of the early versus late selection debate in attention proposes that perceptual load determines the locus of selection. Attention selects stimuli at a late processing level under low-load conditions but selects stimuli at an early level under high-load conditions. Despite the successes of perceptual load theory, a non-circular definition of perceptual load remains elusive. We investigated the factors that influence perceptual load by using manipulations that have been studied extensively in visual search, namely target-distractor similarity and distractor-distractor similarity. Consistent with previous work, search was most efficient when targets and distractors were dissimilar and the displays contained homogeneous distractors; search became less efficient when target-distractor similarity increased irrespective of display heterogeneity. Importantly, we used these same stimuli in a typical perceptual load task that measured attentional spill-over to a task-irrelevant flanker. We found a strong correspondence between search efficiency and perceptual load; stimuli that generated efficient searches produced flanker interference effects, suggesting that such displays involved low perceptual load. Flanker interference effects were reduced in displays that produced less efficient searches. Furthermore, our results demonstrate that search difficulty, as measured by search intercept, has little bearing on perceptual load. These results suggest that perceptual load might be defined in part by well-characterized, continuous factors that influence visual search. PMID:23398258
Dynamic Search and Working Memory in Social Recall
ERIC Educational Resources Information Center
Hills, Thomas T.; Pachur, Thorsten
2012-01-01
What are the mechanisms underlying search in social memory (e.g., remembering the people one knows)? Do the search mechanisms involve dynamic local-to-global transitions similar to semantic search, and are these transitions governed by the general control of attention, associated with working memory span? To find out, we asked participants to…
Dao, Tien Tuan; Hoang, Tuan Nha; Ta, Xuan Hien; Tho, Marie Christine Ho Ba
2013-02-01
Human musculoskeletal system resources of the human body are valuable for the learning and medical purposes. Internet-based information from conventional search engines such as Google or Yahoo cannot response to the need of useful, accurate, reliable and good-quality human musculoskeletal resources related to medical processes, pathological knowledge and practical expertise. In this present work, an advanced knowledge-based personalized search engine was developed. Our search engine was based on a client-server multi-layer multi-agent architecture and the principle of semantic web services to acquire dynamically accurate and reliable HMSR information by a semantic processing and visualization approach. A security-enhanced mechanism was applied to protect the medical information. A multi-agent crawler was implemented to develop a content-based database of HMSR information. A new semantic-based PageRank score with related mathematical formulas were also defined and implemented. As the results, semantic web service descriptions were presented in OWL, WSDL and OWL-S formats. Operational scenarios with related web-based interfaces for personal computers and mobile devices were presented and analyzed. Functional comparison between our knowledge-based search engine, a conventional search engine and a semantic search engine showed the originality and the robustness of our knowledge-based personalized search engine. In fact, our knowledge-based personalized search engine allows different users such as orthopedic patient and experts or healthcare system managers or medical students to access remotely into useful, accurate, reliable and good-quality HMSR information for their learning and medical purposes. Copyright © 2012 Elsevier Inc. All rights reserved.
GWFASTA: server for FASTA search in eukaryotic and microbial genomes.
Issac, Biju; Raghava, G P S
2002-09-01
Similarity searches are a powerful method for solving important biological problems such as database scanning, evolutionary studies, gene prediction, and protein structure prediction. FASTA is a widely used sequence comparison tool for rapid database scanning. Here we describe the GWFASTA server that was developed to assist the FASTA user in similarity searches against partially and/or completely sequenced genomes. GWFASTA consists of more than 60 microbial genomes, eight eukaryote genomes, and proteomes of annotatedgenomes. Infact, it provides the maximum number of databases for similarity searching from a single platform. GWFASTA allows the submission of more than one sequence as a single query for a FASTA search. It also provides integrated post-processing of FASTA output, including compositional analysis of proteins, multiple sequences alignment, and phylogenetic analysis. Furthermore, it summarizes the search results organism-wise for prokaryotes and chromosome-wise for eukaryotes. Thus, the integration of different tools for sequence analyses makes GWFASTA a powerful toolfor biologists.
Visual search and autism symptoms: What young children search for and co-occurring ADHD matter.
Doherty, Brianna R; Charman, Tony; Johnson, Mark H; Scerif, Gaia; Gliga, Teodora
2018-05-03
Superior visual search is one of the most common findings in the autism spectrum disorder (ASD) literature. Here, we ascertain how generalizable these findings are across task and participant characteristics, in light of recent replication failures. We tested 106 3-year-old children at familial risk for ASD, a sample that presents high ASD and ADHD symptoms, and 25 control participants, in three multi-target search conditions: easy exemplar search (look for cats amongst artefacts), difficult exemplar search (look for dogs amongst chairs/tables perceptually similar to dogs), and categorical search (look for animals amongst artefacts). Performance was related to dimensional measures of ASD and ADHD, in agreement with current research domain criteria (RDoC). We found that ASD symptom severity did not associate with enhanced performance in search, but did associate with poorer categorical search in particular, consistent with literature describing impairments in categorical knowledge in ASD. Furthermore, ASD and ADHD symptoms were both associated with more disorganized search paths across all conditions. Thus, ASD traits do not always convey an advantage in visual search; on the contrary, ASD traits may be associated with difficulties in search depending upon the nature of the stimuli (e.g., exemplar vs. categorical search) and the presence of co-occurring symptoms. © 2018 John Wiley & Sons Ltd.
Tachyon search speeds up retrieval of similar sequences by several orders of magnitude
Tan, Joshua; Kuchibhatla, Durga; Sirota, Fernanda L.; Sherman, Westley A.; Gattermayer, Tobias; Kwoh, Chia Yee; Eisenhaber, Frank; Schneider, Georg; Maurer-Stroh, Sebastian
2012-01-01
Summary: The usage of current sequence search tools becomes increasingly slower as databases of protein sequences continue to grow exponentially. Tachyon, a new algorithm that identifies closely related protein sequences ~200 times faster than standard BLAST, circumvents this limitation with a reduced database and oligopeptide matching heuristic. Availability and implementation: The tool is publicly accessible as a webserver at http://tachyon.bii.a-star.edu.sg and can also be accessed programmatically through SOAP. Contact: sebastianms@bii.a-star.edu.sg Supplementary information: Supplementary data are available at the Bioinformatics online. PMID:22531216
A cross-species analysis method to analyze animal models' similarity to human's disease state
2012-01-01
Background Animal models are indispensable tools in studying the cause of human diseases and searching for the treatments. The scientific value of an animal model depends on the accurate mimicry of human diseases. The primary goal of the current study was to develop a cross-species method by using the animal models' expression data to evaluate the similarity to human diseases' and assess drug molecules' efficiency in drug research. Therefore, we hoped to reveal that it is feasible and useful to compare gene expression profiles across species in the studies of pathology, toxicology, drug repositioning, and drug action mechanism. Results We developed a cross-species analysis method to analyze animal models' similarity to human diseases and effectiveness in drug research by utilizing the existing animal gene expression data in the public database, and mined some meaningful information to help drug research, such as potential drug candidates, possible drug repositioning, side effects and analysis in pharmacology. New animal models could be evaluated by our method before they are used in drug discovery. We applied the method to several cases of known animal model expression profiles and obtained some useful information to help drug research. We found that trichostatin A and some other HDACs could have very similar response across cell lines and species at gene expression level. Mouse hypoxia model could accurately mimic the human hypoxia, while mouse diabetes drug model might have some limitation. The transgenic mouse of Alzheimer was a useful model and we deeply analyzed the biological mechanisms of some drugs in this case. In addition, all the cases could provide some ideas for drug discovery and drug repositioning. Conclusions We developed a new cross-species gene expression module comparison method to use animal models' expression data to analyse the effectiveness of animal models in drug research. Moreover, through data integration, our method could be applied for
A cross-species analysis method to analyze animal models' similarity to human's disease state.
Yu, Shuhao; Zheng, Lulu; Li, Yun; Li, Chunyan; Ma, Chenchen; Li, Yixue; Li, Xuan; Hao, Pei
2012-01-01
Animal models are indispensable tools in studying the cause of human diseases and searching for the treatments. The scientific value of an animal model depends on the accurate mimicry of human diseases. The primary goal of the current study was to develop a cross-species method by using the animal models' expression data to evaluate the similarity to human diseases' and assess drug molecules' efficiency in drug research. Therefore, we hoped to reveal that it is feasible and useful to compare gene expression profiles across species in the studies of pathology, toxicology, drug repositioning, and drug action mechanism. We developed a cross-species analysis method to analyze animal models' similarity to human diseases and effectiveness in drug research by utilizing the existing animal gene expression data in the public database, and mined some meaningful information to help drug research, such as potential drug candidates, possible drug repositioning, side effects and analysis in pharmacology. New animal models could be evaluated by our method before they are used in drug discovery. We applied the method to several cases of known animal model expression profiles and obtained some useful information to help drug research. We found that trichostatin A and some other HDACs could have very similar response across cell lines and species at gene expression level. Mouse hypoxia model could accurately mimic the human hypoxia, while mouse diabetes drug model might have some limitation. The transgenic mouse of Alzheimer was a useful model and we deeply analyzed the biological mechanisms of some drugs in this case. In addition, all the cases could provide some ideas for drug discovery and drug repositioning. We developed a new cross-species gene expression module comparison method to use animal models' expression data to analyse the effectiveness of animal models in drug research. Moreover, through data integration, our method could be applied for drug research, such as
Semantic Features for Classifying Referring Search Terms
DOE Office of Scientific and Technical Information (OSTI.GOV)
May, Chandler J.; Henry, Michael J.; McGrath, Liam R.
2012-05-11
When an internet user clicks on a result in a search engine, a request is submitted to the destination web server that includes a referrer field containing the search terms given by the user. Using this information, website owners can analyze the search terms leading to their websites to better understand their visitors needs. This work explores some of the features that can be used for classification-based analysis of such referring search terms. We present initial results for the example task of classifying HTTP requests countries of origin. A system that can accurately predict the country of origin from querymore » text may be a valuable complement to IP lookup methods which are susceptible to the obfuscation of dereferrers or proxies. We suggest that the addition of semantic features improves classifier performance in this example application. We begin by looking at related work and presenting our approach. After describing initial experiments and results, we discuss paths forward for this work.« less
The Talent Search Model: Past, Present, and Future
ERIC Educational Resources Information Center
Swiatek, Mary Ann
2007-01-01
Typical standardized achievement tests cannot provide accurate information about gifted students' abilities because they are not challenging enough for such students. Talent searches solve this problem through above-level testing--using tests designed for older students to raise the ceiling for younger, gifted students. Currently, talent search…
Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews.
Cheng Ye, M S; Fabbri, Daniel
2018-05-21
Word embeddings project semantically similar terms into nearby points in a vector space. When trained on clinical text, these embeddings can be leveraged to improve keyword search and text highlighting. In this paper, we present methods to refine the selection process of similar terms from multiple EMR-based word embeddings, and evaluate their performance quantitatively and qualitatively across multiple chart review tasks. Word embeddings were trained on each clinical note type in an EMR. These embeddings were then combined, weighted, and truncated to select a refined set of similar terms to be used in keyword search and text highlighting. To evaluate their quality, we measured the similar terms' information retrieval (IR) performance using precision-at-K (P@5, P@10). Additionally a user study evaluated users' search term preferences, while a timing study measured the time to answer a question from a clinical chart. The refined terms outperformed the baseline method's information retrieval performance (e.g., increasing the average P@5 from 0.48 to 0.60). Additionally, the refined terms were preferred by most users, and reduced the average time to answer a question. Clinical information can be more quickly retrieved and synthesized when using semantically similar term from multiple embeddings. Copyright © 2018. Published by Elsevier Inc.
Parkesh, Raman; Childs-Disney, Jessica L; Nakamori, Masayuki; Kumar, Amit; Wang, Eric; Wang, Thomas; Hoskins, Jason; Tran, Tuan; Housman, David; Thornton, Charles A; Disney, Matthew D
2012-03-14
Myotonic dystrophy type 1 (DM1) is a triplet repeating disorder caused by expanded CTG repeats in the 3'-untranslated region of the dystrophia myotonica protein kinase (DMPK) gene. The transcribed repeats fold into an RNA hairpin with multiple copies of a 5'CUG/3'GUC motif that binds the RNA splicing regulator muscleblind-like 1 protein (MBNL1). Sequestration of MBNL1 by expanded r(CUG) repeats causes splicing defects in a subset of pre-mRNAs including the insulin receptor, the muscle-specific chloride ion channel, sarco(endo)plasmic reticulum Ca(2+) ATPase 1, and cardiac troponin T. Based on these observations, the development of small-molecule ligands that target specifically expanded DM1 repeats could be of use as therapeutics. In the present study, chemical similarity searching was employed to improve the efficacy of pentamidine and Hoechst 33258 ligands that have been shown previously to target the DM1 triplet repeat. A series of in vitro inhibitors of the RNA-protein complex were identified with low micromolar IC(50)'s, which are >20-fold more potent than the query compounds. Importantly, a bis-benzimidazole identified from the Hoechst query improves DM1-associated pre-mRNA splicing defects in cell and mouse models of DM1 (when dosed with 1 mM and 100 mg/kg, respectively). Since Hoechst 33258 was identified as a DM1 binder through analysis of an RNA motif-ligand database, these studies suggest that lead ligands targeting RNA with improved biological activity can be identified by using a synergistic approach that combines analysis of known RNA-ligand interactions with chemical similarity searching.
Basophile: Accurate Fragment Charge State Prediction Improves Peptide Identification Rates
Wang, Dong; Dasari, Surendra; Chambers, Matthew C.; ...
2013-03-07
In shotgun proteomics, database search algorithms rely on fragmentation models to predict fragment ions that should be observed for a given peptide sequence. The most widely used strategy (Naive model) is oversimplified, cleaving all peptide bonds with equal probability to produce fragments of all charges below that of the precursor ion. More accurate models, based on fragmentation simulation, are too computationally intensive for on-the-fly use in database search algorithms. We have created an ordinal-regression-based model called Basophile that takes fragment size and basic residue distribution into account when determining the charge retention during CID/higher-energy collision induced dissociation (HCD) of chargedmore » peptides. This model improves the accuracy of predictions by reducing the number of unnecessary fragments that are routinely predicted for highly-charged precursors. Basophile increased the identification rates by 26% (on average) over the Naive model, when analyzing triply-charged precursors from ion trap data. Basophile achieves simplicity and speed by solving the prediction problem with an ordinal regression equation, which can be incorporated into any database search software for shotgun proteomic identification.« less
Deciu, Cosmin; Sun, Jun; Wall, Mark A
2007-09-01
We discuss several aspects related to load balancing of database search jobs in a distributed computing environment, such as Linux cluster. Load balancing is a technique for making the most of multiple computational resources, which is particularly relevant in environments in which the usage of such resources is very high. The particular case of the Sequest program is considered here, but the general methodology should apply to any similar database search program. We show how the runtimes for Sequest searches of tandem mass spectral data can be predicted from profiles of previous representative searches, and how this information can be used for better load balancing of novel data. A well-known heuristic load balancing method is shown to be applicable to this problem, and its performance is analyzed for a variety of search parameters.
Using internet searches for influenza surveillance.
Polgreen, Philip M; Chen, Yiling; Pennock, David M; Nelson, Forrest D
2008-12-01
The Internet is an important source of health information. Thus, the frequency of Internet searches may provide information regarding infectious disease activity. As an example, we examined the relationship between searches for influenza and actual influenza occurrence. Using search queries from the Yahoo! search engine ( http://search.yahoo.com ) from March 2004 through May 2008, we counted daily unique queries originating in the United States that contained influenza-related search terms. Counts were divided by the total number of searches, and the resulting daily fraction of searches was averaged over the week. We estimated linear models, using searches with 1-10-week lead times as explanatory variables to predict the percentage of cultures positive for influenza and deaths attributable to pneumonia and influenza in the United States. With use of the frequency of searches, our models predicted an increase in cultures positive for influenza 1-3 weeks in advance of when they occurred (P < .001), and similar models predicted an increase in mortality attributable to pneumonia and influenza up to 5 weeks in advance (P < .001). Search-term surveillance may provide an additional tool for disease surveillance.
Using the Dual-Target Cost to Explore the Nature of Search Target Representations
ERIC Educational Resources Information Center
Stroud, Michael J.; Menneer, Tamaryn; Cave, Kyle R.; Donnelly, Nick
2012-01-01
Eye movements were monitored to examine search efficiency and infer how color is mentally represented to guide search for multiple targets. Observers located a single color target very efficiently by fixating colors similar to the target. However, simultaneous search for 2 colors produced a dual-target cost. In addition, as the similarity between…
Keehn, Brandon; Joseph, Robert M
2016-03-01
In multiple conjunction search, the target is not known in advance but is defined only with respect to the distractors in a given search array, thus reducing the contributions of bottom-up and top-down attentional and perceptual processes during search. This study investigated whether the superior visual search skills typically demonstrated by individuals with autism spectrum disorder (ASD) would be evident in multiple conjunction search. Thirty-two children with ASD and 32 age- and nonverbal IQ-matched typically developing (TD) children were administered a multiple conjunction search task. Contrary to findings from the large majority of studies on visual search in ASD, response times of individuals with ASD were significantly slower than those of their TD peers. Evidence of slowed performance in ASD suggests that the mechanisms responsible for superior ASD performance in other visual search paradigms are not available in multiple conjunction search. Although the ASD group failed to exhibit superior performance, they showed efficient search and intertrial priming levels similar to the TD group. Efficient search indicates that ASD participants were able to group distractors into distinct subsets. In summary, while demonstrating grouping and priming effects comparable to those exhibited by their TD peers, children with ASD were slowed in their performance on a multiple conjunction search task, suggesting that their usual superior performance in visual search tasks is specifically dependent on top-down and/or bottom-up attentional and perceptual processes. © 2015 International Society for Autism Research, Wiley Periodicals, Inc.
Privacy-Preserving Patient Similarity Learning in a Federated Environment: Development and Analysis
Sun, Jimeng; Wang, Fei; Wang, Shuang; Jun, Chi-Hyuck; Jiang, Xiaoqian
2018-01-01
Background There is an urgent need for the development of global analytic frameworks that can perform analyses in a privacy-preserving federated environment across multiple institutions without privacy leakage. A few studies on the topic of federated medical analysis have been conducted recently with the focus on several algorithms. However, none of them have solved similar patient matching, which is useful for applications such as cohort construction for cross-institution observational studies, disease surveillance, and clinical trials recruitment. Objective The aim of this study was to present a privacy-preserving platform in a federated setting for patient similarity learning across institutions. Without sharing patient-level information, our model can find similar patients from one hospital to another. Methods We proposed a federated patient hashing framework and developed a novel algorithm to learn context-specific hash codes to represent patients across institutions. The similarities between patients can be efficiently computed using the resulting hash codes of corresponding patients. To avoid security attack from reverse engineering on the model, we applied homomorphic encryption to patient similarity search in a federated setting. Results We used sequential medical events extracted from the Multiparameter Intelligent Monitoring in Intensive Care-III database to evaluate the proposed algorithm in predicting the incidence of five diseases independently. Our algorithm achieved averaged area under the curves of 0.9154 and 0.8012 with balanced and imbalanced data, respectively, in κ-nearest neighbor with κ=3. We also confirmed privacy preservation in similarity search by using homomorphic encryption. Conclusions The proposed algorithm can help search similar patients across institutions effectively to support federated data analysis in a privacy-preserving manner. PMID:29653917
SIFTER search: a web server for accurate phylogeny-based protein function prediction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sahraeian, Sayed M.; Luo, Kevin R.; Brenner, Steven E.
We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access tomore » precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. Lastly, the SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.« less
SIFTER search: a web server for accurate phylogeny-based protein function prediction
Sahraeian, Sayed M.; Luo, Kevin R.; Brenner, Steven E.
2015-05-15
We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access tomore » precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. Lastly, the SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.« less
Matsuda, Fumio; Shinbo, Yoko; Oikawa, Akira; Hirai, Masami Yokota; Fiehn, Oliver; Kanaya, Shigehiko; Saito, Kazuki
2009-01-01
Background In metabolomics researches using mass spectrometry (MS), systematic searching of high-resolution mass data against compound databases is often the first step of metabolite annotation to determine elemental compositions possessing similar theoretical mass numbers. However, incorrect hits derived from errors in mass analyses will be included in the results of elemental composition searches. To assess the quality of peak annotation information, a novel methodology for false discovery rates (FDR) evaluation is presented in this study. Based on the FDR analyses, several aspects of an elemental composition search, including setting a threshold, estimating FDR, and the types of elemental composition databases most reliable for searching are discussed. Methodology/Principal Findings The FDR can be determined from one measured value (i.e., the hit rate for search queries) and four parameters determined by Monte Carlo simulation. The results indicate that relatively high FDR values (30–50%) were obtained when searching time-of-flight (TOF)/MS data using the KNApSAcK and KEGG databases. In addition, searches against large all-in-one databases (e.g., PubChem) always produced unacceptable results (FDR >70%). The estimated FDRs suggest that the quality of search results can be improved not only by performing more accurate mass analysis but also by modifying the properties of the compound database. A theoretical analysis indicates that FDR could be improved by using compound database with smaller but higher completeness entries. Conclusions/Significance High accuracy mass analysis, such as Fourier transform (FT)-MS, is needed for reliable annotation (FDR <10%). In addition, a small, customized compound database is preferable for high-quality annotation of metabolome data. PMID:19847304
Attentional Control via Parallel Target-Templates in Dual-Target Search
Barrett, Doug J. K.; Zobay, Oliver
2014-01-01
Simultaneous search for two targets has been shown to be slower and less accurate than independent searches for the same two targets. Recent research suggests this ‘dual-target cost’ may be attributable to a limit in the number of target-templates than can guide search at any one time. The current study investigated this possibility by comparing behavioural responses during single- and dual-target searches for targets defined by their orientation. The results revealed an increase in reaction times for dual- compared to single-target searches that was largely independent of the number of items in the display. Response accuracy also decreased on dual- compared to single-target searches: dual-target accuracy was higher than predicted by a model restricting search guidance to a single target-template and lower than predicted by a model simulating two independent single-target searches. These results are consistent with a parallel model of dual-target search in which attentional control is exerted by more than one target-template at a time. The requirement to maintain two target-templates simultaneously, however, appears to impose a reduction in the specificity of the memory representation that guides search for each target. PMID:24489793
Lee, Jeongmi; Geng, Joy J
2017-02-01
The efficiency of finding an object in a crowded environment depends largely on the similarity of nontargets to the search target. Models of attention theorize that the similarity is determined by representations stored within an "attentional template" held in working memory. However, the degree to which the contents of the attentional template are individually unique and where those idiosyncratic representations are encoded in the brain are unknown. We investigated this problem using representational similarity analysis of human fMRI data to measure the common and idiosyncratic representations of famous face morphs during an identity categorization task; data from the categorization task were then used to predict performance on a separate identity search task. We hypothesized that the idiosyncratic categorical representations of the continuous face morphs would predict their distractability when searching for each target identity. The results identified that patterns of activation in the lateral prefrontal cortex (LPFC) as well as in face-selective areas in the ventral temporal cortex were highly correlated with the patterns of behavioral categorization of face morphs and search performance that were common across subjects. However, the individually unique components of the categorization behavior were reliably decoded only in right LPFC. Moreover, the neural pattern in right LPFC successfully predicted idiosyncratic variability in search performance, such that reaction times were longer when distractors had a higher probability of being categorized as the target identity. These results suggest that the prefrontal cortex encodes individually unique components of categorical representations that are also present in attentional templates for target search. Everyone's perception of the world is uniquely shaped by personal experiences and preferences. Using functional MRI, we show that individual differences in the categorization of face morphs between two identities
Lo, Yu-Chen; Senese, Silvia; Li, Chien-Ming; Hu, Qiyang; Huang, Yong; Damoiseaux, Robert; Torres, Jorge Z.
2015-01-01
Target identification is one of the most critical steps following cell-based phenotypic chemical screens aimed at identifying compounds with potential uses in cell biology and for developing novel disease therapies. Current in silico target identification methods, including chemical similarity database searches, are limited to single or sequential ligand analysis that have limited capabilities for accurate deconvolution of a large number of compounds with diverse chemical structures. Here, we present CSNAP (Chemical Similarity Network Analysis Pulldown), a new computational target identification method that utilizes chemical similarity networks for large-scale chemotype (consensus chemical pattern) recognition and drug target profiling. Our benchmark study showed that CSNAP can achieve an overall higher accuracy (>80%) of target prediction with respect to representative chemotypes in large (>200) compound sets, in comparison to the SEA approach (60–70%). Additionally, CSNAP is capable of integrating with biological knowledge-based databases (Uniprot, GO) and high-throughput biology platforms (proteomic, genetic, etc) for system-wise drug target validation. To demonstrate the utility of the CSNAP approach, we combined CSNAP's target prediction with experimental ligand evaluation to identify the major mitotic targets of hit compounds from a cell-based chemical screen and we highlight novel compounds targeting microtubules, an important cancer therapeutic target. The CSNAP method is freely available and can be accessed from the CSNAP web server (http://services.mbi.ucla.edu/CSNAP/). PMID:25826798
A SETI experiment. [Search for Extra Terrestrial Intelligence
NASA Technical Reports Server (NTRS)
Mclaughlin, W. I.
1986-01-01
In order to increase the probability of contact in the search for extraterrestrial intelligence (SETI), it has been proposed to search more intensively in certain regions of the electromagnetic spectrum ('the water hole'). The present paper describes a similar narrowing of the search in the time domain. Application of this strategy results in the SETI experiments searching for signals from the Tau Ceti system late in 1986 and early in 1987, and from the Epsilon Eridani system in mid 1988.
A similarity-based data warehousing environment for medical images.
Teixeira, Jefferson William; Annibal, Luana Peixoto; Felipe, Joaquim Cezar; Ciferri, Ricardo Rodrigues; Ciferri, Cristina Dutra de Aguiar
2015-11-01
A core issue of the decision-making process in the medical field is to support the execution of analytical (OLAP) similarity queries over images in data warehousing environments. In this paper, we focus on this issue. We propose imageDWE, a non-conventional data warehousing environment that enables the storage of intrinsic features taken from medical images in a data warehouse and supports OLAP similarity queries over them. To comply with this goal, we introduce the concept of perceptual layer, which is an abstraction used to represent an image dataset according to a given feature descriptor in order to enable similarity search. Based on this concept, we propose the imageDW, an extended data warehouse with dimension tables specifically designed to support one or more perceptual layers. We also detail how to build an imageDW and how to load image data into it. Furthermore, we show how to process OLAP similarity queries composed of a conventional predicate and a similarity search predicate that encompasses the specification of one or more perceptual layers. Moreover, we introduce an index technique to improve the OLAP query processing over images. We carried out performance tests over a data warehouse environment that consolidated medical images from exams of several modalities. The results demonstrated the feasibility and efficiency of our proposed imageDWE to manage images and to process OLAP similarity queries. The results also demonstrated that the use of the proposed index technique guaranteed a great improvement in query processing. Copyright © 2015 Elsevier Ltd. All rights reserved.
Prior knowledge of category size impacts visual search.
Wu, Rachel; McGee, Brianna; Echiverri, Chelsea; Zinszer, Benjamin D
2018-03-30
Prior research has shown that category search can be similar to one-item search (as measured by the N2pc ERP marker of attentional selection) for highly familiar, smaller categories (e.g., letters and numbers) because the finite set of items in a category can be grouped into one unit to guide search. Other studies have shown that larger, more broadly defined categories (e.g., healthy food) also can elicit N2pc components during category search, but the amplitude of these components is typically attenuated. Two experiments investigated whether the perceived size of a familiar category impacts category and exemplar search. We presented participants with 16 familiar company logos: 8 from a smaller category (social media companies) and 8 from a larger category (entertainment/recreation manufacturing companies). The ERP results from Experiment 1 revealed that, in a two-item search array, search was more efficient for the smaller category of logos compared to the larger category. In a four-item search array (Experiment 2), where two of the four items were placeholders, search was largely similar between the category types, but there was more attentional capture by nontarget members from the same category as the target for smaller rather than larger categories. These results support a growing literature on how prior knowledge of categories affects attentional selection and capture during visual search. We discuss the implications of these findings in relation to assessing cognitive abilities across the lifespan, given that prior knowledge typically increases with age. © 2018 Society for Psychophysiological Research.
[Pain registries and similar data collections : A systematic review].
Freytag, A; Scriba, B; Kaiser, U; Meißner, W
2016-12-01
Registries and similar data collections are a valuable addition to prospective studies as they provide data from real life treatment. In pain medicine only few such data collections exist so far. Aim of the study was to identify German-language registries or similar data collections that record patient-reported and pain-associated outcomes together with other data. A systematic search was carried out, which included the following sources: the data bases PubMed/MEDLINE and Embase, the German Registry for Clinical Trials (DRKS), ClinicalTrials.gov and registry portals known to us. Furthermore, an extended internet search was carried out via Google Scholar. References from personal scientific contacts and from operators of registries were also included. Questionnaires regarding registry items were sent to registry operators. Out of 381 search hits, 37 potentially relevant projects received a questionnaire and 35 answered. From the 35 responders 23 registries or similar data collections fulfilling inclusion criteria could be identified: 5 primarily pain-associated, 3 therapy-associated, 2 population-associated and 13 disease-associated (rheumatism/arthritis 5, joints/spine 4, hernias 1 and cancer 3). The reader obtains contact information on relevant data collections associated with pain, the contents, objectives and the pain assessment instruments applied. This review could give an important impulse for increased networking in health services research on pain. A limitation of the study was that identification of registries was made difficult due to an inconsistent definition and application of the term "registry", incomplete or insufficiently updated registry portals, missing scientific publications as well as two non-responders.
Spatial partitions systematize visual search and enhance target memory.
Solman, Grayden J F; Kingstone, Alan
2017-02-01
Humans are remarkably capable of finding desired objects in the world, despite the scale and complexity of naturalistic environments. Broadly, this ability is supported by an interplay between exploratory search and guidance from episodic memory for previously observed target locations. Here we examined how the environment itself may influence this interplay. In particular, we examined how partitions in the environment-like buildings, rooms, and furniture-can impact memory during repeated search. We report that the presence of partitions in a display, independent of item configuration, reliably improves episodic memory for item locations. Repeated search through partitioned displays was faster overall and was characterized by more rapid ballistic orienting in later repetitions. Explicit recall was also both faster and more accurate when displays were partitioned. Finally, we found that search paths were more regular and systematic when displays were partitioned. Given the ubiquity of partitions in real-world environments, these results provide important insights into the mechanisms of naturalistic search and its relation to memory.
RAG-3D: A search tool for RNA 3D substructures
Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef; ...
2015-08-24
In this study, to address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally describedmore » in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding.« less
RAG-3D: a search tool for RNA 3D substructures
Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef; Schlick, Tamar
2015-01-01
To address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally described in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding. PMID:26304547
RAG-3D: A search tool for RNA 3D substructures
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef
In this study, to address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally describedmore » in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding.« less
Fast protein tertiary structure retrieval based on global surface shape similarity.
Sael, Lee; Li, Bin; La, David; Fang, Yi; Ramani, Karthik; Rustamov, Raif; Kihara, Daisuke
2008-09-01
Characterization and identification of similar tertiary structure of proteins provides rich information for investigating function and evolution. The importance of structure similarity searches is increasing as structure databases continue to expand, partly due to the structural genomics projects. A crucial drawback of conventional protein structure comparison methods, which compare structures by their main-chain orientation or the spatial arrangement of secondary structure, is that a database search is too slow to be done in real-time. Here we introduce a global surface shape representation by three-dimensional (3D) Zernike descriptors, which represent a protein structure compactly as a series expansion of 3D functions. With this simplified representation, the search speed against a few thousand structures takes less than a minute. To investigate the agreement between surface representation defined by 3D Zernike descriptor and conventional main-chain based representation, a benchmark was performed against a protein classification generated by the combinatorial extension algorithm. Despite the different representation, 3D Zernike descriptor retrieved proteins of the same conformation defined by combinatorial extension in 89.6% of the cases within the top five closest structures. The real-time protein structure search by 3D Zernike descriptor will open up new possibility of large-scale global and local protein surface shape comparison. 2008 Wiley-Liss, Inc.
Privacy-Preserving Patient Similarity Learning in a Federated Environment: Development and Analysis.
Lee, Junghye; Sun, Jimeng; Wang, Fei; Wang, Shuang; Jun, Chi-Hyuck; Jiang, Xiaoqian
2018-04-13
There is an urgent need for the development of global analytic frameworks that can perform analyses in a privacy-preserving federated environment across multiple institutions without privacy leakage. A few studies on the topic of federated medical analysis have been conducted recently with the focus on several algorithms. However, none of them have solved similar patient matching, which is useful for applications such as cohort construction for cross-institution observational studies, disease surveillance, and clinical trials recruitment. The aim of this study was to present a privacy-preserving platform in a federated setting for patient similarity learning across institutions. Without sharing patient-level information, our model can find similar patients from one hospital to another. We proposed a federated patient hashing framework and developed a novel algorithm to learn context-specific hash codes to represent patients across institutions. The similarities between patients can be efficiently computed using the resulting hash codes of corresponding patients. To avoid security attack from reverse engineering on the model, we applied homomorphic encryption to patient similarity search in a federated setting. We used sequential medical events extracted from the Multiparameter Intelligent Monitoring in Intensive Care-III database to evaluate the proposed algorithm in predicting the incidence of five diseases independently. Our algorithm achieved averaged area under the curves of 0.9154 and 0.8012 with balanced and imbalanced data, respectively, in κ-nearest neighbor with κ=3. We also confirmed privacy preservation in similarity search by using homomorphic encryption. The proposed algorithm can help search similar patients across institutions effectively to support federated data analysis in a privacy-preserving manner. ©Junghye Lee, Jimeng Sun, Fei Wang, Shuang Wang, Chi-Hyuck Jun, Xiaoqian Jiang. Originally published in JMIR Medical Informatics (http
Global Image Dissimilarity in Macaque Inferotemporal Cortex Predicts Human Visual Search Efficiency
Sripati, Arun P.; Olson, Carl R.
2010-01-01
Finding a target in a visual scene can be easy or difficult depending on the nature of the distractors. Research in humans has suggested that search is more difficult the more similar the target and distractors are to each other. However, it has not yielded an objective definition of similarity. We hypothesized that visual search performance depends on similarity as determined by the degree to which two images elicit overlapping patterns of neuronal activity in visual cortex. To test this idea, we recorded from neurons in monkey inferotemporal cortex (IT) and assessed visual search performance in humans using pairs of images formed from the same local features in different global arrangements. The ability of IT neurons to discriminate between two images was strongly predictive of the ability of humans to discriminate between them during visual search, accounting overall for 90% of the variance in human performance. A simple physical measure of global similarity – the degree of overlap between the coarse footprints of a pair of images – largely explains both the neuronal and the behavioral results. To explain the relation between population activity and search behavior, we propose a model in which the efficiency of global oddball search depends on contrast-enhancing lateral interactions in high-order visual cortex. PMID:20107054
NASA Astrophysics Data System (ADS)
Wihardi, Y.; Setiawan, W.; Nugraha, E.
2018-01-01
On this research we try to build CBIRS based on Learning Distance/Similarity Function using Linear Discriminant Analysis (LDA) and Histogram of Oriented Gradient (HoG) feature. Our method is invariant to depiction of image, such as similarity of image to image, sketch to image, and painting to image. LDA can decrease execution time compared to state of the art method, but it still needs an improvement in term of accuracy. Inaccuracy in our experiment happen because we did not perform sliding windows search and because of low number of negative samples as natural-world images.
CLAST: CUDA implemented large-scale alignment search tool.
Yano, Masahiro; Mori, Hiroshi; Akiyama, Yutaka; Yamada, Takuji; Kurokawa, Ken
2014-12-11
Metagenomics is a powerful methodology to study microbial communities, but it is highly dependent on nucleotide sequence similarity searching against sequence databases. Metagenomic analyses with next-generation sequencing technologies produce enormous numbers of reads from microbial communities, and many reads are derived from microbes whose genomes have not yet been sequenced, limiting the usefulness of existing sequence similarity search tools. Therefore, there is a clear need for a sequence similarity search tool that can rapidly detect weak similarity in large datasets. We developed a tool, which we named CLAST (CUDA implemented large-scale alignment search tool), that enables analyses of millions of reads and thousands of reference genome sequences, and runs on NVIDIA Fermi architecture graphics processing units. CLAST has four main advantages over existing alignment tools. First, CLAST was capable of identifying sequence similarities ~80.8 times faster than BLAST and 9.6 times faster than BLAT. Second, CLAST executes global alignment as the default (local alignment is also an option), enabling CLAST to assign reads to taxonomic and functional groups based on evolutionarily distant nucleotide sequences with high accuracy. Third, CLAST does not need a preprocessed sequence database like Burrows-Wheeler Transform-based tools, and this enables CLAST to incorporate large, frequently updated sequence databases. Fourth, CLAST requires <2 GB of main memory, making it possible to run CLAST on a standard desktop computer or server node. CLAST achieved very high speed (similar to the Burrows-Wheeler Transform-based Bowtie 2 for long reads) and sensitivity (equal to BLAST, BLAT, and FR-HIT) without the need for extensive database preprocessing or a specialized computing platform. Our results demonstrate that CLAST has the potential to be one of the most powerful and realistic approaches to analyze the massive amount of sequence data from next-generation sequencing
Sequence search on a supercomputer.
Gotoh, O; Tagashira, Y
1986-01-10
A set of programs was developed for searching nucleic acid and protein sequence data bases for sequences similar to a given sequence. The programs, written in FORTRAN 77, were optimized for vector processing on a Hitachi S810-20 supercomputer. A search of a 500-residue protein sequence against the entire PIR data base Ver. 1.0 (1) (0.5 M residues) is carried out in a CPU time of 45 sec. About 4 min is required for an exhaustive search of a 1500-base nucleotide sequence against all mammalian sequences (1.2M bases) in Genbank Ver. 29.0. The CPU time is reduced to about a quarter with a faster version.
Identification of "Known Unknowns" Utilizing Accurate Mass Data and ChemSpider
NASA Astrophysics Data System (ADS)
Little, James L.; Williams, Antony J.; Pshenichnov, Alexey; Tkachenko, Valery
2012-01-01
In many cases, an unknown to an investigator is actually known in the chemical literature, a reference database, or an internet resource. We refer to these types of compounds as "known unknowns." ChemSpider is a very valuable internet database of known compounds useful in the identification of these types of compounds in commercial, environmental, forensic, and natural product samples. The database contains over 26 million entries from hundreds of data sources and is provided as a free resource to the community. Accurate mass mass spectrometry data is used to query the database by either elemental composition or a monoisotopic mass. Searching by elemental composition is the preferred approach. However, it is often difficult to determine a unique elemental composition for compounds with molecular weights greater than 600 Da. In these cases, searching by the monoisotopic mass is advantageous. In either case, the search results are refined by sorting the number of references associated with each compound in descending order. This raises the most useful candidates to the top of the list for further evaluation. These approaches were shown to be successful in identifying "known unknowns" noted in our laboratory and for compounds of interest to others.
Optimizing random searches on three-dimensional lattices
NASA Astrophysics Data System (ADS)
Yang, Benhao; Yang, Shunkun; Zhang, Jiaquan; Li, Daqing
2018-07-01
Search is a universal behavior related to many types of intelligent individuals. While most studies have focused on search in two or infinite-dimensional space, it is still missing how search can be optimized in three-dimensional space. Here we study random searches on three-dimensional (3d) square lattices with periodic boundary conditions, and explore the optimal search strategy with a power-law step length distribution, p(l) ∼l-μ, known as Lévy flights. We find that compared to random searches on two-dimensional (2d) lattices, the optimal exponent μopt on 3d lattices is relatively smaller in non-destructive case and remains similar in destructive case. We also find μopt decreases as the lattice length in z direction increases under high target density. Our findings may help us to understand the role of spatial dimension in search behaviors.
A new method to search for high-redshift clusters using photometric redshifts
DOE Office of Scientific and Technical Information (OSTI.GOV)
Castignani, G.; Celotti, A.; Chiaberge, M.
2014-09-10
We describe a new method (Poisson probability method, PPM) to search for high-redshift galaxy clusters and groups by using photometric redshift information and galaxy number counts. The method relies on Poisson statistics and is primarily introduced to search for megaparsec-scale environments around a specific beacon. The PPM is tailored to both the properties of the FR I radio galaxies in the Chiaberge et al. sample, which are selected within the COSMOS survey, and to the specific data set used. We test the efficiency of our method of searching for cluster candidates against simulations. Two different approaches are adopted. (1) Wemore » use two z ∼ 1 X-ray detected cluster candidates found in the COSMOS survey and we shift them to higher redshift up to z = 2. We find that the PPM detects the cluster candidates up to z = 1.5, and it correctly estimates both the redshift and size of the two clusters. (2) We simulate spherically symmetric clusters of different size and richness, and we locate them at different redshifts (i.e., z = 1.0, 1.5, and 2.0) in the COSMOS field. We find that the PPM detects the simulated clusters within the considered redshift range with a statistical 1σ redshift accuracy of ∼0.05. The PPM is an efficient alternative method for high-redshift cluster searches that may also be applied to both present and future wide field surveys such as SDSS Stripe 82, LSST, and Euclid. Accurate photometric redshifts and a survey depth similar or better than that of COSMOS (e.g., I < 25) are required.« less
Cultural similarity, cultural competence, and nurse workforce diversity.
McGinnis, Sandra L; Brush, Barbara L; Moore, Jean
2010-11-01
Proponents of health workforce diversity argue that increasing the number of minority health care providers will enhance cultural similarity between patients and providers as well as the health system's capacity to provide culturally competent care. Measuring cultural similarity has been difficult, however, given that current benchmarks of workforce diversity categorize health workers by major racial/ethnic classifications rather than by cultural measures. This study examined the use of national racial/ethnic categories in both patient and registered nurse (RN) populations and found them to be a poor indicator of cultural similarity. Rather, we found that cultural similarity between RN and patient populations needs to be established at the level of local labor markets and broadened to include other cultural parameters such as country of origin, primary language, and self-identified ancestry. Only then can the relationship between cultural similarity and cultural competence be accurately determined and its outcomes measured.
Learning context-sensitive shape similarity by graph transduction.
Bai, Xiang; Yang, Xingwei; Latecki, Longin Jan; Liu, Wenyu; Tu, Zhuowen
2010-05-01
Shape similarity and shape retrieval are very important topics in computer vision. The recent progress in this domain has been mostly driven by designing smart shape descriptors for providing better similarity measure between pairs of shapes. In this paper, we provide a new perspective to this problem by considering the existing shapes as a group, and study their similarity measures to the query shape in a graph structure. Our method is general and can be built on top of any existing shape similarity measure. For a given similarity measure, a new similarity is learned through graph transduction. The new similarity is learned iteratively so that the neighbors of a given shape influence its final similarity to the query. The basic idea here is related to PageRank ranking, which forms a foundation of Google Web search. The presented experimental results demonstrate that the proposed approach yields significant improvements over the state-of-art shape matching algorithms. We obtained a retrieval rate of 91.61 percent on the MPEG-7 data set, which is the highest ever reported in the literature. Moreover, the learned similarity by the proposed method also achieves promising improvements on both shape classification and shape clustering.
Haim, Mario; Arendt, Florian; Scherr, Sebastian
2017-02-01
Despite evidence that suicide rates can increase after suicides are widely reported in the media, appropriate depictions of suicide in the media can help people to overcome suicidal crises and can thus elicit preventive effects. We argue on the level of individual media users that a similar ambivalence can be postulated for search results on online suicide-related search queries. Importantly, the filter bubble hypothesis (Pariser, 2011) states that search results are biased by algorithms based on a person's previous search behavior. In this study, we investigated whether suicide-related search queries, including either potentially suicide-preventive or -facilitative terms, influence subsequent search results. This might thus protect or harm suicidal Internet users. We utilized a 3 (search history: suicide-related harmful, suicide-related helpful, and suicide-unrelated) × 2 (reactive: clicking the top-most result link and no clicking) experimental design applying agent-based testing. While findings show no influences either of search histories or of reactivity on search results in a subsequent situation, the presentation of a helpline offer raises concerns about possible detrimental algorithmic decision-making: Algorithms "decided" whether or not to present a helpline, and this automated decision, then, followed the agent throughout the rest of the observation period. Implications for policy-making and search providers are discussed.
Forecasting influenza outbreak dynamics in Melbourne from Internet search query surveillance data.
Moss, Robert; Zarebski, Alexander; Dawson, Peter; McCaw, James M
2016-07-01
Accurate forecasting of seasonal influenza epidemics is of great concern to healthcare providers in temperate climates, as these epidemics vary substantially in their size, timing and duration from year to year, making it a challenge to deliver timely and proportionate responses. Previous studies have shown that Bayesian estimation techniques can accurately predict when an influenza epidemic will peak many weeks in advance, using existing surveillance data, but these methods must be tailored both to the target population and to the surveillance system. Our aim was to evaluate whether forecasts of similar accuracy could be obtained for metropolitan Melbourne (Australia). We used the bootstrap particle filter and a mechanistic infection model to generate epidemic forecasts for metropolitan Melbourne (Australia) from weekly Internet search query surveillance data reported by Google Flu Trends for 2006-14. Optimal observation models were selected from hundreds of candidates using a novel approach that treats forecasts akin to receiver operating characteristic (ROC) curves. We show that the timing of the epidemic peak can be accurately predicted 4-6 weeks in advance, but that the magnitude of the epidemic peak and the overall burden are much harder to predict. We then discuss how the infection and observation models and the filtering process may be refined to improve forecast robustness, thereby improving the utility of these methods for healthcare decision support. © 2016 The Authors. Influenza and Other Respiratory Viruses Published by John Wiley & Sons Ltd.
Pressure probe and isopiestic psychrometer measure similar turgor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nonami, H.; Boyer, J.S.; Steudle, E.
1987-03-01
Turgor measured with a miniature pressure probe was compared to that measured with an isopiestic thermocouple psychrometer in mature regions of soybean (Glycine max (L.) Merr.) stems. The probe measured turgor directly in cells of intact stems whereas the psychrometer measured the water potential and osmotic potential of excised stem segments and turgor was calculated by difference. When care was taken to prevent dehydration when working with the pressure probe, and diffusive resistance and dilution errors with the psychrometer, both methods gave similar values of turgor whether the plants were dehydrating or rehydrating. This finding, together with the previously demonstratedmore » similarity in turgor measured with the isopiestic psychrometer and a pressure chamber, indicates that the pressure probe provides accurate measurements of turgor despite the need to penetrate the cell. On the other hand, it suggest that as long as precautions are taken to obtain accurate values for the water potential and osmotic potential, turgor can be determined by isopiestic psychrometry in tissues not accessible to the pressure probe for physical reasons.« less
Pressure probe and isopiestic psychrometer measure similar turgor.
Nonami, H; Boyer, J S; Steudle, E
1987-03-01
Turgor measured with a miniature pressure probe was compared to that measured with an isopiestic thermocouple psychrometer in mature regions of soybean (Glycine max [L.] Merr.) stems. The probe measured turgor directly in cells of intact stems whereas the psychrometer measured the water potential and osmotic potential of excised stem segments and turgor was calculated by difference. When care was taken to prevent dehydration when working with the pressure probe, and diffusive resistance and dilution errors with the psychrometer, both methods gave similar values of turgor whether the plants were dehydrating or rehydrating. This finding, together with the previously demonstrated similarity in turgor measured with the isopiestic psychrometer and a pressure chamber, indicates that the pressure probe provides accurate measurements of turgor despite the need to penetrate the cell. On the other hand, it suggests that as long as precautions are taken to obtain accurate values for the water potential and osmotic potential, turgor can be determined by isopiestic psychrometry in tissues not accessible to the pressure probe for physical reasons.
Learning deep similarity in fundus photography
NASA Astrophysics Data System (ADS)
Chudzik, Piotr; Al-Diri, Bashir; Caliva, Francesco; Ometto, Giovanni; Hunter, Andrew
2017-02-01
Similarity learning is one of the most fundamental tasks in image analysis. The ability to extract similar images in the medical domain as part of content-based image retrieval (CBIR) systems has been researched for many years. The vast majority of methods used in CBIR systems are based on hand-crafted feature descriptors. The approximation of a similarity mapping for medical images is difficult due to the big variety of pixel-level structures of interest. In fundus photography (FP) analysis, a subtle difference in e.g. lesions and vessels shape and size can result in a different diagnosis. In this work, we demonstrated how to learn a similarity function for image patches derived directly from FP image data without the need of manually designed feature descriptors. We used a convolutional neural network (CNN) with a novel architecture adapted for similarity learning to accomplish this task. Furthermore, we explored and studied multiple CNN architectures. We show that our method can approximate the similarity between FP patches more efficiently and accurately than the state-of- the-art feature descriptors, including SIFT and SURF using a publicly available dataset. Finally, we observe that our approach, which is purely data-driven, learns that features such as vessels calibre and orientation are important discriminative factors, which resembles the way how humans reason about similarity. To the best of authors knowledge, this is the first attempt to approximate a visual similarity mapping in FP.
NASA Astrophysics Data System (ADS)
Ma, Yitao; Miura, Sadahiko; Honjo, Hiroaki; Ikeda, Shoji; Hanyu, Takahiro; Ohno, Hideo; Endoh, Tetsuo
2017-04-01
A high-density nonvolatile associative memory (NV-AM) based on spin transfer torque magnetoresistive random access memory (STT-MRAM), which achieves highly concurrent and ultralow-power nearest neighbor search with full adaptivity of the template data format, has been proposed and fabricated using the 90 nm CMOS/70 nm perpendicular-magnetic-tunnel-junction hybrid process. A truly compact current-mode circuitry is developed to realize flexibly controllable and high-parallel similarity evaluation, which makes the NV-AM adaptable to any dimensionality and component-bit of template data. A compact dual-stage time-domain minimum searching circuit is also developed, which can freely extend the system for more template data by connecting multiple NM-AM cores without additional circuits for integrated processing. Both the embedded STT-MRAM module and the computing circuit modules in this NV-AM chip are synchronously power-gated to completely eliminate standby power and maximally reduce operation power by only activating the currently accessed circuit blocks. The operations of a prototype chip at 40 MHz are demonstrated by measurement. The average operation power is only 130 µW, and the circuit density is less than 11 µm2/bit. Compared with the latest conventional works in both volatile and nonvolatile approaches, more than 31.3% circuit area reductions and 99.2% power improvements are achieved, respectively. Further power performance analyses are discussed, which verify the special superiority of the proposed NV-AM in low-power and large-memory-based VLSIs.
Recognition memory is modulated by visual similarity.
Yago, Elena; Ishai, Alumit
2006-06-01
We used event-related fMRI to test whether recognition memory depends on visual similarity between familiar prototypes and novel exemplars. Subjects memorized portraits, landscapes, and abstract compositions by six painters with a unique style, and later performed a memory recognition task. The prototypes were presented with new exemplars that were either visually similar or dissimilar. Behaviorally, novel, dissimilar items were detected faster and more accurately. We found activation in a distributed cortical network that included face- and object-selective regions in the visual cortex, where familiar prototypes evoked stronger responses than new exemplars; attention-related regions in parietal cortex, where responses elicited by new exemplars were reduced with decreased similarity to the prototypes; and the hippocampus and memory-related regions in parietal and prefrontal cortices, where stronger responses were evoked by the dissimilar exemplars. Our findings suggest that recognition memory is mediated by classification of novel exemplars as a match or a mismatch, based on their visual similarity to familiar prototypes.
WCSTools 3.0: More Tools for Image Astrometry and Catalog Searching
NASA Astrophysics Data System (ADS)
Mink, Douglas J.
For five years, WCSTools has provided image astrometry for astronomers who need accurate positions for objects they wish to observe. Other functions have been added and improved since the package was first released. Support has been added for new catalogs, such as the GSC-ACT, 2MASS Point Source Catalog, and GSC II, as they have been published. A simple command line interface can search any supported catalog, returning information in several standard formats, whether the catalog is on a local disk or searchable over the World Wide Web. The catalog searching routine can be located on either end (or both ends!) of such a web connection, and the output from one catalog search can be used as the input to another search.
GEMINI: a computationally-efficient search engine for large gene expression datasets.
DeFreitas, Timothy; Saddiki, Hachem; Flaherty, Patrick
2016-02-24
Low-cost DNA sequencing allows organizations to accumulate massive amounts of genomic data and use that data to answer a diverse range of research questions. Presently, users must search for relevant genomic data using a keyword, accession number of meta-data tag. However, in this search paradigm the form of the query - a text-based string - is mismatched with the form of the target - a genomic profile. To improve access to massive genomic data resources, we have developed a fast search engine, GEMINI, that uses a genomic profile as a query to search for similar genomic profiles. GEMINI implements a nearest-neighbor search algorithm using a vantage-point tree to store a database of n profiles and in certain circumstances achieves an [Formula: see text] expected query time in the limit. We tested GEMINI on breast and ovarian cancer gene expression data from The Cancer Genome Atlas project and show that it achieves a query time that scales as the logarithm of the number of records in practice on genomic data. In a database with 10(5) samples, GEMINI identifies the nearest neighbor in 0.05 sec compared to a brute force search time of 0.6 sec. GEMINI is a fast search engine that uses a query genomic profile to search for similar profiles in a very large genomic database. It enables users to identify similar profiles independent of sample label, data origin or other meta-data information.
Förster, Jens
2009-02-01
Nine studies showed a bidirectional link (a) between a global processing style and generation of similarities and (b) between a local processing style and generation of dissimilarities. In Experiments 1-4, participants were primed with global versus local perception styles and then asked to work on an allegedly unrelated generation task. Across materials, participants generated more similarities than dissimilarities after global priming, whereas for participants with local priming, the opposite was true. Experiments 5-6 demonstrated a bidirectional link whereby participants who were first instructed to search for similarities attended more to the gestalt of a stimulus than to its details, whereas the reverse was true for those who were initially instructed to search for dissimilarities. Because important psychological variables are correlated with processing styles, in Experiments 7-9, temporal distance, a promotion focus, and high power were predicted and shown to enhance the search for similarities, whereas temporal proximity, a prevention focus, and low power enhanced the search for dissimilarities. (PsycINFO Database Record (c) 2009 APA, all rights reserved).
Accurate visible speech synthesis based on concatenating variable length motion capture data.
Ma, Jiyong; Cole, Ron; Pellom, Bryan; Ward, Wayne; Wise, Barbara
2006-01-01
We present a novel approach to synthesizing accurate visible speech based on searching and concatenating optimal variable-length units in a large corpus of motion capture data. Based on a set of visual prototypes selected on a source face and a corresponding set designated for a target face, we propose a machine learning technique to automatically map the facial motions observed on the source face to the target face. In order to model the long distance coarticulation effects in visible speech, a large-scale corpus that covers the most common syllables in English was collected, annotated and analyzed. For any input text, a search algorithm to locate the optimal sequences of concatenated units for synthesis is desrcribed. A new algorithm to adapt lip motions from a generic 3D face model to a specific 3D face model is also proposed. A complete, end-to-end visible speech animation system is implemented based on the approach. This system is currently used in more than 60 kindergarten through third grade classrooms to teach students to read using a lifelike conversational animated agent. To evaluate the quality of the visible speech produced by the animation system, both subjective evaluation and objective evaluation are conducted. The evaluation results show that the proposed approach is accurate and powerful for visible speech synthesis.
Survival Processing Enhances Visual Search Efficiency.
Cho, Kit W
2018-05-01
Words rated for their survival relevance are remembered better than when rated using other well-known memory mnemonics. This finding, which is known as the survival advantage effect and has been replicated in many studies, suggests that our memory systems are molded by natural selection pressures. In two experiments, the present study used a visual search task to examine whether there is likewise a survival advantage for our visual systems. Participants rated words for their survival relevance or for their pleasantness before locating that object's picture in a search array with 8 or 16 objects. Although there was no difference in search times among the two rating scenarios when set size was 8, survival processing reduced visual search times when set size was 16. These findings reflect a search efficiency effect and suggest that similar to our memory systems, our visual systems are also tuned toward self-preservation.
FitSearch: a robust way to interpret a yeast fitness profile in terms of drug's mode-of-action.
Lee, Minho; Han, Sangjo; Chang, Hyeshik; Kwak, Youn-Sig; Weller, David M; Kim, Dongsup
2013-01-01
Yeast deletion-mutant collections have been successfully used to infer the mode-of-action of drugs especially by profiling chemical-genetic and genetic-genetic interactions on a genome-wide scale. Although tens of thousands of those profiles are publicly available, a lack of an accurate method for mining such data has been a major bottleneck for more widespread use of these useful resources. For general usage of those public resources, we designed FitRankDB as a general repository of fitness profiles, and developed a new search algorithm, FitSearch, for identifying the profiles that have a high similarity score with statistical significance for a given fitness profile. We demonstrated that our new repository and algorithm are highly beneficial to researchers who attempting to make hypotheses based on unknown modes-of-action of bioactive compounds, regardless of the types of experiments that have been performed using yeast deletion-mutant collection in various types of different measurement platforms, especially non-chip-based platforms. We showed that our new database and algorithm are useful when attempting to construct a hypothesis regarding the unknown function of a bioactive compound through small-scale experiments with a yeast deletion collection in a platform independent manner. The FitRankDB and FitSearch enhance the ease of searching public yeast fitness profiles and obtaining insights into unknown mechanisms of action of drugs. FitSearch is freely available at http://fitsearch.kaist.ac.kr.
FitSearch: a robust way to interpret a yeast fitness profile in terms of drug's mode-of-action
2013-01-01
Background Yeast deletion-mutant collections have been successfully used to infer the mode-of-action of drugs especially by profiling chemical-genetic and genetic-genetic interactions on a genome-wide scale. Although tens of thousands of those profiles are publicly available, a lack of an accurate method for mining such data has been a major bottleneck for more widespread use of these useful resources. Results For general usage of those public resources, we designed FitRankDB as a general repository of fitness profiles, and developed a new search algorithm, FitSearch, for identifying the profiles that have a high similarity score with statistical significance for a given fitness profile. We demonstrated that our new repository and algorithm are highly beneficial to researchers who attempting to make hypotheses based on unknown modes-of-action of bioactive compounds, regardless of the types of experiments that have been performed using yeast deletion-mutant collection in various types of different measurement platforms, especially non-chip-based platforms. Conclusions We showed that our new database and algorithm are useful when attempting to construct a hypothesis regarding the unknown function of a bioactive compound through small-scale experiments with a yeast deletion collection in a platform independent manner. The FitRankDB and FitSearch enhance the ease of searching public yeast fitness profiles and obtaining insights into unknown mechanisms of action of drugs. FitSearch is freely available at http://fitsearch.kaist.ac.kr. PMID:23368702
NASA Astrophysics Data System (ADS)
Bergen, K.; Yoon, C. E.; OReilly, O. J.; Beroza, G. C.
2015-12-01
Recent improvements in computational efficiency for waveform correlation-based detections achieved by new methods such as Fingerprint and Similarity Thresholding (FAST) promise to allow large-scale blind search for similar waveforms in long-duration continuous seismic data. Waveform similarity search applied to datasets of months to years of continuous seismic data will identify significantly more events than traditional detection methods. With the anticipated increase in number of detections and associated increase in false positives, manual inspection of the detection results will become infeasible. This motivates the need for new approaches to process the output of similarity-based detection. We explore data mining techniques for improved detection post-processing. We approach this by considering similarity-detector output as a sparse similarity graph with candidate events as vertices and similarities as weighted edges. Image processing techniques are leveraged to define candidate events and combine results individually processed at multiple stations. Clustering and graph analysis methods are used to identify groups of similar waveforms and assign a confidence score to candidate detections. Anomaly detection and classification are applied to waveform data for additional false detection removal. A comparison of methods will be presented and their performance will be demonstrated on a suspected induced and non-induced earthquake sequence.
Tree decomposition based fast search of RNA structures including pseudoknots in genomes.
Song, Yinglei; Liu, Chunmei; Malmberg, Russell; Pan, Fangfang; Cai, Liming
2005-01-01
Searching genomes for RNA secondary structure with computational methods has become an important approach to the annotation of non-coding RNAs. However, due to the lack of efficient algorithms for accurate RNA structure-sequence alignment, computer programs capable of fast and effectively searching genomes for RNA secondary structures have not been available. In this paper, a novel RNA structure profiling model is introduced based on the notion of a conformational graph to specify the consensus structure of an RNA family. Tree decomposition yields a small tree width t for such conformation graphs (e.g., t = 2 for stem loops and only a slight increase for pseudo-knots). Within this modelling framework, the optimal alignment of a sequence to the structure model corresponds to finding a maximum valued isomorphic subgraph and consequently can be accomplished through dynamic programming on the tree decomposition of the conformational graph in time O(k(t)N(2)), where k is a small parameter; and N is the size of the projiled RNA structure. Experiments show that the application of the alignment algorithm to search in genomes yields the same search accuracy as methods based on a Covariance model with a significant reduction in computation time. In particular; very accurate searches of tmRNAs in bacteria genomes and of telomerase RNAs in yeast genomes can be accomplished in days, as opposed to months required by other methods. The tree decomposition based searching tool is free upon request and can be downloaded at our site h t t p ://w.uga.edu/RNA-informatics/software/index.php.
GeoSearch: A lightweight broking middleware for geospatial resources discovery
NASA Astrophysics Data System (ADS)
Gui, Z.; Yang, C.; Liu, K.; Xia, J.
2012-12-01
With petabytes of geodata, thousands of geospatial web services available over the Internet, it is critical to support geoscience research and applications by finding the best-fit geospatial resources from the massive and heterogeneous resources. Past decades' developments witnessed the operation of many service components to facilitate geospatial resource management and discovery. However, efficient and accurate geospatial resource discovery is still a big challenge due to the following reasons: 1)The entry barriers (also called "learning curves") hinder the usability of discovery services to end users. Different portals and catalogues always adopt various access protocols, metadata formats and GUI styles to organize, present and publish metadata. It is hard for end users to learn all these technical details and differences. 2)The cost for federating heterogeneous services is high. To provide sufficient resources and facilitate data discovery, many registries adopt periodic harvesting mechanism to retrieve metadata from other federated catalogues. These time-consuming processes lead to network and storage burdens, data redundancy, and also the overhead of maintaining data consistency. 3)The heterogeneous semantics issues in data discovery. Since the keyword matching is still the primary search method in many operational discovery services, the search accuracy (precision and recall) is hard to guarantee. Semantic technologies (such as semantic reasoning and similarity evaluation) offer a solution to solve these issues. However, integrating semantic technologies with existing service is challenging due to the expandability limitations on the service frameworks and metadata templates. 4)The capabilities to help users make final selection are inadequate. Most of the existing search portals lack intuitive and diverse information visualization methods and functions (sort, filter) to present, explore and analyze search results. Furthermore, the presentation of the value
A hierarchical transition state search algorithm
NASA Astrophysics Data System (ADS)
del Campo, Jorge M.; Köster, Andreas M.
2008-07-01
A hierarchical transition state search algorithm is developed and its implementation in the density functional theory program deMon2k is described. This search algorithm combines the double ended saddle interpolation method with local uphill trust region optimization. A new formalism for the incorporation of the distance constrain in the saddle interpolation method is derived. The similarities between the constrained optimizations in the local trust region method and the saddle interpolation are highlighted. The saddle interpolation and local uphill trust region optimizations are validated on a test set of 28 representative reactions. The hierarchical transition state search algorithm is applied to an intramolecular Diels-Alder reaction with several internal rotors, which makes automatic transition state search rather challenging. The obtained reaction mechanism is discussed in the context of the experimentally observed product distribution.
Visual search for features and conjunctions in development.
Lobaugh, N J; Cole, S; Rovet, J F
1998-12-01
Visual search performance was examined in three groups of children 7 to 12 years of age and in young adults. Colour and orientation feature searches and a conjunction search were conducted. Reaction time (RT) showed expected improvements in processing speed with age. Comparisons of RT's on target-present and target-absent trials were consistent with parallel search on the two feature conditions and with serial search in the conjunction condition. The RT results indicated searches for feature and conjunctions were treated similarly for children and adults. However, the youngest children missed more targets at the largest array sizes, most strikingly in conjunction search. Based on an analysis of speed/accuracy trade-offs, we suggest that low target-distractor discriminability leads to an undersampling of array elements, and is responsible for the high number of misses in the youngest children.
Accurate millimetre and submillimetre rest frequencies for cis- and trans-dithioformic acid, HCSSH
NASA Astrophysics Data System (ADS)
Prudenzano, D.; Laas, J.; Bizzocchi, L.; Lattanzi, V.; Endres, C.; Giuliano, B. M.; Spezzano, S.; Palumbo, M. E.; Caselli, P.
2018-04-01
Context. A better understanding of sulphur chemistry is needed to solve the interstellar sulphur depletion problem. A way to achieve this goal is to study new S-bearing molecules in the laboratory, obtaining accurate rest frequencies for an astronomical search. We focus on dithioformic acid, HCSSH, which is the sulphur analogue of formic acid. Aims: The aim of this study is to provide an accurate line list of the two HCSSH trans and cis isomers in their electronic ground state and a comprehensive centrifugal distortion analysis with an extension of measurements in the millimetre and submillimetre range. Methods: We studied the two isomers in the laboratory using an absorption spectrometer employing the frequency-modulation technique. The molecules were produced directly within a free-space cell by glow discharge of a gas mixture. We measured lines belonging to the electronic ground state up to 478 GHz, with a total number of 204 and 139 new rotational transitions, respectively, for trans and cis isomers. The final dataset also includes lines in the centimetre range available from literature. Results: The extension of the measurements in the mm and submm range lead to an accurate set of rotational and centrifugal distortion parameters. This allows us to predict frequencies with estimated uncertainties as low as 5 kHz at 1 mm wavelength. Hence, the new dataset provided by this study can be used for astronomical search. Frequency lists are only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/612/A56
Setting the public agenda for online health search: a white paper and action agenda.
Greenberg, Liza; D'Andrea, Guy; Lorence, Dan
2004-06-08
Searches for health information are among the most common reasons that consumers use the Internet. Both consumers and quality experts have raised concerns about the quality of information on the Web and the ability of consumers to find accurate information that meets their needs. To produce a national stakeholder-driven agenda for research, technical improvements, and education that will improve the results of consumer searches for health information on the Internet. URAC, a national accreditation organization, and Consumer WebWatch (CWW), a project of Consumers Union (a consumer advocacy organization), conducted a review of factors influencing the results of online health searches. The organizations convened two stakeholder groups of consumers, quality experts, search engine experts, researchers, health-care providers, informatics specialists, and others. Meeting participants reviewed existing information and developed recommendations for improving the results of online consumer searches for health information. Participants were not asked to vote on or endorse the recommendations. Our working definition of a quality Web site was one that contained accurate, reliable, and complete information. The Internet has greatly improved access to health information for consumers. There is great variation in how consumers seek information via the Internet, and in how successful they are in searching for health information. Further, there is variation among Web sites, both in quality and accessibility. Many Web site features affect the capability of search engines to find and index them. Research is needed to define quality elements of Web sites that could be retrieved by search engines and understand how to meet the needs of different types of searchers. Technological research should seek to develop more sophisticated approaches for tagging information, and to develop searches that "learn" from consumer behavior. Finally, education initiatives are needed to help consumers
Setting the Public Agenda for Online Health Search: A White Paper and Action Agenda
D'Andrea, Guy; Lorence, Dan
2004-01-01
Background Searches for health information are among the most common reasons that consumers use the Internet. Both consumers and quality experts have raised concerns about the quality of information on the Web and the ability of consumers to find accurate information that meets their needs. Objective To produce a national stakeholder-driven agenda for research, technical improvements, and education that will improve the results of consumer searches for health information on the Internet. Methods URAC, a national accreditation organization, and Consumer WebWatch (CWW), a project of Consumers Union (a consumer advocacy organization), conducted a review of factors influencing the results of online health searches. The organizations convened two stakeholder groups of consumers, quality experts, search engine experts, researchers, health-care providers, informatics specialists, and others. Meeting participants reviewed existing information and developed recommendations for improving the results of online consumer searches for health information. Participants were not asked to vote on or endorse the recommendations. Our working definition of a quality Web site was one that contained accurate, reliable, and complete information. Results The Internet has greatly improved access to health information for consumers. There is great variation in how consumers seek information via the Internet, and in how successful they are in searching for health information. Further, there is variation among Web sites, both in quality and accessibility. Many Web site features affect the capability of search engines to find and index them. Conclusions Research is needed to define quality elements of Web sites that could be retrieved by search engines and understand how to meet the needs of different types of searchers. Technological research should seek to develop more sophisticated approaches for tagging information, and to develop searches that "learn" from consumer behavior. Finally
Mental models accurately predict emotion transitions.
Thornton, Mark A; Tamir, Diana I
2017-06-06
Successful social interactions depend on people's ability to predict others' future actions and emotions. People possess many mechanisms for perceiving others' current emotional states, but how might they use this information to predict others' future states? We hypothesized that people might capitalize on an overlooked aspect of affective experience: current emotions predict future emotions. By attending to regularities in emotion transitions, perceivers might develop accurate mental models of others' emotional dynamics. People could then use these mental models of emotion transitions to predict others' future emotions from currently observable emotions. To test this hypothesis, studies 1-3 used data from three extant experience-sampling datasets to establish the actual rates of emotional transitions. We then collected three parallel datasets in which participants rated the transition likelihoods between the same set of emotions. Participants' ratings of emotion transitions predicted others' experienced transitional likelihoods with high accuracy. Study 4 demonstrated that four conceptual dimensions of mental state representation-valence, social impact, rationality, and human mind-inform participants' mental models. Study 5 used 2 million emotion reports on the Experience Project to replicate both of these findings: again people reported accurate models of emotion transitions, and these models were informed by the same four conceptual dimensions. Importantly, neither these conceptual dimensions nor holistic similarity could fully explain participants' accuracy, suggesting that their mental models contain accurate information about emotion dynamics above and beyond what might be predicted by static emotion knowledge alone.
Mental models accurately predict emotion transitions
Thornton, Mark A.; Tamir, Diana I.
2017-01-01
Successful social interactions depend on people’s ability to predict others’ future actions and emotions. People possess many mechanisms for perceiving others’ current emotional states, but how might they use this information to predict others’ future states? We hypothesized that people might capitalize on an overlooked aspect of affective experience: current emotions predict future emotions. By attending to regularities in emotion transitions, perceivers might develop accurate mental models of others’ emotional dynamics. People could then use these mental models of emotion transitions to predict others’ future emotions from currently observable emotions. To test this hypothesis, studies 1–3 used data from three extant experience-sampling datasets to establish the actual rates of emotional transitions. We then collected three parallel datasets in which participants rated the transition likelihoods between the same set of emotions. Participants’ ratings of emotion transitions predicted others’ experienced transitional likelihoods with high accuracy. Study 4 demonstrated that four conceptual dimensions of mental state representation—valence, social impact, rationality, and human mind—inform participants’ mental models. Study 5 used 2 million emotion reports on the Experience Project to replicate both of these findings: again people reported accurate models of emotion transitions, and these models were informed by the same four conceptual dimensions. Importantly, neither these conceptual dimensions nor holistic similarity could fully explain participants’ accuracy, suggesting that their mental models contain accurate information about emotion dynamics above and beyond what might be predicted by static emotion knowledge alone. PMID:28533373
OrChem - An open source chemistry search engine for Oracle(R).
Rijnbeek, Mark; Steinbeck, Christoph
2009-10-22
Registration, indexing and searching of chemical structures in relational databases is one of the core areas of cheminformatics. However, little detail has been published on the inner workings of search engines and their development has been mostly closed-source. We decided to develop an open source chemistry extension for Oracle, the de facto database platform in the commercial world. Here we present OrChem, an extension for the Oracle 11G database that adds registration and indexing of chemical structures to support fast substructure and similarity searching. The cheminformatics functionality is provided by the Chemistry Development Kit. OrChem provides similarity searching with response times in the order of seconds for databases with millions of compounds, depending on a given similarity cut-off. For substructure searching, it can make use of multiple processor cores on today's powerful database servers to provide fast response times in equally large data sets. OrChem is free software and can be redistributed and/or modified under the terms of the GNU Lesser General Public License as published by the Free Software Foundation. All software is available via http://orchem.sourceforge.net.
OrChem - An open source chemistry search engine for Oracle®
2009-01-01
Background Registration, indexing and searching of chemical structures in relational databases is one of the core areas of cheminformatics. However, little detail has been published on the inner workings of search engines and their development has been mostly closed-source. We decided to develop an open source chemistry extension for Oracle, the de facto database platform in the commercial world. Results Here we present OrChem, an extension for the Oracle 11G database that adds registration and indexing of chemical structures to support fast substructure and similarity searching. The cheminformatics functionality is provided by the Chemistry Development Kit. OrChem provides similarity searching with response times in the order of seconds for databases with millions of compounds, depending on a given similarity cut-off. For substructure searching, it can make use of multiple processor cores on today's powerful database servers to provide fast response times in equally large data sets. Availability OrChem is free software and can be redistributed and/or modified under the terms of the GNU Lesser General Public License as published by the Free Software Foundation. All software is available via http://orchem.sourceforge.net. PMID:20298521
Seeing and Being Seen: Predictors of Accurate Perceptions about Classmates’ Relationships
Neal, Jennifer Watling; Neal, Zachary P.; Cappella, Elise
2015-01-01
This study examines predictors of observer accuracy (i.e. seeing) and target accuracy (i.e. being seen) in perceptions of classmates’ relationships in a predominantly African American sample of 420 second through fourth graders (ages 7 – 11). Girls, children in higher grades, and children in smaller classrooms were more accurate observers. Targets (i.e. pairs of children) were more accurately observed when they occurred in smaller classrooms of higher grades and involved same-sex, high-popularity, and similar-popularity children. Moreover, relationships between pairs of girls were more accurately observed than relationships between pairs of boys. As a set, these findings suggest the importance of both observer and target characteristics for children’s accurate perceptions of classroom relationships. Moreover, the substantial variation in observer accuracy and target accuracy has methodological implications for both peer-reported assessments of classroom relationships and the use of stochastic actor-based models to understand peer selection and socialization processes. PMID:26347582
When Gravity Fails: Local Search Topology
NASA Technical Reports Server (NTRS)
Frank, Jeremy; Cheeseman, Peter; Stutz, John; Lau, Sonie (Technical Monitor)
1997-01-01
Local search algorithms for combinatorial search problems frequently encounter a sequence of states in which it is impossible to improve the value of the objective function; moves through these regions, called {\\em plateau moves), dominate the time spent in local search. We analyze and characterize {\\em plateaus) for three different classes of randomly generated Boolean Satisfiability problems. We identify several interesting features of plateaus that impact the performance of local search algorithms. We show that local minima tend to be small but occasionally may be very large. We also show that local minima can be escaped without unsatisfying a large number of clauses, but that systematically searching for an escape route may be computationally expensive if the local minimum is large. We show that plateaus with exits, called benches, tend to be much larger than minima, and that some benches have very few exit states which local search can use to escape. We show that the solutions (i.e. global minima) of randomly generated problem instances form clusters, which behave similarly to local minima. We revisit several enhancements of local search algorithms and explain their performance in light of our results. Finally we discuss strategies for creating the next generation of local search algorithms.
Patient-Centered Tools for Medication Information Search.
Wilcox, Lauren; Feiner, Steven; Elhadad, Noémie; Vawdrey, David; Tran, Tran H
2014-05-20
Recent research focused on online health information seeking highlights a heavy reliance on general-purpose search engines. However, current general-purpose search interfaces do not necessarily provide adequate support for non-experts in identifying suitable sources of health information. Popular search engines have recently introduced search tools in their user interfaces for a range of topics. In this work, we explore how such tools can support non-expert, patient-centered health information search. Scoping the current work to medication-related search, we report on findings from a formative study focused on the design of patient-centered, medication-information search tools. Our study included qualitative interviews with patients, family members, and domain experts, as well as observations of their use of Remedy, a technology probe embodying a set of search tools. Post-operative cardiothoracic surgery patients and their visiting family members used the tools to find information about their hospital medications and were interviewed before and after their use. Domain experts conducted similar search tasks and provided qualitative feedback on their preferences and recommendations for designing these tools. Findings from our study suggest the importance of four valuation principles underlying our tools: credibility, readability, consumer perspective, and topical relevance.
VaST: A variability search toolkit
NASA Astrophysics Data System (ADS)
Sokolovsky, K. V.; Lebedev, A. A.
2018-01-01
Variability Search Toolkit (VaST) is a software package designed to find variable objects in a series of sky images. It can be run from a script or interactively using its graphical interface. VaST relies on source list matching as opposed to image subtraction. SExtractor is used to generate source lists and perform aperture or PSF-fitting photometry (with PSFEx). Variability indices that characterize scatter and smoothness of a lightcurve are computed for all objects. Candidate variables are identified as objects having high variability index values compared to other objects of similar brightness. The two distinguishing features of VaST are its ability to perform accurate aperture photometry of images obtained with non-linear detectors and handle complex image distortions. The software has been successfully applied to images obtained with telescopes ranging from 0.08 to 2.5 m in diameter equipped with a variety of detectors including CCD, CMOS, MIC and photographic plates. About 1800 variable stars have been discovered with VaST. It is used as a transient detection engine in the New Milky Way (NMW) nova patrol. The code is written in C and can be easily compiled on the majority of UNIX-like systems. VaST is free software available at http://scan.sai.msu.ru/vast/.
The Sweet Spot of a Nonacademic Job Search
ERIC Educational Resources Information Center
Lord, Alexandra M.
2012-01-01
Because academic culture frowns on Ph.D.'s who consider leaving the ivory tower, most of those who jump ship find themselves at a loss as to where and how to begin a job search. Yet a nonacademic job search is actually quite similar to a standard research project. Both require advance planning, substantial research, collating evidence for an…
SANSparallel: interactive homology search against Uniprot
Somervuo, Panu; Holm, Liisa
2015-01-01
Proteins evolve by mutations and natural selection. The network of sequence similarities is a rich source for mining homologous relationships that inform on protein structure and function. There are many servers available to browse the network of homology relationships but one has to wait up to a minute for results. The SANSparallel webserver provides protein sequence database searches with immediate response and professional alignment visualization by third-party software. The output is a list, pairwise alignment or stacked alignment of sequence-similar proteins from Uniprot, UniRef90/50, Swissprot or Protein Data Bank. The stacked alignments are viewed in Jalview or as sequence logos. The database search uses the suffix array neighborhood search (SANS) method, which has been re-implemented as a client-server, improved and parallelized. The method is extremely fast and as sensitive as BLAST above 50% sequence identity. Benchmarks show that the method is highly competitive compared to previously published fast database search programs: UBLAST, DIAMOND, LAST, LAMBDA, RAPSEARCH2 and BLAT. The web server can be accessed interactively or programmatically at http://ekhidna2.biocenter.helsinki.fi/cgi-bin/sans/sans.cgi. It can be used to make protein functional annotation pipelines more efficient, and it is useful in interactive exploration of the detailed evidence supporting the annotation of particular proteins of interest. PMID:25855811
ERIC Educational Resources Information Center
Fattal, Laura Felleman
2004-01-01
Practical and academic, the interrelationship of the visual and performing arts opens unique frontiers to aesthetic pioneers. Divergent in aim from the historic search for similar tonalities between the Synchronists and Stravinsky or atonal musicians of the 1950s-70s and minimalist painters and sculptors, the present use of the visual arts as a…
Scene analysis for effective visual search in rough three-dimensional-modeling scenes
NASA Astrophysics Data System (ADS)
Wang, Qi; Hu, Xiaopeng
2016-11-01
Visual search is a fundamental technology in the computer vision community. It is difficult to find an object in complex scenes when there exist similar distracters in the background. We propose a target search method in rough three-dimensional-modeling scenes based on a vision salience theory and camera imaging model. We give the definition of salience of objects (or features) and explain the way that salience measurements of objects are calculated. Also, we present one type of search path that guides to the target through salience objects. Along the search path, when the previous objects are localized, the search region of each subsequent object decreases, which is calculated through imaging model and an optimization method. The experimental results indicate that the proposed method is capable of resolving the ambiguities resulting from distracters containing similar visual features with the target, leading to an improvement of search speed by over 50%.
Semi-automating the manual literature search for systematic reviews increases efficiency.
Chapman, Andrea L; Morgan, Laura C; Gartlehner, Gerald
2010-03-01
To minimise retrieval bias, manual literature searches are a key part of the search process of any systematic review. Considering the need to have accurate information, valid results of the manual literature search are essential to ensure scientific standards; likewise efficient approaches that minimise the amount of personnel time required to conduct a manual literature search are of great interest. The objective of this project was to determine the validity and efficiency of a new manual search method that utilises the scopus database. We used the traditional manual search approach as the gold standard to determine the validity and efficiency of the proposed scopus method. Outcome measures included completeness of article detection and personnel time involved. Using both methods independently, we compared the results based on accuracy of the results, validity and time spent conducting the search, efficiency. Regarding accuracy, the scopus method identified the same studies as the traditional approach indicating its validity. In terms of efficiency, using scopus led to a time saving of 62.5% compared with the traditional approach (3 h versus 8 h). The scopus method can significantly improve the efficiency of manual searches and thus of systematic reviews.
Similarity of Cortical Activity Patterns Predicts generalization Behavior
Engineer, Crystal T.; Perez, Claudia A.; Carraway, Ryan S.; Chang, Kevin Q.; Roland, Jarod L.; Sloan, Andrew M.; Kilgard, Michael P.
2013-01-01
Humans and animals readily generalize previously learned knowledge to new situations. Determining similarity is critical for assigning category membership to a novel stimulus. We tested the hypothesis that category membership is initially encoded by the similarity of the activity pattern evoked by a novel stimulus to the patterns from known categories. We provide behavioral and neurophysiological evidence that activity patterns in primary auditory cortex contain sufficient information to explain behavioral categorization of novel speech sounds by rats. Our results suggest that category membership might be encoded by the similarity of the activity pattern evoked by a novel speech sound to the patterns evoked by known sounds. Categorization based on featureless pattern matching may represent a general neural mechanism for ensuring accurate generalization across sensory and cognitive systems. PMID:24147140
Scaling, Similarity, and the Fourth Paradigm for Hydrology
NASA Technical Reports Server (NTRS)
Peters-Lidard, Christa D.; Clark, Martyn; Samaniego, Luis; Verhoest, Niko E. C.; van Emmerik, Tim; Uijlenhoet, Remko; Achieng, Kevin; Franz, Trenton E.; Woods, Ross
2017-01-01
In this synthesis paper addressing hydrologic scaling and similarity, we posit that roadblocks in the search for universal laws of hydrology are hindered by our focus on computational simulation (the third paradigm), and assert that it is time for hydrology to embrace a fourth paradigm of data-intensive science. Advances in information-based hydrologic science, coupled with an explosion of hydrologic data and advances in parameter estimation and modelling, have laid the foundation for a data-driven framework for scrutinizing hydrological scaling and similarity hypotheses. We summarize important scaling and similarity concepts (hypotheses) that require testing, describe a mutual information framework for testing these hypotheses, describe boundary condition, state flux, and parameter data requirements across scales to support testing these hypotheses, and discuss some challenges to overcome while pursuing the fourth hydrological paradigm. We call upon the hydrologic sciences community to develop a focused effort towards adopting the fourth paradigm and apply this to outstanding challenges in scaling and similarity.
SA-Search: a web tool for protein structure mining based on a Structural Alphabet
Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre
2004-01-01
SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs.jussieu.fr/cgi-bin/SA-Search. PMID:15215446
SA-Search: a web tool for protein structure mining based on a Structural Alphabet.
Guyon, Frédéric; Camproux, Anne-Claude; Hochez, Joëlle; Tufféry, Pierre
2004-07-01
SA-Search is a web tool that can be used to mine for protein structures and extract structural similarities. It is based on a hidden Markov model derived Structural Alphabet (SA) that allows the compression of three-dimensional (3D) protein conformations into a one-dimensional (1D) representation using a limited number of prototype conformations. Using such a representation, classical methods developed for amino acid sequences can be employed. Currently, SA-Search permits the performance of fast 3D similarity searches such as the extraction of exact words using a suffix tree approach, and the search for fuzzy words viewed as a simple 1D sequence alignment problem. SA-Search is available at http://bioserv.rpbs.jussieu.fr/cgi-bin/SA-Search.
NASA Astrophysics Data System (ADS)
Trakumas, S.; Salter, E.
2009-02-01
Adverse health effects due to exposure to airborne particles are associated with particle deposition within the human respiratory tract. Particle size, shape, chemical composition, and the individual physiological characteristics of each person determine to what depth inhaled particles may penetrate and deposit within the respiratory tract. Various particle inertial classification devices are available to fractionate airborne particles according to their aerodynamic size to approximate particle penetration through the human respiratory tract. Cyclones are most often used to sample thoracic or respirable fractions of inhaled particles. Extensive studies of different cyclonic samplers have shown, however, that the sampling characteristics of cyclones do not follow the entire selected convention accurately. In the search for a more accurate way to assess worker exposure to different fractions of inhaled dust, a novel sampler comprising several inertial impactors arranged in parallel was designed and tested. The new design includes a number of separated impactors arranged in parallel. Prototypes of respirable and thoracic samplers each comprising four impactors arranged in parallel were manufactured and tested. Results indicated that the prototype samplers followed closely the penetration characteristics for which they were designed. The new samplers were found to perform similarly for liquid and solid test particles; penetration characteristics remained unchanged even after prolonged exposure to coal mine dust at high concentration. The new parallel impactor design can be applied to approximate any monotonically decreasing penetration curve at a selected flow rate. Personal-size samplers that operate at a few L/min as well as area samplers that operate at higher flow rates can be made based on the suggested design. Performance of such samplers can be predicted with high accuracy employing well-established impaction theory.
Nowcasting Intraseasonal Recreational Fishing Harvest with Internet Search Volume
Carter, David W.; Crosson, Scott; Liese, Christopher
2015-01-01
Estimates of recreational fishing harvest are often unavailable until after a fishing season has ended. This lag in information complicates efforts to stay within the quota. The simplest way to monitor quota within the season is to use harvest information from the previous year. This works well when fishery conditions are stable, but is inaccurate when fishery conditions are changing. We develop regression-based models to “nowcast” intraseasonal recreational fishing harvest in the presence of changing fishery conditions. Our basic model accounts for seasonality, changes in the fishing season, and important events in the fishery. Our extended model uses Google Trends data on the internet search volume relevant to the fishery of interest. We demonstrate the model with the Gulf of Mexico red snapper fishery where the recreational sector has exceeded the quota nearly every year since 2007. Our results confirm that data for the previous year works well to predict intraseasonal harvest for a year (2012) where fishery conditions are consistent with historic patterns. However, for a year (2013) of unprecedented harvest and management activity our regression model using search volume for the term “red snapper season” generates intraseasonal nowcasts that are 27% more accurate than the basic model without the internet search information and 29% more accurate than the prediction based on the previous year. Reliable nowcasts of intraseasonal harvest could make in-season (or in-year) management feasible and increase the likelihood of staying within quota. Our nowcasting approach using internet search volume might have the potential to improve quota management in other fisheries where conditions change year-to-year. PMID:26348645
Pattern similarity study of functional sites in protein sequences: lysozymes and cystatins
Nakai, Shuryo; Li-Chan, Eunice CY; Dou, Jinglie
2005-01-01
Background Although it is generally agreed that topography is more conserved than sequences, proteins sharing the same fold can have different functions, while there are protein families with low sequence similarity. An alternative method for profile analysis of characteristic conserved positions of the motifs within the 3D structures may be needed for functional annotation of protein sequences. Using the approach of quantitative structure-activity relationships (QSAR), we have proposed a new algorithm for postulating functional mechanisms on the basis of pattern similarity and average of property values of side-chains in segments within sequences. This approach was used to search for functional sites of proteins belonging to the lysozyme and cystatin families. Results Hydrophobicity and β-turn propensity of reference segments with 3–7 residues were used for the homology similarity search (HSS) for active sites. Hydrogen bonding was used as the side-chain property for searching the binding sites of lysozymes. The profiles of similarity constants and average values of these parameters as functions of their positions in the sequences could identify both active and substrate binding sites of the lysozyme of Streptomyces coelicolor, which has been reported as a new fold enzyme (Cellosyl). The same approach was successfully applied to cystatins, especially for postulating the mechanisms of amyloidosis of human cystatin C as well as human lysozyme. Conclusion Pattern similarity and average index values of structure-related properties of side chains in short segments of three residues or longer were, for the first time, successfully applied for predicting functional sites in sequences. This new approach may be applicable to studying functional sites in un-annotated proteins, for which complete 3D structures are not yet available. PMID:15904486
Self-similar slip distributions on irregular shaped faults
NASA Astrophysics Data System (ADS)
Herrero, A.; Murphy, S.
2018-06-01
We propose a strategy to place a self-similar slip distribution on a complex fault surface that is represented by an unstructured mesh. This is possible by applying a strategy based on the composite source model where a hierarchical set of asperities, each with its own slip function which is dependent on the distance from the asperity centre. Central to this technique is the efficient, accurate computation of distance between two points on the fault surface. This is known as the geodetic distance problem. We propose a method to compute the distance across complex non-planar surfaces based on a corollary of the Huygens' principle. The difference between this method compared to others sample-based algorithms which precede it is the use of a curved front at a local level to calculate the distance. This technique produces a highly accurate computation of the distance as the curvature of the front is linked to the distance from the source. Our local scheme is based on a sequence of two trilaterations, producing a robust algorithm which is highly precise. We test the strategy on a planar surface in order to assess its ability to keep the self-similarity properties of a slip distribution. We also present a synthetic self-similar slip distribution on a real slab topography for a M8.5 event. This method for computing distance may be extended to the estimation of first arrival times in both complex 3D surfaces or 3D volumes.
Pressure Probe and Isopiestic Psychrometer Measure Similar Turgor 1
Nonami, Hiroshi; Boyer, John S.; Steudle, Ernst
1987-01-01
Turgor measured with a miniature pressure probe was compared to that measured with an isopiestic thermocouple psychrometer in mature regions of soybean (Glycine max [L.] Merr.) stems. The probe measured turgor directly in cells of intact stems whereas the psychrometer measured the water potential and osmotic potential of excised stem segments and turgor was calculated by difference. When care was taken to prevent dehydration when working with the pressure probe, and diffusive resistance and dilution errors with the psychrometer, both methods gave similar values of turgor whether the plants were dehydrating or rehydrating. This finding, together with the previously demonstrated similarity in turgor measured with the isopiestic psychrometer and a pressure chamber, indicates that the pressure probe provides accurate measurements of turgor despite the need to penetrate the cell. On the other hand, it suggests that as long as precautions are taken to obtain accurate values for the water potential and osmotic potential, turgor can be determined by isopiestic psychrometry in tissues not accessible to the pressure probe for physical reasons. PMID:16665293
Object recognition based on Google's reverse image search and image similarity
NASA Astrophysics Data System (ADS)
Horváth, András.
2015-12-01
Image classification is one of the most challenging tasks in computer vision and a general multiclass classifier could solve many different tasks in image processing. Classification is usually done by shallow learning for predefined objects, which is a difficult task and very different from human vision, which is based on continuous learning of object classes and one requires years to learn a large taxonomy of objects which are not disjunct nor independent. In this paper I present a system based on Google image similarity algorithm and Google image database, which can classify a large set of different objects in a human like manner, identifying related classes and taxonomies.
Interactive searching of facial image databases
NASA Astrophysics Data System (ADS)
Nicholls, Robert A.; Shepherd, John W.; Shepherd, Jean
1995-09-01
A set of psychological facial descriptors has been devised to enable computerized searching of criminal photograph albums. The descriptors have been used to encode image databased of up to twelve thousand images. Using a system called FACES, the databases are searched by translating a witness' verbal description into corresponding facial descriptors. Trials of FACES have shown that this coding scheme is more productive and efficient than searching traditional photograph albums. An alternative method of searching the encoded database using a genetic algorithm is currenly being tested. The genetic search method does not require the witness to verbalize a description of the target but merely to indicate a degree of similarity between the target and a limited selection of images from the database. The major drawback of FACES is that is requires a manual encoding of images. Research is being undertaken to automate the process, however, it will require an algorithm which can predict human descriptive values. Alternatives to human derived coding schemes exist using statistical classifications of images. Since databases encoded using statistical classifiers do not have an obvious direct mapping to human derived descriptors, a search method which does not require the entry of human descriptors is required. A genetic search algorithm is being tested for such a purpose.
Brown, Peter; Pullan, Wayne; Yang, Yuedong; Zhou, Yaoqi
2016-02-01
The three dimensional tertiary structure of a protein at near atomic level resolution provides insight alluding to its function and evolution. As protein structure decides its functionality, similarity in structure usually implies similarity in function. As such, structure alignment techniques are often useful in the classifications of protein function. Given the rapidly growing rate of new, experimentally determined structures being made available from repositories such as the Protein Data Bank, fast and accurate computational structure comparison tools are required. This paper presents SPalignNS, a non-sequential protein structure alignment tool using a novel asymmetrical greedy search technique. The performance of SPalignNS was evaluated against existing sequential and non-sequential structure alignment methods by performing trials with commonly used datasets. These benchmark datasets used to gauge alignment accuracy include (i) 9538 pairwise alignments implied by the HOMSTRAD database of homologous proteins; (ii) a subset of 64 difficult alignments from set (i) that have low structure similarity; (iii) 199 pairwise alignments of proteins with similar structure but different topology; and (iv) a subset of 20 pairwise alignments from the RIPC set. SPalignNS is shown to achieve greater alignment accuracy (lower or comparable root-mean squared distance with increased structure overlap coverage) for all datasets, and the highest agreement with reference alignments from the challenging dataset (iv) above, when compared with both sequentially constrained alignments and other non-sequential alignments. SPalignNS was implemented in C++. The source code, binary executable, and a web server version is freely available at: http://sparks-lab.org yaoqi.zhou@griffith.edu.au. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Quality of anaesthesia-related information accessed via Internet searches.
Caron, S; Berton, J; Beydon, L
2007-08-01
We conducted a study to examine the quality and stability of information available from the Internet on four anaesthesia-related topics. In January 2006, we searched using four key words (porphyria, scleroderma, transfusion risk, and epidural analgesia risk) with five search engines (Google, HotBot, AltaVista, Excite, and Yahoo). We used a published scoring system (NetScoring) to evaluate the first 15 sites identified by each of these 20 searches. We also used a simple four-point scale to assess the first 100 sites in the Google search on one of our four topics ('epidural analgesia risk'). In November 2006, we conducted a second evaluation, using three search engines (Google, AltaVista, and Yahoo) with 14 synonyms for 'epidural analgesia risk'. The five search engines performed similarly. NetScoring scores were lower for transfusion risk (P < 0.001). One or more high-quality sites was identified consistently among the first 15 sites in each search. Quality scored using the simple scale correlated closely with medical content and design by NetScoring and with the number of references (P < 0.05). Synonyms of 'epidural analgesia risk' yielded similar results. The quality of accessed information improved somewhat over the 11 month period with Yahoo and AltaVista, but declined with Google. The Internet is a valuable tool for obtaining medical information, but the quality of websites varies between different topics. A simple rating scale may facilitate the quality scoring on individual websites. Differences in precise search terms used for a given topic did not appear to affect the quality of the information obtained.
Patient-Centered Tools for Medication Information Search
Wilcox, Lauren; Feiner, Steven; Elhadad, Noémie; Vawdrey, David; Tran, Tran H.
2016-01-01
Recent research focused on online health information seeking highlights a heavy reliance on general-purpose search engines. However, current general-purpose search interfaces do not necessarily provide adequate support for non-experts in identifying suitable sources of health information. Popular search engines have recently introduced search tools in their user interfaces for a range of topics. In this work, we explore how such tools can support non-expert, patient-centered health information search. Scoping the current work to medication-related search, we report on findings from a formative study focused on the design of patient-centered, medication-information search tools. Our study included qualitative interviews with patients, family members, and domain experts, as well as observations of their use of Remedy, a technology probe embodying a set of search tools. Post-operative cardiothoracic surgery patients and their visiting family members used the tools to find information about their hospital medications and were interviewed before and after their use. Domain experts conducted similar search tasks and provided qualitative feedback on their preferences and recommendations for designing these tools. Findings from our study suggest the importance of four valuation principles underlying our tools: credibility, readability, consumer perspective, and topical relevance. PMID:28163972
Methods for Documenting Systematic Review Searches: A Discussion of Common Issues
ERIC Educational Resources Information Center
Rader, Tamara; Mann, Mala; Stansfield, Claire; Cooper, Chris; Sampson, Margaret
2014-01-01
Introduction: As standardized reporting requirements for systematic reviews are being adopted more widely, review authors are under greater pressure to accurately record their search process. With careful planning, documentation to fulfill the Preferred Reporting Items for Systematic Reviews and Meta-Analyses requirements can become a valuable…
Investigating Pharmacological Similarity by Charting Chemical Space.
Buonfiglio, Rosa; Engkvist, Ola; Várkonyi, Péter; Henz, Astrid; Vikeved, Elisabet; Backlund, Anders; Kogej, Thierry
2015-11-23
In this study, biologically relevant areas of the chemical space were analyzed using ChemGPS-NP. This application enables comparing groups of ligands within a multidimensional space based on principle components derived from physicochemical descriptors. Also, 3D visualization of the ChemGPS-NP global map can be used to conveniently evaluate bioactive compound similarity and visually distinguish between different types or groups of compounds. To further establish ChemGPS-NP as a method to accurately represent the chemical space, a comparison with structure-based fingerprint has been performed. Interesting complementarities between the two descriptions of molecules were observed. It has been shown that the accuracy of describing molecules with physicochemical descriptors like in ChemGPS-NP is similar to the accuracy of structural fingerprints in retrieving bioactive molecules. Lastly, pharmacological similarity of structurally diverse compounds has been investigated in ChemGPS-NP space. These results further strengthen the case of using ChemGPS-NP as a tool to explore and visualize chemical space.
A New Method for Measuring Text Similarity in Learning Management Systems Using WordNet
ERIC Educational Resources Information Center
Alkhatib, Bassel; Alnahhas, Ammar; Albadawi, Firas
2014-01-01
As text sources are getting broader, measuring text similarity is becoming more compelling. Automatic text classification, search engines and auto answering systems are samples of applications that rely on text similarity. Learning management systems (LMS) are becoming more important since electronic media is getting more publicly available. As…
NASA Astrophysics Data System (ADS)
Shakibay Senobari, N.; Funning, G.
2016-12-01
Repeating earthquakes (REs) are the regular or semi-regular failures of the same patch on a fault, producing near-identical waveforms at a given station. Sequences of REs are commonly interpreted as slip on small locked patches surrounded by large areas of fault that are creeping (Nadeau and McEvilly, 1999). Detecting them, therefore, places important constraints on the extent of fault creep at depth. In addition, the magnitude and recurrence interval of these RE sequences can be related to the creep rate and used as constraints on slip models. In this study we search for REs in northern California fault systems upon which creep is suspected, but not well constrained, including the Rodgers Creek, Maacama, Bartlett Springs, Concord-Green Valley, West Napa and Greenville faults, targeting events recorded at stations where the instrument was not changed for 10 years or more. A pair of events can be identified as REs based on a high cross-correlation coefficient (CCC) between their waveforms. Thus a fundamental step in RE searches is calculating the CCC for all event waveform pairs recorded at common stations. This becomes computationally expensive for large data sets. To expedite our search, we use a fast and accurate similarity search algorithm developed by the computer science community (Mueen et al., 2015; Zhu et al., 2016). Our initial tests on a data set including 1500 waveforms suggest it is around 40 times faster than the algorithm that we used previously (Shakibay Senobari and Funning, AGU Fall Meeting 2014). We search for event pairs with CCC>0.85 and cluster them based on their similarity. A second, location based filter, based on the differential S-P times for each event pair at 5 or more stations, is used as an independent check. We consider a cluster of events a RE sequence if the source location separation distance for each pair is less than the estimated circular size of the source (e.g. Chen et al., 2008); these are gathered into an RE catalogue. In
Influence of inter-item symmetry in visual search.
Roggeveen, Alexa B; Kingstone, Alan; Enns, James T
2004-01-01
Does visual search involve a serial inspection of individual items (Feature Integration Theory) or are items grouped and segregated prior to their consideration as a possible target (Attentional Engagement Theory)? For search items defined by motion and shape there is strong support for prior grouping (Kingstone and Bischof, 1999). The present study tested for grouping based on inter-item shape symmetry. Results showed that target-distractor symmetry strongly influenced search whereas distractor-distractor symmetry influenced search more weakly. This indicates that static shapes are evaluated for similarity to one another prior to their explicit identification as 'target' or 'distractor'. Possible reasons for the unequal contributions of target-distractor and distractor-distractor relations are discussed.
SANSparallel: interactive homology search against Uniprot.
Somervuo, Panu; Holm, Liisa
2015-07-01
Proteins evolve by mutations and natural selection. The network of sequence similarities is a rich source for mining homologous relationships that inform on protein structure and function. There are many servers available to browse the network of homology relationships but one has to wait up to a minute for results. The SANSparallel webserver provides protein sequence database searches with immediate response and professional alignment visualization by third-party software. The output is a list, pairwise alignment or stacked alignment of sequence-similar proteins from Uniprot, UniRef90/50, Swissprot or Protein Data Bank. The stacked alignments are viewed in Jalview or as sequence logos. The database search uses the suffix array neighborhood search (SANS) method, which has been re-implemented as a client-server, improved and parallelized. The method is extremely fast and as sensitive as BLAST above 50% sequence identity. Benchmarks show that the method is highly competitive compared to previously published fast database search programs: UBLAST, DIAMOND, LAST, LAMBDA, RAPSEARCH2 and BLAT. The web server can be accessed interactively or programmatically at http://ekhidna2.biocenter.helsinki.fi/cgi-bin/sans/sans.cgi. It can be used to make protein functional annotation pipelines more efficient, and it is useful in interactive exploration of the detailed evidence supporting the annotation of particular proteins of interest. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Ontology-Driven Search and Triage: Design of a Web-Based Visual Interface for MEDLINE
2017-01-01
Background Diverse users need to search health and medical literature to satisfy open-ended goals such as making evidence-based decisions and updating their knowledge. However, doing so is challenging due to at least two major difficulties: (1) articulating information needs using accurate vocabulary and (2) dealing with large document sets returned from searches. Common search interfaces such as PubMed do not provide adequate support for exploratory search tasks. Objective Our objective was to improve support for exploratory search tasks by combining two strategies in the design of an interactive visual interface by (1) using a formal ontology to help users build domain-specific knowledge and vocabulary and (2) providing multi-stage triaging support to help mitigate the information overload problem. Methods We developed a Web-based tool, Ontology-Driven Visual Search and Triage Interface for MEDLINE (OVERT-MED), to test our design ideas. We implemented a custom searchable index of MEDLINE, which comprises approximately 25 million document citations. We chose a popular biomedical ontology, the Human Phenotype Ontology (HPO), to test our solution to the vocabulary problem. We implemented multistage triaging support in OVERT-MED, with the aid of interactive visualization techniques, to help users deal with large document sets returned from searches. Results Formative evaluation suggests that the design features in OVERT-MED are helpful in addressing the two major difficulties described above. Using a formal ontology seems to help users articulate their information needs with more accurate vocabulary. In addition, multistage triaging combined with interactive visualizations shows promise in mitigating the information overload problem. Conclusions Our strategies appear to be valuable in addressing the two major problems in exploratory search. Although we tested OVERT-MED with a particular ontology and document collection, we anticipate that our strategies can be
Baldi, Pierre
2010-01-01
As repositories of chemical molecules continue to expand and become more open, it becomes increasingly important to develop tools to search them efficiently and assess the statistical significance of chemical similarity scores. Here we develop a general framework for understanding, modeling, predicting, and approximating the distribution of chemical similarity scores and its extreme values in large databases. The framework can be applied to different chemical representations and similarity measures but is demonstrated here using the most common binary fingerprints with the Tanimoto similarity measure. After introducing several probabilistic models of fingerprints, including the Conditional Gaussian Uniform model, we show that the distribution of Tanimoto scores can be approximated by the distribution of the ratio of two correlated Normal random variables associated with the corresponding unions and intersections. This remains true also when the distribution of similarity scores is conditioned on the size of the query molecules in order to derive more fine-grained results and improve chemical retrieval. The corresponding extreme value distributions for the maximum scores are approximated by Weibull distributions. From these various distributions and their analytical forms, Z-scores, E-values, and p-values are derived to assess the significance of similarity scores. In addition, the framework allows one to predict also the value of standard chemical retrieval metrics, such as Sensitivity and Specificity at fixed thresholds, or ROC (Receiver Operating Characteristic) curves at multiple thresholds, and to detect outliers in the form of atypical molecules. Numerous and diverse experiments carried in part with large sets of molecules from the ChemDB show remarkable agreement between theory and empirical results. PMID:20540577
Folksonomical P2P File Sharing Networks Using Vectorized KANSEI Information as Search Tags
NASA Astrophysics Data System (ADS)
Ohnishi, Kei; Yoshida, Kaori; Oie, Yuji
We present the concept of folksonomical peer-to-peer (P2P) file sharing networks that allow participants (peers) to freely assign structured search tags to files. These networks are similar to folksonomies in the present Web from the point of view that users assign search tags to information distributed over a network. As a concrete example, we consider an unstructured P2P network using vectorized Kansei (human sensitivity) information as structured search tags for file search. Vectorized Kansei information as search tags indicates what participants feel about their files and is assigned by the participant to each of their files. A search query also has the same form of search tags and indicates what participants want to feel about files that they will eventually obtain. A method that enables file search using vectorized Kansei information is the Kansei query-forwarding method, which probabilistically propagates a search query to peers that are likely to hold more files having search tags that are similar to the query. The similarity between the search query and the search tags is measured in terms of their dot product. The simulation experiments examine if the Kansei query-forwarding method can provide equal search performance for all peers in a network in which only the Kansei information and the tendency with respect to file collection are different among all of the peers. The simulation results show that the Kansei query forwarding method and a random-walk-based query forwarding method, for comparison, work effectively in different situations and are complementary. Furthermore, the Kansei query forwarding method is shown, through simulations, to be superior to or equal to the random-walk based one in terms of search speed.
Trading efficiency for effectiveness in similarity-based indexing for image databases
NASA Astrophysics Data System (ADS)
Barros, Julio E.; French, James C.; Martin, Worthy N.; Kelly, Patrick M.
1995-11-01
Image databases typically manage feature data that can be viewed as points in a feature space. Some features, however, can be better expressed as a collection of points or described by a probability distribution function (PDF) rather than as a single point. In earlier work we introduced a similarity measure and a method for indexing and searching the PDF descriptions of these items that guarantees an answer equivalent to sequential search. Unfortunately, certain properties of the data can restrict the efficiency of that method. In this paper we extend that work and examine trade-offs between efficiency and answer quality or effectiveness. These trade-offs reduce the amount of work required during a search by reducing the number of undesired items fetched without excluding an excessive number of the desired ones.
On numerically accurate finite element
NASA Technical Reports Server (NTRS)
Nagtegaal, J. C.; Parks, D. M.; Rice, J. R.
1974-01-01
A general criterion for testing a mesh with topologically similar repeat units is given, and the analysis shows that only a few conventional element types and arrangements are, or can be made suitable for computations in the fully plastic range. Further, a new variational principle, which can easily and simply be incorporated into an existing finite element program, is presented. This allows accurate computations to be made even for element designs that would not normally be suitable. Numerical results are given for three plane strain problems, namely pure bending of a beam, a thick-walled tube under pressure, and a deep double edge cracked tensile specimen. The effects of various element designs and of the new variational procedure are illustrated. Elastic-plastic computation at finite strain are discussed.
Link-Based Similarity Measures Using Reachability Vectors
Yoon, Seok-Ho; Kim, Ji-Soo; Ryu, Minsoo; Choi, Ho-Jin
2014-01-01
We present a novel approach for computing link-based similarities among objects accurately by utilizing the link information pertaining to the objects involved. We discuss the problems with previous link-based similarity measures and propose a novel approach for computing link based similarities that does not suffer from these problems. In the proposed approach each target object is represented by a vector. Each element of the vector corresponds to all the objects in the given data, and the value of each element denotes the weight for the corresponding object. As for this weight value, we propose to utilize the probability of reaching from the target object to the specific object, computed using the “Random Walk with Restart” strategy. Then, we define the similarity between two objects as the cosine similarity of the two vectors. In this paper, we provide examples to show that our approach does not suffer from the aforementioned problems. We also evaluate the performance of the proposed methods in comparison with existing link-based measures, qualitatively and quantitatively, with respect to two kinds of data sets, scientific papers and Web documents. Our experimental results indicate that the proposed methods significantly outperform the existing measures. PMID:24701188
Fred L. Tobiason; Richard W. Hemingway
1994-01-01
A GMMX conformational search routine gives a family of conformations that reflects the Boltzmann-averaged heterocyclic ring conformation as evidenced by accurate prediction of all three coupling constants observed for tetra-O-methyl-(+)-catechin.
Fred L. Tobiason; Richard w. Hemingway
1994-01-01
A GMMXe conformational search routine gives a family a conformations that reflects the boltzmann-averaged heterocyclic ring conformation as evidence by accurate prediction of all three coupling constants observed for tetra-O-methyl-(+)-catechin.
NASA Astrophysics Data System (ADS)
Huang, Chuan; Guo, Peng; Yang, Aiying; Qiao, Yaojun
2018-07-01
In single channel systems, the nonlinear phase noise only comes from the channel itself through self-phase modulation (SPM). In this paper, a fast-nonlinear effect estimation method is proposed based on fractional Fourier transformation (FrFT). The nonlinear phase noise caused by Self-phase modulation effect is accurately estimated for single model 10Gbaud OOK and RZ-QPSK signals with the fiber length range of 0-200 km and the launch power range of 1-10 mW. The pulse windowing is adopted to search the optimum fractional order for the OOK and RZ-QPSK signals. Since the nonlinear phase shift caused by the SPM effect is very small, the accurate optimum fractional order of the signal cannot be found based on the traditional method. In this paper, a new method magnifying the phase shift is proposed to get the accurate optimum order and thus the nonlinear phase shift is calculated. The simulation results agree with the theoretical analysis and the method is applicable to signals whose pulse type has the similar characteristics with Gaussian pulse.
Faster sequence homology searches by clustering subsequences.
Suzuki, Shuji; Kakuta, Masanori; Ishida, Takashi; Akiyama, Yutaka
2015-04-15
Sequence homology searches are used in various fields. New sequencing technologies produce huge amounts of sequence data, which continuously increase the size of sequence databases. As a result, homology searches require large amounts of computational time, especially for metagenomic analysis. We developed a fast homology search method based on database subsequence clustering, and implemented it as GHOSTZ. This method clusters similar subsequences from a database to perform an efficient seed search and ungapped extension by reducing alignment candidates based on triangle inequality. The database subsequence clustering technique achieved an ∼2-fold increase in speed without a large decrease in search sensitivity. When we measured with metagenomic data, GHOSTZ is ∼2.2-2.8 times faster than RAPSearch and is ∼185-261 times faster than BLASTX. The source code is freely available for download at http://www.bi.cs.titech.ac.jp/ghostz/ akiyama@cs.titech.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
What Friends Are For: Collaborative Intelligence Analysis and Search
2014-06-01
14. SUBJECT TERMS Intelligence Community, information retrieval, recommender systems , search engines, social networks, user profiling, Lucene...improvements over existing search systems . The improvements are shown to be robust to high levels of human error and low similarity between users ...precision NOLH nearly orthogonal Latin hypercubes P@ precision at documents RS recommender systems TREC Text REtrieval Conference USM user
Improving Zernike moments comparison for optimal similarity and rotation angle retrieval.
Revaud, Jérôme; Lavoué, Guillaume; Baskurt, Atilla
2009-04-01
Zernike moments constitute a powerful shape descriptor in terms of robustness and description capability. However the classical way of comparing two Zernike descriptors only takes into account the magnitude of the moments and loses the phase information. The novelty of our approach is to take advantage of the phase information in the comparison process while still preserving the invariance to rotation. This new Zernike comparator provides a more accurate similarity measure together with the optimal rotation angle between the patterns, while keeping the same complexity as the classical approach. This angle information is particularly of interest for many applications, including 3D scene understanding through images. Experiments demonstrate that our comparator outperforms the classical one in terms of similarity measure. In particular the robustness of the retrieval against noise and geometric deformation is greatly improved. Moreover, the rotation angle estimation is also more accurate than state-of-the-art algorithms.
ERIC Educational Resources Information Center
Sutton, Jennifer E.
2006-01-01
Children ages 2, 3 and 4 years participated in a novel hide-and-seek search task presented on a touchscreen monitor. On beacon trials, the target hiding place could be located using a beacon cue, but on landmark trials, searching required the use of a nearby landmark cue. In Experiment 1, 2-year-olds performed less accurately than older children…
Quantifying the Search Behaviour of Different Demographics Using Google Correlate
Letchford, Adrian; Preis, Tobias; Moat, Helen Susannah
2016-01-01
Vast records of our everyday interests and concerns are being generated by our frequent interactions with the Internet. Here, we investigate how the searches of Google users vary across U.S. states with different birth rates and infant mortality rates. We find that users in states with higher birth rates search for more information about pregnancy, while those in states with lower birth rates search for more information about cats. Similarly, we find that users in states with higher infant mortality rates search for more information about credit, loans and diseases. Our results provide evidence that Internet search data could offer new insight into the concerns of different demographics. PMID:26910464
Context matters: the structure of task goals affects accuracy in multiple-target visual search.
Clark, Kait; Cain, Matthew S; Adcock, R Alison; Mitroff, Stephen R
2014-05-01
Career visual searchers such as radiologists and airport security screeners strive to conduct accurate visual searches, but despite extensive training, errors still occur. A key difference between searches in radiology and airport security is the structure of the search task: Radiologists typically scan a certain number of medical images (fixed objective), and airport security screeners typically search X-rays for a specified time period (fixed duration). Might these structural differences affect accuracy? We compared performance on a search task administered either under constraints that approximated radiology or airport security. Some displays contained more than one target because the presence of multiple targets is an established source of errors for career searchers, and accuracy for additional targets tends to be especially sensitive to contextual conditions. Results indicate that participants searching within the fixed objective framework produced more multiple-target search errors; thus, adopting a fixed duration framework could improve accuracy for career searchers. Copyright © 2013 Elsevier Ltd and The Ergonomics Society. All rights reserved.
JSC Search System Usability Case Study
NASA Technical Reports Server (NTRS)
Meza, David; Berndt, Sarah
2014-01-01
The advanced nature of "search" has facilitated the movement from keyword match to the delivery of every conceivable information topic from career, commerce, entertainment, learning... the list is infinite. At NASA Johnson Space Center (JSC ) the Search interface is an important means of knowledge transfer. By indexing multiple sources between directorates and organizations, the system's potential is culture changing in that through search, knowledge of the unique accomplishments in engineering and science can be seamlessly passed between generations. This paper reports the findings of an initial survey, the first of a four part study to help determine user sentiment on the intranet, or local (JSC) enterprise search environment as well as the larger NASA enterprise. The survey is a means through which end users provide direction on the development and transfer of knowledge by way of the search experience. The ideal is to identify what is working and what needs to be improved from the users' vantage point by documenting: (1) Where users are satisfied/dissatisfied (2) Perceived value of interface components (3) Gaps which cause any disappointment in search experience. The near term goal is it to inform JSC search in order to improve users' ability to utilize existing services and infrastructure to perform tasks with a shortened life cycle. Continuing steps include an agency based focus with modified questions to accomplish a similar purpose
Yom-Tov, Elad; Fernandez-Luque, Luis
2014-01-01
Vaccination campaigns are one of the most important and successful public health programs ever undertaken. People who want to learn about vaccines in order to make an informed decision on whether to vaccinate are faced with a wealth of information on the Internet, both for and against vaccinations. In this paper we develop an automated way to score Internet search queries and web pages as to the likelihood that a person making these queries or reading those pages would decide to vaccinate. We apply this method to data from a major Internet search engine, while people seek information about the Measles, Mumps and Rubella (MMR) vaccine. We show that our method is accurate, and use it to learn about the information acquisition process of people. Our results show that people who are pro-vaccination as well as people who are anti-vaccination seek similar information, but browsing this information has differing effect on their future browsing. These findings demonstrate the need for health authorities to tailor their information according to the current stance of users.
Yom-Tov, Elad; Fernandez-Luque, Luis
2014-01-01
Vaccination campaigns are one of the most important and successful public health programs ever undertaken. People who want to learn about vaccines in order to make an informed decision on whether to vaccinate are faced with a wealth of information on the Internet, both for and against vaccinations. In this paper we develop an automated way to score Internet search queries and web pages as to the likelihood that a person making these queries or reading those pages would decide to vaccinate. We apply this method to data from a major Internet search engine, while people seek information about the Measles, Mumps and Rubella (MMR) vaccine. We show that our method is accurate, and use it to learn about the information acquisition process of people. Our results show that people who are pro-vaccination as well as people who are anti-vaccination seek similar information, but browsing this information has differing effect on their future browsing. These findings demonstrate the need for health authorities to tailor their information according to the current stance of users. PMID:25954435
Fatehi, Farhad; Gray, Leonard C; Wootton, Richard
2014-01-01
The way that PubMed results are displayed can be changed using the Display Settings drop-down menu in the result screen. There are three groups of options: Format, Items per page and Sort by, which allow a good deal of control. The results from several searches can be temporarily stored on the Clipboard. Records of interest can be selected on the results page using check boxes and can then be combined, for example to form a reference list. The Related Citations is a valuable feature of PubMed that can provide a set of similar articles when you have identified a record of interest among the results. You can easily search for RCTs or reviews using the appropriate filters or field tags. If you are interested in clinical articles, rather than basic science or health service research, then the Clinical Queries tool on the PubMed home page can be used to retrieve them.
Evaluating Open-Source Full-Text Search Engines for Matching ICD-10 Codes.
Jurcău, Daniel-Alexandru; Stoicu-Tivadar, Vasile
2016-01-01
This research presents the results of evaluating multiple free, open-source engines on matching ICD-10 diagnostic codes via full-text searches. The study investigates what it takes to get an accurate match when searching for a specific diagnostic code. For each code the evaluation starts by extracting the words that make up its text and continues with building full-text search queries from the combinations of these words. The queries are then run against all the ICD-10 codes until a match indicates the code in question as a match with the highest relative score. This method identifies the minimum number of words that must be provided in order for the search engines choose the desired entry. The engines analyzed include a popular Java-based full-text search engine, a lightweight engine written in JavaScript which can even execute on the user's browser, and two popular open-source relational database management systems.
Searching Lost People with Uavs: the System and Results of the Close-Search Project
NASA Astrophysics Data System (ADS)
Molina, P.; Colomina, I.; Vitoria, T.; Silva, P. F.; Skaloud, J.; Kornus, W.; Prades, R.; Aguilera, C.
2012-07-01
This paper will introduce the goals, concept and results of the project named CLOSE-SEARCH, which stands for 'Accurate and safe EGNOS-SoL Navigation for UAV-based low-cost Search-And-Rescue (SAR) operations'. The main goal is to integrate a medium-size, helicopter-type Unmanned Aerial Vehicle (UAV), a thermal imaging sensor and an EGNOS-based multi-sensor navigation system, including an Autonomous Integrity Monitoring (AIM) capability, to support search operations in difficult-to-access areas and/or night operations. The focus of the paper is three-fold. Firstly, the operational and technical challenges of the proposed approach are discussed, such as ultra-safe multi-sensor navigation system, the use of combined thermal and optical vision (infrared plus visible) for person recognition and Beyond-Line-Of-Sight communications among others. Secondly, the implementation of the integrity concept for UAV platforms is discussed herein through the AIM approach. Based on the potential of the geodetic quality analysis and on the use of the European EGNOS system as a navigation performance starting point, AIM approaches integrity from the precision standpoint; that is, the derivation of Horizontal and Vertical Protection Levels (HPLs, VPLs) from a realistic precision estimation of the position parameters is performed and compared to predefined Alert Limits (ALs). Finally, some results from the project test campaigns are described to report on particular project achievements. Together with actual Search-and-Rescue teams, the system was operated in realistic, user-chosen test scenarios. In this context, and specially focusing on the EGNOS-based UAV navigation, the AIM capability and also the RGB/thermal imaging subsystem, a summary of the results is presented.
PISA and Scientific Literacy: Similarities and Differences between the Nordic Countries
ERIC Educational Resources Information Center
Kjaernsli, Marit; Lie, Svein
2004-01-01
In this paper we have set out to search for similarities and differences between the Nordic countries concerning patterns of competencies defined as scientific literacy in the Programme for International Student Assessment (PISA) study. The first part focuses on gender differences concerning the two types of competencies, understanding of…
Popularity versus similarity in growing networks
NASA Astrophysics Data System (ADS)
Krioukov, Dmitri; Papadopoulos, Fragkiskos; Kitsak, Maksim; Serrano, Mariangeles; Boguna, Marian
2012-02-01
Preferential attachment is a powerful mechanism explaining the emergence of scaling in growing networks. If new connections are established preferentially to more popular nodes in a network, then the network is scale-free. Here we show that not only popularity but also similarity is a strong force shaping the network structure and dynamics. We develop a framework where new connections, instead of preferring popular nodes, optimize certain trade-offs between popularity and similarity. The framework admits a geometric interpretation, in which preferential attachment emerges from local optimization processes. As opposed to preferential attachment, the optimization framework accurately describes large-scale evolution of technological (Internet), social (web of trust), and biological (E.coli metabolic) networks, predicting the probability of new links in them with a remarkable precision. The developed framework can thus be used for predicting new links in evolving networks, and provides a different perspective on preferential attachment as an emergent phenomenon.
Search and Graph Database Technologies for Biomedical Semantic Indexing: Experimental Analysis.
Segura Bedmar, Isabel; Martínez, Paloma; Carruana Martín, Adrián
2017-12-01
Biomedical semantic indexing is a very useful support tool for human curators in their efforts for indexing and cataloging the biomedical literature. The aim of this study was to describe a system to automatically assign Medical Subject Headings (MeSH) to biomedical articles from MEDLINE. Our approach relies on the assumption that similar documents should be classified by similar MeSH terms. Although previous work has already exploited the document similarity by using a k-nearest neighbors algorithm, we represent documents as document vectors by search engine indexing and then compute the similarity between documents using cosine similarity. Once the most similar documents for a given input document are retrieved, we rank their MeSH terms to choose the most suitable set for the input document. To do this, we define a scoring function that takes into account the frequency of the term into the set of retrieved documents and the similarity between the input document and each retrieved document. In addition, we implement guidelines proposed by human curators to annotate MEDLINE articles; in particular, the heuristic that says if 3 MeSH terms are proposed to classify an article and they share the same ancestor, they should be replaced by this ancestor. The representation of the MeSH thesaurus as a graph database allows us to employ graph search algorithms to quickly and easily capture hierarchical relationships such as the lowest common ancestor between terms. Our experiments show promising results with an F1 of 69% on the test dataset. To the best of our knowledge, this is the first work that combines search and graph database technologies for the task of biomedical semantic indexing. Due to its horizontal scalability, ElasticSearch becomes a real solution to index large collections of documents (such as the bibliographic database MEDLINE). Moreover, the use of graph search algorithms for accessing MeSH information could provide a support tool for cataloging MEDLINE
The effectiveness of position- and composition-specific gap costs for protein similarity searches.
Stojmirović, Aleksandar; Gertz, E Michael; Altschul, Stephen F; Yu, Yi-Kuo
2008-07-01
The flexibility in gap cost enjoyed by hidden Markov models (HMMs) is expected to afford them better retrieval accuracy than position-specific scoring matrices (PSSMs). We attempt to quantify the effect of more general gap parameters by separately examining the influence of position- and composition-specific gap scores, as well as by comparing the retrieval accuracy of the PSSMs constructed using an iterative procedure to that of the HMMs provided by Pfam and SUPERFAMILY, curated ensembles of multiple alignments. We found that position-specific gap penalties have an advantage over uniform gap costs. We did not explore optimizing distinct uniform gap costs for each query. For Pfam, PSSMs iteratively constructed from seeds based on HMM consensus sequences perform equivalently to HMMs that were adjusted to have constant gap transition probabilities, albeit with much greater variance. We observed no effect of composition-specific gap costs on retrieval performance. These results suggest possible improvements to the PSI-BLAST protein database search program. The scripts for performing evaluations are available upon request from the authors.
Seeking out SARI: an automated search of electronic health records.
O'Horo, John C; Dziadzko, Mikhail; Sakusic, Amra; Ali, Rashid; Sohail, M Rizwan; Kor, Daryl J; Gajic, Ognjen
2018-06-01
The definition of severe acute respiratory infection (SARI) - a respiratory illness with fever and cough, occurring within the past 10 days and requiring hospital admission - has not been evaluated for critically ill patients. Using integrated electronic health records data, we developed an automated search algorithm to identify SARI cases in a large cohort of critical care patients and evaluate patient outcomes. We conducted a retrospective cohort study of all admissions to a medical intensive care unit from August 2009 through March 2016. Subsets were randomly selected for deriving and validating a search algorithm, which was compared with temporal trends in laboratory-confirmed influenza to ensure that SARI was correlated with influenza. The algorithm was applied to the cohort to identify clinical differences for patients with and without SARI. For identifying SARI, the algorithm (sensitivity, 86.9%; specificity, 95.6%) outperformed billing-based searching (sensitivity, 73.8%; specificity, 78.8%). Automated searching correlated with peaks in laboratory-confirmed influenza. Adjusted for severity of illness, SARI was associated with more hospital, intensive care unit and ventilator days but not with death or dismissal to home. The search algorithm accurately identified SARI for epidemiologic study and surveillance.
Chen, Chen Hsiu; Kuo, Su Ching; Tang, Siew Tzuh
2017-05-01
No systematic meta-analysis is available on the prevalence of cancer patients' accurate prognostic awareness and differences in accurate prognostic awareness by publication year, region, assessment method, and service received. To examine the prevalence of advanced/terminal cancer patients' accurate prognostic awareness and differences in accurate prognostic awareness by publication year, region, assessment method, and service received. Systematic review and meta-analysis. MEDLINE, Embase, The Cochrane Library, CINAHL, and PsycINFO were systematically searched on accurate prognostic awareness in adult patients with advanced/terminal cancer (1990-2014). Pooled prevalences were calculated for accurate prognostic awareness by a random-effects model. Differences in weighted estimates of accurate prognostic awareness were compared by meta-regression. In total, 34 articles were retrieved for systematic review and meta-analysis. At best, only about half of advanced/terminal cancer patients accurately understood their prognosis (49.1%; 95% confidence interval: 42.7%-55.5%; range: 5.4%-85.7%). Accurate prognostic awareness was independent of service received and publication year, but highest in Australia, followed by East Asia, North America, and southern Europe and the United Kingdom (67.7%, 60.7%, 52.8%, and 36.0%, respectively; p = 0.019). Accurate prognostic awareness was higher by clinician assessment than by patient report (63.2% vs 44.5%, p < 0.001). Less than half of advanced/terminal cancer patients accurately understood their prognosis, with significant variations by region and assessment method. Healthcare professionals should thoroughly assess advanced/terminal cancer patients' preferences for prognostic information and engage them in prognostic discussion early in the cancer trajectory, thus facilitating their accurate prognostic awareness and the quality of end-of-life care decision-making.
One Shot Detection with Laplacian Object and Fast Matrix Cosine Similarity.
Biswas, Sujoy Kumar; Milanfar, Peyman
2016-03-01
One shot, generic object detection involves searching for a single query object in a larger target image. Relevant approaches have benefited from features that typically model the local similarity patterns. In this paper, we combine local similarity (encoded by local descriptors) with a global context (i.e., a graph structure) of pairwise affinities among the local descriptors, embedding the query descriptors into a low dimensional but discriminatory subspace. Unlike principal components that preserve global structure of feature space, we actually seek a linear approximation to the Laplacian eigenmap that permits us a locality preserving embedding of high dimensional region descriptors. Our second contribution is an accelerated but exact computation of matrix cosine similarity as the decision rule for detection, obviating the computationally expensive sliding window search. We leverage the power of Fourier transform combined with integral image to achieve superior runtime efficiency that allows us to test multiple hypotheses (for pose estimation) within a reasonably short time. Our approach to one shot detection is training-free, and experiments on the standard data sets confirm the efficacy of our model. Besides, low computation cost of the proposed (codebook-free) object detector facilitates rather straightforward query detection in large data sets including movie videos.
Modeling and prediction of human word search behavior in interactive machine translation
NASA Astrophysics Data System (ADS)
Ji, Duo; Yu, Bai; Ma, Bin; Ye, Na
2017-12-01
As a kind of computer aided translation method, Interactive Machine Translation technology reduced manual translation repetitive and mechanical operation through a variety of methods, so as to get the translation efficiency, and played an important role in the practical application of the translation work. In this paper, we regarded the behavior of users' frequently searching for words in the translation process as the research object, and transformed the behavior to the translation selection problem under the current translation. The paper presented a prediction model, which is a comprehensive utilization of alignment model, translation model and language model of the searching words behavior. It achieved a highly accurate prediction of searching words behavior, and reduced the switching of mouse and keyboard operations in the users' translation process.
Protein structure database search and evolutionary classification.
Yang, Jinn-Moon; Tung, Chi-Hua
2006-01-01
As more protein structures become available and structural genomics efforts provide structural models in a genome-wide strategy, there is a growing need for fast and accurate methods for discovering homologous proteins and evolutionary classifications of newly determined structures. We have developed 3D-BLAST, in part, to address these issues. 3D-BLAST is as fast as BLAST and calculates the statistical significance (E-value) of an alignment to indicate the reliability of the prediction. Using this method, we first identified 23 states of the structural alphabet that represent pattern profiles of the backbone fragments and then used them to represent protein structure databases as structural alphabet sequence databases (SADB). Our method enhanced BLAST as a search method, using a new structural alphabet substitution matrix (SASM) to find the longest common substructures with high-scoring structured segment pairs from an SADB database. Using personal computers with Intel Pentium4 (2.8 GHz) processors, our method searched more than 10 000 protein structures in 1.3 s and achieved a good agreement with search results from detailed structure alignment methods. [3D-BLAST is available at http://3d-blast.life.nctu.edu.tw].
Insights: Talent Searches from Parents' Perspectives
ERIC Educational Resources Information Center
Willis, Mariam
2012-01-01
Talent Searches offer an opportunity for gifted children to experience learning on prestigious college campuses around the nation, and as importantly, an opportunity to form relationships with like-minded, similar-age peers. Few opportunities open doors for intellectual, social, and emotional growth in gifted children as efficiently as…
Kim, Bong Jun; Lee, Sungsoo
2018-04-01
The huge improvements in the speed of data transmission and the increasing amount of data available as the Internet has expanded have made it easy to obtain information about any disease. Since pneumothorax frequently occurs in young adolescents, patients often search the Internet for information on pneumothorax. This study analyzed an Internet community for exchanging information on pneumothorax, with an emphasis on the importance of accurate information and doctors' role in providing such information. This study assessed 599,178 visitors to the Internet community from June 2008 to April 2017. There was an average of 190 visitors, 2.2 posts, and 4.5 replies per day. A total of 6,513 posts were made, and 63.3% of them included questions about the disease. The visitors mostly searched for terms such as 'pneumothorax,' 'recurrent pneumothorax,' 'pneumothorax operation,' and 'obtaining a medical certification of having been diagnosed with pneumothorax.' However, 22% of the pneumothorax-related posts by visitors contained inaccurate information. Internet communities can be an important source of information. However, incorrect information about a disease can be harmful for patients. We, as doctors, should try to provide more in-depth information about diseases to patients and to disseminate accurate information about diseases in Internet communities.
Visual search for verbal material in patients with obsessive-compulsive disorder.
Botta, Fabiano; Vibert, Nicolas; Harika-Germaneau, Ghina; Frasca, Mickaël; Rigalleau, François; Fakra, Eric; Ros, Christine; Rouet, Jean-François; Ferreri, Florian; Jaafari, Nematollah
2018-06-01
This study aimed at investigating attentional mechanisms in obsessive-compulsive disorder (OCD) by analysing how visual search processes are modulated by normal and obsession-related distracting information in OCD patients and whether these modulations differ from those observed in healthy people. OCD patients were asked to search for a target word within distractor words that could be orthographically similar to the target, semantically related to the target, semantically related to the most typical obsessions/compulsions observed in OCD patients, or unrelated to the target. Patients' performance and eye movements were compared with those of individually matched healthy controls. In controls, the distractors that were visually similar to the target mostly captured attention. Conversely, patients' attention was captured equally by all kinds of distractor words, whatever their similarity with the target, except obsession-related distractors that attracted patients' attention less than the other distractors. OCD had a major impact on the mostly subliminal mechanisms that guide attention within the search display, but had much less impact on the distractor rejection processes that take place when a distractor is fixated. Hence, visual search in OCD is characterized by abnormal subliminal, but not supraliminal, processing of obsession-related information and by an impaired ability to inhibit task-irrelevant inputs. Copyright © 2018 Elsevier B.V. All rights reserved.
Assigning statistical significance to proteotypic peptides via database searches
Alves, Gelio; Ogurtsov, Aleksey Y.; Yu, Yi-Kuo
2011-01-01
Querying MS/MS spectra against a database containing only proteotypic peptides reduces data analysis time due to reduction of database size. Despite the speed advantage, this search strategy is challenged by issues of statistical significance and coverage. The former requires separating systematically significant identifications from less confident identifications, while the latter arises when the underlying peptide is not present, due to single amino acid polymorphisms (SAPs) or post-translational modifications (PTMs), in the proteotypic peptide libraries searched. To address both issues simultaneously, we have extended RAId’s knowledge database to include proteotypic information, utilized RAId’s statistical strategy to assign statistical significance to proteotypic peptides, and modified RAId’s programs to allow for consideration of proteotypic information during database searches. The extended database alleviates the coverage problem since all annotated modifications, even those occurred within proteotypic peptides, may be considered. Taking into account the likelihoods of observation, the statistical strategy of RAId provides accurate E-value assignments regardless whether a candidate peptide is proteotypic or not. The advantage of including proteotypic information is evidenced by its superior retrieval performance when compared to regular database searches. PMID:21055489
Fuzzy measures on the Gene Ontology for gene product similarity.
Popescu, Mihail; Keller, James M; Mitchell, Joyce A
2006-01-01
One of the most important objects in bioinformatics is a gene product (protein or RNA). For many gene products, functional information is summarized in a set of Gene Ontology (GO) annotations. For these genes, it is reasonable to include similarity measures based on the terms found in the GO or other taxonomy. In this paper, we introduce several novel measures for computing the similarity of two gene products annotated with GO terms. The fuzzy measure similarity (FMS) has the advantage that it takes into consideration the context of both complete sets of annotation terms when computing the similarity between two gene products. When the two gene products are not annotated by common taxonomy terms, we propose a method that avoids a zero similarity result. To account for the variations in the annotation reliability, we propose a similarity measure based on the Choquet integral. These similarity measures provide extra tools for the biologist in search of functional information for gene products. The initial testing on a group of 194 sequences representing three proteins families shows a higher correlation of the FMS and Choquet similarities to the BLAST sequence similarities than the traditional similarity measures such as pairwise average or pairwise maximum.
Semantic Clustering of Search Engine Results
Soliman, Sara Saad; El-Sayed, Maged F.; Hassan, Yasser F.
2015-01-01
This paper presents a novel approach for search engine results clustering that relies on the semantics of the retrieved documents rather than the terms in those documents. The proposed approach takes into consideration both lexical and semantics similarities among documents and applies activation spreading technique in order to generate semantically meaningful clusters. This approach allows documents that are semantically similar to be clustered together rather than clustering documents based on similar terms. A prototype is implemented and several experiments are conducted to test the prospered solution. The result of the experiment confirmed that the proposed solution achieves remarkable results in terms of precision. PMID:26933673
A Quantum-Based Similarity Method in Virtual Screening.
Al-Dabbagh, Mohammed Mumtaz; Salim, Naomie; Himmat, Mubarak; Ahmed, Ali; Saeed, Faisal
2015-10-02
One of the most widely-used techniques for ligand-based virtual screening is similarity searching. This study adopted the concepts of quantum mechanics to present as state-of-the-art similarity method of molecules inspired from quantum theory. The representation of molecular compounds in mathematical quantum space plays a vital role in the development of quantum-based similarity approach. One of the key concepts of quantum theory is the use of complex numbers. Hence, this study proposed three various techniques to embed and to re-represent the molecular compounds to correspond with complex numbers format. The quantum-based similarity method that developed in this study depending on complex pure Hilbert space of molecules called Standard Quantum-Based (SQB). The recall of retrieved active molecules were at top 1% and top 5%, and significant test is used to evaluate our proposed methods. The MDL drug data report (MDDR), maximum unbiased validation (MUV) and Directory of Useful Decoys (DUD) data sets were used for experiments and were represented by 2D fingerprints. Simulated virtual screening experiment show that the effectiveness of SQB method was significantly increased due to the role of representational power of molecular compounds in complex numbers forms compared to Tanimoto benchmark similarity measure.
Search prefilters to assist in library searching of infrared spectra of automotive clear coats.
Lavine, Barry K; Fasasi, Ayuba; Mirjankar, Nikhil; White, Collin; Sandercock, Mark
2015-01-01
Clear coat searches of the infrared (IR) spectral library of the paint data query (PDQ) forensic database often generate an unusable number of hits that span multiple manufacturers, assembly plants, and years. To improve the accuracy of the hit list, pattern recognition methods have been used to develop search prefilters (i.e., principal component models) that differentiate between similar but non-identical IR spectra of clear coats on the basis of manufacturer (e.g., General Motors, Ford, Chrysler) or assembly plant. A two step procedure to develop these search prefilters was employed. First, the discrete wavelet transform was used to decompose each IR spectrum into wavelet coefficients to enhance subtle but significant features in the spectral data. Second, a genetic algorithm for IR spectral pattern recognition was employed to identify wavelet coefficients characteristic of the manufacturer or assembly plant of the vehicle. Even in challenging trials where the paint samples evaluated were all from the same manufacturer (General Motors) within a limited production year range (2000-2006), the respective assembly plant of the vehicle was correctly identified. Search prefilters to identify assembly plants were successfully validated using 10 blind samples provided by the Royal Canadian Mounted Police (RCMP) as part of a study to populate PDQ to current production years, whereas the search prefilter to discriminate among automobile manufacturers was successfully validated using IR spectra obtained directly from the PDQ database. Copyright © 2014 Elsevier B.V. All rights reserved.
Semantically Enriching the Search System of a Music Digital Library
NASA Astrophysics Data System (ADS)
de Juan, Paloma; Iglesias, Carlos
Traditional search systems are usually based on keywords, a very simple and convenient mechanism to express a need for information. This is the most popular way of searching the Web, although it is not always an easy task to accurately summarize a natural language query in a few keywords. Working with keywords means losing the context, which is the only thing that can help us deal with ambiguity. This is the biggest problem of keyword-based systems. Semantic Web technologies seem a perfect solution to this problem, since they make it possible to represent the semantics of a given domain. In this chapter, we present three projects, Harmos, Semusici and Cantiga, whose aim is to provide access to a music digital library. We will describe two search systems, a traditional one and a semantic one, developed in the context of these projects and compare them in terms of usability and effectiveness.
An advanced search engine for patent analytics in medicinal chemistry.
Pasche, Emilie; Gobeill, Julien; Teodoro, Douglas; Gaudinat, Arnaud; Vishnykova, Dina; Lovis, Christian; Ruch, Patrick
2012-01-01
Patent collections contain an important amount of medical-related knowledge, but existing tools were reported to lack of useful functionalities. We present here the development of TWINC, an advanced search engine dedicated to patent retrieval in the domain of health and life sciences. Our tool embeds two search modes: an ad hoc search to retrieve relevant patents given a short query and a related patent search to retrieve similar patents given a patent. Both search modes rely on tuning experiments performed during several patent retrieval competitions. Moreover, TWINC is enhanced with interactive modules, such as chemical query expansion, which is of prior importance to cope with various ways of naming biomedical entities. While the related patent search showed promising performances, the ad-hoc search resulted in fairly contrasted results. Nonetheless, TWINC performed well during the Chemathlon task of the PatOlympics competition and experts appreciated its usability.
Female cowbirds have more accurate spatial memory than males.
Guigueno, Mélanie F; Snow, Danielle A; MacDougall-Shackleton, Scott A; Sherry, David F
2014-02-01
Brown-headed cowbirds (Molothrus ater) are obligate brood parasites. Only females search for host nests and they find host nests one or more days before placing eggs in them. Past work has shown that females have a larger hippocampus than males, but sex differences in spatial cognition have not been extensively investigated. We tested cowbirds for sex and seasonal differences in spatial memory on a foraging task with an ecologically relevant retention interval. Birds were trained to find one rewarded location among 25 after 24 h. Females made significantly fewer errors than males and took more direct paths to the rewarded location than males. Females and males showed similar search times, indicating there was no sex difference in motivation. This sex difference in spatial cognition is the reverse of that observed in some polygynous mammals and is consistent with the hypothesis that spatial cognition is adaptively specialized in this brood-parasitic species.
Query Auto-Completion Based on Word2vec Semantic Similarity
NASA Astrophysics Data System (ADS)
Shao, Taihua; Chen, Honghui; Chen, Wanyu
2018-04-01
Query auto-completion (QAC) is the first step of information retrieval, which helps users formulate the entire query after inputting only a few prefixes. Regarding the models of QAC, the traditional method ignores the contribution from the semantic relevance between queries. However, similar queries always express extremely similar search intention. In this paper, we propose a hybrid model FS-QAC based on query semantic similarity as well as the query frequency. We choose word2vec method to measure the semantic similarity between intended queries and pre-submitted queries. By combining both features, our experiments show that FS-QAC model improves the performance when predicting the user’s query intention and helping formulate the right query. Our experimental results show that the optimal hybrid model contributes to a 7.54% improvement in terms of MRR against a state-of-the-art baseline using the public AOL query logs.
Ontology-Driven Search and Triage: Design of a Web-Based Visual Interface for MEDLINE.
Demelo, Jonathan; Parsons, Paul; Sedig, Kamran
2017-02-02
Diverse users need to search health and medical literature to satisfy open-ended goals such as making evidence-based decisions and updating their knowledge. However, doing so is challenging due to at least two major difficulties: (1) articulating information needs using accurate vocabulary and (2) dealing with large document sets returned from searches. Common search interfaces such as PubMed do not provide adequate support for exploratory search tasks. Our objective was to improve support for exploratory search tasks by combining two strategies in the design of an interactive visual interface by (1) using a formal ontology to help users build domain-specific knowledge and vocabulary and (2) providing multi-stage triaging support to help mitigate the information overload problem. We developed a Web-based tool, Ontology-Driven Visual Search and Triage Interface for MEDLINE (OVERT-MED), to test our design ideas. We implemented a custom searchable index of MEDLINE, which comprises approximately 25 million document citations. We chose a popular biomedical ontology, the Human Phenotype Ontology (HPO), to test our solution to the vocabulary problem. We implemented multistage triaging support in OVERT-MED, with the aid of interactive visualization techniques, to help users deal with large document sets returned from searches. Formative evaluation suggests that the design features in OVERT-MED are helpful in addressing the two major difficulties described above. Using a formal ontology seems to help users articulate their information needs with more accurate vocabulary. In addition, multistage triaging combined with interactive visualizations shows promise in mitigating the information overload problem. Our strategies appear to be valuable in addressing the two major problems in exploratory search. Although we tested OVERT-MED with a particular ontology and document collection, we anticipate that our strategies can be transferred successfully to other contexts.
Hu, Jialu; Kehr, Birte; Reinert, Knut
2014-02-15
Owing to recent advancements in high-throughput technologies, protein-protein interaction networks of more and more species become available in public databases. The question of how to identify functionally conserved proteins across species attracts a lot of attention in computational biology. Network alignments provide a systematic way to solve this problem. However, most existing alignment tools encounter limitations in tackling this problem. Therefore, the demand for faster and more efficient alignment tools is growing. We present a fast and accurate algorithm, NetCoffee, which allows to find a global alignment of multiple protein-protein interaction networks. NetCoffee searches for a global alignment by maximizing a target function using simulated annealing on a set of weighted bipartite graphs that are constructed using a triplet approach similar to T-Coffee. To assess its performance, NetCoffee was applied to four real datasets. Our results suggest that NetCoffee remedies several limitations of previous algorithms, outperforms all existing alignment tools in terms of speed and nevertheless identifies biologically meaningful alignments. The source code and data are freely available for download under the GNU GPL v3 license at https://code.google.com/p/netcoffee/.
Asking better questions: How presentation formats influence information search.
Wu, Charley M; Meder, Björn; Filimon, Flavia; Nelson, Jonathan D
2017-08-01
While the influence of presentation formats have been widely studied in Bayesian reasoning tasks, we present the first systematic investigation of how presentation formats influence information search decisions. Four experiments were conducted across different probabilistic environments, where subjects (N = 2,858) chose between 2 possible search queries, each with binary probabilistic outcomes, with the goal of maximizing classification accuracy. We studied 14 different numerical and visual formats for presenting information about the search environment, constructed across 6 design features that have been prominently related to improvements in Bayesian reasoning accuracy (natural frequencies, posteriors, complement, spatial extent, countability, and part-to-whole information). The posterior variants of the icon array and bar graph formats led to the highest proportion of correct responses, and were substantially better than the standard probability format. Results suggest that presenting information in terms of posterior probabilities and visualizing natural frequencies using spatial extent (a perceptual feature) were especially helpful in guiding search decisions, although environments with a mixture of probabilistic and certain outcomes were challenging across all formats. Subjects who made more accurate probability judgments did not perform better on the search task, suggesting that simple decision heuristics may be used to make search decisions without explicitly applying Bayesian inference to compute probabilities. We propose a new take-the-difference (TTD) heuristic that identifies the accuracy-maximizing query without explicit computation of posterior probabilities. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Lavine, Barry K; White, Collin G; Allen, Matthew D; Weakley, Andrew
2017-03-01
Multilayered automotive paint fragments, which are one of the most complex materials encountered in the forensic science laboratory, provide crucial links in criminal investigations and prosecutions. To determine the origin of these paint fragments, forensic automotive paint examiners have turned to the paint data query (PDQ) database, which allows the forensic examiner to compare the layer sequence and color, texture, and composition of the sample to paint systems of the original equipment manufacturer (OEM). However, modern automotive paints have a thin color coat and this layer on a microscopic fragment is often too thin to obtain accurate chemical and topcoat color information. A search engine has been developed for the infrared (IR) spectral libraries of the PDQ database in an effort to improve discrimination capability and permit quantification of discrimination power for OEM automotive paint comparisons. The similarity of IR spectra of the corresponding layers of various records for original finishes in the PDQ database often results in poor discrimination using commercial library search algorithms. A pattern recognition approach employing pre-filters and a cross-correlation library search algorithm that performs both a forward and backward search has been used to significantly improve the discrimination of IR spectra in the PDQ database and thus improve the accuracy of the search. This improvement permits inter-comparison of OEM automotive paint layer systems using the IR spectra alone. Such information can serve to quantify the discrimination power of the original automotive paint encountered in casework and further efforts to succinctly communicate trace evidence to the courts.
Accurate forced-choice recognition without awareness of memory retrieval.
Voss, Joel L; Baym, Carol L; Paller, Ken A
2008-06-01
Recognition confidence and the explicit awareness of memory retrieval commonly accompany accurate responding in recognition tests. Memory performance in recognition tests is widely assumed to measure explicit memory, but the generality of this assumption is questionable. Indeed, whether recognition in nonhumans is always supported by explicit memory is highly controversial. Here we identified circumstances wherein highly accurate recognition was unaccompanied by hallmark features of explicit memory. When memory for kaleidoscopes was tested using a two-alternative forced-choice recognition test with similar foils, recognition was enhanced by an attentional manipulation at encoding known to degrade explicit memory. Moreover, explicit recognition was most accurate when the awareness of retrieval was absent. These dissociations between accuracy and phenomenological features of explicit memory are consistent with the notion that correct responding resulted from experience-dependent enhancements of perceptual fluency with specific stimuli--the putative mechanism for perceptual priming effects in implicit memory tests. This mechanism may contribute to recognition performance in a variety of frequently-employed testing circumstances. Our results thus argue for a novel view of recognition, in that analyses of its neurocognitive foundations must take into account the potential for both (1) recognition mechanisms allied with implicit memory and (2) recognition mechanisms allied with explicit memory.
Brain CT image similarity retrieval method based on uncertain location graph.
Pan, Haiwei; Li, Pengyuan; Li, Qing; Han, Qilong; Feng, Xiaoning; Gao, Linlin
2014-03-01
A number of brain computed tomography (CT) images stored in hospitals that contain valuable information should be shared to support computer-aided diagnosis systems. Finding the similar brain CT images from the brain CT image database can effectively help doctors diagnose based on the earlier cases. However, the similarity retrieval for brain CT images requires much higher accuracy than the general images. In this paper, a new model of uncertain location graph (ULG) is presented for brain CT image modeling and similarity retrieval. According to the characteristics of brain CT image, we propose a novel method to model brain CT image to ULG based on brain CT image texture. Then, a scheme for ULG similarity retrieval is introduced. Furthermore, an effective index structure is applied to reduce the searching time. Experimental results reveal that our method functions well on brain CT images similarity retrieval with higher accuracy and efficiency.
Crowded visual search in children with normal vision and children with visual impairment.
Huurneman, Bianca; Cox, Ralf F A; Vlaskamp, Björn N S; Boonstra, F Nienke
2014-03-01
This study investigates the influence of oculomotor control, crowding, and attentional factors on visual search in children with normal vision ([NV], n=11), children with visual impairment without nystagmus ([VI-nys], n=11), and children with VI with accompanying nystagmus ([VI+nys], n=26). Exclusion criteria for children with VI were: multiple impairments and visual acuity poorer than 20/400 or better than 20/50. Three search conditions were presented: a row with homogeneous distractors, a matrix with homogeneous distractors, and a matrix with heterogeneous distractors. Element spacing was manipulated in 5 steps from 2 to 32 minutes of arc. Symbols were sized 2 times the threshold acuity to guarantee visibility for the VI groups. During simple row and matrix search with homogeneous distractors children in the VI+nys group were less accurate than children with NV at smaller spacings. Group differences were even more pronounced during matrix search with heterogeneous distractors. Search times were longer in children with VI compared to children with NV. The more extended impairments during serial search reveal greater dependence on oculomotor control during serial compared to parallel search. Copyright © 2014 Elsevier B.V. All rights reserved.
Semantic Similarity between Web Documents Using Ontology
NASA Astrophysics Data System (ADS)
Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh
2018-06-01
The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.
Semantic Similarity between Web Documents Using Ontology
NASA Astrophysics Data System (ADS)
Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh
2018-03-01
The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.
Playing shooter and driving videogames improves top-down guidance in visual search.
Wu, Sijing; Spence, Ian
2013-05-01
Playing action videogames is known to improve visual spatial attention and related skills. Here, we showed that playing action videogames also improves classic visual search, as well as the ability to locate targets in a dual search that mimics certain aspects of an action videogame. In Experiment 1A, first-person shooter (FPS) videogame players were faster than nonplayers in both feature search and conjunction search, and in Experiment 1B, they were faster and more accurate in a peripheral search and identification task while simultaneously performing a central search. In Experiment 2, we showed that 10 h of play could improve the performance of nonplayers on each of these tasks. Three different genres of videogames were used for training: two action games and a 3-D puzzle game. Participants who played an action game (either an FPS or a driving game) achieved greater gains on all search tasks than did those who trained using the puzzle game. Feature searches were faster after playing an action videogame, suggesting that players developed a better target template to guide search in a top-down manner. The results of the dual search suggest that, in addition to enhancing the ability to divide attention, playing an action game improves the top-down guidance of attention to possible target locations. The results have practical implications for the development of training tools to improve perceptual and cognitive skills.
Diffusion amid random overlapping obstacles: Similarities, invariants, approximations
Novak, Igor L.; Gao, Fei; Kraikivski, Pavel; Slepchenko, Boris M.
2011-01-01
Efficient and accurate numerical techniques are used to examine similarities of effective diffusion in a void between random overlapping obstacles: essential invariance of effective diffusion coefficients (Deff) with respect to obstacle shapes and applicability of a two-parameter power law over nearly entire range of excluded volume fractions (ϕ), except for a small vicinity of a percolation threshold. It is shown that while neither of the properties is exact, deviations from them are remarkably small. This allows for quick estimation of void percolation thresholds and approximate reconstruction of Deff (ϕ) for obstacles of any given shape. In 3D, the similarities of effective diffusion yield a simple multiplication “rule” that provides a fast means of estimating Deff for a mixture of overlapping obstacles of different shapes with comparable sizes. PMID:21513372
Examining perceptual and conceptual set biases in multiple-target visual search.
Biggs, Adam T; Adamo, Stephen H; Dowd, Emma Wu; Mitroff, Stephen R
2015-04-01
Visual search is a common practice conducted countless times every day, and one important aspect of visual search is that multiple targets can appear in a single search array. For example, an X-ray image of airport luggage could contain both a water bottle and a gun. Searchers are more likely to miss additional targets after locating a first target in multiple-target searches, which presents a potential problem: If airport security officers were to find a water bottle, would they then be more likely to miss a gun? One hypothetical cause of multiple-target search errors is that searchers become biased to detect additional targets that are similar to a found target, and therefore become less likely to find additional targets that are dissimilar to the first target. This particular hypothesis has received theoretical, but little empirical, support. In the present study, we tested the bounds of this idea by utilizing "big data" obtained from the mobile application Airport Scanner. Multiple-target search errors were substantially reduced when the two targets were identical, suggesting that the first-found target did indeed create biases during subsequent search. Further analyses delineated the nature of the biases, revealing both a perceptual set bias (i.e., a bias to find additional targets with features similar to those of the first-found target) and a conceptual set bias (i.e., a bias to find additional targets with a conceptual relationship to the first-found target). These biases are discussed in terms of the implications for visual-search theories and applications for professional visual searchers.
Optimal Target Stars in the Search for Life
NASA Astrophysics Data System (ADS)
Lingam, Manasvi; Loeb, Abraham
2018-04-01
The selection of optimal targets in the search for life represents a highly important strategic issue. In this Letter, we evaluate the benefits of searching for life around a potentially habitable planet orbiting a star of arbitrary mass relative to a similar planet around a Sun-like star. If recent physical arguments implying that the habitability of planets orbiting low-mass stars is selectively suppressed are correct, we find that planets around solar-type stars may represent the optimal targets.
Analysis of Magnitude Correlations in a Self-Similar model of Seismicity
NASA Astrophysics Data System (ADS)
Zambrano, A.; Joern, D.
2017-12-01
A recent model of seismicity that incorporates a self-similar Omori-Utsu relation, which is used to describe the temporal evolution of earthquake triggering, has been shown to provide a more accurate description of seismicity in Southern California when compared to epidemic type aftershock sequence models. Forecasting of earthquakes is an active research area where one of the debated points is whether magnitude correlations of earthquakes exist within real world seismic data. Prior to this work, the analysis of magnitude correlations of the aforementioned self-similar model had not been addressed. Here we present statistical properties of the magnitude correlations for the self-similar model along with an analytical analysis of the branching ratio and criticality parameters.
Exploration and Exploitation During Sequential Search
Dam, Gregory; Körding, Konrad
2012-01-01
When we learn how to throw darts we adjust how we throw based on where the darts stick. Much of skill learning is computationally similar in that we learn using feedback obtained after the completion of individual actions. We can formalize such tasks as a search problem; among the set of all possible actions, find the action that leads to the highest reward. In such cases our actions have two objectives: we want to best utilize what we already know (exploitation), but we also want to learn to be more successful in the future (exploration). Here we tested how participants learn movement trajectories where feedback is provided as a monetary reward that depends on the chosen trajectory. We mathematically derived the optimal search policy for our experiment using decision theory. The search behavior of participants is well predicted by an ideal searcher model that optimally combines exploration and exploitation. PMID:21585479
The Collaborative Search by Tag-Based User Profile in Social Media
Li, Xiaodong; Li, Qing
2014-01-01
Recently, we have witnessed the popularity and proliferation of social media applications (e.g., Delicious, Flickr, and YouTube) in the web 2.0 era. The rapid growth of user-generated data results in the problem of information overload to users. Facing such a tremendous volume of data, it is a big challenge to assist the users to find their desired data. To attack this critical problem, we propose the collaborative search approach in this paper. The core idea is that similar users may have common interests so as to help users to find their demanded data. Similar research has been conducted on the user log analysis in web search. However, the rapid growth and change of user-generated data in social media require us to discover a brand-new approach to address the unsolved issues (e.g., how to profile users, how to measure the similar users, and how to depict user-generated resources) rather than adopting existing method from web search. Therefore, we investigate various metrics to identify the similar users (user community). Moreover, we conduct the experiment on two real-life data sets by comparing the Collaborative method with the latest baselines. The empirical results show the effectiveness of the proposed approach and validate our observations. PMID:25692176
Characterising dark matter searches at colliders and direct detection experiments: Vector mediators
Buchmueller, Oliver; Dolan, Matthew J.; Malik, Sarah A.; ...
2015-01-09
We introduce a Minimal Simplified Dark Matter (MSDM) framework to quantitatively characterise dark matter (DM) searches at the LHC. We study two MSDM models where the DM is a Dirac fermion which interacts with a vector and axial-vector mediator. The models are characterised by four parameters: m DM, M med , g DM and g q, the DM and mediator masses, and the mediator couplings to DM and quarks respectively. The MSDM models accurately capture the full event kinematics, and the dependence on all masses and couplings can be systematically studied. The interpretation of mono-jet searches in this framework canmore » be used to establish an equal-footing comparison with direct detection experiments. For theories with a vector mediator, LHC mono-jet searches possess better sensitivity than direct detection searches for light DM masses (≲5 GeV). For axial-vector mediators, LHC and direct detection searches generally probe orthogonal directions in the parameter space. We explore the projected limits of these searches from the ultimate reach of the LHC and multi-ton xenon direct detection experiments, and find that the complementarity of the searches remains. In conclusion, we provide a comparison of limits in the MSDM and effective field theory (EFT) frameworks to highlight the deficiencies of the EFT framework, particularly when exploring the complementarity of mono-jet and direct detection searches.« less
Leynes, P Andrew; Brown, Jaime; Landau, Joshua D
2011-01-01
Memory blocks are a common experience characterised by inappropriate retrieval of information that impairs memory search processes. In five studies, memory blocks were induced via exposure to orthographically similar words (Smith & Tindell, 1997) while participants reported their subjective experiences to determine whether the memory block effect (MBE) paradigm produces a feeling of being blocked. Experiments 1 and 3 provided evidence that the MBE is associated with more blocked experiences. In Experiments 2 and 4 increased blocking experiences correlated with blocked fragments when the experimental manipulation was disguised, which demonstrates that ratings were not contaminated by demand characteristics. Experiment 5 demonstrated that blocking happens even when there is no study list. Collectively, the subjective retrieval ratings and the objective response data provide converging evidence that exposure to orthographically similar words induces a memory block characterised by an ineffective memory search that perseverates on interfering information.
A fuzzy-match search engine for physician directories.
Rastegar-Mojarad, Majid; Kadolph, Christopher; Ye, Zhan; Wall, Daniel; Murali, Narayana; Lin, Simon
2014-11-04
A search engine to find physicians' information is a basic but crucial function of a health care provider's website. Inefficient search engines, which return no results or incorrect results, can lead to patient frustration and potential customer loss. A search engine that can handle misspellings and spelling variations of names is needed, as the United States (US) has culturally, racially, and ethnically diverse names. The Marshfield Clinic website provides a search engine for users to search for physicians' names. The current search engine provides an auto-completion function, but it requires an exact match. We observed that 26% of all searches yielded no results. The goal was to design a fuzzy-match algorithm to aid users in finding physicians easier and faster. Instead of an exact match search, we used a fuzzy algorithm to find similar matches for searched terms. In the algorithm, we solved three types of search engine failures: "Typographic", "Phonetic spelling variation", and "Nickname". To solve these mismatches, we used a customized Levenshtein distance calculation that incorporated Soundex coding and a lookup table of nicknames derived from US census data. Using the "Challenge Data Set of Marshfield Physician Names," we evaluated the accuracy of fuzzy-match engine-top ten (90%) and compared it with exact match (0%), Soundex (24%), Levenshtein distance (59%), and fuzzy-match engine-top one (71%). We designed, created a reference implementation, and evaluated a fuzzy-match search engine for physician directories. The open-source code is available at the codeplex website and a reference implementation is available for demonstration at the datamarsh website.
Electroencephalography epilepsy classifications using hybrid cuckoo search and neural network
NASA Astrophysics Data System (ADS)
Pratiwi, A. B.; Damayanti, A.; Miswanto
2017-07-01
Epilepsy is a condition that affects the brain and causes repeated seizures. This seizure is episodes that can vary and nearly undetectable to long periods of vigorous shaking or brain contractions. Epilepsy often can be confirmed with an electrocephalography (EEG). Neural Networks has been used in biomedic signal analysis, it has successfully classified the biomedic signal, such as EEG signal. In this paper, a hybrid cuckoo search and neural network are used to recognize EEG signal for epilepsy classifications. The weight of the multilayer perceptron is optimized by the cuckoo search algorithm based on its error. The aim of this methods is making the network faster to obtained the local or global optimal then the process of classification become more accurate. Based on the comparison results with the traditional multilayer perceptron, the hybrid cuckoo search and multilayer perceptron provides better performance in term of error convergence and accuracy. The purpose methods give MSE 0.001 and accuracy 90.0 %.
Real-time earthquake monitoring using a search engine method.
Zhang, Jie; Zhang, Haijiang; Chen, Enhong; Zheng, Yi; Kuang, Wenhuan; Zhang, Xiong
2014-12-04
When an earthquake occurs, seismologists want to use recorded seismograms to infer its location, magnitude and source-focal mechanism as quickly as possible. If such information could be determined immediately, timely evacuations and emergency actions could be undertaken to mitigate earthquake damage. Current advanced methods can report the initial location and magnitude of an earthquake within a few seconds, but estimating the source-focal mechanism may require minutes to hours. Here we present an earthquake search engine, similar to a web search engine, that we developed by applying a computer fast search method to a large seismogram database to find waveforms that best fit the input data. Our method is several thousand times faster than an exact search. For an Mw 5.9 earthquake on 8 March 2012 in Xinjiang, China, the search engine can infer the earthquake's parameters in <1 s after receiving the long-period surface wave data.
Predicting the evolution of complex networks via similarity dynamics
NASA Astrophysics Data System (ADS)
Wu, Tao; Chen, Leiting; Zhong, Linfeng; Xian, Xingping
2017-01-01
Almost all real-world networks are subject to constant evolution, and plenty of them have been investigated empirically to uncover the underlying evolution mechanism. However, the evolution prediction of dynamic networks still remains a challenging problem. The crux of this matter is to estimate the future network links of dynamic networks. This paper studies the evolution prediction of dynamic networks with link prediction paradigm. To estimate the likelihood of the existence of links more accurate, an effective and robust similarity index is presented by exploiting network structure adaptively. Moreover, most of the existing link prediction methods do not make a clear distinction between future links and missing links. In order to predict the future links, the networks are regarded as dynamic systems in this paper, and a similarity updating method, spatial-temporal position drift model, is developed to simulate the evolutionary dynamics of node similarity. Then the updated similarities are used as input information for the future links' likelihood estimation. Extensive experiments on real-world networks suggest that the proposed similarity index performs better than baseline methods and the position drift model performs well for evolution prediction in real-world evolving networks.
In Search of Speedier Searches.
ERIC Educational Resources Information Center
Peterson, Ivars
1984-01-01
Methods to make computer searching as simple and efficient as possible have led to the development of various data structures. Data structures specify the items involved in searching and what can be done to them. The nature and advantages of using "self-adjusting" data structures (self-adjusting binary search trees) are discussed. (JN)
Fast correspondences search in anatomical trees
NASA Astrophysics Data System (ADS)
dos Santos, Thiago R.; Gergel, Ingmar; Meinzer, Hans-Peter; Maier-Hein, Lena
2010-03-01
Registration of multiple medical images commonly comprises the steps feature extraction, correspondences search and transformation computation. In this paper, we present a new method for a fast and pose independent search of correspondences using as features anatomical trees such as the bronchial system in the lungs or the vessel system in the liver. Our approach scores the similarities between the trees' nodes (bifurcations) taking into account both, topological properties extracted from their graph representations and anatomical properties extracted from the trees themselves. The node assignment maximizes the global similarity (sum of the scores of each pair of assigned nodes), assuring that the matches are distributed throughout the trees. Furthermore, the proposed method is able to deal with distortions in the data, such as noise, motion, artifacts, and problems associated with the extraction method, such as missing or false branches. According to an evaluation on swine lung data sets, the method requires less than one second on average to compute the matching and yields a high rate of correct matches compared to state of the art work.
Dehaene, S
1989-07-01
Treisman and Gelade's (1980) feature-integration theory of attention states that a scene must be serially scanned before the objects in it can be accurately perceived. Is serial scanning compatible with the speed observed in the perception of real-world scenes? Most real scenes consist of many more dimensions (color, size, shape, depth, etc.) than those generally found in search paradigms. Furthermore, real objects differ from each other along many of these dimensions. The present experiment assessed the influence of the total number of dimensions and target/distractor discriminability (the number of dimensions that suffice to separate a target from distractors) on search times for a conjunction of features. Search was always found to be serial. However, for the most discriminable targets, search rate was so fast that search times were in the same range as pop-out detection times. Apparently, greater discriminability enables subjects to direct attention at a faster rate and at only a fraction of the items in a scene.
Accurate modeling and evaluation of microstructures in complex materials
NASA Astrophysics Data System (ADS)
Tahmasebi, Pejman
2018-02-01
Accurate characterization of heterogeneous materials is of great importance for different fields of science and engineering. Such a goal can be achieved through imaging. Acquiring three- or two-dimensional images under different conditions is not, however, always plausible. On the other hand, accurate characterization of complex and multiphase materials requires various digital images (I) under different conditions. An ensemble method is presented that can take one single (or a set of) I(s) and stochastically produce several similar models of the given disordered material. The method is based on a successive calculating of a conditional probability by which the initial stochastic models are produced. Then, a graph formulation is utilized for removing unrealistic structures. A distance transform function for the Is with highly connected microstructure and long-range features is considered which results in a new I that is more informative. Reproduction of the I is also considered through a histogram matching approach in an iterative framework. Such an iterative algorithm avoids reproduction of unrealistic structures. Furthermore, a multiscale approach, based on pyramid representation of the large Is, is presented that can produce materials with millions of pixels in a matter of seconds. Finally, the nonstationary systems—those for which the distribution of data varies spatially—are studied using two different methods. The method is tested on several complex and large examples of microstructures. The produced results are all in excellent agreement with the utilized Is and the similarities are quantified using various correlation functions.
Full-Text Searching on Major Supermarket Systems: Dialog, Data-Star, and Nexis.
ERIC Educational Resources Information Center
Tenopir, Carol; Berglund, Sharon
1993-01-01
Examines the similarities, differences, and full-text features of the three most-used online systems for full-text searching in general libraries: DIALOG, Data-Star, and NEXIS. Overlapping databases, unique sources, search features, proximity operators, set building, language enhancement and word equivalencies, and display features are discussed.…
Structural texture similarity metrics for image analysis and retrieval.
Zujovic, Jana; Pappas, Thrasyvoulos N; Neuhoff, David L
2013-07-01
We develop new metrics for texture similarity that accounts for human visual perception and the stochastic nature of textures. The metrics rely entirely on local image statistics and allow substantial point-by-point deviations between textures that according to human judgment are essentially identical. The proposed metrics extend the ideas of structural similarity and are guided by research in texture analysis-synthesis. They are implemented using a steerable filter decomposition and incorporate a concise set of subband statistics, computed globally or in sliding windows. We conduct systematic tests to investigate metric performance in the context of "known-item search," the retrieval of textures that are "identical" to the query texture. This eliminates the need for cumbersome subjective tests, thus enabling comparisons with human performance on a large database. Our experimental results indicate that the proposed metrics outperform peak signal-to-noise ratio (PSNR), structural similarity metric (SSIM) and its variations, as well as state-of-the-art texture classification metrics, using standard statistical measures.
Study of the similarity function in Indexing-First-One hashing
NASA Astrophysics Data System (ADS)
Lai, Y.-L.; Jin, Z.; Goi, B.-M.; Chai, T.-Y.
2017-06-01
The recent proposed Indexing-First-One (IFO) hashing is a latest technique that is particularly adopted for eye iris template protection, i.e. IrisCode. However, IFO employs the measure of Jaccard Similarity (JS) initiated from Min-hashing has yet been adequately discussed. In this paper, we explore the nature of JS in binary domain and further propose a mathematical formulation to generalize the usage of JS, which is subsequently verified by using CASIA v3-Interval iris database. Our study reveals that JS applied in IFO hashing is a generalized version in measure two input objects with respect to Min-Hashing where the coefficient of JS is equal to one. With this understanding, IFO hashing can propagate the useful properties of Min-hashing, i.e. similarity preservation, thus favorable for similarity searching or recognition in binary space.
Training eye movements for visual search in individuals with macular degeneration
Janssen, Christian P.; Verghese, Preeti
2016-01-01
We report a method to train individuals with central field loss due to macular degeneration to improve the efficiency of visual search. Our method requires participants to make a same/different judgment on two simple silhouettes. One silhouette is presented in an area that falls within the binocular scotoma while they are fixating the center of the screen with their preferred retinal locus (PRL); the other silhouette is presented diametrically opposite within the intact visual field. Over the course of 480 trials (approximately 6 hr), we gradually reduced the amount of time that participants have to make a saccade and judge the similarity of stimuli. This requires that they direct their PRL first toward the stimulus that is initially hidden behind the scotoma. Results from nine participants show that all participants could complete the task faster with training without sacrificing accuracy on the same/different judgment task. Although a majority of participants were able to direct their PRL toward the initially hidden stimulus, the ability to do so varied between participants. Specifically, six of nine participants made faster saccades with training. A smaller set (four of nine) made accurate saccades inside or close to the target area and retained this strategy 2 to 3 months after training. Subjective reports suggest that training increased awareness of the scotoma location for some individuals. However, training did not transfer to a different visual search task. Nevertheless, our study suggests that increasing scotoma awareness and training participants to look toward their scotoma may help them acquire missing information. PMID:28027382
Exploring the Logic of Mobile Search
ERIC Educational Resources Information Center
Westlund, Oscar; Gomez-Barroso, Jose-Luis; Compano, Ramon; Feijoo, Claudio
2011-01-01
After more than a decade of development work and hopes, the usage of mobile Internet has finally taken off. Now, we are witnessing the first signs of evidence of what might become the explosion of mobile content and applications that will be shaping the (mobile) Internet of the future. Similar to the wired Internet, search will become very…
Grebner, Christoph; Becker, Johannes; Weber, Daniel; Bellinger, Daniel; Tafipolski, Maxim; Brückner, Charlotte; Engels, Bernd
2014-09-15
The presented program package, Conformational Analysis and Search Tool (CAST) allows the accurate treatment of large and flexible (macro) molecular systems. For the determination of thermally accessible minima CAST offers the newly developed TabuSearch algorithm, but algorithms such as Monte Carlo (MC), MC with minimization, and molecular dynamics are implemented as well. For the determination of reaction paths, CAST provides the PathOpt, the Nudge Elastic band, and the umbrella sampling approach. Access to free energies is possible through the free energy perturbation approach. Along with a number of standard force fields, a newly developed symmetry-adapted perturbation theory-based force field is included. Semiempirical computations are possible through DFTB+ and MOPAC interfaces. For calculations based on density functional theory, a Message Passing Interface (MPI) interface to the Graphics Processing Unit (GPU)-accelerated TeraChem program is available. The program is available on request. Copyright © 2014 Wiley Periodicals, Inc.
Alvarez, George A.; Nakayama, Ken; Konkle, Talia
2016-01-01
Visual search is a ubiquitous visual behavior, and efficient search is essential for survival. Different cognitive models have explained the speed and accuracy of search based either on the dynamics of attention or on similarity of item representations. Here, we examined the extent to which performance on a visual search task can be predicted from the stable representational architecture of the visual system, independent of attentional dynamics. Participants performed a visual search task with 28 conditions reflecting different pairs of categories (e.g., searching for a face among cars, body among hammers, etc.). The time it took participants to find the target item varied as a function of category combination. In a separate group of participants, we measured the neural responses to these object categories when items were presented in isolation. Using representational similarity analysis, we then examined whether the similarity of neural responses across different subdivisions of the visual system had the requisite structure needed to predict visual search performance. Overall, we found strong brain/behavior correlations across most of the higher-level visual system, including both the ventral and dorsal pathways when considering both macroscale sectors as well as smaller mesoscale regions. These results suggest that visual search for real-world object categories is well predicted by the stable, task-independent architecture of the visual system. NEW & NOTEWORTHY Here, we ask which neural regions have neural response patterns that correlate with behavioral performance in a visual processing task. We found that the representational structure across all of high-level visual cortex has the requisite structure to predict behavior. Furthermore, when directly comparing different neural regions, we found that they all had highly similar category-level representational structures. These results point to a ubiquitous and uniform representational structure in high
Similarity law and critical properties in ionic systems.
NASA Astrophysics Data System (ADS)
Desgranges, Caroline; Delhommelle, Jerome
2017-11-01
Using molecular simulations, we determine the locus of ideal compressibility, or Zeno line, for a series of ionic compounds. We find that the shape of this thermodynamic contour follows a linear law, leading to the determination of the Boyle parameters. We also show that a similarity law, based on the Boyle parameters, yields accurate critical data when compared to the experiment. Furthermore, we show that the Boyle density scales linearly with the size-asymmetry, providing a direct route to establish a correspondence between the thermodynamic properties of different ionic compounds.
Comparison of PubMed and Google Scholar literature searches.
Anders, Michael E; Evans, Dennis P
2010-05-01
Literature searches are essential to evidence-based respiratory care. To conduct literature searches, respiratory therapists rely on search engines to retrieve information, but there is a dearth of literature on the comparative efficiencies of search engines for researching clinical questions in respiratory care. To compare PubMed and Google Scholar search results for clinical topics in respiratory care to that of a benchmark. We performed literature searches with PubMed and Google Scholar, on 3 clinical topics. In PubMed we used the Clinical Queries search filter. In Google Scholar we used the search filters in the Advanced Scholar Search option. We used the reference list of a related Cochrane Collaboration evidence-based systematic review as the benchmark for each of the search results. We calculated recall (sensitivity) and precision (positive predictive value) with 2 x 2 contingency tables. We compared the results with the chi-square test of independence and Fisher's exact test. PubMed and Google Scholar had similar recall for both overall search results (71% vs 69%) and full-text results (43% vs 51%). PubMed had better precision than Google Scholar for both overall search results (13% vs 0.07%, P < .001) and full-text results (8% vs 0.05%, P < .001). Our results suggest that PubMed searches with the Clinical Queries filter are more precise than with the Advanced Scholar Search in Google Scholar for respiratory care topics. PubMed appears to be more practical to conduct efficient, valid searches for informing evidence-based patient-care protocols, for guiding the care of individual patients, and for educational purposes.
Inferring gene ontologies from pairwise similarity data
Kramer, Michael; Dutkowski, Janusz; Yu, Michael; Bafna, Vineet; Ideker, Trey
2014-01-01
Motivation: While the manually curated Gene Ontology (GO) is widely used, inferring a GO directly from -omics data is a compelling new problem. Recognizing that ontologies are a directed acyclic graph (DAG) of terms and hierarchical relations, algorithms are needed that: analyze a full matrix of gene–gene pairwise similarities from -omics data;infer true hierarchical structure in these data rather than enforcing hierarchy as a computational artifact; andrespect biological pleiotropy, by which a term in the hierarchy can relate to multiple higher level terms. Methods addressing these requirements are just beginning to emerge—none has been evaluated for GO inference. Methods: We consider two algorithms [Clique Extracted Ontology (CliXO), LocalFitness] that uniquely satisfy these requirements, compared with methods including standard clustering. CliXO is a new approach that finds maximal cliques in a network induced by progressive thresholding of a similarity matrix. We evaluate each method’s ability to reconstruct the GO biological process ontology from a similarity matrix based on (a) semantic similarities for GO itself or (b) three -omics datasets for yeast. Results: For task (a) using semantic similarity, CliXO accurately reconstructs GO (>99% precision, recall) and outperforms other approaches (<20% precision, <20% recall). For task (b) using -omics data, CliXO outperforms other methods using two -omics datasets and achieves ∼30% precision and recall using YeastNet v3, similar to an earlier approach (Network Extracted Ontology) and better than LocalFitness or standard clustering (20–25% precision, recall). Conclusion: This study provides algorithmic foundation for building gene ontologies by capturing hierarchical and pleiotropic structure embedded in biomolecular data. Contact: tideker@ucsd.edu PMID:24932003
A topic clustering approach to finding similar questions from large question and answer archives.
Zhang, Wei-Nan; Liu, Ting; Yang, Yang; Cao, Liujuan; Zhang, Yu; Ji, Rongrong
2014-01-01
With the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a large number of question and answer (Q&A) pairs with high quality devoted by human intelligence have been accumulated as a comprehensive knowledge base. Unlike the search engines, which return long lists of results, searching in the CQA services can obtain the correct answers to the question queries by automatically finding similar questions that have already been answered by other users. Hence, it greatly improves the efficiency of the online information retrieval. However, given a question query, finding the similar and well-answered questions is a non-trivial task. The main challenge is the word mismatch between question query (query) and candidate question for retrieval (question). To investigate this problem, in this study, we capture the word semantic similarity between query and question by introducing the topic modeling approach. We then propose an unsupervised machine-learning approach to finding similar questions on CQA Q&A archives. The experimental results show that our proposed approach significantly outperforms the state-of-the-art methods.
Real-time earthquake monitoring using a search engine method
Zhang, Jie; Zhang, Haijiang; Chen, Enhong; Zheng, Yi; Kuang, Wenhuan; Zhang, Xiong
2014-01-01
When an earthquake occurs, seismologists want to use recorded seismograms to infer its location, magnitude and source-focal mechanism as quickly as possible. If such information could be determined immediately, timely evacuations and emergency actions could be undertaken to mitigate earthquake damage. Current advanced methods can report the initial location and magnitude of an earthquake within a few seconds, but estimating the source-focal mechanism may require minutes to hours. Here we present an earthquake search engine, similar to a web search engine, that we developed by applying a computer fast search method to a large seismogram database to find waveforms that best fit the input data. Our method is several thousand times faster than an exact search. For an Mw 5.9 earthquake on 8 March 2012 in Xinjiang, China, the search engine can infer the earthquake’s parameters in <1 s after receiving the long-period surface wave data. PMID:25472861
Simrank: Rapid and sensitive general-purpose k-mer search tool
2011-01-01
Background Terabyte-scale collections of string-encoded data are expected from consortia efforts such as the Human Microbiome Project http://nihroadmap.nih.gov/hmp. Intra- and inter-project data similarity searches are enabled by rapid k-mer matching strategies. Software applications for sequence database partitioning, guide tree estimation, molecular classification and alignment acceleration have benefited from embedded k-mer searches as sub-routines. However, a rapid, general-purpose, open-source, flexible, stand-alone k-mer tool has not been available. Results Here we present a stand-alone utility, Simrank, which allows users to rapidly identify database strings the most similar to query strings. Performance testing of Simrank and related tools against DNA, RNA, protein and human-languages found Simrank 10X to 928X faster depending on the dataset. Conclusions Simrank provides molecular ecologists with a high-throughput, open source choice for comparing large sequence sets to find similarity. PMID:21524302
Visuospatial working memory mediates inhibitory and facilitatory guidance in preview search.
Barrett, Doug J K; Shimozaki, Steven S; Jensen, Silke; Zobay, Oliver
2016-10-01
Visual search is faster and more accurate when a subset of distractors is presented before the display containing the target. This "preview benefit" has been attributed to separate inhibitory and facilitatory guidance mechanisms during search. In the preview task the temporal cues thought to elicit inhibition and facilitation provide complementary sources of information about the likely location of the target. In this study, we use a Bayesian observer model to compare sensitivity when the temporal cues eliciting inhibition and facilitation produce complementary, and competing, sources of information. Observers searched for T-shaped targets among L-shaped distractors in 2 standard and 2 preview conditions. In the standard conditions, all the objects in the display appeared at the same time. In the preview conditions, the initial subset of distractors either stayed on the screen or disappeared before the onset of the search display, which contained the target when present. In the latter, the synchronous onset of old and new objects negates the predictive utility of stimulus-driven capture during search. The results indicate observers combine memory-driven inhibition and sensory-driven capture to reduce spatial uncertainty about the target's likely location during search. In the absence of spatially predictive onsets, memory-driven inhibition at old locations persists despite irrelevant sensory change at previewed locations. This result is consistent with a bias toward unattended objects during search via the active suppression of irrelevant capture at previously attended locations. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Interrupted Visual Searches Reveal Volatile Search Memory
ERIC Educational Resources Information Center
Shen, Y. Jeremy; Jiang, Yuhong V.
2006-01-01
This study investigated memory from interrupted visual searches. Participants conducted a change detection search task on polygons overlaid on scenes. Search was interrupted by various disruptions, including unfilled delay, passive viewing of other scenes, and additional search on new displays. Results showed that performance was unaffected by…
Accurate Classification of RNA Structures Using Topological Fingerprints
Li, Kejie; Gribskov, Michael
2016-01-01
While RNAs are well known to possess complex structures, functionally similar RNAs often have little sequence similarity. While the exact size and spacing of base-paired regions vary, functionally similar RNAs have pronounced similarity in the arrangement, or topology, of base-paired stems. Furthermore, predicted RNA structures often lack pseudoknots (a crucial aspect of biological activity), and are only partially correct, or incomplete. A topological approach addresses all of these difficulties. In this work we describe each RNA structure as a graph that can be converted to a topological spectrum (RNA fingerprint). The set of subgraphs in an RNA structure, its RNA fingerprint, can be compared with the fingerprints of other RNA structures to identify and correctly classify functionally related RNAs. Topologically similar RNAs can be identified even when a large fraction, up to 30%, of the stems are omitted, indicating that highly accurate structures are not necessary. We investigate the performance of the RNA fingerprint approach on a set of eight highly curated RNA families, with diverse sizes and functions, containing pseudoknots, and with little sequence similarity–an especially difficult test set. In spite of the difficult test set, the RNA fingerprint approach is very successful (ROC AUC > 0.95). Due to the inclusion of pseudoknots, the RNA fingerprint approach both covers a wider range of possible structures than methods based only on secondary structure, and its tolerance for incomplete structures suggests that it can be applied even to predicted structures. Source code is freely available at https://github.rcac.purdue.edu/mgribsko/XIOS_RNA_fingerprint. PMID:27755571
Searching for confining hidden valleys at LHCb, ATLAS, and CMS
NASA Astrophysics Data System (ADS)
Pierce, Aaron; Shakya, Bibhushan; Tsai, Yuhsin; Zhao, Yue
2018-05-01
We explore strategies for probing hidden valley scenarios exhibiting confinement. Such scenarios lead to a moderate multiplicity of light hidden hadrons for generic showering and hadronization similar to QCD. Their decays are typically soft and displaced, making them challenging to probe with traditional LHC searches. We show that the low trigger requirements and excellent track and vertex reconstruction at LHCb provide a favorable environment to search for such signals. We propose novel search strategies in both muonic and hadronic channels. We also study existing ATLAS and CMS searches and compare them with our proposals at LHCb. We find that the reach at LHCb is generically better in the parameter space we consider here, even with optimistic background estimations for ATLAS and CMS searches. We discuss potential modifications at ATLAS and CMS that might make these experiments competitive with the LHCb reach. Our proposed searches can be applied to general hidden valley models as well as exotic Higgs boson decays, such as in twin Higgs models.
Boyer, C; Baujard, V; Scherrer, J R
2001-01-01
Any new user to the Internet will think that to retrieve the relevant document is an easy task especially with the wealth of sources available on this medium, but this is not the case. Even experienced users have difficulty formulating the right query for making the most of a search tool in order to efficiently obtain an accurate result. The goal of this work is to reduce the time and the energy necessary in searching and locating medical and health information. To reach this goal we have developed HONselect [1]. The aim of HONselect is not only to improve efficiency in retrieving documents but to respond to an increased need for obtaining a selection of relevant and accurate documents from a breadth of various knowledge databases including scientific bibliographical references, clinical trials, daily news, multimedia illustrations, conferences, forum, Web sites, clinical cases, and others. The authors based their approach on the knowledge representation using the National Library of Medicine's Medical Subject Headings (NLM, MeSH) vocabulary and classification [2,3]. The innovation is to propose a multilingual "one-stop searching" (one Web interface to databases currently in English, French and German) with full navigational and connectivity capabilities. The user may choose from a given selection of related terms the one that best suit his search, navigate in the term's hierarchical tree, and access directly to a selection of documents from high quality knowledge suppliers such as the MEDLINE database, the NLM's ClinicalTrials.gov server, the NewsPage's daily news, the HON's media gallery, conference listings and MedHunt's Web sites [4, 5, 6, 7, 8, 9]. HONselect, developed by HON, a non-profit organisation [10], is a free online available multilingual tool based on the MeSH thesaurus to index, select, retrieve and display accurate, up to date, high-level and quality documents.
Representational flexibility and response control in a multistep multilocation search task.
Zelazo, P D; Reznick, J S; Spinazzola, J
1998-03-01
Three experiments were conducted to explore the determinants of 2-year-olds' perseverative errors in a search task. In Experiment 1, children either retrieved an object during a preswitch phase or merely observed a hiding event. Active search produced perseveration on postswitch trials, but mere observation did not. In Experiment 2, similar results were found, even when active search occurred in the absence of observation. Finally, in Experiment 3, children observed a hiding event at 1 location on some pretest trials and simply retrieved an object at a different location on other trials. On test trials, in which an object was hidden at a 3rd location, children tended to search where they had searched previously. Together, the results indicate that active search is required to elicit perseveration, which points to failures of response control rather than representational inflexibility.
PubMed related articles: a probabilistic topic-based model for content similarity
Lin, Jimmy; Wilbur, W John
2007-01-01
Background We present a probabilistic topic-based model for content similarity called pmra that underlies the related article search feature in PubMed. Whether or not a document is about a particular topic is computed from term frequencies, modeled as Poisson distributions. Unlike previous probabilistic retrieval models, we do not attempt to estimate relevance–but rather our focus is "relatedness", the probability that a user would want to examine a particular document given known interest in another. We also describe a novel technique for estimating parameters that does not require human relevance judgments; instead, the process is based on the existence of MeSH ® in MEDLINE ®. Results The pmra retrieval model was compared against bm25, a competitive probabilistic model that shares theoretical similarities. Experiments using the test collection from the TREC 2005 genomics track shows a small but statistically significant improvement of pmra over bm25 in terms of precision. Conclusion Our experiments suggest that the pmra model provides an effective ranking algorithm for related article search. PMID:17971238
The Perth Automated Supernova Search
NASA Astrophysics Data System (ADS)
Williams, A. J.
1997-12-01
An automated search for supernovae in late spiral galaxies has been established at Perth Observatory, Western Australia. This automated search uses three low-cost PC-clone computers, a liquid nitrogen cooled CCD camera built locally, and a 61-cm telescope automated for the search. The images are all analysed automatically in real-time by routines in Perth Vista, the image processing system ported to the PC architecture for the search system. The telescope control software written for the project, Teljoy, maintains open-loop position accuracy better than 30" of arc after hundreds of jumps over an entire night. Total capital cost to establish and run this supernova search over the seven years of development and operation was around US$30,000. To date, the system has discovered a total of 6 confirmed supernovae, made an independent detection of a seventh, and detected one unconfirmed event assumed to be a supernova. The various software and hardware components of the search system are described in detail, the analysis of the first three years of data is discussed, and results presented. We find a Type Ib/c rate of 0.43 +/- 0.43 SNu, and a Type IIP rate of 0.86 +/- 0.49 SNu, where SNu are 'supernova units', expressed in supernovae per 10^10 solar blue luminosity galaxy per century. These values are for a Hubble constant of 75 km/s per Mpc, and scale as (H0/75)^2. The small number of discoveries has left large statistical uncertainties, but our strategy of frequent observations has reduced systematic errors - altering detection threshold or peak supernova luminosity by +/- 0.5 mag changes estimated rates by only around 20%. Similarly, adoption of different light curve templates for Type Ia and Type IIP supernovae has a minimal effect on the final statistics (2% and 4% change, respectively).
A Full-Text-Based Search Engine for Finding Highly Matched Documents Across Multiple Categories
NASA Technical Reports Server (NTRS)
Nguyen, Hung D.; Steele, Gynelle C.
2016-01-01
This report demonstrates the full-text-based search engine that works on any Web-based mobile application. The engine has the capability to search databases across multiple categories based on a user's queries and identify the most relevant or similar. The search results presented here were found using an Android (Google Co.) mobile device; however, it is also compatible with other mobile phones.
Large-scale Cross-modality Search via Collective Matrix Factorization Hashing.
Ding, Guiguang; Guo, Yuchen; Zhou, Jile; Gao, Yue
2016-09-08
By transforming data into binary representation, i.e., Hashing, we can perform high-speed search with low storage cost, and thus Hashing has collected increasing research interest in the recent years. Recently, how to generate Hashcode for multimodal data (e.g., images with textual tags, documents with photos, etc) for large-scale cross-modality search (e.g., searching semantically related images in database for a document query) is an important research issue because of the fast growth of multimodal data in the Web. To address this issue, a novel framework for multimodal Hashing is proposed, termed as Collective Matrix Factorization Hashing (CMFH). The key idea of CMFH is to learn unified Hashcodes for different modalities of one multimodal instance in the shared latent semantic space in which different modalities can be effectively connected. Therefore, accurate cross-modality search is supported. Based on the general framework, we extend it in the unsupervised scenario where it tries to preserve the Euclidean structure, and in the supervised scenario where it fully exploits the label information of data. The corresponding theoretical analysis and the optimization algorithms are given. We conducted comprehensive experiments on three benchmark datasets for cross-modality search. The experimental results demonstrate that CMFH can significantly outperform several state-of-the-art cross-modality Hashing methods, which validates the effectiveness of the proposed CMFH.
Race, self-selection, and the job search process.
Pager, Devah; Pedulla, David S
2015-01-01
While existing research has documented persistent barriers facing African-American job seekers, far less research has questioned how job seekers respond to this reality. Do minorities self-select into particular segments of the labor market to avoid discrimination? Such questions have remained unanswered due to the lack of data available on the positions to which job seekers apply. Drawing on two original data sets with application-specific information, we find little evidence that blacks target or avoid particular job types. Rather, blacks cast a wider net in their search than similarly situated whites, including a greater range of occupational categories and characteristics in their pool of job applications. Additionally, we show that perceptions of discrimination are associated with increased search breadth, suggesting that broad search among African-Americans represents an adaptation to labor market discrimination. Together these findings provide novel evidence on the role of race and self-selection in the job search process.
Mapping the Color Space of Saccadic Selectivity in Visual Search
ERIC Educational Resources Information Center
Xu, Yun; Higgins, Emily C.; Xiao, Mei; Pomplun, Marc
2007-01-01
Color coding is used to guide attention in computer displays for such critical tasks as baggage screening or air traffic control. It has been shown that a display object attracts more attention if its color is more similar to the color for which one is searching. However, what does "similar" precisely mean? Can we predict the amount of attention…
Accurate upwind methods for the Euler equations
NASA Technical Reports Server (NTRS)
Huynh, Hung T.
1993-01-01
A new class of piecewise linear methods for the numerical solution of the one-dimensional Euler equations of gas dynamics is presented. These methods are uniformly second-order accurate, and can be considered as extensions of Godunov's scheme. With an appropriate definition of monotonicity preservation for the case of linear convection, it can be shown that they preserve monotonicity. Similar to Van Leer's MUSCL scheme, they consist of two key steps: a reconstruction step followed by an upwind step. For the reconstruction step, a monotonicity constraint that preserves uniform second-order accuracy is introduced. Computational efficiency is enhanced by devising a criterion that detects the 'smooth' part of the data where the constraint is redundant. The concept and coding of the constraint are simplified by the use of the median function. A slope steepening technique, which has no effect at smooth regions and can resolve a contact discontinuity in four cells, is described. As for the upwind step, existing and new methods are applied in a manner slightly different from those in the literature. These methods are derived by approximating the Euler equations via linearization and diagonalization. At a 'smooth' interface, Harten, Lax, and Van Leer's one intermediate state model is employed. A modification for this model that can resolve contact discontinuities is presented. Near a discontinuity, either this modified model or a more accurate one, namely, Roe's flux-difference splitting. is used. The current presentation of Roe's method, via the conceptually simple flux-vector splitting, not only establishes a connection between the two splittings, but also leads to an admissibility correction with no conditional statement, and an efficient approximation to Osher's approximate Riemann solver. These reconstruction and upwind steps result in schemes that are uniformly second-order accurate and economical at smooth regions, and yield high resolution at discontinuities.
Intelligent navigation and accurate positioning of an assist robot in indoor environments
NASA Astrophysics Data System (ADS)
Hua, Bin; Rama, Endri; Capi, Genci; Jindai, Mitsuru; Tsuri, Yosuke
2017-12-01
Intact robot's navigation and accurate positioning in indoor environments are still challenging tasks. Especially in robot applications, assisting disabled and/or elderly people in museums/art gallery environments. In this paper, we present a human-like navigation method, where the neural networks control the wheelchair robot to reach the goal location safely, by imitating the supervisor's motions, and positioning in the intended location. In a museum similar environment, the mobile robot starts navigation from various positions, and uses a low-cost camera to track the target picture, and a laser range finder to make a safe navigation. Results show that the neural controller with the Conjugate Gradient Backpropagation training algorithm gives a robust response to guide the mobile robot accurately to the goal position.
Routine development of objectively derived search strategies.
Hausner, Elke; Waffenschmidt, Siw; Kaiser, Thomas; Simon, Michael
2012-02-29
Over the past few years, information retrieval has become more and more professionalized, and information specialists are considered full members of a research team conducting systematic reviews. Research groups preparing systematic reviews and clinical practice guidelines have been the driving force in the development of search strategies, but open questions remain regarding the transparency of the development process and the available resources. An empirically guided approach to the development of a search strategy provides a way to increase transparency and efficiency. Our aim in this paper is to describe the empirically guided development process for search strategies as applied by the German Institute for Quality and Efficiency in Health Care (Institut für Qualität und Wirtschaftlichkeit im Gesundheitswesen, or "IQWiG"). This strategy consists of the following steps: generation of a test set, as well as the development, validation and standardized documentation of the search strategy. We illustrate our approach by means of an example, that is, a search for literature on brachytherapy in patients with prostate cancer. For this purpose, a test set was generated, including a total of 38 references from 3 systematic reviews. The development set for the generation of the strategy included 25 references. After application of textual analytic procedures, a strategy was developed that included all references in the development set. To test the search strategy on an independent set of references, the remaining 13 references in the test set (the validation set) were used. The validation set was also completely identified. Our conclusion is that an objectively derived approach similar to that used in search filter development is a feasible way to develop and validate reliable search strategies. Besides creating high-quality strategies, the widespread application of this approach will result in a substantial increase in the transparency of the development process of
Seasonal trends in sleep-disordered breathing: evidence from Internet search engine query data.
Ingram, David G; Matthews, Camilla K; Plante, David T
2015-03-01
The primary aim of the current study was to test the hypothesis that there is a seasonal component to snoring and obstructive sleep apnea (OSA) through the use of Google search engine query data. Internet search engine query data were retrieved from Google Trends from January 2006 to December 2012. Monthly normalized search volume was obtained over that 7-year period in the USA and Australia for the following search terms: "snoring" and "sleep apnea". Seasonal effects were investigated by fitting cosinor regression models. In addition, the search terms "snoring children" and "sleep apnea children" were evaluated to examine seasonal effects in pediatric populations. Statistically significant seasonal effects were found using cosinor analysis in both USA and Australia for "snoring" (p < 0.00001 for both countries). Similarly, seasonal patterns were observed for "sleep apnea" in the USA (p = 0.001); however, cosinor analysis was not significant for this search term in Australia (p = 0.13). Seasonal patterns for "snoring children" and "sleep apnea children" were observed in the USA (p = 0.002 and p < 0.00001, respectively), with insufficient search volume to examine these search terms in Australia. All searches peaked in the winter or early spring in both countries, with the magnitude of seasonal effect ranging from 5 to 50 %. Our findings indicate that there are significant seasonal trends for both snoring and sleep apnea internet search engine queries, with a peak in the winter and early spring. Further research is indicated to determine the mechanisms underlying these findings, whether they have clinical impact, and if they are associated with other comorbid medical conditions that have similar patterns of seasonal exacerbation.
Dynamic search and working memory in social recall.
Hills, Thomas T; Pachur, Thorsten
2012-01-01
What are the mechanisms underlying search in social memory (e.g., remembering the people one knows)? Do the search mechanisms involve dynamic local-to-global transitions similar to semantic search, and are these transitions governed by the general control of attention, associated with working memory span? To find out, we asked participants to recall individuals from their personal social networks and measured each participant's working memory capacity. Additionally, participants provided social-category and contact-frequency information about the recalled individuals as well as information about the social proximity among the recalled individuals. On the basis of these data, we tested various computational models of memory search regarding their ability to account for the patterns in which participants recalled from social memory. Although recall patterns showed clustering based on social categories, models assuming dynamic transitions between representations cued by social proximity and frequency information predicted participants' recall patterns best-no additional explanatory power was gained from social-category information. Moreover, individual differences in the time between transitions were positively correlated with differences in working memory capacity. These results highlight the role of social proximity in structuring social memory and elucidate the role of working memory for maintaining search criteria during search within that structure.
Fuzzy similarity measures for ultrasound tissue characterization
NASA Astrophysics Data System (ADS)
Emara, Salem M.; Badawi, Ahmed M.; Youssef, Abou-Bakr M.
1995-03-01
Computerized ultrasound tissue characterization has become an objective means for diagnosis of diseases. It is difficult to differentiate diffuse liver diseases, namely cirrhotic and fatty liver from a normal one, by visual inspection from the ultrasound images. The visual criteria for differentiating diffused diseases is rather confusing and highly dependent upon the sonographer's experience. The need for computerized tissue characterization is thus justified to quantitatively assist the sonographer for accurate differentiation and to minimize the degree of risk from erroneous interpretation. In this paper we used the fuzzy similarity measure as an approximate reasoning technique to find the maximum degree of matching between an unknown case defined by a feature vector and a family of prototypes (knowledge base). The feature vector used for the matching process contains 8 quantitative parameters (textural, acoustical, and speckle parameters) extracted from the ultrasound image. The steps done to match an unknown case with the family of prototypes (cirr, fatty, normal) are: Choosing the membership functions for each parameter, then obtaining the fuzzification matrix for the unknown case and the family of prototypes, then by the linguistic evaluation of two fuzzy quantities we obtain the similarity matrix, then by a simple aggregation method and the fuzzy integrals we obtain the degree of similarity. Finally, we find that the similarity measure results are comparable to the neural network classification techniques and it can be used in medical diagnosis to determine the pathology of the liver and to monitor the extent of the disease.
Chen, Xi; Chen, Huajun; Bi, Xuan; Gu, Peiqin; Chen, Jiaoyan; Wu, Zhaohui
2014-01-01
Understanding the functional mechanisms of the complex biological system as a whole is drawing more and more attention in global health care management. Traditional Chinese Medicine (TCM), essentially different from Western Medicine (WM), is gaining increasing attention due to its emphasis on individual wellness and natural herbal medicine, which satisfies the goal of integrative medicine. However, with the explosive growth of biomedical data on the Web, biomedical researchers are now confronted with the problem of large-scale data analysis and data query. Besides that, biomedical data also has a wide coverage which usually comes from multiple heterogeneous data sources and has different taxonomies, making it hard to integrate and query the big biomedical data. Embedded with domain knowledge from different disciplines all regarding human biological systems, the heterogeneous data repositories are implicitly connected by human expert knowledge. Traditional search engines cannot provide accurate and comprehensive search results for the semantically associated knowledge since they only support keywords-based searches. In this paper, we present BioTCM-SE, a semantic search engine for the information retrieval of modern biology and TCM, which provides biologists with a comprehensive and accurate associated knowledge query platform to greatly facilitate the implicit knowledge discovery between WM and TCM.
Chen, Xi; Chen, Huajun; Bi, Xuan; Gu, Peiqin; Chen, Jiaoyan; Wu, Zhaohui
2014-01-01
Understanding the functional mechanisms of the complex biological system as a whole is drawing more and more attention in global health care management. Traditional Chinese Medicine (TCM), essentially different from Western Medicine (WM), is gaining increasing attention due to its emphasis on individual wellness and natural herbal medicine, which satisfies the goal of integrative medicine. However, with the explosive growth of biomedical data on the Web, biomedical researchers are now confronted with the problem of large-scale data analysis and data query. Besides that, biomedical data also has a wide coverage which usually comes from multiple heterogeneous data sources and has different taxonomies, making it hard to integrate and query the big biomedical data. Embedded with domain knowledge from different disciplines all regarding human biological systems, the heterogeneous data repositories are implicitly connected by human expert knowledge. Traditional search engines cannot provide accurate and comprehensive search results for the semantically associated knowledge since they only support keywords-based searches. In this paper, we present BioTCM-SE, a semantic search engine for the information retrieval of modern biology and TCM, which provides biologists with a comprehensive and accurate associated knowledge query platform to greatly facilitate the implicit knowledge discovery between WM and TCM. PMID:24772189
Demeter, persephone, and the search for emergence in agent-based models.
DOE Office of Scientific and Technical Information (OSTI.GOV)
North, M. J.; Howe, T. R.; Collier, N. T.
2006-01-01
In Greek mythology, the earth goddess Demeter was unable to find her daughter Persephone after Persephone was abducted by Hades, the god of the underworld. Demeter is said to have embarked on a long and frustrating, but ultimately successful, search to find her daughter. Unfortunately, long and frustrating searches are not confined to Greek mythology. In modern times, agent-based modelers often face similar troubles when searching for agents that are to be to be connected to one another and when seeking appropriate target agents while defining agent behaviors. The result is a 'search for emergence' in that many emergent ormore » potentially emergent behaviors in agent-based models of complex adaptive systems either implicitly or explicitly require search functions. This paper considers a new nested querying approach to simplifying such agent-based modeling and multi-agent simulation search problems.« less
Semantic search during divergent thinking.
Hass, Richard W
2017-09-01
Divergent thinking, as a method of examining creative cognition, has not been adequately analyzed in the context of modern cognitive theories. This article casts divergent thinking responding in the context of theories of memory search. First, it was argued that divergent thinking tasks are similar to semantic fluency tasks, but are more constrained, and less well structured. Next, response time distributions from 54 participants were analyzed for temporal and semantic clustering. Participants responded to two prompts from the alternative uses test: uses for a brick and uses for a bottle, for two minutes each. Participants' cumulative response curves were negatively accelerating, in line with theories of search of associative memory. However, results of analyses of semantic and temporal clustering suggested that clustering is less evident in alternative uses responding compared to semantic fluency tasks. This suggests either that divergent thinking responding does not involve an exhaustive search through a clustered memory trace, but rather that the process is more exploratory, yielding fewer overall responses that tend to drift away from close associates of the divergent thinking prompt. Copyright © 2017 Elsevier B.V. All rights reserved.
Searching for displaced Higgs boson decays
NASA Astrophysics Data System (ADS)
Csáki, Csaba; Kuflik, Eric; Lombardo, Salvator; Slone, Oren
2015-10-01
We study a simplified model of the Standard Model (SM) Higgs boson decaying to a degenerate pair of scalars which travel a macroscopic distance before decaying to SM particles. This is the leading signal for many well-motivated solutions to the hierarchy problem that do not propose additional light colored particles. Bounds for displaced Higgs boson decays below 10 cm are found by recasting existing tracker searches from Run I. New tracker search strategies, sensitive to the characteristics of these models and similar decays, are proposed with sensitivities projected for Run II at √{s }=13 TeV . With 20 fb-1 of data, we find that Higgs branching ratios down to 2 ×1 0-4 can be probed for centimeter decay lengths.
Lu, Fred Sun; Hou, Suqin; Baltrusaitis, Kristin; Shah, Manan; Leskovec, Jure; Sosic, Rok; Hawkins, Jared; Brownstein, John; Conidi, Giuseppe; Gunn, Julia; Gray, Josh; Zink, Anna
2018-01-01
Background Influenza outbreaks pose major challenges to public health around the world, leading to thousands of deaths a year in the United States alone. Accurate systems that track influenza activity at the city level are necessary to provide actionable information that can be used for clinical, hospital, and community outbreak preparation. Objective Although Internet-based real-time data sources such as Google searches and tweets have been successfully used to produce influenza activity estimates ahead of traditional health care–based systems at national and state levels, influenza tracking and forecasting at finer spatial resolutions, such as the city level, remain an open question. Our study aimed to present a precise, near real-time methodology capable of producing influenza estimates ahead of those collected and published by the Boston Public Health Commission (BPHC) for the Boston metropolitan area. This approach has great potential to be extended to other cities with access to similar data sources. Methods We first tested the ability of Google searches, Twitter posts, electronic health records, and a crowd-sourced influenza reporting system to detect influenza activity in the Boston metropolis separately. We then adapted a multivariate dynamic regression method named ARGO (autoregression with general online information), designed for tracking influenza at the national level, and showed that it effectively uses the above data sources to monitor and forecast influenza at the city level 1 week ahead of the current date. Finally, we presented an ensemble-based approach capable of combining information from models based on multiple data sources to more robustly nowcast as well as forecast influenza activity in the Boston metropolitan area. The performances of our models were evaluated in an out-of-sample fashion over 4 influenza seasons within 2012-2016, as well as a holdout validation period from 2016 to 2017. Results Our ensemble-based methods incorporating
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sun, Jianwei; Remsing, Richard C.; Zhang, Yubo
2016-06-13
One atom or molecule binds to another through various types of bond, the strengths of which range from several meV to several eV. Although some computational methods can provide accurate descriptions of all bond types, those methods are not efficient enough for many studies (for example, large systems, ab initio molecular dynamics and high-throughput searches for functional materials). Here, we show that the recently developed non-empirical strongly constrained and appropriately normed (SCAN) meta-generalized gradient approximation (meta-GGA) within the density functional theory framework predicts accurate geometries and energies of diversely bonded molecules and materials (including covalent, metallic, ionic, hydrogen and vanmore » der Waals bonds). This represents a significant improvement at comparable efficiency over its predecessors, the GGAs that currently dominate materials computation. Often, SCAN matches or improves on the accuracy of a computationally expensive hybrid functional, at almost-GGA cost. SCAN is therefore expected to have a broad impact on chemistry and materials science.« less
Sun, Jianwei; Remsing, Richard C; Zhang, Yubo; Sun, Zhaoru; Ruzsinszky, Adrienn; Peng, Haowei; Yang, Zenghui; Paul, Arpita; Waghmare, Umesh; Wu, Xifan; Klein, Michael L; Perdew, John P
2016-09-01
One atom or molecule binds to another through various types of bond, the strengths of which range from several meV to several eV. Although some computational methods can provide accurate descriptions of all bond types, those methods are not efficient enough for many studies (for example, large systems, ab initio molecular dynamics and high-throughput searches for functional materials). Here, we show that the recently developed non-empirical strongly constrained and appropriately normed (SCAN) meta-generalized gradient approximation (meta-GGA) within the density functional theory framework predicts accurate geometries and energies of diversely bonded molecules and materials (including covalent, metallic, ionic, hydrogen and van der Waals bonds). This represents a significant improvement at comparable efficiency over its predecessors, the GGAs that currently dominate materials computation. Often, SCAN matches or improves on the accuracy of a computationally expensive hybrid functional, at almost-GGA cost. SCAN is therefore expected to have a broad impact on chemistry and materials science.
Searching for the right word: Hybrid visual and memory search for words.
Boettcher, Sage E P; Wolfe, Jeremy M
2015-05-01
In "hybrid search" (Wolfe Psychological Science, 23(7), 698-703, 2012), observers search through visual space for any of multiple targets held in memory. With photorealistic objects as the stimuli, response times (RTs) increase linearly with the visual set size and logarithmically with the memory set size, even when over 100 items are committed to memory. It is well-established that pictures of objects are particularly easy to memorize (Brady, Konkle, Alvarez, & Oliva Proceedings of the National Academy of Sciences, 105, 14325-14329, 2008). Would hybrid-search performance be similar if the targets were words or phrases, in which word order can be important, so that the processes of memorization might be different? In Experiment 1, observers memorized 2, 4, 8, or 16 words in four different blocks. After passing a memory test, confirming their memorization of the list, the observers searched for these words in visual displays containing two to 16 words. Replicating Wolfe (Psychological Science, 23(7), 698-703, 2012), the RTs increased linearly with the visual set size and logarithmically with the length of the word list. The word lists of Experiment 1 were random. In Experiment 2, words were drawn from phrases that observers reported knowing by heart (e.g., "London Bridge is falling down"). Observers were asked to provide four phrases, ranging in length from two words to no less than 20 words (range 21-86). All words longer than two characters from the phrase, constituted the target list. Distractor words were matched for length and frequency. Even with these strongly ordered lists, the results again replicated the curvilinear function of memory set size seen in hybrid search. One might expect to find serial position effects, perhaps reducing the RTs for the first (primacy) and/or the last (recency) members of a list (Atkinson & Shiffrin, 1968; Murdock Journal of Experimental Psychology, 64, 482-488, 1962). Surprisingly, we showed no reliable effects of word order
NASA Astrophysics Data System (ADS)
Prabhat, Prashant; Peet, Michael; Erdogan, Turan
2016-03-01
In order to design a fluorescence experiment, typically the spectra of a fluorophore and of a filter set are overlaid on a single graph and the spectral overlap is evaluated intuitively. However, in a typical fluorescence imaging system the fluorophores and optical filters are not the only wavelength dependent variables - even the excitation light sources have been changing. For example, LED Light Engines may have a significantly different spectral response compared to the traditional metal-halide lamps. Therefore, for a more accurate assessment of fluorophore-to-filter-set compatibility, all sources of spectral variation should be taken into account simultaneously. Additionally, intuitive or qualitative evaluation of many spectra does not necessarily provide a realistic assessment of the system performance. "SearchLight" is a freely available web-based spectral plotting and analysis tool that can be used to address the need for accurate, quantitative spectral evaluation of fluorescence measurement systems. This tool is available at: http://searchlight.semrock.com/. Based on a detailed mathematical framework [1], SearchLight calculates signal, noise, and signal-to-noise ratio for multiple combinations of fluorophores, filter sets, light sources and detectors. SearchLight allows for qualitative and quantitative evaluation of the compatibility of filter sets with fluorophores, analysis of bleed-through, identification of optimized spectral edge locations for a set of filters under specific experimental conditions, and guidance regarding labeling protocols in multiplexing imaging assays. Entire SearchLight sessions can be shared with colleagues and collaborators and saved for future reference. [1] Anderson, N., Prabhat, P. and Erdogan, T., Spectral Modeling in Fluorescence Microscopy, http://www.semrock.com (2010).
Frontal–Occipital Connectivity During Visual Search
Pantazatos, Spiro P.; Yanagihara, Ted K.; Zhang, Xian; Meitzler, Thomas
2012-01-01
Abstract Although expectation- and attention-related interactions between ventral and medial prefrontal cortex and stimulus category-selective visual regions have been identified during visual detection and discrimination, it is not known if similar neural mechanisms apply to other tasks such as visual search. The current work tested the hypothesis that high-level frontal regions, previously implicated in expectation and visual imagery of object categories, interact with visual regions associated with object recognition during visual search. Using functional magnetic resonance imaging, subjects searched for a specific object that varied in size and location within a complex natural scene. A model-free, spatial-independent component analysis isolated multiple task-related components, one of which included visual cortex, as well as a cluster within ventromedial prefrontal cortex (vmPFC), consistent with the engagement of both top-down and bottom-up processes. Analyses of psychophysiological interactions showed increased functional connectivity between vmPFC and object-sensitive lateral occipital cortex (LOC), and results from dynamic causal modeling and Bayesian Model Selection suggested bidirectional connections between vmPFC and LOC that were positively modulated by the task. Using image-guided diffusion-tensor imaging, functionally seeded, probabilistic white-matter tracts between vmPFC and LOC, which presumably underlie this effective interconnectivity, were also observed. These connectivity findings extend previous models of visual search processes to include specific frontal–occipital neuronal interactions during a natural and complex search task. PMID:22708993
NASA Astrophysics Data System (ADS)
Eliazar, Iddo
2017-12-01
Search processes play key roles in various scientific fields. A widespread and effective search-process scheme, which we term Restart Search, is based on the following restart algorithm: i) set a timer and initiate a search task; ii) if the task was completed before the timer expired, then stop; iii) if the timer expired before the task was completed, then go back to the first step and restart the search process anew. In this paper a branching feature is added to the restart algorithm: at every transition from the algorithm's third step to its first step branching takes place, thus multiplying the search effort. This branching feature yields a search-process scheme which we term Branching Search. The running time of Branching Search is analyzed, closed-form results are established, and these results are compared to the coresponding running-time results of Restart Search.
Visibiome: an efficient microbiome search engine based on a scalable, distributed architecture.
Azman, Syafiq Kamarul; Anwar, Muhammad Zohaib; Henschel, Andreas
2017-07-24
Given the current influx of 16S rRNA profiles of microbiota samples, it is conceivable that large amounts of them eventually are available for search, comparison and contextualization with respect to novel samples. This process facilitates the identification of similar compositional features in microbiota elsewhere and therefore can help to understand driving factors for microbial community assembly. We present Visibiome, a microbiome search engine that can perform exhaustive, phylogeny based similarity search and contextualization of user-provided samples against a comprehensive dataset of 16S rRNA profiles environments, while tackling several computational challenges. In order to scale to high demands, we developed a distributed system that combines web framework technology, task queueing and scheduling, cloud computing and a dedicated database server. To further ensure speed and efficiency, we have deployed Nearest Neighbor search algorithms, capable of sublinear searches in high-dimensional metric spaces in combination with an optimized Earth Mover Distance based implementation of weighted UniFrac. The search also incorporates pairwise (adaptive) rarefaction and optionally, 16S rRNA copy number correction. The result of a query microbiome sample is the contextualization against a comprehensive database of microbiome samples from a diverse range of environments, visualized through a rich set of interactive figures and diagrams, including barchart-based compositional comparisons and ranking of the closest matches in the database. Visibiome is a convenient, scalable and efficient framework to search microbiomes against a comprehensive database of environmental samples. The search engine leverages a popular but computationally expensive, phylogeny based distance metric, while providing numerous advantages over the current state of the art tool.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sheng, Zheng, E-mail: 19994035@sina.com; Wang, Jun; Zhou, Bihua
2014-03-15
This paper introduces a novel hybrid optimization algorithm to establish the parameters of chaotic systems. In order to deal with the weaknesses of the traditional cuckoo search algorithm, the proposed adaptive cuckoo search with simulated annealing algorithm is presented, which incorporates the adaptive parameters adjusting operation and the simulated annealing operation in the cuckoo search algorithm. Normally, the parameters of the cuckoo search algorithm are kept constant that may result in decreasing the efficiency of the algorithm. For the purpose of balancing and enhancing the accuracy and convergence rate of the cuckoo search algorithm, the adaptive operation is presented tomore » tune the parameters properly. Besides, the local search capability of cuckoo search algorithm is relatively weak that may decrease the quality of optimization. So the simulated annealing operation is merged into the cuckoo search algorithm to enhance the local search ability and improve the accuracy and reliability of the results. The functionality of the proposed hybrid algorithm is investigated through the Lorenz chaotic system under the noiseless and noise condition, respectively. The numerical results demonstrate that the method can estimate parameters efficiently and accurately in the noiseless and noise condition. Finally, the results are compared with the traditional cuckoo search algorithm, genetic algorithm, and particle swarm optimization algorithm. Simulation results demonstrate the effectiveness and superior performance of the proposed algorithm.« less
Extracting TSK-type Neuro-Fuzzy model using the Hunting search algorithm
NASA Astrophysics Data System (ADS)
Bouzaida, Sana; Sakly, Anis; M'Sahli, Faouzi
2014-01-01
This paper proposes a Takagi-Sugeno-Kang (TSK) type Neuro-Fuzzy model tuned by a novel metaheuristic optimization algorithm called Hunting Search (HuS). The HuS algorithm is derived based on a model of group hunting of animals such as lions, wolves, and dolphins when looking for a prey. In this study, the structure and parameters of the fuzzy model are encoded into a particle. Thus, the optimal structure and parameters are achieved simultaneously. The proposed method was demonstrated through modeling and control problems, and the results have been compared with other optimization techniques. The comparisons indicate that the proposed method represents a powerful search approach and an effective optimization technique as it can extract the accurate TSK fuzzy model with an appropriate number of rules.
Liang, Zhongwei; Zhou, Liang; Liu, Xiaochu; Wang, Xiaogang
2014-01-01
It is obvious that tablet image tracking exerts a notable influence on the efficiency and reliability of high-speed drug mass production, and, simultaneously, it also emerges as a big difficult problem and targeted focus during production monitoring in recent years, due to the high similarity shape and random position distribution of those objectives to be searched for. For the purpose of tracking tablets accurately in random distribution, through using surface fitting approach and transitional vector determination, the calibrated surface of light intensity reflective energy can be established, describing the shape topology and topography details of objective tablet. On this basis, the mathematical properties of these established surfaces have been proposed, and thereafter artificial neural network (ANN) has been employed for classifying those moving targeted tablets by recognizing their different surface properties; therefore, the instantaneous coordinate positions of those drug tablets on one image frame can then be determined. By repeating identical pattern recognition on the next image frame, the real-time movements of objective tablet templates were successfully tracked in sequence. This paper provides reliable references and new research ideas for the real-time objective tracking in the case of drug production practices. PMID:25143781
Similarity Rules for Scaling Solar Sail Systems
NASA Technical Reports Server (NTRS)
Canfield, Stephen L.; Beard, James W., III; Peddieson, John; Ewing, Anthony; Garbe, Greg
2004-01-01
Future science missions will require solar sails on the order 10,000 sq m (or larger). However, ground and flight demonstrations must be conducted at significantly smaller Sizes (400 sq m for ground demo) due to limitations of ground-based facilities and cost and availability of flight opportunities. For this reason, the ability to understand the process of scalability, as it applies to solar sail system models and test data, is crucial to the advancement of this technology. This report will address issues of scaling in solar sail systems, focusing on structural characteristics, by developing a set of similarity or similitude functions that will guide the scaling process. The primary goal of these similarity functions (process invariants) that collectively form a set of scaling rules or guidelines is to establish valid relationships between models and experiments that are performed at different orders of scale. In the near term, such an effort will help guide the size and properties of a flight validation sail that will need to be flown to accurately represent a large, mission-level sail.
Explicit awareness supports conditional visual search in the retrieval guidance paradigm.
Buttaccio, Daniel R; Lange, Nicholas D; Hahn, Sowon; Thomas, Rick P
2014-01-01
In four experiments we explored whether participants would be able to use probabilistic prompts to simplify perceptually demanding visual search in a task we call the retrieval guidance paradigm. On each trial a memory prompt appeared prior to (and during) the search task and the diagnosticity of the prompt(s) was manipulated to provide complete, partial, or non-diagnostic information regarding the target's color on each trial (Experiments 1-3). In Experiment 1 we found that the more diagnostic prompts was associated with faster visual search performance. However, similar visual search behavior was observed in Experiment 2 when the diagnosticity of the prompts was eliminated, suggesting that participants in Experiment 1 were merely relying on base rate information to guide search and were not utilizing the prompts. In Experiment 3 participants were informed of the relationship between the prompts and the color of the target and this was associated with faster search performance relative to Experiment 1, suggesting that the participants were using the prompts to guide search. Additionally, in Experiment 3 a knowledge test was implemented and performance in this task was associated with qualitative differences in search behavior such that participants that were able to name the color(s) most associated with the prompts were faster to find the target than participants who were unable to do so. However, in Experiments 1-3 diagnosticity of the memory prompt was manipulated via base rate information, making it possible that participants were merely relying on base rate information to inform search in Experiment 3. In Experiment 4 we manipulated diagnosticity of the prompts without manipulating base rate information and found a similar pattern of results as Experiment 3. Together, the results emphasize the importance of base rate and diagnosticity information in visual search behavior. In the General discussion section we explore how a recent computational model of
Memory for found targets interferes with subsequent performance in multiple-target visual search.
Cain, Matthew S; Mitroff, Stephen R
2013-10-01
Multiple-target visual searches--when more than 1 target can appear in a given search display--are commonplace in radiology, airport security screening, and the military. Whereas 1 target is often found accurately, additional targets are more likely to be missed in multiple-target searches. To better understand this decrement in 2nd-target detection, here we examined 2 potential forms of interference that can arise from finding a 1st target: interference from the perceptual salience of the 1st target (a now highly relevant distractor in a known location) and interference from a newly created memory representation for the 1st target. Here, we found that removing found targets from the display or making them salient and easily segregated color singletons improved subsequent search accuracy. However, replacing found targets with random distractor items did not improve subsequent search accuracy. Removing and highlighting found targets likely reduced both a target's visual salience and its memory load, whereas replacing a target removed its visual salience but not its representation in memory. Collectively, the current experiments suggest that the working memory load of a found target has a larger effect on subsequent search accuracy than does its perceptual salience. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Similarity solution of the Boussinesq equation
NASA Astrophysics Data System (ADS)
Lockington, D. A.; Parlange, J.-Y.; Parlange, M. B.; Selker, J.
Similarity transforms of the Boussinesq equation in a semi-infinite medium are available when the boundary conditions are a power of time. The Boussinesq equation is reduced from a partial differential equation to a boundary-value problem. Chen et al. [Trans Porous Media 1995;18:15-36] use a hodograph method to derive an integral equation formulation of the new differential equation which they solve by numerical iteration. In the present paper, the convergence of their scheme is improved such that numerical iteration can be avoided for all practical purposes. However, a simpler analytical approach is also presented which is based on Shampine's transformation of the boundary value problem to an initial value problem. This analytical approximation is remarkably simple and yet more accurate than the analytical hodograph approximations.
Data-driven model-independent searches for long-lived particles at the LHC
NASA Astrophysics Data System (ADS)
Coccaro, Andrea; Curtin, David; Lubatti, H. J.; Russell, Heather; Shelton, Jessie
2016-12-01
Neutral long-lived particles (LLPs) are highly motivated by many beyond the Standard Model scenarios, such as theories of supersymmetry, baryogenesis, and neutral naturalness, and present both tremendous discovery opportunities and experimental challenges for the LHC. A major bottleneck for current LLP searches is the prediction of Standard Model backgrounds, which are often impossible to simulate accurately. In this paper, we propose a general strategy for obtaining differential, data-driven background estimates in LLP searches, thereby notably extending the range of LLP masses and lifetimes that can be discovered at the LHC. We focus on LLPs decaying in the ATLAS muon system, where triggers providing both signal and control samples are available at LHC run 2. While many existing searches require two displaced decays, a detailed knowledge of backgrounds will allow for very inclusive searches that require just one detected LLP decay. As we demonstrate for the h →X X signal model of LLP pair production in exotic Higgs decays, this results in dramatic sensitivity improvements for proper lifetimes ≳10 m . In theories of neutral naturalness, this extends reach to glueball masses far below the b ¯b threshold. Our strategy readily generalizes to other signal models and other detector subsystems. This framework therefore lends itself to the development of a systematic, model-independent LLP search program, in analogy to the highly successful simplified-model framework of prompt searches.
To search or to like: Mapping fixations to differentiate two forms of incidental scene memory.
Choe, Kyoung Whan; Kardan, Omid; Kotabe, Hiroki P; Henderson, John M; Berman, Marc G
2017-10-01
We employed eye-tracking to investigate how performing different tasks on scenes (e.g., intentionally memorizing them, searching for an object, evaluating aesthetic preference) can affect eye movements during encoding and subsequent scene memory. We found that scene memorability decreased after visual search (one incidental encoding task) compared to intentional memorization, and that preference evaluation (another incidental encoding task) produced better memory, similar to the incidental memory boost previously observed for words and faces. By analyzing fixation maps, we found that although fixation map similarity could explain how eye movements during visual search impairs incidental scene memory, it could not explain the incidental memory boost from aesthetic preference evaluation, implying that implicit mechanisms were at play. We conclude that not all incidental encoding tasks should be taken to be similar, as different mechanisms (e.g., explicit or implicit) lead to memory enhancements or decrements for different incidental encoding tasks.
Dynamic Grover search: applications in recommendation systems and optimization problems
NASA Astrophysics Data System (ADS)
Chakrabarty, Indranil; Khan, Shahzor; Singh, Vanshdeep
2017-06-01
In the recent years, we have seen that Grover search algorithm (Proceedings, 28th annual ACM symposium on the theory of computing, pp. 212-219, 1996) by using quantum parallelism has revolutionized the field of solving huge class of NP problems in comparisons to classical systems. In this work, we explore the idea of extending Grover search algorithm to approximate algorithms. Here we try to analyze the applicability of Grover search to process an unstructured database with a dynamic selection function in contrast to the static selection function used in the original work (Grover in Proceedings, 28th annual ACM symposium on the theory of computing, pp. 212-219, 1996). We show that this alteration facilitates us to extend the application of Grover search to the field of randomized search algorithms. Further, we use the dynamic Grover search algorithm to define the goals for a recommendation system based on which we propose a recommendation algorithm which uses binomial similarity distribution space giving us a quadratic speedup over traditional classical unstructured recommendation systems. Finally, we see how dynamic Grover search can be used to tackle a wide range of optimization problems where we improve complexity over existing optimization algorithms.
Visual Search Elicits the Electrophysiological Marker of Visual Working Memory
Emrich, Stephen M.; Al-Aidroos, Naseem; Pratt, Jay; Ferber, Susanne
2009-01-01
Background Although limited in capacity, visual working memory (VWM) plays an important role in many aspects of visually-guided behavior. Recent experiments have demonstrated an electrophysiological marker of VWM encoding and maintenance, the contralateral delay activity (CDA), which has been shown in multiple tasks that have both explicit and implicit memory demands. Here, we investigate whether the CDA is evident during visual search, a thoroughly-researched task that is a hallmark of visual attention but has no explicit memory requirements. Methodology/Principal Findings The results demonstrate that the CDA is present during a lateralized search task, and that it is similar in amplitude to the CDA observed in a change-detection task, but peaks slightly later. The changes in CDA amplitude during search were strongly correlated with VWM capacity, as well as with search efficiency. These results were paralleled by behavioral findings showing a strong correlation between VWM capacity and search efficiency. Conclusions/Significance We conclude that the activity observed during visual search was generated by the same neural resources that subserve VWM, and that this activity reflects the maintenance of previously searched distractors. PMID:19956663
Infant search and object permanence: a meta-analysis of the A-not-B error.
Wellman, H M; Cross, D; Bartsch, K
1987-01-01
Research on Piaget's stage 4 object concept has failed to reveal a clear or consistent pattern of results. Piaget found that 8-12-month-old infants would make perserverative errors; his explanation for this phenomenon was that the infant's concept of the object was contextually dependent on his or her actions. Some studies designed to test Piaget's explanation have replicated Piaget's basic finding, yet many have found no preference for the A location or the B location or an actual preference for the B location. More recently, researchers have attempted to uncover the causes for these results concerning the A-not-B error. Again, however, different studies have yielded different results, and qualitative reviews have failed to yield a consistent explanation for the results of the individual studies. This state of affairs suggests that the phenomenon may simply be too complex to be captured by individual studies varying 1 factor at a time and by reviews based on similar qualitative considerations. Therefore, the current investigation undertook a meta-analysis, a synthesis capturing the quantitative information across the now sizable number of studies. We entered several important factors into the meta-analysis, including the effects of age, the number of A trials, the length of delay between hiding and search, the number of locations, the distances between locations, and the distinctive visual properties of the hiding arrays. Of these, the analysis consistently indicated that age, delay, and number of hiding locations strongly influence infants' search. The pattern of specific findings also yielded new information about infant search. A general characterization of the results is that, at every age, both above-chance and below-chance performance was observed. That is, at each age at least 1 combination of delay and number of locations yielded above-chance A-not-B errors or significant perseverative search. At the same time, at each age at least 1 alternative
Stride search: A general algorithm for storm detection in high resolution climate data
Bosler, Peter Andrew; Roesler, Erika Louise; Taylor, Mark A.; ...
2015-09-08
This article discusses the problem of identifying extreme climate events such as intense storms within large climate data sets. The basic storm detection algorithm is reviewed, which splits the problem into two parts: a spatial search followed by a temporal correlation problem. Two specific implementations of the spatial search algorithm are compared. The commonly used grid point search algorithm is reviewed, and a new algorithm called Stride Search is introduced. Stride Search is designed to work at all latitudes, while grid point searches may fail in polar regions. Results from the two algorithms are compared for the application of tropicalmore » cyclone detection, and shown to produce similar results for the same set of storm identification criteria. The time required for both algorithms to search the same data set is compared. Furthermore, Stride Search's ability to search extreme latitudes is demonstrated for the case of polar low detection.« less
Pillai, Nikhil; Craig, Morgan; Dokoumetzidis, Aristeidis; Schwartz, Sorell L; Bies, Robert; Freedman, Immanuel
2018-06-19
In mathematical pharmacology, models are constructed to confer a robust method for optimizing treatment. The predictive capability of pharmacological models depends heavily on the ability to track the system and to accurately determine parameters with reference to the sensitivity in projected outcomes. To closely track chaotic systems, one may choose to apply chaos synchronization. An advantageous byproduct of this methodology is the ability to quantify model parameters. In this paper, we illustrate the use of chaos synchronization combined with Nelder-Mead search to estimate parameters of the well-known Kirschner-Panetta model of IL-2 immunotherapy from noisy data. Chaos synchronization with Nelder-Mead search is shown to provide more accurate and reliable estimates than Nelder-Mead search based on an extended least squares (ELS) objective function. Our results underline the strength of this approach to parameter estimation and provide a broader framework of parameter identification for nonlinear models in pharmacology. Copyright © 2018 Elsevier Ltd. All rights reserved.
Mining Social Media and Web Searches For Disease Detection
Yang, Y. Tony; Horneffer, Michael; DiLisio, Nicole
2013-01-01
Web-based social media is increasingly being used across different settings in the health care industry. The increased frequency in the use of the Internet via computer or mobile devices provides an opportunity for social media to be the medium through which people can be provided with valuable health information quickly and directly. While traditional methods of detection relied predominately on hierarchical or bureaucratic lines of communication, these often failed to yield timely and accurate epidemiological intelligence. New web-based platforms promise increased opportunities for a more timely and accurate spreading of information and analysis. This article aims to provide an overview and discussion of the availability of timely and accurate information. It is especially useful for the rapid identification of an outbreak of an infectious disease that is necessary to promptly and effectively develop public health responses. These web-based platforms include search queries, data mining of web and social media, process and analysis of blogs containing epidemic key words, text mining, and geographical information system data analyses. These new sources of analysis and information are intended to complement traditional sources of epidemic intelligence. Despite the attractiveness of these new approaches, further study is needed to determine the accuracy of blogger statements, as increases in public participation may not necessarily mean the information provided is more accurate. PMID:25170475
Mining social media and web searches for disease detection.
Yang, Y Tony; Horneffer, Michael; DiLisio, Nicole
2013-04-28
Web-based social media is increasingly being used across different settings in the health care industry. The increased frequency in the use of the Internet via computer or mobile devices provides an opportunity for social media to be the medium through which people can be provided with valuable health information quickly and directly. While traditional methods of detection relied predominately on hierarchical or bureaucratic lines of communication, these often failed to yield timely and accurate epidemiological intelligence. New web-based platforms promise increased opportunities for a more timely and accurate spreading of information and analysis. This article aims to provide an overview and discussion of the availability of timely and accurate information. It is especially useful for the rapid identification of an outbreak of an infectious disease that is necessary to promptly and effectively develop public health responses. These web-based platforms include search queries, data mining of web and social media, process and analysis of blogs containing epidemic key words, text mining, and geographical information system data analyses. These new sources of analysis and information are intended to complement traditional sources of epidemic intelligence. Despite the attractiveness of these new approaches, further study is needed to determine the accuracy of blogger statements, as increases in public participation may not necessarily mean the information provided is more accurate.
The determination of accurate dipole polarizabilities alpha and gamma for the noble gases
NASA Technical Reports Server (NTRS)
Rice, Julia E.; Taylor, Peter R.; Lee, Timothy J.; Almlof, Jan
1991-01-01
Accurate static dipole polarizabilities alpha and gamma of the noble gases He through Xe were determined using wave functions of similar quality for each system. Good agreement with experimental data for the static polarizability gamma was obtained for Ne and Xe, but not for Ar and Kr. Calculations suggest that the experimental values for these latter ions are too low.
Accurate Prediction of Motor Failures by Application of Multi CBM Tools: A Case Study
NASA Astrophysics Data System (ADS)
Dutta, Rana; Singh, Veerendra Pratap; Dwivedi, Jai Prakash
2018-02-01
Motor failures are very difficult to predict accurately with a single condition-monitoring tool as both electrical and the mechanical systems are closely related. Electrical problem, like phase unbalance, stator winding insulation failures can, at times, lead to vibration problem and at the same time mechanical failures like bearing failure, leads to rotor eccentricity. In this case study of a 550 kW blower motor it has been shown that a rotor bar crack was detected by current signature analysis and vibration monitoring confirmed the same. In later months in a similar motor vibration monitoring predicted bearing failure and current signature analysis confirmed the same. In both the cases, after dismantling the motor, the predictions were found to be accurate. In this paper we will be discussing the accurate predictions of motor failures through use of multi condition monitoring tools with two case studies.
OpenSearch technology for geospatial resources discovery
NASA Astrophysics Data System (ADS)
Papeschi, Fabrizio; Enrico, Boldrini; Mazzetti, Paolo
2010-05-01
In 2005, the term Web 2.0 has been coined by Tim O'Reilly to describe a quickly growing set of Web-based applications that share a common philosophy of "mutually maximizing collective intelligence and added value for each participant by formalized and dynamic information sharing". Around this same period, OpenSearch a new Web 2.0 technology, was developed. More properly, OpenSearch is a collection of technologies that allow publishing of search results in a format suitable for syndication and aggregation. It is a way for websites and search engines to publish search results in a standard and accessible format. Due to its strong impact on the way the Web is perceived by users and also due its relevance for businesses, Web 2.0 has attracted the attention of both mass media and the scientific community. This explosive growth in popularity of Web 2.0 technologies like OpenSearch, and practical applications of Service Oriented Architecture (SOA) resulted in an increased interest in similarities, convergence, and a potential synergy of these two concepts. SOA is considered as the philosophy of encapsulating application logic in services with a uniformly defined interface and making these publicly available via discovery mechanisms. Service consumers may then retrieve these services, compose and use them according to their current needs. A great degree of similarity between SOA and Web 2.0 may be leading to a convergence between the two paradigms. They also expose divergent elements, such as the Web 2.0 support to the human interaction in opposition to the typical SOA machine-to-machine interaction. According to these considerations, the Geospatial Information (GI) domain, is also moving first steps towards a new approach of data publishing and discovering, in particular taking advantage of the OpenSearch technology. A specific GI niche is represented by the OGC Catalog Service for Web (CSW) that is part of the OGC Web Services (OWS) specifications suite, which provides a
Fast, Inclusive Searches for Geographic Names Using Digraphs
Donato, David I.
2008-01-01
An algorithm specifies how to quickly identify names that approximately match any specified name when searching a list or database of geographic names. Based on comparisons of the digraphs (ordered letter pairs) contained in geographic names, this algorithmic technique identifies approximately matching names by applying an artificial but useful measure of name similarity. A digraph index enables computer name searches that are carried out using this technique to be fast enough for deployment in a Web application. This technique, which is a member of the class of n-gram algorithms, is related to, but distinct from, the soundex, PHONIX, and metaphone phonetic algorithms. Despite this technique's tendency to return some counterintuitive approximate matches, it is an effective aid for fast, inclusive searches for geographic names when the exact name sought, or its correct spelling, is unknown.
Zelinsky, Gregory J; Peng, Yifan; Berg, Alexander C; Samaras, Dimitris
2013-10-08
Search is commonly described as a repeating cycle of guidance to target-like objects, followed by the recognition of these objects as targets or distractors. Are these indeed separate processes using different visual features? We addressed this question by comparing observer behavior to that of support vector machine (SVM) models trained on guidance and recognition tasks. Observers searched for a categorically defined teddy bear target in four-object arrays. Target-absent trials consisted of random category distractors rated in their visual similarity to teddy bears. Guidance, quantified as first-fixated objects during search, was strongest for targets, followed by target-similar, medium-similarity, and target-dissimilar distractors. False positive errors to first-fixated distractors also decreased with increasing dissimilarity to the target category. To model guidance, nine teddy bear detectors, using features ranging in biological plausibility, were trained on unblurred bears then tested on blurred versions of the same objects appearing in each search display. Guidance estimates were based on target probabilities obtained from these detectors. To model recognition, nine bear/nonbear classifiers, trained and tested on unblurred objects, were used to classify the object that would be fixated first (based on the detector estimates) as a teddy bear or a distractor. Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by an HMAX model in combination with a color histogram feature. We conclude that guidance and recognition in the context of search are not separate processes mediated by different features, and that what the literature knows as guidance is really recognition performed on blurred objects viewed in the visual periphery.
Lie algebraic similarity transformed Hamiltonians for lattice model systems
NASA Astrophysics Data System (ADS)
Wahlen-Strothman, Jacob M.; Jiménez-Hoyos, Carlos A.; Henderson, Thomas M.; Scuseria, Gustavo E.
2015-01-01
We present a class of Lie algebraic similarity transformations generated by exponentials of two-body on-site Hermitian operators whose Hausdorff series can be summed exactly without truncation. The correlators are defined over the entire lattice and include the Gutzwiller factor ni ↑ni ↓ , and two-site products of density (ni ↑+ni ↓) and spin (ni ↑-ni ↓) operators. The resulting non-Hermitian many-body Hamiltonian can be solved in a biorthogonal mean-field approach with polynomial computational cost. The proposed similarity transformation generates locally weighted orbital transformations of the reference determinant. Although the energy of the model is unbound, projective equations in the spirit of coupled cluster theory lead to well-defined solutions. The theory is tested on the one- and two-dimensional repulsive Hubbard model where it yields accurate results for small and medium sized interaction strengths.
A lunar-based detector to search for relic supernovae antineutrinos
NASA Astrophysics Data System (ADS)
Mann, A. K.; Zhang, W.
1990-03-01
Observations of the relic supernovae antineutrino flux are argued to be possible near the lowest theoretical estimates of the flux by means of a suitable detector located on the moon. The status of the search for the relic flux is discussed with illustrations of the data obtained by terrestrial searches. The detector concept is described, and the advantages are found to include the fact that a lunar detector would not detect the electron-type antineutrinos related to nuclear reactors. Similarly, the lunar detector would not be affected by the flux of neutrinos and antineutrinos generated by the cosmic-ray proton flux in the atmosphere. The relative abundance of radioisotopes on the moon is similar to that found on earth, so that the background lunar radioactivity would have little effect on the detection of antineutrinos.
Searching for the right word: Hybrid visual and memory search for words
Boettcher, Sage E. P.; Wolfe, Jeremy M.
2016-01-01
In “Hybrid Search” (Wolfe 2012) observers search through visual space for any of multiple targets held in memory. With photorealistic objects as stimuli, response times (RTs) increase linearly with the visual set size and logarithmically with memory set size even when over 100 items are committed to memory. It is well established that pictures of objects are particularly easy to memorize (Brady, Konkle, Alvarez, & Olivia, 2008). Would hybrid search performance be similar if the targets were words or phrases where word order can be important and where the processes of memorization might be different? In Experiment One, observers memorized 2, 4, 8, or 16 words in 4 different blocks. After passing a memory test, confirming memorization of the list, observers searched for these words in visual displays containing 2 to 16 words. Replicating Wolfe (2012), RTs increased linearly with the visual set size and logarithmically with the length of the word list. The word lists of Experiment One were random. In Experiment Two, words were drawn from phrases that observers reported knowing by heart (E.G. “London Bridge is falling down”). Observers were asked to provide four phrases ranging in length from 2 words to a phrase of no less than 20 words (range 21–86). Words longer than 2 characters from the phrase constituted the target list. Distractor words were matched for length and frequency. Even with these strongly ordered lists, results again replicated the curvilinear function of memory set size seen in hybrid search. One might expect serial position effects; perhaps reducing RTs for the first (primacy) and/or last (recency) members of a list (Atkinson & Shiffrin 1968; Murdock, 1962). Surprisingly we showed no reliable effects of word order. Thus, in “London Bridge is falling down”, “London” and “down” are found no faster than “falling”. PMID:25788035
Zheluk, Andrey; Gillespie, James A; Quinn, Casey
2012-12-13
,403 for "Khimki" in Yandex. We found Google potentially provides timely search results, whereas Yandex provides more accurate geographic localization. The correlation was moderate to strong between search terms representing the Bychkov episode and terms representing salient drug issues in Yandex-"illicit drug treatment" (r(s) = .90, P < .001), "illicit drugs" (r(s) = .76, P < .001), and "drug addiction" (r(s) = .74, P < .001). Google correlations were weaker or absent-"illicit drug treatment" (r(s) = .12, P = .58), "illicit drugs " (r(s) = -0.29, P = .17), and "drug addiction" (r(s) = .68, P < .001). This study contributes to the methodological literature on the analysis of search patterns for public health. This paper investigated the relationship between Google and Yandex, and contributed to the broader methods literature by highlighting both the potential and limitations of these two search providers. We believe that Yandex Wordstat is a potentially valuable, and underused data source for researchers working on Russian-related illicit drug policy and other public health problems. The Russian Federation, with its large, geographically dispersed, and politically engaged online population presents unique opportunities for studying the evolving influence of the Internet on politics and policy, using low cost methods resilient against potential increases in censorship.
2015-01-01
the Protein Data Bank (http://www.rcsb.org/ pdb /). These structures are the most accurate and can be used for molecular docking. Target flexibility is...crystallized with the different ligands. In total, 240 files with the structures of 37 proteins were downloaded from PDB and used for docking...total, 240 files with protein structures were downloaded from the PDB and used for protein–ligand docking. It is widely accepted that ligand binding
Passage-Based Bibliographic Coupling: An Inter-Article Similarity Measure for Biomedical Articles
Liu, Rey-Long
2015-01-01
Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling) that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references) in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no) articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations. PMID:26440794
Passage-Based Bibliographic Coupling: An Inter-Article Similarity Measure for Biomedical Articles.
Liu, Rey-Long
2015-01-01
Biomedical literature is an essential source of biomedical evidence. To translate the evidence for biomedicine study, researchers often need to carefully read multiple articles about specific biomedical issues. These articles thus need to be highly related to each other. They should share similar core contents, including research goals, methods, and findings. However, given an article r, it is challenging for search engines to retrieve highly related articles for r. In this paper, we present a technique PBC (Passage-based Bibliographic Coupling) that estimates inter-article similarity by seamlessly integrating bibliographic coupling with the information collected from context passages around important out-link citations (references) in each article. Empirical evaluation shows that PBC can significantly improve the retrieval of those articles that biomedical experts believe to be highly related to specific articles about gene-disease associations. PBC can thus be used to improve search engines in retrieving the highly related articles for any given article r, even when r is cited by very few (or even no) articles. The contribution is essential for those researchers and text mining systems that aim at cross-validating the evidence about specific gene-disease associations.
Web server to identify similarity of amino acid motifs to compounds (SAAMCO).
Casey, Fergal P; Davey, Norman E; Baran, Ivan; Varekova, Radka Svobodova; Shields, Denis C
2008-07-01
Protein-protein interactions are fundamental in mediating biological processes including metabolism, cell growth, and signaling. To be able to selectively inhibit or induce protein activity or complex formation is a key feature in controlling disease. For those situations in which protein-protein interactions derive substantial affinity from short linear peptide sequences, or motifs, we can develop search algorithms for peptidomimetic compounds that resemble the short peptide's structure but are not compromised by poor pharmacological properties. SAAMCO is a Web service ( http://bioware.ucd.ie/ approximately saamco) that facilitates the screening of motifs with known structures against bioactive compound databases. It is built on an algorithm that defines compound similarity based on the presence of appropriate amino acid side chain fragments and a favorable Root Mean Squared Deviation (RMSD) between compound and motif structure. The methodology is efficient as the available compound databases are preprocessed and fast regular expression searches filter potential matches before time-intensive 3D superposition is performed. The required input information is minimal, and the compound databases have been selected to maximize the availability of information on biological activity. "Hits" are accompanied with a visualization window and links to source database entries. Motif matching can be defined on partial or full similarity which will increase or reduce respectively the number of potential mimetic compounds. The Web server provides the functionality for rapid screening of known or putative interaction motifs against prepared compound libraries using a novel search algorithm. The tabulated results can be analyzed by linking to appropriate databases and by visualization.
Areal Feature Matching Based on Similarity Using Critic Method
NASA Astrophysics Data System (ADS)
Kim, J.; Yu, K.
2015-10-01
In this paper, we propose an areal feature matching method that can be applied for many-to-many matching, which involves matching a simple entity with an aggregate of several polygons or two aggregates of several polygons with fewer user intervention. To this end, an affine transformation is applied to two datasets by using polygon pairs for which the building name is the same. Then, two datasets are overlaid with intersected polygon pairs that are selected as candidate matching pairs. If many polygons intersect at this time, we calculate the inclusion function between such polygons. When the value is more than 0.4, many of the polygons are aggregated as single polygons by using a convex hull. Finally, the shape similarity is calculated between the candidate pairs according to the linear sum of the weights computed in CRITIC method and the position similarity, shape ratio similarity, and overlap similarity. The candidate pairs for which the value of the shape similarity is more than 0.7 are determined as matching pairs. We applied the method to two geospatial datasets: the digital topographic map and the KAIS map in South Korea. As a result, the visual evaluation showed two polygons that had been well detected by using the proposed method. The statistical evaluation indicates that the proposed method is accurate when using our test dataset with a high F-measure of 0.91.
Hunting for a headhunter. How to select a physician search firm.
Walker, R
1989-05-01
Many healthcare facilities in search of a physician are bombarded with offers from physician search firms to drum up potential candidates. Determining which firm has the right stuff for the job takes considerable time and skill. More than 60 companies belong to the National Association of Physician Recruiters, and their methods, policies, and results may vary widely. Administrators can begin getting basic information by contacting firms and requesting written material. During the initial telephone call, the administrator in charge of the search should speak with a consultant or principal of the firm (whoever would be doing the search) and find out what experience that person has had with searches for facilities in similar geographical areas, his or her success in placing physicians who specialize in the specialty needed, how many searches the consultant undertakes at one time, whether the firm guarantees its services, and an outline of its fee structure. After evaluating written material, the administrator should choose two or three search firms to make personal presentations. These presentations should follow a logical sequence and include statistics, completion times, ratios, and specific deadlines for various parts of the search process.
2017-01-01
Background Patient and consumer access to eHealth information is of crucial importance because of its role in patient-centered medicine and to improve knowledge about general aspects of health and medical topics. Objectives The objectives were to analyze and compare eHealth search patterns in a private (United States) and a public (United Kingdom) health care market. Methods A new taxonomy of eHealth websites is proposed to organize the largest eHealth websites. An online measurement framework is developed that provides a precise and detailed measurement system. Online panel data are used to accurately track and analyze detailed search behavior across 100 of the largest eHealth websites in the US and UK health care markets. Results The health, medical, and lifestyle categories account for approximately 90% of online activity, and e-pharmacies, social media, and professional categories account for the remaining 10% of online activity. Overall search penetration of eHealth websites is significantly higher in the private (United States) than the public market (United Kingdom). Almost twice the number of eHealth users in the private market have adopted online search in the health and lifestyle categories and also spend more time per website than those in the public market. The use of medical websites for specific conditions is almost identical in both markets. The allocation of search effort across categories is similar in both the markets. For all categories, the vast majority of eHealth users only access one website within each category. Those that conduct a search of two or more websites display very narrow search patterns. All users spend relatively little time on eHealth, that is, 3-7 minutes per website. Conclusions The proposed online measurement framework exploits online panel data to provide a powerful and objective method of analyzing and exploring eHealth behavior. The private health care system does appear to have an influence on eHealth search behavior in
Schneider, Janina Anne; Holland, Christopher Patrick
2017-04-13
Patient and consumer access to eHealth information is of crucial importance because of its role in patient-centered medicine and to improve knowledge about general aspects of health and medical topics. The objectives were to analyze and compare eHealth search patterns in a private (United States) and a public (United Kingdom) health care market. A new taxonomy of eHealth websites is proposed to organize the largest eHealth websites. An online measurement framework is developed that provides a precise and detailed measurement system. Online panel data are used to accurately track and analyze detailed search behavior across 100 of the largest eHealth websites in the US and UK health care markets. The health, medical, and lifestyle categories account for approximately 90% of online activity, and e-pharmacies, social media, and professional categories account for the remaining 10% of online activity. Overall search penetration of eHealth websites is significantly higher in the private (United States) than the public market (United Kingdom). Almost twice the number of eHealth users in the private market have adopted online search in the health and lifestyle categories and also spend more time per website than those in the public market. The use of medical websites for specific conditions is almost identical in both markets. The allocation of search effort across categories is similar in both the markets. For all categories, the vast majority of eHealth users only access one website within each category. Those that conduct a search of two or more websites display very narrow search patterns. All users spend relatively little time on eHealth, that is, 3-7 minutes per website. The proposed online measurement framework exploits online panel data to provide a powerful and objective method of analyzing and exploring eHealth behavior. The private health care system does appear to have an influence on eHealth search behavior in terms of search penetration and time spent per
Scale-Similar Models for Large-Eddy Simulations
NASA Technical Reports Server (NTRS)
Sarghini, F.
1999-01-01
Scale-similar models employ multiple filtering operations to identify the smallest resolved scales, which have been shown to be the most active in the interaction with the unresolved subgrid scales. They do not assume that the principal axes of the strain-rate tensor are aligned with those of the subgrid-scale stress (SGS) tensor, and allow the explicit calculation of the SGS energy. They can provide backscatter in a numerically stable and physically realistic manner, and predict SGS stresses in regions that are well correlated with the locations where large Reynolds stress occurs. In this paper, eddy viscosity and mixed models, which include an eddy-viscosity part as well as a scale-similar contribution, are applied to the simulation of two flows, a high Reynolds number plane channel flow, and a three-dimensional, nonequilibrium flow. The results show that simulations without models or with the Smagorinsky model are unable to predict nonequilibrium effects. Dynamic models provide an improvement of the results: the adjustment of the coefficient results in more accurate prediction of the perturbation from equilibrium. The Lagrangian-ensemble approach [Meneveau et al., J. Fluid Mech. 319, 353 (1996)] is found to be very beneficial. Models that included a scale-similar term and a dissipative one, as well as the Lagrangian ensemble averaging, gave results in the best agreement with the direct simulation and experimental data.
Visual search and emotion: how children with autism spectrum disorders scan emotional scenes.
Maccari, Lisa; Pasini, Augusto; Caroli, Emanuela; Rosa, Caterina; Marotta, Andrea; Martella, Diana; Fuentes, Luis J; Casagrande, Maria
2014-11-01
This study assessed visual search abilities, tested through the flicker task, in children diagnosed with autism spectrum disorders (ASDs). Twenty-two children diagnosed with ASD and 22 matched typically developing (TD) children were told to detect changes in objects of central interest or objects of marginal interest (MI) embedded in either emotion-laden (positive or negative) or neutral real-world pictures. The results showed that emotion-laden pictures equally interfered with performance of both ASD and TD children, slowing down reaction times compared with neutral pictures. Children with ASD were faster than TD children, particularly in detecting changes in MI objects, the most difficult condition. However, their performance was less accurate than performance of TD children just when the pictures were negative. These findings suggest that children with ASD have better visual search abilities than TD children only when the search is particularly difficult and requires strong serial search strategies. The emotional-social impairment that is usually considered as a typical feature of ASD seems to be limited to processing of negative emotional information.
Gentle Masking of Low-Complexity Sequences Improves Homology Search
Frith, Martin C.
2011-01-01
Detection of sequences that are homologous, i.e. descended from a common ancestor, is a fundamental task in computational biology. This task is confounded by low-complexity tracts (such as atatatatatat), which arise frequently and independently, causing strong similarities that are not homologies. There has been much research on identifying low-complexity tracts, but little research on how to treat them during homology search. We propose to find homologies by aligning sequences with “gentle” masking of low-complexity tracts. Gentle masking means that the match score involving a masked letter is , where is the unmasked score. Gentle masking slightly but noticeably improves the sensitivity of homology search (compared to “harsh” masking), without harming specificity. We show examples in three useful homology search problems: detection of NUMTs (nuclear copies of mitochondrial DNA), recruitment of metagenomic DNA reads to reference genomes, and pseudogene detection. Gentle masking is currently the best way to treat low-complexity tracts during homology search. PMID:22205972
Race, Self-Selection, and the Job Search Process1
Pager, Devah; Pedulla, David S.
2015-01-01
While existing research has documented persistent barriers facing African American job seekers, far less research has questioned how job seekers respond to this reality. Do minorities self-select into particular segments of the labor market to avoid discrimination? Such questions have remained unanswered due to the lack of data available on the positions to which job seekers apply. Drawing on two original datasets with application-specific information, we find little evidence that blacks target or avoid particular job types. Rather, blacks cast a wider net in their search than similarly situated whites, including a greater range of occupational categories and characteristics in their pool of job applications. Finally, we show that perceptions of discrimination are associated with increased search breadth, suggesting that broad search among African Americans represents an adaptation to labor market discrimination. Together these findings provide novel evidence on the role of race and self-selection in the job search process. PMID:26046224
Conjunctive visual search in individuals with and without mental retardation.
Carlin, Michael; Chrysler, Christina; Sullivan, Kate
2007-01-01
A comprehensive understanding of the basic visual and cognitive abilities of individuals with mental retardation is critical for understanding the basis of mental retardation and for the design of remediation programs. We assessed visual search abilities in individuals with mild mental retardation and in MA- and CA-matched comparison groups. Our goal was to determine the effect of decreasing target-distracter disparities on visual search efficiency. Results showed that search rates for the group with mental retardation and the MA-matched comparisons were more negatively affected by decreasing disparities than were those of the CA-matched group. The group with mental retardation and the MA-matched group performed similarly on all tasks. Implications for theory and application are discussed.
NNLOPS accurate associated HW production
NASA Astrophysics Data System (ADS)
Astill, William; Bizon, Wojciech; Re, Emanuele; Zanderighi, Giulia
2016-06-01
We present a next-to-next-to-leading order accurate description of associated HW production consistently matched to a parton shower. The method is based on reweighting events obtained with the HW plus one jet NLO accurate calculation implemented in POWHEG, extended with the MiNLO procedure, to reproduce NNLO accurate Born distributions. Since the Born kinematics is more complex than the cases treated before, we use a parametrization of the Collins-Soper angles to reduce the number of variables required for the reweighting. We present phenomenological results at 13 TeV, with cuts suggested by the Higgs Cross section Working Group.
Popularity versus similarity in growing networks.
Papadopoulos, Fragkiskos; Kitsak, Maksim; Serrano, M Ángeles; Boguñá, Marián; Krioukov, Dmitri
2012-09-27
The principle that 'popularity is attractive' underlies preferential attachment, which is a common explanation for the emergence of scaling in growing networks. If new connections are made preferentially to more popular nodes, then the resulting distribution of the number of connections possessed by nodes follows power laws, as observed in many real networks. Preferential attachment has been directly validated for some real networks (including the Internet), and can be a consequence of different underlying processes based on node fitness, ranking, optimization, random walks or duplication. Here we show that popularity is just one dimension of attractiveness; another dimension is similarity. We develop a framework in which new connections optimize certain trade-offs between popularity and similarity, instead of simply preferring popular nodes. The framework has a geometric interpretation in which popularity preference emerges from local optimization. As opposed to preferential attachment, our optimization framework accurately describes the large-scale evolution of technological (the Internet), social (trust relationships between people) and biological (Escherichia coli metabolic) networks, predicting the probability of new links with high precision. The framework that we have developed can thus be used for predicting new links in evolving networks, and provides a different perspective on preferential attachment as an emergent phenomenon.
Hiding and finding: the relationship between visual concealment and visual search.
Smilek, Daniel; Weinheimer, Laura; Kwan, Donna; Reynolds, Mike; Kingstone, Alan
2009-11-01
As an initial step toward developing a theory of visual concealment, we assessed whether people would use factors known to influence visual search difficulty when the degree of concealment of objects among distractors was varied. In Experiment 1, participants arranged search objects (shapes, emotional faces, and graphemes) to create displays in which the targets were in plain sight but were either easy or hard to find. Analyses of easy and hard displays created during Experiment 1 revealed that the participants reliably used factors known to influence search difficulty (e.g., eccentricity, target-distractor similarity, presence/absence of a feature) to vary the difficulty of search across displays. In Experiment 2, a new participant group searched for the targets in the displays created by the participants in Experiment 1. Results indicated that search was more difficult in the hard than in the easy condition. In Experiments 3 and 4, participants used presence versus absence of a feature to vary search difficulty with several novel stimulus sets. Taken together, the results reveal a close link between the factors that govern concealment and the factors known to influence search difficulty, suggesting that a visual search theory can be extended to form the basis of a theory of visual concealment.
Crescendo: A Protein Sequence Database Search Engine for Tandem Mass Spectra.
Wang, Jianqi; Zhang, Yajie; Yu, Yonghao
2015-07-01
A search engine that discovers more peptides reliably is essential to the progress of the computational proteomics. We propose two new scoring functions (L- and P-scores), which aim to capture similar characteristics of a peptide-spectrum match (PSM) as Sequest and Comet do. Crescendo, introduced here, is a software program that implements these two scores for peptide identification. We applied Crescendo to test datasets and compared its performance with widely used search engines, including Mascot, Sequest, and Comet. The results indicate that Crescendo identifies a similar or larger number of peptides at various predefined false discovery rates (FDR). Importantly, it also provides a better separation between the true and decoy PSMs, warranting the future development of a companion post-processing filtering algorithm.
Parallel seed-based approach to multiple protein structure similarities detection
Chapuis, Guillaume; Le Boudic-Jamin, Mathilde; Andonov, Rumen; ...
2015-01-01
Finding similarities between protein structures is a crucial task in molecular biology. Most of the existing tools require proteins to be aligned in order-preserving way and only find single alignments even when multiple similar regions exist. We propose a new seed-based approach that discovers multiple pairs of similar regions. Its computational complexity is polynomial and it comes with a quality guarantee—the returned alignments have both root mean squared deviations (coordinate-based as well as internal-distances based) lower than a given threshold, if such exist. We do not require the alignments to be order preserving (i.e., we consider nonsequential alignments), which makesmore » our algorithm suitable for detecting similar domains when comparing multidomain proteins as well as to detect structural repetitions within a single protein. Because the search space for nonsequential alignments is much larger than for sequential ones, the computational burden is addressed by extensive use of parallel computing techniques: a coarse-grain level parallelism making use of available CPU cores for computation and a fine-grain level parallelism exploiting bit-level concurrency as well as vector instructions.« less
An Accurate Temperature Correction Model for Thermocouple Hygrometers 1
Savage, Michael J.; Cass, Alfred; de Jager, James M.
1982-01-01
Numerous water relation studies have used thermocouple hygrometers routinely. However, the accurate temperature correction of hygrometer calibration curve slopes seems to have been largely neglected in both psychrometric and dewpoint techniques. In the case of thermocouple psychrometers, two temperature correction models are proposed, each based on measurement of the thermojunction radius and calculation of the theoretical voltage sensitivity to changes in water potential. The first model relies on calibration at a single temperature and the second at two temperatures. Both these models were more accurate than the temperature correction models currently in use for four psychrometers calibrated over a range of temperatures (15-38°C). The model based on calibration at two temperatures is superior to that based on only one calibration. The model proposed for dewpoint hygrometers is similar to that for psychrometers. It is based on the theoretical voltage sensitivity to changes in water potential. Comparison with empirical data from three dewpoint hygrometers calibrated at four different temperatures indicates that these instruments need only be calibrated at, e.g. 25°C, if the calibration slopes are corrected for temperature. PMID:16662241
D-score: a search engine independent MD-score.
Vaudel, Marc; Breiter, Daniela; Beck, Florian; Rahnenführer, Jörg; Martens, Lennart; Zahedi, René P
2013-03-01
While peptides carrying PTMs are routinely identified in gel-free MS, the localization of the PTMs onto the peptide sequences remains challenging. Search engine scores of secondary peptide matches have been used in different approaches in order to infer the quality of site inference, by penalizing the localization whenever the search engine similarly scored two candidate peptides with different site assignments. In the present work, we show how the estimation of posterior error probabilities for peptide candidates allows the estimation of a PTM score called the D-score, for multiple search engine studies. We demonstrate the applicability of this score to three popular search engines: Mascot, OMSSA, and X!Tandem, and evaluate its performance using an already published high resolution data set of synthetic phosphopeptides. For those peptides with phosphorylation site inference uncertainty, the number of spectrum matches with correctly localized phosphorylation increased by up to 25.7% when compared to using Mascot alone, although the actual increase depended on the fragmentation method used. Since this method relies only on search engine scores, it can be readily applied to the scoring of the localization of virtually any modification at no additional experimental or in silico cost. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Predicting Airport Screening Officers' Visual Search Competency With a Rapid Assessment.
Mitroff, Stephen R; Ericson, Justin M; Sharpe, Benjamin
2018-03-01
Objective The study's objective was to assess a new personnel selection and assessment tool for aviation security screeners. A mobile app was modified to create a tool, and the question was whether it could predict professional screeners' on-job performance. Background A variety of professions (airport security, radiology, the military, etc.) rely on visual search performance-being able to detect targets. Given the importance of such professions, it is necessary to maximize performance, and one means to do so is to select individuals who excel at visual search. A critical question is whether it is possible to predict search competency within a professional search environment. Method Professional searchers from the USA Transportation Security Administration (TSA) completed a rapid assessment on a tablet-based X-ray simulator (XRAY Screener, derived from the mobile technology app Airport Scanner; Kedlin Company). The assessment contained 72 trials that were simulated X-ray images of bags. Participants searched for prohibited items and tapped on them with their finger. Results Performance on the assessment significantly related to on-job performance measures for the TSA officers such that those who were better XRAY Screener performers were both more accurate and faster at the actual airport checkpoint. Conclusion XRAY Screener successfully predicted on-job performance for professional aviation security officers. While questions remain about the underlying cognitive mechanisms, this quick assessment was found to significantly predict on-job success for a task that relies on visual search performance. Application It may be possible to quickly assess an individual's visual search competency, which could help organizations select new hires and assess their current workforce.
Search Strategy to Identify Dental Survival Analysis Articles Indexed in MEDLINE.
Layton, Danielle M; Clarke, Michael
2016-01-01
Articles reporting survival outcomes (time-to-event outcomes) in patients over time are challenging to identify in the literature. Research shows the words authors use to describe their dental survival analyses vary, and that allocation of medical subject headings by MEDLINE indexers is inconsistent. Together, this undermines accurate article identification. The present study aims to develop and validate a search strategy to identify dental survival analyses indexed in MEDLINE (Ovid). A gold standard cohort of articles was identified to derive the search terms, and an independent gold standard cohort of articles was identified to test and validate the proposed search strategies. The first cohort included all 6,955 articles published in the 50 dental journals with the highest impact factors in 2008, of which 95 articles were dental survival articles. The second cohort included all 6,514 articles published in the 50 dental journals with the highest impact factors for 2012, of which 148 were dental survival articles. Each cohort was identified by a systematic hand search. Performance parameters of sensitivity, precision, and number needed to read (NNR) for the search strategies were calculated. Sensitive, precise, and optimized search strategies were developed and validated. The performances of the search strategy maximizing sensitivity were 92% sensitivity, 14% precision, and 7.11 NNR; the performances of the strategy maximizing precision were 93% precision, 10% sensitivity, and 1.07 NNR; and the performances of the strategy optimizing the balance between sensitivity and precision were 83% sensitivity, 24% precision, and 4.13 NNR. The methods used to identify search terms were objective, not subjective. The search strategies were validated in an independent group of articles that included different journals and different publication years. Across the three search strategies, dental survival articles can be identified with sensitivity up to 92%, precision up to 93
FlavonoidSearch: A system for comprehensive flavonoid annotation by mass spectrometry.
Akimoto, Nayumi; Ara, Takeshi; Nakajima, Daisuke; Suda, Kunihiro; Ikeda, Chiaki; Takahashi, Shingo; Muneto, Reiko; Yamada, Manabu; Suzuki, Hideyuki; Shibata, Daisuke; Sakurai, Nozomu
2017-04-28
Currently, in mass spectrometry-based metabolomics, limited reference mass spectra are available for flavonoid identification. In the present study, a database of probable mass fragments for 6,867 known flavonoids (FsDatabase) was manually constructed based on new structure- and fragmentation-related rules using new heuristics to overcome flavonoid complexity. We developed the FlavonoidSearch system for flavonoid annotation, which consists of the FsDatabase and a computational tool (FsTool) to automatically search the FsDatabase using the mass spectra of metabolite peaks as queries. This system showed the highest identification accuracy for the flavonoid aglycone when compared to existing tools and revealed accurate discrimination between the flavonoid aglycone and other compounds. Sixteen new flavonoids were found from parsley, and the diversity of the flavonoid aglycone among different fruits and vegetables was investigated.
End-user search behaviors and their relationship to search effectiveness.
Wildemuth, B M; Moore, M E
1995-01-01
One hundred sixty-one MEDLINE searches conducted by third-year medical students were analyzed and evaluated to determine which search moves were used, whether those individual moves were effective, and whether there was a relationship between specific search behaviors and the effectiveness of the search strategy as a whole. The typical search included fourteen search statements, used seven terms or "limit" commands, and resulted in the display of eleven citations. The most common moves were selection of a database, entering single-word terms and free-text term phrases, and combining sets of terms. Syntactic errors were also common. Overall, librarians judged the searches to be adequate, and students were quite satisfied with their own searches. However, librarians also identified many missed opportunities in the search strategies, including underutilization of the controlled vocabulary, subheadings, and synonyms for search concepts. No strong relationships were found between specific search behaviors and search effectiveness (as measured by the librarians' or students' evaluations). Implications of these findings for system design and user education are discussed. PMID:7581185
Similar herpes zoster incidence across Europe: results from a systematic literature review.
Pinchinat, Sybil; Cebrián-Cuenca, Ana M; Bricout, Hélène; Johnson, Robert W
2013-04-10
Herpes zoster (HZ) is caused by reactivation of the varicella-zoster virus (VZV) and mainly affects individuals aged ≥50 years. The forthcoming European launch of a vaccine against HZ (Zostavax®) prompts the need for a better understanding of the epidemiology of HZ in Europe. Therefore the aim of this systematic review was to summarize the available data on HZ incidence in Europe and to describe age-specific incidence. The Medline database of the National Library of Medicine was used to conduct a comprehensive literature search of population-based studies of HZ incidence published between 1960 and 2010 carried out in the 27 member countries of the European Union, Iceland, Norway and Switzerland. The identified articles were reviewed and scored according to a reading grid including various quality criteria, and HZ incidence data were extracted and presented by country. The search identified 21 studies, and revealed a similar annual HZ incidence throughout Europe, varying by country from 2.0 to 4.6/1 000 person-years with no clearly observed geographic trend. Despite the fact that age groups differed from one study to another, age-specific HZ incidence rates seemed to hold steady during the review period, at around 1/1 000 children <10 years, around 2/1 000 adults aged <40 years, and around 1-4/1 000 adults aged 40-50 years. They then increased rapidly after age 50 years to around 7-8/1 000, up to 10/1 000 after 80 years of age. Our review confirms that in Europe HZ incidence increases with age, and quite drastically after 50 years of age. In all of the 21 studies included in the present review, incidence rates were higher among women than men, and this difference increased with age. This review also highlights the need to identify standardized surveillance methods to improve the comparability of data within European Union Member States and to monitor the impact of VZV immunization on the epidemiology of HZ. Available data in Europe have shortcomings which
NASA Astrophysics Data System (ADS)
Abbott, B. P.; Abbott, R.; Abbott, T. D.; Abernathy, M. R.; Acernese, F.; Ackley, K.; Adams, C.; Adams, T.; Addesso, P.; Adhikari, R. X.; Adya, V. B.; Affeldt, C.; Agathos, M.; Agatsuma, K.; Aggarwal, N.; Aguiar, O. D.; Aiello, L.; Ain, A.; Ajith, P.; Allen, B.; Allocca, A.; Altin, P. A.; Anderson, S. B.; Anderson, W. G.; Arai, K.; Araya, M. C.; Arceneaux, C. C.; Areeda, J. S.; Arnaud, N.; Arun, K. G.; Ascenzi, S.; Ashton, G.; Ast, M.; Aston, S. M.; Astone, P.; Aufmuth, P.; Aulbert, C.; Babak, S.; Bacon, P.; Bader, M. K. M.; Baker, P. T.; Baldaccini, F.; Ballardin, G.; Ballmer, S. W.; Barayoga, J. C.; Barclay, S. E.; Barish, B. C.; Barker, D.; Barone, F.; Barr, B.; Barsotti, L.; Barsuglia, M.; Barta, D.; Bartlett, J.; Bartos, I.; Bassiri, R.; Basti, A.; Batch, J. C.; Baune, C.; Bavigadda, V.; Bazzan, M.; Behnke, B.; Bejger, M.; Bell, A. S.; Bell, C. J.; Berger, B. K.; Bergman, J.; Bergmann, G.; Berry, C. P. L.; Bersanetti, D.; Bertolini, A.; Betzwieser, J.; Bhagwat, S.; Bhandare, R.; Bilenko, I. A.; Billingsley, G.; Birch, J.; Birney, R.; Biscans, S.; Bisht, A.; Bitossi, M.; Biwer, C.; Bizouard, M. A.; Blackburn, J. K.; Blair, C. D.; Blair, D. G.; Blair, R. M.; Bloemen, S.; Bock, O.; Bodiya, T. P.; Boer, M.; Bogaert, G.; Bogan, C.; Bohe, A.; Bojtos, P.; Bond, C.; Bondu, F.; Bonnand, R.; Boom, B. A.; Bork, R.; Boschi, V.; Bose, S.; Bouffanais, Y.; Bozzi, A.; Bradaschia, C.; Brady, P. R.; Braginsky, V. B.; Branchesi, M.; Brau, J. E.; Briant, T.; Brillet, A.; Brinkmann, M.; Brisson, V.; Brockill, P.; Brooks, A. F.; Brown, D. A.; Brown, D. D.; Brown, N. M.; Buchanan, C. C.; Buikema, A.; Bulik, T.; Bulten, H. J.; Buonanno, A.; Buskulic, D.; Buy, C.; Byer, R. L.; Cadonati, L.; Cagnoli, G.; Cahillane, C.; Calderón Bustillo, J.; Callister, T.; Calloni, E.; Camp, J. B.; Cannon, K. C.; Cao, J.; Capano, C. D.; Capocasa, E.; Carbognani, F.; Caride, S.; Casanueva Diaz, J.; Casentini, C.; Caudill, S.; Cavaglià, M.; Cavalier, F.; Cavalieri, R.; Cella, G.; Cepeda, C. B.; Cerboni Baiardi, L.; Cerretani, G.; Cesarini, E.; Chakraborty, R.; Chalermsongsak, T.; Chamberlin, S. J.; Chan, M.; Chao, S.; Charlton, P.; Chassande-Mottin, E.; Chen, H. Y.; Chen, Y.; Cheng, C.; Chincarini, A.; Chiummo, A.; Cho, H. S.; Cho, M.; Chow, J. H.; Christensen, N.; Chu, Q.; Chua, S.; Chung, S.; Ciani, G.; Clara, F.; Clark, J. A.; Cleva, F.; Coccia, E.; Cohadon, P.-F.; Colla, A.; Collette, C. G.; Cominsky, L.; Constancio, M.; Conte, A.; Conti, L.; Cook, D.; Corbitt, T. R.; Cornish, N.; Corsi, A.; Cortese, S.; Costa, C. A.; Coughlin, M. W.; Coughlin, S. B.; Coulon, J.-P.; Countryman, S. T.; Couvares, P.; Coward, D. M.; Cowart, M. J.; Coyne, D. C.; Coyne, R.; Craig, K.; Creighton, J. D. E.; Cripe, J.; Crowder, S. G.; Cumming, A.; Cunningham, L.; Cuoco, E.; Dal Canton, T.; Danilishin, S. L.; D'Antonio, S.; Danzmann, K.; Darman, N. S.; Dattilo, V.; Dave, I.; Daveloza, H. P.; Davier, M.; Davies, G. S.; Daw, E. J.; Day, R.; DeBra, D.; Debreczeni, G.; Degallaix, J.; De Laurentis, M.; Deléglise, S.; Del Pozzo, W.; Denker, T.; Dent, T.; Dergachev, V.; De Rosa, R.; DeRosa, R. T.; DeSalvo, R.; Dhurandhar, S.; Díaz, M. C.; Di Fiore, L.; Di Giovanni, M.; Di Girolamo, T.; Di Lieto, A.; Di Pace, S.; Di Palma, I.; Di Virgilio, A.; Dojcinoski, G.; Dolique, V.; Donovan, F.; Dooley, K. L.; Doravari, S.; Douglas, R.; Downes, T. P.; Drago, M.; Drever, R. W. P.; Driggers, J. C.; Du, Z.; Ducrot, M.; Dwyer, S. E.; Edo, T. B.; Edwards, M. C.; Effler, A.; Eggenstein, H.-B.; Ehrens, P.; Eichholz, J.; Eikenberry, S. S.; Engels, W.; Essick, R. C.; Etzel, T.; Evans, M.; Evans, T. M.; Everett, R.; Factourovich, M.; Fafone, V.; Fair, H.; Fairhurst, S.; Fan, X.; Fang, Q.; Farinon, S.; Farr, B.; Farr, W. M.; Favata, M.; Fays, M.; Fehrmann, H.; Fejer, M. M.; Ferrante, I.; Ferreira, E. C.; Ferrini, F.; Fidecaro, F.; Fiori, I.; Fiorucci, D.; Fisher, R. P.; Flaminio, R.; Fletcher, M.; Fournier, J.-D.; Frasca, S.; Frasconi, F.; Frei, Z.; Freise, A.; Frey, R.; Frey, V.; Fricke, T. T.; Fritschel, P.; Frolov, V. V.; Fulda, P.; Fyffe, M.; Gabbard, H. A. G.; Gair, J. R.; Gammaitoni, L.; Gaonkar, S. G.; Garufi, F.; Gaur, G.; Gehrels, N.; Gemme, G.; Genin, E.; Gennai, A.; George, J.; Gergely, L.; Germain, V.; Ghosh, Archisman; Ghosh, S.; Giaime, J. A.; Giardina, K. D.; Giazotto, A.; Gill, K.; Glaefke, A.; Goetz, E.; Goetz, R.; Gondan, L.; González, G.; Gonzalez Castro, J. M.; Gopakumar, A.; Gordon, N. A.; Gorodetsky, M. L.; Gossan, S. E.; Gosselin, M.; Gouaty, R.; Grado, A.; Graef, C.; Graff, P. B.; Granata, M.; Grant, A.; Gras, S.; Gray, C.; Greco, G.; Green, A. C.; Groot, P.; Grote, H.; Grunewald, S.; Guidi, G. M.; Guo, X.; Gupta, A.; Gupta, M. K.; Gushwa, K. E.; Gustafson, E. K.; Gustafson, R.; Hacker, J. J.; Hall, B. R.; Hall, E. D.; Hammond, G.; Haney, M.; Hanke, M. M.; Hanks, J.; Hanna, C.; Hannam, M. D.; Hanson, J.; Hardwick, T.; Harms, J.; Harry, G. M.; Harry, I. W.; Hart, M. J.; Hartman, M. T.; Haster, C.-J.; Haughian, K.; Heidmann, A.; Heintze, M. C.; Heitmann, H.; Hello, P.; Hemming, G.; Hendry, M.; Heng, I. S.; Hennig, J.; Heptonstall, A. W.; Heurs, M.; Hild, S.; Hoak, D.; Hodge, K. A.; Hofman, D.; Hollitt, S. E.; Holt, K.; Holz, D. E.; Hopkins, P.; Hosken, D. J.; Hough, J.; Houston, E. A.; Howell, E. J.; Hu, Y. M.; Huang, S.; Huerta, E. A.; Huet, D.; Hughey, B.; Husa, S.; Huttner, S. H.; Huynh-Dinh, T.; Idrisy, A.; Indik, N.; Ingram, D. R.; Inta, R.; Isa, H. N.; Isac, J.-M.; Isi, M.; Islas, G.; Isogai, T.; Iyer, B. R.; Izumi, K.; Jacqmin, T.; Jang, H.; Jani, K.; Jaranowski, P.; Jawahar, S.; Jiménez-Forteza, F.; Johnson, W. W.; Jones, D. I.; Jones, R.; Jonker, R. J. G.; Ju, L.; Haris, K.; Kalaghatgi, C. V.; Kalogera, V.; Kandhasamy, S.; Kang, G.; Kanner, J. B.; Karki, S.; Kasprzack, M.; Katsavounidis, E.; Katzman, W.; Kaufer, S.; Kaur, T.; Kawabe, K.; Kawazoe, F.; Kéfélian, F.; Kehl, M. S.; Keitel, D.; Kelley, D. B.; Kells, W.; Kennedy, R.; Key, J. S.; Khalaidovski, A.; Khalili, F. Y.; Khan, I.; Khan, S.; Khan, Z.; Khazanov, E. A.; Kijbunchoo, N.; Kim, Chunglee; Kim, J.; Kim, K.; Kim, Nam-Gyu; Kim, Namjun; Kim, Y.-M.; King, E. J.; King, P. J.; Kinzel, D. L.; Kissel, J. S.; Kleybolte, L.; Klimenko, S.; Koehlenbeck, S. M.; Kokeyama, K.; Koley, S.; Kondrashov, V.; Kontos, A.; Korobko, M.; Korth, W. Z.; Kowalska, I.; Kozak, D. B.; Kringel, V.; Królak, A.; Krueger, C.; Kuehn, G.; Kumar, P.; Kuo, L.; Kutynia, A.; Lackey, B. D.; Landry, M.; Lange, J.; Lantz, B.; Lasky, P. D.; Lazzarini, A.; Lazzaro, C.; Leaci, P.; Leavey, S.; Lebigot, E. O.; Lee, C. H.; Lee, H. K.; Lee, H. M.; Lee, K.; Lenon, A.; Leonardi, M.; Leong, J. R.; Leroy, N.; Letendre, N.; Levin, Y.; Levine, B. M.; Li, T. G. F.; Libson, A.; Littenberg, T. B.; Lockerbie, N. A.; Logue, J.; Lombardi, A. L.; Lord, J. E.; Lorenzini, M.; Loriette, V.; Lormand, M.; Losurdo, G.; Lough, J. D.; Lück, H.; Lundgren, A. P.; Luo, J.; Lynch, R.; Ma, Y.; MacDonald, T.; Machenschalk, B.; MacInnis, M.; Macleod, D. M.; Magaña-Sandoval, F.; Magee, R. M.; Mageswaran, M.; Majorana, E.; Maksimovic, I.; Malvezzi, V.; Man, N.; Mandic, V.; Mangano, V.; Mansell, G. L.; Manske, M.; Mantovani, M.; Marchesoni, F.; Marion, F.; Márka, S.; Márka, Z.; Markosyan, A. S.; Maros, E.; Martelli, F.; Martellini, L.; Martin, I. W.; Martin, R. M.; Martynov, D. V.; Marx, J. N.; Mason, K.; Masserot, A.; Massinger, T. J.; Masso-Reid, M.; Mastrogiovanni, S.; Matichard, F.; Matone, L.; Mavalvala, N.; Mazumder, N.; Mazzolo, G.; McCarthy, R.; McClelland, D. E.; McCormick, S.; McGuire, S. C.; McIntyre, G.; McIver, J.; McManus, D. J.; McWilliams, S. T.; Meacher, D.; Meadors, G. D.; Meidam, J.; Melatos, A.; Mendell, G.; Mendoza-Gandara, D.; Mercer, R. A.; Merilh, E. L.; Merzougui, M.; Meshkov, S.; Messenger, C.; Messick, C.; Metzdorff, R.; Meyers, P. M.; Mezzani, F.; Miao, H.; Michel, C.; Middleton, H.; Mikhailov, E. E.; Milano, L.; Miller, A. L.; Miller, J.; Millhouse, M.; Minenkov, Y.; Ming, J.; Mirshekari, S.; Mishra, C.; Mitra, S.; Mitrofanov, V. P.; Mitselmakher, G.; Mittleman, R.; Moggi, A.; Mohan, M.; Mohapatra, S. R. P.; Montani, M.; Moore, B. C.; Moore, C. J.; Moraru, D.; Moreno, G.; Morriss, S. R.; Mossavi, K.; Mours, B.; Mow-Lowry, C. M.; Mueller, C. L.; Mueller, G.; Muir, A. W.; Mukherjee, Arunava; Mukherjee, D.; Mukherjee, S.; Mukund, K. N.; Mullavey, A.; Munch, J.; Murphy, D. J.; Murray, P. G.; Mytidis, A.; Nardecchia, I.; Naticchioni, L.; Nayak, R. K.; Necula, V.; Nedkova, K.; Nelemans, G.; Neri, M.; Neunzert, A.; Newton, G.; Nguyen, T. T.; Nielsen, A. B.; Nissanke, S.; Nitz, A.; Nocera, F.; Nolting, D.; Normandin, M. E. N.; Nuttall, L. K.; Oberling, J.; Ochsner, E.; O'Dell, J.; Oelker, E.; Ogin, G. H.; Oh, J. J.; Oh, S. H.; Ohme, F.; Oliver, M.; Oppermann, P.; Oram, Richard J.; O'Reilly, B.; O'Shaughnessy, R.; Ott, C. D.; Ottaway, D. J.; Ottens, R. S.; Overmier, H.; Owen, B. J.; Pai, A.; Pai, S. A.; Palamos, J. R.; Palashov, O.; Palomba, C.; Pal-Singh, A.; Pan, H.; Pankow, C.; Pannarale, F.; Pant, B. C.; Paoletti, F.; Paoli, A.; Papa, M. A.; Paris, H. R.; Parker, W.; Pascucci, D.; Pasqualetti, A.; Passaquieti, R.; Passuello, D.; Patricelli, B.; Patrick, Z.; Pearlstone, B. L.; Pedraza, M.; Pedurand, R.; Pekowsky, L.; Pele, A.; Penn, S.; Pereira, R.; Perreca, A.; Phelps, M.; Piccinni, O. J.; Pichot, M.; Piergiovanni, F.; Pierro, V.; Pillant, G.; Pinard, L.; Pinto, I. M.; Pitkin, M.; Pletsch, H. J.; Poggiani, R.; Popolizio, P.; Post, A.; Powell, J.; Prasad, J.; Predoi, V.; Premachandra, S. S.; Prestegard, T.; Price, L. R.; Prijatelj, M.; Principe, M.; Privitera, S.; Prodi, G. A.; Prokhorov, L.; Puncken, O.; Punturo, M.; Puppo, P.; Pürrer, M.; Qi, H.; Qin, J.; Quetschke, V.; Quintero, E. A.; Quitzow-James, R.; Raab, F. J.; Rabeling, D. S.; Radkins, H.; Raffai, P.; Raja, S.; Rakhmanov, M.; Rapagnani, P.; Raymond, V.; Razzano, M.; Re, V.; Read, J.; Reed, C. M.; Regimbau, T.; Rei, L.; Reid, S.; Reitze, D. H.; Rew, H.; Ricci, F.; Riles, K.; Robertson, N. A.; Robie, R.; Robinet, F.; Rocchi, A.; Rolland, L.; Rollins, J. G.; Roma, V. J.; Romano, J. D.; Romano, R.; Romanov, G.; Romie, J. H.; Rosińska, D.; Rowan, S.; Rüdiger, A.; Ruggi, P.; Ryan, K.; Sachdev, S.; Sadecki, T.; Sadeghian, L.; Salconi, L.; Saleem, M.; Salemi, F.; Samajdar, A.; Sammut, L.; Sanchez, E. J.; Sandberg, V.; Sandeen, B.; Sanders, J. R.; Sassolas, B.; Sathyaprakash, B. S.; Saulson, P. R.; Sauter, O. E. S.; Savage, R. L.; Sawadsky, A.; Schale, P.; Schilling, R.; Schmidt, J.; Schmidt, P.; Schnabel, R.; Schofield, R. M. S.; Schönbeck, A.; Schreiber, E.; Schuette, D.; Schutz, B. F.; Scott, J.; Scott, S. M.; Sellers, D.; Sentenac, D.; Sequino, V.; Sergeev, A.; Serna, G.; Setyawati, Y.; Sevigny, A.; Shaddock, D. A.; Shahriar, M. S.; Shaltev, M.; Shao, Z.; Shapiro, B.; Shawhan, P.; Sheperd, A.; Shoemaker, D. H.; Shoemaker, D. M.; Siellez, K.; Siemens, X.; Sieniawska, M.; Sigg, D.; Silva, A. D.; Simakov, D.; Singer, A.; Singer, L. P.; Singh, A.; Singh, R.; Singhal, A.; Sintes, A. M.; Slagmolen, B. J. J.; Smith, J. R.; Smith, N. D.; Smith, R. J. E.; Son, E. J.; Sorazu, B.; Sorrentino, F.; Souradeep, T.; Srivastava, A. K.; Staley, A.; Steinke, M.; Steinlechner, J.; Steinlechner, S.; Steinmeyer, D.; Stephens, B. C.; Stiles, D.; Stone, R.; Strain, K. A.; Straniero, N.; Stratta, G.; Strauss, N. A.; Strigin, S.; Sturani, R.; Stuver, A. L.; Summerscales, T. Z.; Sun, L.; Sutton, P. J.; Swinkels, B. L.; Szczepańczyk, M. J.; Tacca, M.; Talukder, D.; Tanner, D. B.; Tápai, M.; Tarabrin, S. P.; Taracchini, A.; Taylor, R.; Theeg, T.; Thirugnanasambandam, M. P.; Thomas, E. G.; Thomas, M.; Thomas, P.; Thorne, K. A.; Thrane, E.; Tiwari, S.; Tiwari, V.; Tokmakov, K. V.; Tomlinson, C.; Tonelli, M.; Torres, C. V.; Torrie, C. I.; Töyrä, D.; Travasso, F.; Traylor, G.; Trifirò, D.; Tringali, M. C.; Trozzo, L.; Tse, M.; Turconi, M.; Tuyenbayev, D.; Ugolini, D.; Unnikrishnan, C. S.; Urban, A. L.; Usman, S. A.; Vahlbruch, H.; Vajente, G.; Valdes, G.; van Bakel, N.; van Beuzekom, M.; van den Brand, J. F. J.; Van Den Broeck, C.; Vander-Hyde, D. C.; van der Schaaf, L.; van Heijningen, J. V.; van Veggel, A. A.; Vardaro, M.; Vass, S.; Vasúth, M.; Vaulin, R.; Vecchio, A.; Vedovato, G.; Veitch, J.; Veitch, P. J.; Venkateswara, K.; Verkindt, D.; Vetrano, F.; Viceré, A.; Vinciguerra, S.; Vine, D. J.; Vinet, J.-Y.; Vitale, S.; Vo, T.; Vocca, H.; Vorvick, C.; Voss, D. V.; Vousden, W. D.; Vyatchanin, S. P.; Wade, A. R.; Wade, L. E.; Wade, M.; Walker, M.; Wallace, L.; Walsh, S.; Wang, G.; Wang, H.; Wang, M.; Wang, X.; Wang, Y.; Ward, R. L.; Warner, J.; Was, M.; Weaver, B.; Wei, L.-W.; Weinert, M.; Weinstein, A. J.; Weiss, R.; Welborn, T.; Wen, L.; Weßels, P.; Westphal, T.; Wette, K.; Whelan, J. T.; Whitcomb, S. E.; White, D. J.; Whiting, B. F.; Williams, R. D.; Williamson, A. R.; Willis, J. L.; Willke, B.; Wimmer, M. H.; Winkler, W.; Wipf, C. C.; Wittel, H.; Woan, G.; Worden, J.; Wright, J. L.; Wu, G.; Yablon, J.; Yam, W.; Yamamoto, H.; Yancey, C. C.; Yap, M. J.; Yu, H.; Yvert, M.; ZadroŻny, A.; Zangrando, L.; Zanolin, M.; Zendri, J.-P.; Zevin, M.; Zhang, F.; Zhang, L.; Zhang, M.; Zhang, Y.; Zhao, C.; Zhou, M.; Zhou, Z.; Zhu, X. J.; Zucker, M. E.; Zuraw, S. E.; Zweizig, J.; Archibald, A. M.; Banaszak, S.; Berndsen, A.; Boyles, J.; Cardoso, R. F.; Chawla, P.; Cherry, A.; Dartez, L. P.; Day, D.; Epstein, C. R.; Ford, A. J.; Flanigan, J.; Garcia, A.; Hessels, J. W. T.; Hinojosa, J.; Jenet, F. A.; Karako-Argaman, C.; Kaspi, V. M.; Keane, E. F.; Kondratiev, V. I.; Kramer, M.; Leake, S.; Lorimer, D.; Lunsford, G.; Lynch, R. S.; Martinez, J. G.; Mata, A.; McLaughlin, M. A.; McPhee, C. A.; Penucci, T.; Ransom, S.; Roberts, M. S. E.; Rohr, M. D. W.; Stairs, I. H.; Stovall, K.; van Leeuwen, J.; Walker, A. N.; Wells, B. L.; LIGO Scientific Collaboration; Virgo Collaboration
2016-06-01
We present an archival search for transient gravitational-wave bursts in coincidence with 27 single-pulse triggers from Green Bank Telescope pulsar surveys, using the LIGO, Virgo, and GEO interferometer network. We also discuss a check for gravitational-wave signals in coincidence with Parkes fast radio bursts using similar methods. Data analyzed in these searches were collected between 2007 and 2013. Possible sources of emission of both short-duration radio signals and transient gravitational-wave emission include starquakes on neutron stars, binary coalescence of neutron stars, and cosmic string cusps. While no evidence for gravitational-wave emission in coincidence with these radio transients was found, the current analysis serves as a prototype for similar future searches using more sensitive second-generation interferometers.
Stereotypes of Age Differences in Personality Traits: Universal and Accurate?
Chan, Wayne; McCrae, Robert R.; De Fruyt, Filip; Jussim, Lee; Löckenhoff, Corinna E.; De Bolle, Marleen; Costa, Paul T.; Sutin, Angelina R.; Realo, Anu; Allik, Jüri; Nakazato, Katsuharu; Shimonaka, Yoshiko; Hřebíčková, Martina; Kourilova, Sylvie; Yik, Michelle; Ficková, Emília; Brunner-Sciarra, Marina; de Figueora, Nora Leibovich; Schmidt, Vanina; Ahn, Chang-kyu; Ahn, Hyun-nie; Aguilar-Vafaie, Maria E.; Siuta, Jerzy; Szmigielska, Barbara; Cain, Thomas R.; Crawford, Jarret T.; Mastor, Khairul Anwar; Rolland, Jean-Pierre; Nansubuga, Florence; Miramontez, Daniel R.; Benet-Martínez, Veronica; Rossier, Jérôme; Bratko, Denis; Halberstadt, Jamin; Yamaguchi, Mami; Knežević, Goran; Martin, Thomas A.; Gheorghiu, Mirona; Smith, Peter B.; Barbaranelli, Claduio; Wang, Lei; Shakespeare-Finch, Jane; Lima, Margarida P.; Klinkosz, Waldemar; Sekowski, Andrzej; Alcalay, Lidia; Simonetti, Franco; Avdeyeva, Tatyana V.; Pramila, V. S.; Terracciano, Antonio
2012-01-01
Age trajectories for personality traits are known to be similar across cultures. To address whether stereotypes of age groups reflect these age-related changes in personality, we asked participants in 26 countries (N = 3,323) to rate typical adolescents, adults, and old persons in their own country. Raters across nations tended to share similar beliefs about different age groups; adolescents were seen as impulsive, rebellious, undisciplined, preferring excitement and novelty, whereas old people were consistently considered lower on impulsivity, activity, antagonism, and Openness. These consensual age group stereotypes correlated strongly with published age differences on the five major dimensions of personality and most of 30 specific traits, using as criteria of accuracy both self-reports and observer ratings, different survey methodologies, and data from up to 50 nations. However, personal stereotypes were considerably less accurate, and consensual stereotypes tended to exaggerate differences across age groups. PMID:23088227
Path Searching Based Crease Detection for Large Scale Scanned Document Images
NASA Astrophysics Data System (ADS)
Zhang, Jifu; Li, Yi; Li, Shutao; Sun, Bin; Sun, Jun
2017-12-01
Since the large size documents are usually folded for preservation, creases will occur in the scanned images. In this paper, a crease detection method is proposed to locate the crease pixels for further processing. According to the imaging process of contactless scanners, the shading on both sides of the crease usually varies a lot. Based on this observation, a convex hull based algorithm is adopted to extract the shading information of the scanned image. Then, the possible crease path can be achieved by applying the vertical filter and morphological operations on the shading image. Finally, the accurate crease is detected via Dijkstra path searching. Experimental results on the dataset of real scanned newspapers demonstrate that the proposed method can obtain accurate locations of the creases in the large size document images.
Region segmentation and contextual cuing in visual search.
Conci, Markus; von Mühlenen, Adrian
2009-10-01
Contextual information provides an important source for behavioral orienting. For instance, in the contextual-cuing paradigm, repetitions of the spatial layout of elements in a search display can guide attention to the target location. The present study explored how this contextual-cuing effect is influenced by the grouping of search elements. In Experiment 1, four nontarget items could be arranged collinearly to form an imaginary square. The presence of such a square eliminated the contextual-cuing effect, despite the fact that the square's location still had a predictive value for the target location. Three follow-up experiments demonstrated that other types of grouping abolished contextual cuing in a similar way and that the mere presence of a task-irrelevant singleton had only a diminishing effect (by half) on contextual cuing. These findings suggest that a segmented, salient region can interfere with contextual cuing, reducing its predictive impact on search.
The accurate assessment of small-angle X-ray scattering data
Grant, Thomas D.; Luft, Joseph R.; Carter, Lester G.; ...
2015-01-23
Small-angle X-ray scattering (SAXS) has grown in popularity in recent times with the advent of bright synchrotron X-ray sources, powerful computational resources and algorithms enabling the calculation of increasingly complex models. However, the lack of standardized data-quality metrics presents difficulties for the growing user community in accurately assessing the quality of experimental SAXS data. Here, a series of metrics to quantitatively describe SAXS data in an objective manner using statistical evaluations are defined. These metrics are applied to identify the effects of radiation damage, concentration dependence and interparticle interactions on SAXS data from a set of 27 previously described targetsmore » for which high-resolution structures have been determined via X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy. Studies show that these metrics are sufficient to characterize SAXS data quality on a small sample set with statistical rigor and sensitivity similar to or better than manual analysis. The development of data-quality analysis strategies such as these initial efforts is needed to enable the accurate and unbiased assessment of SAXS data quality.« less
Impact of Glaucoma and Dry Eye on Text-Based Searching.
Sun, Michelle J; Rubin, Gary S; Akpek, Esen K; Ramulu, Pradeep Y
2017-06-01
We determine if visual field loss from glaucoma and/or measures of dry eye severity are associated with difficulty searching, as judged by slower search times on a text-based search task. Glaucoma patients with bilateral visual field (VF) loss, patients with clinically significant dry eye, and normally-sighted controls were enrolled from the Wilmer Eye Institute clinics. Subjects searched three Yellow Pages excerpts for a specific phone number, and search time was recorded. A total of 50 glaucoma subjects, 40 dry eye subjects, and 45 controls completed study procedures. On average, glaucoma patients exhibited 57% longer search times compared to controls (95% confidence interval [CI], 26%-96%, P < 0.001), and longer search times were noted among subjects with greater VF loss ( P < 0.001), worse contrast sensitivity ( P < 0.001), and worse visual acuity ( P = 0.026). Dry eye subjects demonstrated similar search times compared to controls, though worse Ocular Surface Disease Index (OSDI) vision-related subscores were associated with longer search times ( P < 0.01). Search times showed no association with OSDI symptom subscores ( P = 0.20) or objective measures of dry eye ( P > 0.08 for Schirmer's testing without anesthesia, corneal fluorescein staining, and tear film breakup time). Text-based visual search is slower for glaucoma patients with greater levels of VF loss and dry eye patients with greater self-reported visual difficulty, and these difficulties may contribute to decreased quality of life in these groups. Visual search is impaired in glaucoma and dry eye groups compared to controls, highlighting the need for compensatory strategies and tools to assist individuals in overcoming their deficiencies.
Impact of Glaucoma and Dry Eye on Text-Based Searching
Sun, Michelle J.; Rubin, Gary S.; Akpek, Esen K.; Ramulu, Pradeep Y.
2017-01-01
Purpose We determine if visual field loss from glaucoma and/or measures of dry eye severity are associated with difficulty searching, as judged by slower search times on a text-based search task. Methods Glaucoma patients with bilateral visual field (VF) loss, patients with clinically significant dry eye, and normally-sighted controls were enrolled from the Wilmer Eye Institute clinics. Subjects searched three Yellow Pages excerpts for a specific phone number, and search time was recorded. Results A total of 50 glaucoma subjects, 40 dry eye subjects, and 45 controls completed study procedures. On average, glaucoma patients exhibited 57% longer search times compared to controls (95% confidence interval [CI], 26%–96%, P < 0.001), and longer search times were noted among subjects with greater VF loss (P < 0.001), worse contrast sensitivity (P < 0.001), and worse visual acuity (P = 0.026). Dry eye subjects demonstrated similar search times compared to controls, though worse Ocular Surface Disease Index (OSDI) vision-related subscores were associated with longer search times (P < 0.01). Search times showed no association with OSDI symptom subscores (P = 0.20) or objective measures of dry eye (P > 0.08 for Schirmer's testing without anesthesia, corneal fluorescein staining, and tear film breakup time). Conclusions Text-based visual search is slower for glaucoma patients with greater levels of VF loss and dry eye patients with greater self-reported visual difficulty, and these difficulties may contribute to decreased quality of life in these groups. Translational Relevance Visual search is impaired in glaucoma and dry eye groups compared to controls, highlighting the need for compensatory strategies and tools to assist individuals in overcoming their deficiencies. PMID:28670502
Individual differences in working memory capacity and search efficiency.
Miller, Ashley L; Unsworth, Nash
2018-05-29
In two experiments, we examined how various learning conditions impact the relation between working memory capacity (WMC) and memory search abilities. Experiment 1 employed a delayed free recall task with semantically related words to induce the buildup of proactive interference (PI) and revealed that the buildup of PI differentially impacted recall accuracy and recall latency for low-WMC and high-WMC individuals. Namely, the buildup of PI impaired recall accuracy and slowed recall latency for low-WMC individuals to a greater extent than what was observed for high-WMC individuals. To provide a circumstance in which previously learned information remains relevant over the course of learning, Experiment 2 required participants to complete a multitrial delayed free recall task with unrelated words. Results revealed that with increased practice with the same word list, WMC-related differences were eventually eliminated in interresponse times (IRTs) and recall accuracy, but not recall latency. Thus, despite still accumulating larger search sets, low-WMC individuals searched LTM as efficiently as high-WMC individuals. Collectively, these results are consistent with the notion that under normal free recall conditions, low-WMC individuals search LTM less efficiently than do high-WMC individuals because of their reliance on noisy temporal-contextual cues at retrieval. However, it appears that under conditions in which previously learned items remain relevant at recall, this tendency to rely on vague self-generated retrieval cues can actually facilitate the ability to accurately and quickly recall information.
Accurate Arabic Script Language/Dialect Classification
2014-01-01
Army Research Laboratory Accurate Arabic Script Language/Dialect Classification by Stephen C. Tratz ARL-TR-6761 January 2014 Approved for public...1197 ARL-TR-6761 January 2014 Accurate Arabic Script Language/Dialect Classification Stephen C. Tratz Computational and Information Sciences...Include area code) Standard Form 298 (Rev. 8/98) Prescribed by ANSI Std. Z39.18 January 2014 Final Accurate Arabic Script Language/Dialect Classification
Dermatological image search engines on the Internet: do they work?
Cutrone, M; Grimalt, R
2007-02-01
Atlases on CD-ROM first substituted the use of paediatric dermatology atlases printed on paper. This permitted a faster search and a practical comparison of differential diagnoses. The third step in the evolution of clinical atlases was the onset of the online atlas. Many doctors now use the Internet image search engines to obtain clinical images directly. The aim of this study was to test the reliability of the image search engines compared to the online atlases. We tested seven Internet image search engines with three paediatric dermatology diseases. In general, the service offered by the search engines is good, and continues to be free of charge. The coincidence between what we searched for and what we found was generally excellent, and contained no advertisements. Most Internet search engines provided similar results but some were more user friendly than others. It is not necessary to repeat the same research with Picsearch, Lycos and MSN, as the response would be the same; there is a possibility that they might share software. Image search engines are a useful, free and precise method to obtain paediatric dermatology images for teaching purposes. There is still the matter of copyright to be resolved. What are the legal uses of these 'free' images? How do we define 'teaching purposes'? New watermark methods and encrypted electronic signatures might solve these problems and answer these questions.
Can computerized tomography accurately stage childhood renal tumors?
Abdelhalim, Ahmed; Helmy, Tamer E; Harraz, Ahmed M; Abou-El-Ghar, Mohamed E; Dawaba, Mohamed E; Hafez, Ashraf T
2014-07-01
Staging of childhood renal tumors is crucial for treatment planning and outcome prediction. We sought to identify whether computerized tomography could accurately predict the local stage of childhood renal tumors. We retrospectively reviewed our database for patients diagnosed with childhood renal tumors and treated surgically between 1990 and 2013. Inability to retrieve preoperative computerized tomography, intraoperative tumor spillage and nonWilms childhood renal tumors were exclusion criteria. Local computerized tomography stage was assigned by a single experienced pediatric radiologist blinded to the pathological stage, using a consensus similar to the Children's Oncology Group Wilms tumor staging system. Tumors were stratified into up-front surgery and preoperative chemotherapy groups. The radiological stage of each tumor was compared to the pathological stage. A total of 189 tumors in 179 patients met inclusion criteria. Computerized tomography staging matched pathological staging in 68% of up-front surgery (70 of 103), 31.8% of pre-chemotherapy (21 of 66) and 48.8% of post-chemotherapy scans (42 of 86). Computerized tomography over staged 21.4%, 65.2% and 46.5% of tumors in the up-front surgery, pre-chemotherapy and post-chemotherapy scans, respectively, and under staged 10.7%, 3% and 4.7%. Computerized tomography staging was more accurate in tumors managed by up-front surgery (p <0.001) and those without extracapsular extension (p <0.001). The validity of computerized tomography staging of childhood renal tumors remains doubtful. This staging is more accurate for tumors treated with up-front surgery and those without extracapsular extension. Preoperative computerized tomography can help to exclude capsular breach. Treatment strategy should be based on surgical and pathological staging to avoid the hazards of inaccurate staging. Copyright © 2014 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Enhancing visual search abilities of people with intellectual disabilities.
Li-Tsang, Cecilia W P; Wong, Jackson K K
2009-01-01
This study aimed to evaluate the effects of cueing in visual search paradigm for people with and without intellectual disabilities (ID). A total of 36 subjects (18 persons with ID and 18 persons with normal intelligence) were recruited using convenient sampling method. A series of experiments were conducted to compare guided cue strategies using either motion contrast or additional cue to basic search task. Repeated measure ANOVA and post hoc multiple comparison tests were used to compare each cue strategy. Results showed that the use of guided strategies was able to capture focal attention in an autonomic manner in the ID group (Pillai's Trace=5.99, p<0.0001). Both guided cue and guided motion search tasks demonstrated functionally similar effects that confirmed the non-specific character of salience. These findings suggested that the visual search efficiency of people with ID was greatly improved if the target was made salient using cueing effect when the complexity of the display increased (i.e. set size increased). This study could have an important implication for the design of the visual searching format of any computerized programs developed for people with ID in learning new tasks.
Zelinsky, Gregory J.; Peng, Yifan; Berg, Alexander C.; Samaras, Dimitris
2013-01-01
Search is commonly described as a repeating cycle of guidance to target-like objects, followed by the recognition of these objects as targets or distractors. Are these indeed separate processes using different visual features? We addressed this question by comparing observer behavior to that of support vector machine (SVM) models trained on guidance and recognition tasks. Observers searched for a categorically defined teddy bear target in four-object arrays. Target-absent trials consisted of random category distractors rated in their visual similarity to teddy bears. Guidance, quantified as first-fixated objects during search, was strongest for targets, followed by target-similar, medium-similarity, and target-dissimilar distractors. False positive errors to first-fixated distractors also decreased with increasing dissimilarity to the target category. To model guidance, nine teddy bear detectors, using features ranging in biological plausibility, were trained on unblurred bears then tested on blurred versions of the same objects appearing in each search display. Guidance estimates were based on target probabilities obtained from these detectors. To model recognition, nine bear/nonbear classifiers, trained and tested on unblurred objects, were used to classify the object that would be fixated first (based on the detector estimates) as a teddy bear or a distractor. Patterns of categorical guidance and recognition accuracy were modeled almost perfectly by an HMAX model in combination with a color histogram feature. We conclude that guidance and recognition in the context of search are not separate processes mediated by different features, and that what the literature knows as guidance is really recognition performed on blurred objects viewed in the visual periphery. PMID:24105460
Kann, Maricel G.; Sheetlin, Sergey L.; Park, Yonil; Bryant, Stephen H.; Spouge, John L.
2007-01-01
The sequencing of complete genomes has created a pressing need for automated annotation of gene function. Because domains are the basic units of protein function and evolution, a gene can be annotated from a domain database by aligning domains to the corresponding protein sequence. Ideally, complete domains are aligned to protein subsequences, in a ‘semi-global alignment’. Local alignment, which aligns pieces of domains to subsequences, is common in high-throughput annotation applications, however. It is a mature technique, with the heuristics and accurate E-values required for screening large databases and evaluating the screening results. Hidden Markov models (HMMs) provide an alternative theoretical framework for semi-global alignment, but their use is limited because they lack heuristic acceleration and accurate E-values. Our new tool, GLOBAL, overcomes some limitations of previous semi-global HMMs: it has accurate E-values and the possibility of the heuristic acceleration required for high-throughput applications. Moreover, according to a standard of truth based on protein structure, two semi-global HMM alignment tools (GLOBAL and HMMer) had comparable performance in identifying complete domains, but distinctly outperformed two tools based on local alignment. When searching for complete protein domains, therefore, GLOBAL avoids disadvantages commonly associated with HMMs, yet maintains their superior retrieval performance. PMID:17596268
[Advanced online search techniques and dedicated search engines for physicians].
Nahum, Yoav
2008-02-01
In recent years search engines have become an essential tool in the work of physicians. This article will review advanced search techniques from the world of information specialists, as well as some advanced search engine operators that may help physicians improve their online search capabilities, and maximize the yield of their searches. This article also reviews popular dedicated scientific and biomedical literature search engines.
Gillespie, James A; Quinn, Casey
2012-01-01
48,084 for “Egor Bychkov”, compared to 53,403 for “Khimki” in Yandex. We found Google potentially provides timely search results, whereas Yandex provides more accurate geographic localization. The correlation was moderate to strong between search terms representing the Bychkov episode and terms representing salient drug issues in Yandex–“illicit drug treatment” (r s = .90, P < .001), "illicit drugs" (r s = .76, P < .001), and "drug addiction" (r s = .74, P < .001). Google correlations were weaker or absent–"illicit drug treatment" (r s = .12, P = .58), “illicit drugs ” (r s = -0.29, P = .17), and "drug addiction" (r s = .68, P < .001). Conclusions This study contributes to the methodological literature on the analysis of search patterns for public health. This paper investigated the relationship between Google and Yandex, and contributed to the broader methods literature by highlighting both the potential and limitations of these two search providers. We believe that Yandex Wordstat is a potentially valuable, and underused data source for researchers working on Russian-related illicit drug policy and other public health problems. The Russian Federation, with its large, geographically dispersed, and politically engaged online population presents unique opportunities for studying the evolving influence of the Internet on politics and policy, using low cost methods resilient against potential increases in censorship. PMID:23238600
Diverse Food Items Are Similarly Categorized by 8- to 13-Year-Old Children
ERIC Educational Resources Information Center
Beltran, Alicia; Knight Sepulveda, Karina; Watson, Kathy; Baranowski, Tom; Baranowski, Janice; Islam, Noemi; Missaghian, Mariam
2008-01-01
Objective: Assess how 8- to 13-year-old children categorized and labeled food items for possible use as part of a food search strategy in a computerized 24-hour dietary recall. Design: A set of 62 cards with pictures and names of food items from 18 professionally defined food groups was sorted by each child into piles of similar food items.…
Fast and accurate reference-free alignment of subtomograms.
Chen, Yuxiang; Pfeffer, Stefan; Hrabe, Thomas; Schuller, Jan Michael; Förster, Friedrich
2013-06-01
In cryoelectron tomography alignment and averaging of subtomograms, each dnepicting the same macromolecule, improves the resolution compared to the individual subtomogram. Major challenges of subtomogram alignment are noise enhancement due to overfitting, the bias of an initial reference in the iterative alignment process, and the computational cost of processing increasingly large amounts of data. Here, we propose an efficient and accurate alignment algorithm via a generalized convolution theorem, which allows computation of a constrained correlation function using spherical harmonics. This formulation increases computational speed of rotational matching dramatically compared to rotation search in Cartesian space without sacrificing accuracy in contrast to other spherical harmonic based approaches. Using this sampling method, a reference-free alignment procedure is proposed to tackle reference bias and overfitting, which also includes contrast transfer function correction by Wiener filtering. Application of the method to simulated data allowed us to obtain resolutions near the ground truth. For two experimental datasets, ribosomes from yeast lysate and purified 20S proteasomes, we achieved reconstructions of approximately 20Å and 16Å, respectively. The software is ready-to-use and made public to the community. Copyright © 2013 Elsevier Inc. All rights reserved.
Gohlke, Bjoern-Oliver; Overkamp, Tim; Richter, Anja; Richter, Antje; Daniel, Peter T; Gillissen, Bernd; Preissner, Robert
2015-09-24
Searching for two-dimensional (2D) structural similarities is a useful tool to identify new active compounds in drug-discovery programs. However, as 2D similarity measures neglect important structural and functional features, similarity by 2D might be underestimated. In the present study, we used combined 2D and three-dimensional (3D) similarity comparisons to reveal possible new functions and/or side-effects of known bioactive compounds. We utilised more than 10,000 compounds from the SuperTarget database with known inhibition values for twelve different anti-cancer targets. We performed all-against-all comparisons resulting in 2D similarity landscapes. Among the regions with low 2D similarity scores are inhibitors of vascular endothelial growth factor receptor (VEGFR) and inhibitors of poly ADP-ribose polymerase (PARP). To demonstrate that 3D landscape comparison can identify similarities, which are untraceable in 2D similarity comparisons, we analysed this region in more detail. This 3D analysis showed the unexpected structural similarity between inhibitors of VEGFR and inhibitors of PARP. Among the VEGFR inhibitors that show similarities to PARP inhibitors was Vatalanib, an oral "multi-targeted" small molecule protein kinase inhibitor being studied in phase-III clinical trials in cancer therapy. An in silico docking simulation and an in vitro HT universal colorimetric PARP assay confirmed that the VEGFR inhibitor Vatalanib exhibits off-target activity as a PARP inhibitor, broadening its mode of action. In contrast to the 2D-similarity search, the 3D-similarity landscape comparison identifies new functions and side effects of the known VEGFR inhibitor Vatalanib.
Energy Consumption Forecasting Using Semantic-Based Genetic Programming with Local Search Optimizer.
Castelli, Mauro; Trujillo, Leonardo; Vanneschi, Leonardo
2015-01-01
Energy consumption forecasting (ECF) is an important policy issue in today's economies. An accurate ECF has great benefits for electric utilities and both negative and positive errors lead to increased operating costs. The paper proposes a semantic based genetic programming framework to address the ECF problem. In particular, we propose a system that finds (quasi-)perfect solutions with high probability and that generates models able to produce near optimal predictions also on unseen data. The framework blends a recently developed version of genetic programming that integrates semantic genetic operators with a local search method. The main idea in combining semantic genetic programming and a local searcher is to couple the exploration ability of the former with the exploitation ability of the latter. Experimental results confirm the suitability of the proposed method in predicting the energy consumption. In particular, the system produces a lower error with respect to the existing state-of-the art techniques used on the same dataset. More importantly, this case study has shown that including a local searcher in the geometric semantic genetic programming system can speed up the search process and can result in fitter models that are able to produce an accurate forecasting also on unseen data.
Accuracy of Binary Black Hole waveforms for Advanced LIGO searches
NASA Astrophysics Data System (ADS)
Kumar, Prayush; Barkett, Kevin; Bhagwat, Swetha; Chu, Tony; Fong, Heather; Brown, Duncan; Pfeiffer, Harald; Scheel, Mark; Szilagyi, Bela
2015-04-01
Coalescing binaries of compact objects are flagship sources for the first direct detection of gravitational waves with LIGO-Virgo observatories. Matched-filtering based detection searches aimed at binaries of black holes will use aligned spin waveforms as filters, and their efficiency hinges on the accuracy of the underlying waveform models. A number of gravitational waveform models are available in literature, e.g. the Effective-One-Body, Phenomenological, and traditional post-Newtonian ones. While Numerical Relativity (NR) simulations provide for the most accurate modeling of gravitational radiation from compact binaries, their computational cost limits their application in large scale searches. In this talk we assess the accuracy of waveform models in two regions of parameter space, which have only been explored cursorily in the past: the high mass-ratio regime as well as the comparable mass-ratio + high spin regime.s Using the SpEC code, six q = 7 simulations with aligned-spins and lasting 60 orbits, and tens of q ∈ [1,3] simulations with high black hole spins were performed. We use them to study the accuracy and intrinsic parameter biases of different waveform families, and assess their viability for Advanced LIGO searches.
Searching Fragment Spaces with feature trees.
Lessel, Uta; Wellenzohn, Bernd; Lilienthal, Markus; Claussen, Holger
2009-02-01
Virtual combinatorial chemistry easily produces billions of compounds, for which conventional virtual screening cannot be performed even with the fastest methods available. An efficient solution for such a scenario is the generation of Fragment Spaces, which encode huge numbers of virtual compounds by their fragments/reagents and rules of how to combine them. Similarity-based searches can be performed in such spaces without ever fully enumerating all virtual products. Here we describe the generation of a huge Fragment Space encoding about 5 * 10(11) compounds based on established in-house synthesis protocols for combinatorial libraries, i.e., we encode practically evaluated combinatorial chemistry protocols in a machine readable form, rendering them accessible to in silico search methods. We show how such searches in this Fragment Space can be integrated as a first step in an overall workflow. It reduces the extremely huge number of virtual products by several orders of magnitude so that the resulting list of molecules becomes more manageable for further more elaborated and time-consuming analysis steps. Results of a case study are presented and discussed, which lead to some general conclusions for an efficient expansion of the chemical space to be screened in pharmaceutical companies.
Children's Search Engines from an Information Search Process Perspective.
ERIC Educational Resources Information Center
Broch, Elana
2000-01-01
Describes cognitive and affective characteristics of children and teenagers that may affect their Web searching behavior. Reviews literature on children's searching in online public access catalogs (OPACs) and using digital libraries. Profiles two Web search engines. Discusses some of the difficulties children have searching the Web, in the…
Measuring the self-similarity exponent in Lévy stable processes of financial time series
NASA Astrophysics Data System (ADS)
Fernández-Martínez, M.; Sánchez-Granero, M. A.; Trinidad Segovia, J. E.
2013-11-01
Geometric method-based procedures, which will be called GM algorithms herein, were introduced in [M.A. Sánchez Granero, J.E. Trinidad Segovia, J. García Pérez, Some comments on Hurst exponent and the long memory processes on capital markets, Phys. A 387 (2008) 5543-5551], to efficiently calculate the self-similarity exponent of a time series. In that paper, the authors showed empirically that these algorithms, based on a geometrical approach, are more accurate than the classical algorithms, especially with short length time series. The authors checked that GM algorithms are good when working with (fractional) Brownian motions. Moreover, in [J.E. Trinidad Segovia, M. Fernández-Martínez, M.A. Sánchez-Granero, A note on geometric method-based procedures to calculate the Hurst exponent, Phys. A 391 (2012) 2209-2214], a mathematical background for the validity of such procedures to estimate the self-similarity index of any random process with stationary and self-affine increments was provided. In particular, they proved theoretically that GM algorithms are also valid to explore long-memory in (fractional) Lévy stable motions. In this paper, we prove empirically by Monte Carlo simulation that GM algorithms are able to calculate accurately the self-similarity index in Lévy stable motions and find empirical evidence that they are more precise than the absolute value exponent (denoted by AVE onwards) and the multifractal detrended fluctuation analysis (MF-DFA) algorithms, especially with a short length time series. We also compare them with the generalized Hurst exponent (GHE) algorithm and conclude that both GM2 and GHE algorithms are the most accurate to study financial series. In addition to that, we provide empirical evidence, based on the accuracy of GM algorithms to estimate the self-similarity index in Lévy motions, that the evolution of the stocks of some international market indices, such as U.S. Small Cap and Nasdaq100, cannot be modelized by means of a
Failures of Perception in the Low-Prevalence Effect: Evidence From Active and Passive Visual Search
Hout, Michael C.; Walenchok, Stephen C.; Goldinger, Stephen D.; Wolfe, Jeremy M.
2017-01-01
In visual search, rare targets are missed disproportionately often. This low-prevalence effect (LPE) is a robust problem with demonstrable societal consequences. What is the source of the LPE? Is it a perceptual bias against rare targets or a later process, such as premature search termination or motor response errors? In 4 experiments, we examined the LPE using standard visual search (with eye tracking) and 2 variants of rapid serial visual presentation (RSVP) in which observers made present/absent decisions after sequences ended. In all experiments, observers looked for 2 target categories (teddy bear and butterfly) simultaneously. To minimize simple motor errors, caused by repetitive absent responses, we held overall target prevalence at 50%, with 1 low-prevalence and 1 high-prevalence target type. Across conditions, observers either searched for targets among other real-world objects or searched for specific bears or butterflies among within-category distractors. We report 4 main results: (a) In standard search, high-prevalence targets were found more quickly and accurately than low-prevalence targets. (b) The LPE persisted in RSVP search, even though observers never terminated search on their own. (c) Eye-tracking analyses showed that high-prevalence targets elicited better attentional guidance and faster perceptual decisions. And (d) even when observers looked directly at low-prevalence targets, they often (12%–34% of trials) failed to detect them. These results strongly argue that low-prevalence misses represent failures of perception when early search termination or motor errors are controlled. PMID:25915073
How Users Search the Library from a Single Search Box
ERIC Educational Resources Information Center
Lown, Cory; Sierra, Tito; Boyer, Josh
2013-01-01
Academic libraries are turning increasingly to unified search solutions to simplify search and discovery of library resources. Unfortunately, very little research has been published on library user search behavior in single search box environments. This study examines how users search a large public university library using a prominent, single…
King, Brian R; Aburdene, Maurice; Thompson, Alex; Warres, Zach
2014-01-01
Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.
Using Search Engine Data as a Tool to Predict Syphilis.
Young, Sean D; Torrone, Elizabeth A; Urata, John; Aral, Sevgi O
2018-07-01
Researchers have suggested that social media and online search data might be used to monitor and predict syphilis and other sexually transmitted diseases. Because people at risk for syphilis might seek sexual health and risk-related information on the internet, we investigated associations between internet state-level search query data (e.g., Google Trends) and reported weekly syphilis cases. We obtained weekly counts of reported primary and secondary syphilis for 50 states from 2012 to 2014 from the US Centers for Disease Control and Prevention. We collected weekly internet search query data regarding 25 risk-related keywords from 2012 to 2014 for 50 states using Google Trends. We joined 155 weeks of Google Trends data with 1-week lag to weekly syphilis data for a total of 7750 data points. Using the least absolute shrinkage and selection operator, we trained three linear mixed models on the first 10 weeks of each year. We validated models for 2012 and 2014 for the following 52 weeks and the 2014 model for the following 42 weeks. The models, consisting of different sets of keyword predictors for each year, accurately predicted 144 weeks of primary and secondary syphilis counts for each state, with an overall average R of 0.9 and overall average root mean squared error of 4.9. We used Google Trends search data from the prior week to predict cases of syphilis in the following weeks for each state. Further research could explore how search data could be integrated into public health monitoring systems.
Using Internet Search Engines to Obtain Medical Information: A Comparative Study
Wang, Liupu; Wang, Juexin; Wang, Michael; Li, Yong; Liang, Yanchun
2012-01-01
results highly overlapped between the search engines, and the overlap between any two search engines was about half or more. On the other hand, each search engine emphasized various types of content differently. In terms of user satisfaction analysis, volunteer users scored Bing the highest for its usefulness, followed by Yahoo!, Google, and Ask.com. Conclusions Google, Yahoo!, Bing, and Ask.com are by and large effective search engines for helping lay users get health and medical information. Nevertheless, the current ranking methods have some pitfalls and there is room for improvement to help users get more accurate and useful information. We suggest that search engine users explore multiple search engines to search different types of health information and medical knowledge for their own needs and get a professional consultation if necessary. PMID:22672889
Using Internet search engines to obtain medical information: a comparative study.
Wang, Liupu; Wang, Juexin; Wang, Michael; Li, Yong; Liang, Yanchun; Xu, Dong
2012-05-16
search engines, and the overlap between any two search engines was about half or more. On the other hand, each search engine emphasized various types of content differently. In terms of user satisfaction analysis, volunteer users scored Bing the highest for its usefulness, followed by Yahoo!, Google, and Ask.com. Google, Yahoo!, Bing, and Ask.com are by and large effective search engines for helping lay users get health and medical information. Nevertheless, the current ranking methods have some pitfalls and there is room for improvement to help users get more accurate and useful information. We suggest that search engine users explore multiple search engines to search different types of health information and medical knowledge for their own needs and get a professional consultation if necessary.
Seasonal variation in internet keyword searches: a proxy assessment of sex mating behaviors.
Markey, Patrick M; Markey, Charlotte N
2013-05-01
The current study investigated seasonal variation in internet searches regarding sex and mating behaviors. Harmonic analyses were used to examine the seasonal trends of Google keyword searches during the past 5 years for topics related to pornography, prostitution, and mate-seeking. Results indicated a consistent 6-month harmonic cycle with the peaks of keyword searches related to sex and mating behaviors occurring most frequently during winter and early summer. Such results compliment past research that has found similar seasonal trends of births, sexually transmitted infections, condom sales, and abortions.
38 CFR 4.46 - Accurate measurement.
Code of Federal Regulations, 2011 CFR
2011-07-01
... RATING DISABILITIES Disability Ratings The Musculoskeletal System § 4.46 Accurate measurement. Accurate... indispensable in examinations conducted within the Department of Veterans Affairs. Muscle atrophy must also be...
38 CFR 4.46 - Accurate measurement.
Code of Federal Regulations, 2014 CFR
2014-07-01
... RATING DISABILITIES Disability Ratings The Musculoskeletal System § 4.46 Accurate measurement. Accurate... indispensable in examinations conducted within the Department of Veterans Affairs. Muscle atrophy must also be...
38 CFR 4.46 - Accurate measurement.
Code of Federal Regulations, 2010 CFR
2010-07-01
... RATING DISABILITIES Disability Ratings The Musculoskeletal System § 4.46 Accurate measurement. Accurate... indispensable in examinations conducted within the Department of Veterans Affairs. Muscle atrophy must also be...
38 CFR 4.46 - Accurate measurement.
Code of Federal Regulations, 2013 CFR
2013-07-01
... RATING DISABILITIES Disability Ratings The Musculoskeletal System § 4.46 Accurate measurement. Accurate... indispensable in examinations conducted within the Department of Veterans Affairs. Muscle atrophy must also be...
38 CFR 4.46 - Accurate measurement.
Code of Federal Regulations, 2012 CFR
2012-07-01
... RATING DISABILITIES Disability Ratings The Musculoskeletal System § 4.46 Accurate measurement. Accurate... indispensable in examinations conducted within the Department of Veterans Affairs. Muscle atrophy must also be...
ERIC Educational Resources Information Center
Badami, Rokhsareh; VaezMousavi, Mohammad; Wulf, Gabriele; Namazizadeh, Mahdi
2012-01-01
One purpose of the present study was to examine whether self-confidence or anxiety would be differentially affected by feedback from more accurate rather than less accurate trials. The second purpose was to determine whether arousal variations (activation) would predict performance. On Day 1, participants performed a golf putting task under one of…
Liquid electrolyte informatics using an exhaustive search with linear regression.
Sodeyama, Keitaro; Igarashi, Yasuhiko; Nakayama, Tomofumi; Tateyama, Yoshitaka; Okada, Masato
2018-06-14
Exploring new liquid electrolyte materials is a fundamental target for developing new high-performance lithium-ion batteries. In contrast to solid materials, disordered liquid solution properties have been less studied by data-driven information techniques. Here, we examined the estimation accuracy and efficiency of three information techniques, multiple linear regression (MLR), least absolute shrinkage and selection operator (LASSO), and exhaustive search with linear regression (ES-LiR), by using coordination energy and melting point as test liquid properties. We then confirmed that ES-LiR gives the most accurate estimation among the techniques. We also found that ES-LiR can provide the relationship between the "prediction accuracy" and "calculation cost" of the properties via a weight diagram of descriptors. This technique makes it possible to choose the balance of the "accuracy" and "cost" when the search of a huge amount of new materials was carried out.
Stride search: A general algorithm for storm detection in high-resolution climate data
Bosler, Peter A.; Roesler, Erika L.; Taylor, Mark A.; ...
2016-04-13
This study discusses the problem of identifying extreme climate events such as intense storms within large climate data sets. The basic storm detection algorithm is reviewed, which splits the problem into two parts: a spatial search followed by a temporal correlation problem. Two specific implementations of the spatial search algorithm are compared: the commonly used grid point search algorithm is reviewed, and a new algorithm called Stride Search is introduced. The Stride Search algorithm is defined independently of the spatial discretization associated with a particular data set. Results from the two algorithms are compared for the application of tropical cyclonemore » detection, and shown to produce similar results for the same set of storm identification criteria. Differences between the two algorithms arise for some storms due to their different definition of search regions in physical space. The physical space associated with each Stride Search region is constant, regardless of data resolution or latitude, and Stride Search is therefore capable of searching all regions of the globe in the same manner. Stride Search's ability to search high latitudes is demonstrated for the case of polar low detection. Wall clock time required for Stride Search is shown to be smaller than a grid point search of the same data, and the relative speed up associated with Stride Search increases as resolution increases.« less
Liu, Qiaoxia; Zhou, Binbin; Wang, Xinliang; Ke, Yanxiong; Jin, Yu; Yin, Lihui; Liang, Xinmiao
2012-12-01
A search library about benzylisoquinoline alkaloids was established based on preparation of alkaloid fractions from Rhizoma coptidis, Cortex phellodendri, and Rhizoma corydalis. In this work, two alkaloid fractions from each herbal medicine were first prepared based on selective separation on the "click" binaphthyl column. And then these alkaloid fractions were analyzed on C18 column by liquid chromatography coupled with tandem mass spectrometry. Many structure-related compounds were included in these alkaloids fractions, which led to easy separation and good MS response in further work. Therefore, a search library of 52 benzylisoquinoline alkaloids was established, which included eight aporphine, 19 tetrahydroprotoberberine, two protopine, two benzyltetrahydroisoquinoline, and 21 protoberberine alkaloids. The information of the search library contained compound names, structures, retention times, accurate masses, fragmentation pathways of benzylisoquionline alkaloids, and their sources from three herbal medicines. Using such a library, the alkaloids, especially those trace and unknown components in some herbal medicine could be accurately and quickly identified. In addition, the distribution of benzylisoquinoline alkaloids in the herbal medicines could be also summarized by searching the source samples in the library. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
ClinicalKey: a point-of-care search engine.
Vardell, Emily
2013-01-01
ClinicalKey is a new point-of-care resource for health care professionals. Through controlled vocabulary, ClinicalKey offers a cross section of resources on diseases and procedures, from journals to e-books and practice guidelines to patient education. A sample search was conducted to demonstrate the features of the database, and a comparison with similar tools is presented.
How accurately can other people infer your thoughts-And does culture matter?
Valanides, Constantinos; Sheppard, Elizabeth; Mitchell, Peter
2017-01-01
This research investigated how accurately people infer what others are thinking after observing a brief sample of their behaviour and whether culture/similarity is a relevant factor. Target participants (14 British and 14 Mediterraneans) were cued to think about either positive or negative events they had experienced. Subsequently, perceiver participants (16 British and 16 Mediterraneans) watched videos of the targets thinking about these things. Perceivers (both groups) were significantly accurate in judging when targets had been cued to think of something positive versus something negative, indicating notable inferential ability. Additionally, Mediterranean perceivers were better than British perceivers in making such inferences, irrespective of nationality of the targets, something that was statistically accounted for by corresponding group differences in levels of independently measured collectivism. The results point to the need for further research to investigate the possibility that being reared in a collectivist culture fosters ability in interpreting others' behaviour.
How accurately can other people infer your thoughts—And does culture matter?
Valanides, Constantinos; Sheppard, Elizabeth; Mitchell, Peter
2017-01-01
This research investigated how accurately people infer what others are thinking after observing a brief sample of their behaviour and whether culture/similarity is a relevant factor. Target participants (14 British and 14 Mediterraneans) were cued to think about either positive or negative events they had experienced. Subsequently, perceiver participants (16 British and 16 Mediterraneans) watched videos of the targets thinking about these things. Perceivers (both groups) were significantly accurate in judging when targets had been cued to think of something positive versus something negative, indicating notable inferential ability. Additionally, Mediterranean perceivers were better than British perceivers in making such inferences, irrespective of nationality of the targets, something that was statistically accounted for by corresponding group differences in levels of independently measured collectivism. The results point to the need for further research to investigate the possibility that being reared in a collectivist culture fosters ability in interpreting others’ behaviour. PMID:29112972
Search times and probability of detection in time-limited search
NASA Astrophysics Data System (ADS)
Wilson, David; Devitt, Nicole; Maurer, Tana
2005-05-01
When modeling the search and target acquisition process, probability of detection as a function of time is important to war games and physical entity simulations. Recent US Army RDECOM CERDEC Night Vision and Electronics Sensor Directorate modeling of search and detection has focused on time-limited search. Developing the relationship between detection probability and time of search as a differential equation is explored. One of the parameters in the current formula for probability of detection in time-limited search corresponds to the mean time to detect in time-unlimited search. However, the mean time to detect in time-limited search is shorter than the mean time to detect in time-unlimited search and the relationship between them is a mathematical relationship between these two mean times. This simple relationship is derived.
Clune, Jeff; Goldsby, Heather J; Ofria, Charles; Pennock, Robert T
2011-03-07
Inclusive fitness theory predicts that natural selection will favour altruist genes that are more accurate in targeting altruism only to copies of themselves. In this paper, we provide evidence from digital evolution in support of this prediction by competing multiple altruist-targeting mechanisms that vary in their accuracy in determining whether a potential target for altruism carries a copy of the altruist gene. We compete altruism-targeting mechanisms based on (i) kinship (kin targeting), (ii) genetic similarity at a level greater than that expected of kin (similarity targeting), and (iii) perfect knowledge of the presence of an altruist gene (green beard targeting). Natural selection always favoured the most accurate targeting mechanism available. Our investigations also revealed that evolution did not increase the altruism level when all green beard altruists used the same phenotypic marker. The green beard altruism levels stably increased only when mutations that changed the altruism level also changed the marker (e.g. beard colour), such that beard colour reliably indicated the altruism level. For kin- and similarity-targeting mechanisms, we found that evolution was able to stably adjust altruism levels. Our results confirm that natural selection favours altruist genes that are increasingly accurate in targeting altruism to only their copies. Our work also emphasizes that the concept of targeting accuracy must include both the presence of an altruist gene and the level of altruism it produces.
Interaction between numbers and size during visual search.
Krause, Florian; Bekkering, Harold; Pratt, Jay; Lindemann, Oliver
2017-05-01
The current study investigates an interaction between numbers and physical size (i.e. size congruity) in visual search. In three experiments, participants had to detect a physically large (or small) target item among physically small (or large) distractors in a search task comprising single-digit numbers. The relative numerical size of the digits was varied, such that the target item was either among the numerically large or small numbers in the search display and the relation between numerical and physical size was either congruent or incongruent. Perceptual differences of the stimuli were controlled by a condition in which participants had to search for a differently coloured target item with the same physical size and by the usage of LCD-style numbers that were matched in visual similarity by shape transformations. The results of all three experiments consistently revealed that detecting a physically large target item is significantly faster when the numerical size of the target item is large as well (congruent), compared to when it is small (incongruent). This novel finding of a size congruity effect in visual search demonstrates an interaction between numerical and physical size in an experimental setting beyond typically used binary comparison tasks, and provides important new evidence for the notion of shared cognitive codes for numbers and sensorimotor magnitudes. Theoretical consequences for recent models on attention, magnitude representation and their interactions are discussed.
Search Parameter Optimization for Discrete, Bayesian, and Continuous Search Algorithms
2017-09-01
NAVAL POSTGRADUATE SCHOOL MONTEREY, CALIFORNIA THESIS SEARCH PARAMETER OPTIMIZATION FOR DISCRETE , BAYESIAN, AND CONTINUOUS SEARCH ALGORITHMS by...to 09-22-2017 4. TITLE AND SUBTITLE SEARCH PARAMETER OPTIMIZATION FOR DISCRETE , BAYESIAN, AND CON- TINUOUS SEARCH ALGORITHMS 5. FUNDING NUMBERS 6...simple search and rescue acts to prosecuting aerial/surface/submersible targets on mission. This research looks at varying the known discrete and
Branch length similarity entropy-based descriptors for shape representation
NASA Astrophysics Data System (ADS)
Kwon, Ohsung; Lee, Sang-Hee
2017-11-01
In previous studies, we showed that the branch length similarity (BLS) entropy profile could be successfully used for the shape recognition such as battle tanks, facial expressions, and butterflies. In the present study, we proposed new descriptors, roundness, symmetry, and surface roughness, for the recognition, which are more accurate and fast in the computation than the previous descriptors. The roundness represents how closely a shape resembles to a circle, the symmetry characterizes how much one shape is similar with another when the shape is moved in flip, and the surface roughness quantifies the degree of vertical deviations of a shape boundary. To evaluate the performance of the descriptors, we used the database of leaf images with 12 species. Each species consisted of 10 - 20 leaf images and the total number of images were 160. The evaluation showed that the new descriptors successfully discriminated the leaf species. We believe that the descriptors can be a useful tool in the field of pattern recognition.
Can Wearable Devices Accurately Measure Heart Rate Variability? A Systematic Review.
Georgiou, Konstantinos; Larentzakis, Andreas V; Khamis, Nehal N; Alsuhaibani, Ghadah I; Alaska, Yasser A; Giallafos, Elias J
2018-03-01
A growing number of wearable devices claim to provide accurate, cheap and easily applicable heart rate variability (HRV) indices. This is mainly accomplished by using wearable photoplethysmography (PPG) and/or electrocardiography (ECG), through simple and non-invasive techniques, as a substitute of the gold standard RR interval estimation through electrocardiogram. Although the agreement between pulse rate variability (PRV) and HRV has been evaluated in the literature, the reported results are still inconclusive especially when using wearable devices. The purpose of this systematic review is to investigate if wearable devices provide a reliable and precise measurement of classic HRV parameters in rest as well as during exercise. A search strategy was implemented to retrieve relevant articles from MEDLINE and SCOPUS databases, as well as, through internet search. The 308 articles retrieved were reviewed for further evaluation according to the predetermined inclusion/exclusion criteria. Eighteen studies were included. Sixteen of them integrated ECG - HRV technology and two of them PPG - PRV technology. All of them examined wearable devices accuracy in RV detection during rest, while only eight of them during exercise. The correlation between classic ECG derived HRV and the wearable RV ranged from very good to excellent during rest, yet it declined progressively as exercise level increased. Wearable devices may provide a promising alternative solution for measuring RV. However, more robust studies in non-stationary conditions are needed using appropriate methodology in terms of number of subjects involved, acquisition and analysis techniques implied.
Comparing NEO Search Telescopes
NASA Astrophysics Data System (ADS)
Myhrvold, Nathan
2016-04-01
Multiple terrestrial and space-based telescopes have been proposed for detecting and tracking near-Earth objects (NEOs). Detailed simulations of the search performance of these systems have used complex computer codes that are not widely available, which hinders accurate cross-comparison of the proposals and obscures whether they have consistent assumptions. Moreover, some proposed instruments would survey infrared (IR) bands, whereas others would operate in the visible band, and differences among asteroid thermal and visible-light models used in the simulations further complicate like-to-like comparisons. I use simple physical principles to estimate basic performance metrics for the ground-based Large Synoptic Survey Telescope and three space-based instruments—Sentinel, NEOCam, and a Cubesat constellation. The performance is measured against two different NEO distributions, the Bottke et al. distribution of general NEOs, and the Veres et al. distribution of Earth-impacting NEO. The results of the comparison show simplified relative performance metrics, including the expected number of NEOs visible in the search volumes and the initial detection rates expected for each system. Although these simplified comparisons do not capture all of the details, they give considerable insight into the physical factors limiting performance. Multiple asteroid thermal models are considered, including FRM, NEATM, and a new generalized form of FRM. I describe issues with how IR albedo and emissivity have been estimated in previous studies, which may render them inaccurate. A thermal model for tumbling asteroids is also developed and suggests that tumbling asteroids may be surprisingly difficult for IR telescopes to observe.