optimizing histogram queries: Topics by Science.gov

Sample records for optimizing histogram queries

Approximate Algorithms for Computing Spatial Distance Histograms with Accuracy Guarantees

PubMed Central

Grupcev, Vladimir; Yuan, Yongke; Tu, Yi-Cheng; Huang, Jin; Chen, Shaoping; Pandit, Sagar; Weng, Michael

2014-01-01

Particle simulation has become an important research tool in many scientific and engineering fields. Data generated by such simulations impose great challenges to database storage and query processing. One of the queries against particle simulation data, the spatial distance histogram (SDH) query, is the building block of many high-level analytics, and requires quadratic time to compute using a straightforward algorithm. Previous work has developed efficient algorithms that compute exact SDHs. While beating the naive solution, such algorithms are still not practical in processing SDH queries against large-scale simulation data. In this paper, we take a different path to tackle this problem by focusing on approximate algorithms with provable error bounds. We first present a solution derived from the aforementioned exact SDH algorithm, and this solution has running time that is unrelated to the system size N. We also develop a mathematical model to analyze the mechanism that leads to errors in the basic approximate algorithm. Our model provides insights on how the algorithm can be improved to achieve higher accuracy and efficiency. Such insights give rise to a new approximate algorithm with improved time/accuracy tradeoff. Experimental results confirm our analysis. PMID:24693210
Approach to Privacy-Preserve Data in Two-Tiered Wireless Sensor Network Based on Linear System and Histogram

NASA Astrophysics Data System (ADS)

Dang, Van H.; Wohlgemuth, Sven; Yoshiura, Hiroshi; Nguyen, Thuc D.; Echizen, Isao

Wireless sensor network (WSN) has been one of key technologies for the future with broad applications from the military to everyday life [1,2,3,4,5]. There are two kinds of WSN model models with sensors for sensing data and a sink for receiving and processing queries from users; and models with special additional nodes capable of storing large amounts of data from sensors and processing queries from the sink. Among the latter type, a two-tiered model [6,7] has been widely adopted because of its storage and energy saving benefits for weak sensors, as proved by the advent of commercial storage node products such as Stargate [8] and RISE. However, by concentrating storage in certain nodes, this model becomes more vulnerable to attack. Our novel technique, called zip-histogram, contributes to solving the problems of previous studies [6,7] by protecting the stored data's confidentiality and integrity (including data from the sensor and queries from the sink) against attackers who might target storage nodes in two-tiered WSNs.
Naturalness preservation image contrast enhancement via histogram modification

NASA Astrophysics Data System (ADS)

Tian, Qi-Chong; Cohen, Laurent D.

2018-04-01

Contrast enhancement is a technique for enhancing image contrast to obtain better visual quality. Since many existing contrast enhancement algorithms usually produce over-enhanced results, the naturalness preservation is needed to be considered in the framework of image contrast enhancement. This paper proposes a naturalness preservation contrast enhancement method, which adopts the histogram matching to improve the contrast and uses the image quality assessment to automatically select the optimal target histogram. The contrast improvement and the naturalness preservation are both considered in the target histogram, so this method can avoid the over-enhancement problem. In the proposed method, the optimal target histogram is a weighted sum of the original histogram, the uniform histogram, and the Gaussian-shaped histogram. Then the structural metric and the statistical naturalness metric are used to determine the weights of corresponding histograms. At last, the contrast-enhanced image is obtained via matching the optimal target histogram. The experiments demonstrate the proposed method outperforms the compared histogram-based contrast enhancement algorithms.
Using an image-extended relational database to support content-based image retrieval in a PACS.

PubMed

Traina, Caetano; Traina, Agma J M; Araújo, Myrian R B; Bueno, Josiane M; Chino, Fabio J T; Razente, Humberto; Azevedo-Marques, Paulo M

2005-12-01

This paper presents a new Picture Archiving and Communication System (PACS), called cbPACS, which has content-based image retrieval capabilities. The cbPACS answers range and k-nearest- neighbor similarity queries, employing a relational database manager extended to support images. The images are compared through their features, which are extracted by an image-processing module and stored in the extended relational database. The database extensions were developed aiming at efficiently answering similarity queries by taking advantage of specialized indexing methods. The main concept supporting the extensions is the definition, inside the relational manager, of distance functions based on features extracted from the images. An extension to the SQL language enables the construction of an interpreter that intercepts the extended commands and translates them to standard SQL, allowing any relational database server to be used. By now, the system implemented works on features based on color distribution of the images through normalized histograms as well as metric histograms. Metric histograms are invariant regarding scale, translation and rotation of images and also to brightness transformations. The cbPACS is prepared to integrate new image features, based on texture and shape of the main objects in the image.
A database system to support image algorithm evaluation

NASA Technical Reports Server (NTRS)

Lien, Y. E.

1977-01-01

The design is given of an interactive image database system IMDB, which allows the user to create, retrieve, store, display, and manipulate images through the facility of a high-level, interactive image query (IQ) language. The query language IQ permits the user to define false color functions, pixel value transformations, overlay functions, zoom functions, and windows. The user manipulates the images through generic functions. The user can direct images to display devices for visual and qualitative analysis. Image histograms and pixel value distributions can also be computed to obtain a quantitative analysis of images.
Particle swarm optimization-based local entropy weighted histogram equalization for infrared image enhancement

NASA Astrophysics Data System (ADS)

Wan, Minjie; Gu, Guohua; Qian, Weixian; Ren, Kan; Chen, Qian; Maldague, Xavier

2018-06-01

Infrared image enhancement plays a significant role in intelligent urban surveillance systems for smart city applications. Unlike existing methods only exaggerating the global contrast, we propose a particle swam optimization-based local entropy weighted histogram equalization which involves the enhancement of both local details and fore-and background contrast. First of all, a novel local entropy weighted histogram depicting the distribution of detail information is calculated based on a modified hyperbolic tangent function. Then, the histogram is divided into two parts via a threshold maximizing the inter-class variance in order to improve the contrasts of foreground and background, respectively. To avoid over-enhancement and noise amplification, double plateau thresholds of the presented histogram are formulated by means of particle swarm optimization algorithm. Lastly, each sub-image is equalized independently according to the constrained sub-local entropy weighted histogram. Comparative experiments implemented on real infrared images prove that our algorithm outperforms other state-of-the-art methods in terms of both visual and quantized evaluations.
Stochastic HKMDHE: A multi-objective contrast enhancement algorithm

NASA Astrophysics Data System (ADS)

Pratiher, Sawon; Mukhopadhyay, Sabyasachi; Maity, Srideep; Pradhan, Asima; Ghosh, Nirmalya; Panigrahi, Prasanta K.

2018-02-01

This contribution proposes a novel extension of the existing `Hyper Kurtosis based Modified Duo-Histogram Equalization' (HKMDHE) algorithm, for multi-objective contrast enhancement of biomedical images. A novel modified objective function has been formulated by joint optimization of the individual histogram equalization objectives. The optimal adequacy of the proposed methodology with respect to image quality metrics such as brightness preserving abilities, peak signal-to-noise ratio (PSNR), Structural Similarity Index (SSIM) and universal image quality metric has been experimentally validated. The performance analysis of the proposed Stochastic HKMDHE with existing histogram equalization methodologies like Global Histogram Equalization (GHE) and Contrast Limited Adaptive Histogram Equalization (CLAHE) has been given for comparative evaluation.
Representation and alignment of sung queries for music information retrieval

NASA Astrophysics Data System (ADS)

Adams, Norman H.; Wakefield, Gregory H.

2005-09-01

The pursuit of robust and rapid query-by-humming systems, which search melodic databases using sung queries, is a common theme in music information retrieval. The retrieval aspect of this database problem has received considerable attention, whereas the front-end processing of sung queries and the data structure to represent melodies has been based on musical intuition and historical momentum. The present work explores three time series representations for sung queries: a sequence of notes, a ``smooth'' pitch contour, and a sequence of pitch histograms. The performance of the three representations is compared using a collection of naturally sung queries. It is found that the most robust performance is achieved by the representation with highest dimension, the smooth pitch contour, but that this representation presents a formidable computational burden. For all three representations, it is necessary to align the query and target in order to achieve robust performance. The computational cost of the alignment is quadratic, hence it is necessary to keep the dimension small for rapid retrieval. Accordingly, iterative deepening is employed to achieve both robust performance and rapid retrieval. Finally, the conventional iterative framework is expanded to adapt the alignment constraints based on previous iterations, further expediting retrieval without degrading performance.
Heuristic query optimization for query multiple table and multiple clausa on mobile finance application

NASA Astrophysics Data System (ADS)

Indrayana, I. N. E.; P, N. M. Wirasyanti D.; Sudiartha, I. KG

2018-01-01

Mobile application allow many users to access data from the application without being limited to space, space and time. Over time the data population of this application will increase. Data access time will cause problems if the data record has reached tens of thousands to millions of records.The objective of this research is to maintain the performance of data execution for large data records. One effort to maintain data access time performance is to apply query optimization method. The optimization used in this research is query heuristic optimization method. The built application is a mobile-based financial application using MySQL database with stored procedure therein. This application is used by more than one business entity in one database, thus enabling rapid data growth. In this stored procedure there is an optimized query using heuristic method. Query optimization is performed on a “Select” query that involves more than one table with multiple clausa. Evaluation is done by calculating the average access time using optimized and unoptimized queries. Access time calculation is also performed on the increase of population data in the database. The evaluation results shown the time of data execution with query heuristic optimization relatively faster than data execution time without using query optimization.
Querying Patterns in High-Dimensional Heterogenous Datasets

ERIC Educational Resources Information Center

Singh, Vishwakarma

2012-01-01

The recent technological advancements have led to the availability of a plethora of heterogenous datasets, e.g., images tagged with geo-location and descriptive keywords. An object in these datasets is described by a set of high-dimensional feature vectors. For example, a keyword-tagged image is represented by a color-histogram and a…
Evaluation of the effectiveness of color attributes for video indexing

NASA Astrophysics Data System (ADS)

Chupeau, Bertrand; Forest, Ronan

2001-10-01

Color features are reviewed and their effectiveness assessed in the application framework of key-frame clustering for abstracting unconstrained video. Existing color spaces and associated quantization schemes are first studied. Description of global color distribution by means of histograms is then detailed. In our work, 12 combinations of color space and quantization were selected, together with 12 histogram metrics. Their respective effectiveness with respect to picture similarity measurement was evaluated through a query-by-example scenario. For that purpose, a set of still-picture databases was built by extracting key frames from several video clips, including news, documentaries, sports and cartoons. Classical retrieval performance evaluation criteria were adapted to the specificity of our testing methodology.
Evaluation of the effectiveness of color attributes for video indexing

NASA Astrophysics Data System (ADS)

Chupeau, Bertrand; Forest, Ronan

2001-01-01

Color features are reviewed and their effectiveness assessed in the application framework of key-frame clustering for abstracting unconstrained video. Existing color spaces and associated quantization schemes are first studied. Description of global color distribution by means of histograms is then detailed. In our work, twelve combinations of color space and quantization were selected, together with twelve histogram metrics. Their respective effectiveness with respect to picture similarity measurement was evaluated through a query-be-example scenario. For that purpose, a set of still-picture databases was built by extracting key-frames from several video clips, including news, documentaries, sports and cartoons. Classical retrieval performance evaluation criteria were adapted to the specificity of our testing methodology.
Evaluation of the effectiveness of color attributes for video indexing

NASA Astrophysics Data System (ADS)

Chupeau, Bertrand; Forest, Ronan

2000-12-01

Color features are reviewed and their effectiveness assessed in the application framework of key-frame clustering for abstracting unconstrained video. Existing color spaces and associated quantization schemes are first studied. Description of global color distribution by means of histograms is then detailed. In our work, twelve combinations of color space and quantization were selected, together with twelve histogram metrics. Their respective effectiveness with respect to picture similarity measurement was evaluated through a query-be-example scenario. For that purpose, a set of still-picture databases was built by extracting key-frames from several video clips, including news, documentaries, sports and cartoons. Classical retrieval performance evaluation criteria were adapted to the specificity of our testing methodology.
On algorithmic optimization of histogramming functions for GEM systems

NASA Astrophysics Data System (ADS)

Krawczyk, Rafał D.; Czarski, Tomasz; Kolasinski, Piotr; Poźniak, Krzysztof T.; Linczuk, Maciej; Byszuk, Adrian; Chernyshova, Maryna; Juszczyk, Bartlomiej; Kasprowicz, Grzegorz; Wojenski, Andrzej; Zabolotny, Wojciech

2015-09-01

This article concerns optimization methods for data analysis for the X-ray GEM detector system. The offline analysis of collected samples was optimized for MATLAB computations. Compiled functions in C language were used with MEX library. Significant speedup was received for both ordering-preprocessing and for histogramming of samples. Utilized techniques with obtained results are presented.
Evolution of Query Optimization Methods

NASA Astrophysics Data System (ADS)

Hameurlain, Abdelkader; Morvan, Franck

Query optimization is the most critical phase in query processing. In this paper, we try to describe synthetically the evolution of query optimization methods from uniprocessor relational database systems to data Grid systems through parallel, distributed and data integration systems. We point out a set of parameters to characterize and compare query optimization methods, mainly: (i) size of the search space, (ii) type of method (static or dynamic), (iii) modification types of execution plans (re-optimization or re-scheduling), (iv) level of modification (intra-operator and/or inter-operator), (v) type of event (estimation errors, delay, user preferences), and (vi) nature of decision-making (centralized or decentralized control).
Multidimensional indexing structure for use with linear optimization queries

NASA Technical Reports Server (NTRS)

Bergman, Lawrence David (Inventor); Castelli, Vittorio (Inventor); Chang, Yuan-Chi (Inventor); Li, Chung-Sheng (Inventor); Smith, John Richard (Inventor)

2002-01-01

Linear optimization queries, which usually arise in various decision support and resource planning applications, are queries that retrieve top N data records (where N is an integer greater than zero) which satisfy a specific optimization criterion. The optimization criterion is to either maximize or minimize a linear equation. The coefficients of the linear equation are given at query time. Methods and apparatus are disclosed for constructing, maintaining and utilizing a multidimensional indexing structure of database records to improve the execution speed of linear optimization queries. Database records with numerical attributes are organized into a number of layers and each layer represents a geometric structure called convex hull. Such linear optimization queries are processed by searching from the outer-most layer of this multi-layer indexing structure inwards. At least one record per layer will satisfy the query criterion and the number of layers needed to be searched depends on the spatial distribution of records, the query-issued linear coefficients, and N, the number of records to be returned. When N is small compared to the total size of the database, answering the query typically requires searching only a small fraction of all relevant records, resulting in a tremendous speedup as compared to linearly scanning the entire dataset.
Optimizing a Query by Transformation and Expansion.

PubMed

Glocker, Katrin; Knurr, Alexander; Dieter, Julia; Dominick, Friederike; Forche, Melanie; Koch, Christian; Pascoe Pérez, Analie; Roth, Benjamin; Ückert, Frank

2017-01-01

In the biomedical sector not only the amount of information produced and uploaded into the web is enormous, but also the number of sources where these data can be found. Clinicians and researchers spend huge amounts of time on trying to access this information and to filter the most important answers to a given question. As the formulation of these queries is crucial, automated query expansion is an effective tool to optimize a query and receive the best possible results. In this paper we introduce the concept of a workflow for an optimization of queries in the medical and biological sector by using a series of tools for expansion and transformation of the query. After the definition of attributes by the user, the query string is compared to previous queries in order to add semantic co-occurring terms to the query. Additionally, the query is enlarged by an inclusion of synonyms. The translation into database specific ontologies ensures the optimal query formulation for the chosen database(s). As this process can be performed in various databases at once, the results are ranked and normalized in order to achieve a comparable list of answers for a question.
A novel adaptive Cuckoo search for optimal query plan generation.

PubMed

Gomathi, Ramalingam; Sharmila, Dhandapani

2014-01-01

The emergence of multiple web pages day by day leads to the development of the semantic web technology. A World Wide Web Consortium (W3C) standard for storing semantic web data is the resource description framework (RDF). To enhance the efficiency in the execution time for querying large RDF graphs, the evolving metaheuristic algorithms become an alternate to the traditional query optimization methods. This paper focuses on the problem of query optimization of semantic web data. An efficient algorithm called adaptive Cuckoo search (ACS) for querying and generating optimal query plan for large RDF graphs is designed in this research. Experiments were conducted on different datasets with varying number of predicates. The experimental results have exposed that the proposed approach has provided significant results in terms of query execution time. The extent to which the algorithm is efficient is tested and the results are documented.
Distributed query plan generation using multiobjective genetic algorithm.

PubMed

Panicker, Shina; Kumar, T V Vijay

2014-01-01

A distributed query processing strategy, which is a key performance determinant in accessing distributed databases, aims to minimize the total query processing cost. One way to achieve this is by generating efficient distributed query plans that involve fewer sites for processing a query. In the case of distributed relational databases, the number of possible query plans increases exponentially with respect to the number of relations accessed by the query and the number of sites where these relations reside. Consequently, computing optimal distributed query plans becomes a complex problem. This distributed query plan generation (DQPG) problem has already been addressed using single objective genetic algorithm, where the objective is to minimize the total query processing cost comprising the local processing cost (LPC) and the site-to-site communication cost (CC). In this paper, this DQPG problem is formulated and solved as a biobjective optimization problem with the two objectives being minimize total LPC and minimize total CC. These objectives are simultaneously optimized using a multiobjective genetic algorithm NSGA-II. Experimental comparison of the proposed NSGA-II based DQPG algorithm with the single objective genetic algorithm shows that the former performs comparatively better and converges quickly towards optimal solutions for an observed crossover and mutation probability.
Distributed Query Plan Generation Using Multiobjective Genetic Algorithm

PubMed Central

Panicker, Shina; Vijay Kumar, T. V.

2014-01-01

A distributed query processing strategy, which is a key performance determinant in accessing distributed databases, aims to minimize the total query processing cost. One way to achieve this is by generating efficient distributed query plans that involve fewer sites for processing a query. In the case of distributed relational databases, the number of possible query plans increases exponentially with respect to the number of relations accessed by the query and the number of sites where these relations reside. Consequently, computing optimal distributed query plans becomes a complex problem. This distributed query plan generation (DQPG) problem has already been addressed using single objective genetic algorithm, where the objective is to minimize the total query processing cost comprising the local processing cost (LPC) and the site-to-site communication cost (CC). In this paper, this DQPG problem is formulated and solved as a biobjective optimization problem with the two objectives being minimize total LPC and minimize total CC. These objectives are simultaneously optimized using a multiobjective genetic algorithm NSGA-II. Experimental comparison of the proposed NSGA-II based DQPG algorithm with the single objective genetic algorithm shows that the former performs comparatively better and converges quickly towards optimal solutions for an observed crossover and mutation probability. PMID:24963513

Information-theoretic CAD system in mammography: Entropy-based indexing for computational efficiency and robust performance

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tourassi, Georgia D.; Harrawood, Brian; Singh, Swatee

2007-08-15

We have previously presented a knowledge-based computer-assisted detection (KB-CADe) system for the detection of mammographic masses. The system is designed to compare a query mammographic region with mammographic templates of known ground truth. The templates are stored in an adaptive knowledge database. Image similarity is assessed with information theoretic measures (e.g., mutual information) derived directly from the image histograms. A previous study suggested that the diagnostic performance of the system steadily improves as the knowledge database is initially enriched with more templates. However, as the database increases in size, an exhaustive comparison of the query case with each stored templatemore » becomes computationally burdensome. Furthermore, blind storing of new templates may result in redundancies that do not necessarily improve diagnostic performance. To address these concerns we investigated an entropy-based indexing scheme for improving the speed of analysis and for satisfying database storage restrictions without compromising the overall diagnostic performance of our KB-CADe system. The indexing scheme was evaluated on two different datasets as (i) a search mechanism to sort through the knowledge database, and (ii) a selection mechanism to build a smaller, concise knowledge database that is easier to maintain but still effective. There were two important findings in the study. First, entropy-based indexing is an effective strategy to identify fast a subset of templates that are most relevant to a given query. Only this subset could be analyzed in more detail using mutual information for optimized decision making regarding the query. Second, a selective entropy-based deposit strategy may be preferable where only high entropy cases are maintained in the knowledge database. Overall, the proposed entropy-based indexing scheme was shown to reduce the computational cost of our KB-CADe system by 55% to 80% while maintaining the system's diagnostic performance.« less
Multipurpose contrast enhancement on epiphyseal plates and ossification centers for bone age assessment

PubMed Central

2013-01-01

Background The high variations of background luminance, low contrast and excessively enhanced contrast of hand bone radiograph often impede the bone age assessment rating system in evaluating the degree of epiphyseal plates and ossification centers development. The Global Histogram equalization (GHE) has been the most frequently adopted image contrast enhancement technique but the performance is not satisfying. A brightness and detail preserving histogram equalization method with good contrast enhancement effect has been a goal of much recent research in histogram equalization. Nevertheless, producing a well-balanced histogram equalized radiograph in terms of its brightness preservation, detail preservation and contrast enhancement is deemed to be a daunting task. Method In this paper, we propose a novel framework of histogram equalization with the aim of taking several desirable properties into account, namely the Multipurpose Beta Optimized Bi-Histogram Equalization (MBOBHE). This method performs the histogram optimization separately in both sub-histograms after the segmentation of histogram using an optimized separating point determined based on the regularization function constituted by three components. The result is then assessed by the qualitative and quantitative analysis to evaluate the essential aspects of histogram equalized image using a total of 160 hand radiographs that are implemented in testing and analyses which are acquired from hand bone online database. Result From the qualitative analysis, we found that basic bi-histogram equalizations are not capable of displaying the small features in image due to incorrect selection of separating point by focusing on only certain metric without considering the contrast enhancement and detail preservation. From the quantitative analysis, we found that MBOBHE correlates well with human visual perception, and this improvement shortens the evaluation time taken by inspector in assessing the bone age. Conclusions The proposed MBOBHE outperforms other existing methods regarding comprehensive performance of histogram equalization. All the features which are pertinent to bone age assessment are more protruding relative to other methods; this has shorten the required evaluation time in manual bone age assessment using TW method. While the accuracy remains unaffected or slightly better than using unprocessed original image. The holistic properties in terms of brightness preservation, detail preservation and contrast enhancement are simultaneous taken into consideration and thus the visual effect is contributive to manual inspection. PMID:23565999
RCQ-GA: RDF Chain Query Optimization Using Genetic Algorithms

NASA Astrophysics Data System (ADS)

Hogenboom, Alexander; Milea, Viorel; Frasincar, Flavius; Kaymak, Uzay

The application of Semantic Web technologies in an Electronic Commerce environment implies a need for good support tools. Fast query engines are needed for efficient querying of large amounts of data, usually represented using RDF. We focus on optimizing a special class of SPARQL queries, the so-called RDF chain queries. For this purpose, we devise a genetic algorithm called RCQ-GA that determines the order in which joins need to be performed for an efficient evaluation of RDF chain queries. The approach is benchmarked against a two-phase optimization algorithm, previously proposed in literature. The more complex a query is, the more RCQ-GA outperforms the benchmark in solution quality, execution time needed, and consistency of solution quality. When the algorithms are constrained by a time limit, the overall performance of RCQ-GA compared to the benchmark further improves.
optBINS: Optimal Binning for histograms

NASA Astrophysics Data System (ADS)

Knuth, Kevin H.

2018-03-01

optBINS (optimal binning) determines the optimal number of bins in a uniform bin-width histogram by deriving the posterior probability for the number of bins in a piecewise-constant density model after assigning a multinomial likelihood and a non-informative prior. The maximum of the posterior probability occurs at a point where the prior probability and the the joint likelihood are balanced. The interplay between these opposing factors effectively implements Occam's razor by selecting the most simple model that best describes the data.
Enabling Incremental Query Re-Optimization.

PubMed

Liu, Mengmeng; Ives, Zachary G; Loo, Boon Thau

2016-01-01

As declarative query processing techniques expand to the Web, data streams, network routers, and cloud platforms, there is an increasing need to re-plan execution in the presence of unanticipated performance changes. New runtime information may affect which query plan we prefer to run. Adaptive techniques require innovation both in terms of the algorithms used to estimate costs , and in terms of the search algorithm that finds the best plan. We investigate how to build a cost-based optimizer that recomputes the optimal plan incrementally given new cost information, much as a stream engine constantly updates its outputs given new data. Our implementation especially shows benefits for stream processing workloads. It lays the foundations upon which a variety of novel adaptive optimization algorithms can be built. We start by leveraging the recently proposed approach of formulating query plan enumeration as a set of recursive datalog queries ; we develop a variety of novel optimization approaches to ensure effective pruning in both static and incremental cases. We further show that the lessons learned in the declarative implementation can be equally applied to more traditional optimizer implementations.
Enabling Incremental Query Re-Optimization

PubMed Central

Liu, Mengmeng; Ives, Zachary G.; Loo, Boon Thau

2017-01-01

As declarative query processing techniques expand to the Web, data streams, network routers, and cloud platforms, there is an increasing need to re-plan execution in the presence of unanticipated performance changes. New runtime information may affect which query plan we prefer to run. Adaptive techniques require innovation both in terms of the algorithms used to estimate costs, and in terms of the search algorithm that finds the best plan. We investigate how to build a cost-based optimizer that recomputes the optimal plan incrementally given new cost information, much as a stream engine constantly updates its outputs given new data. Our implementation especially shows benefits for stream processing workloads. It lays the foundations upon which a variety of novel adaptive optimization algorithms can be built. We start by leveraging the recently proposed approach of formulating query plan enumeration as a set of recursive datalog queries; we develop a variety of novel optimization approaches to ensure effective pruning in both static and incremental cases. We further show that the lessons learned in the declarative implementation can be equally applied to more traditional optimizer implementations. PMID:28659658
Optimization of the Controlled Evaluation of Closed Relational Queries

NASA Astrophysics Data System (ADS)

Biskup, Joachim; Lochner, Jan-Hendrik; Sonntag, Sebastian

For relational databases, controlled query evaluation is an effective inference control mechanism preserving confidentiality regarding a previously declared confidentiality policy. Implementations of controlled query evaluation usually lack efficiency due to costly theorem prover calls. Suitably constrained controlled query evaluation can be implemented efficiently, but is not flexible enough from the perspective of database users and security administrators. In this paper, we propose an optimized framework for controlled query evaluation in relational databases, being efficiently implementable on the one hand and relaxing the constraints of previous approaches on the other hand.
Improved image retrieval based on fuzzy colour feature vector

NASA Astrophysics Data System (ADS)

Ben-Ahmeida, Ahlam M.; Ben Sasi, Ahmed Y.

2013-03-01

One of Image indexing techniques is the Content-Based Image Retrieval which is an efficient way for retrieving images from the image database automatically based on their visual contents such as colour, texture, and shape. In this paper will be discuss how using content-based image retrieval (CBIR) method by colour feature extraction and similarity checking. By dividing the query image and all images in the database into pieces and extract the features of each part separately and comparing the corresponding portions in order to increase the accuracy in the retrieval. The proposed approach is based on the use of fuzzy sets, to overcome the problem of curse of dimensionality. The contribution of colour of each pixel is associated to all the bins in the histogram using fuzzy-set membership functions. As a result, the Fuzzy Colour Histogram (FCH), outperformed the Conventional Colour Histogram (CCH) in image retrieving, due to its speedy results, where were images represented as signatures that took less size of memory, depending on the number of divisions. The results also showed that FCH is less sensitive and more robust to brightness changes than the CCH with better retrieval recall values.
Mining Longitudinal Web Queries: Trends and Patterns.

ERIC Educational Resources Information Center

Wang, Peiling; Berry, Michael W.; Yang, Yiheng

2003-01-01

Analyzed user queries submitted to an academic Web site during a four-year period, using a relational database, to examine users' query behavior, to identify problems they encounter, and to develop techniques for optimizing query analysis and mining. Linguistic analyses focus on query structures, lexicon, and word associations using statistical…
Concept locator: a client-server application for retrieval of UMLS metathesaurus concepts through complex boolean query.

PubMed

Nadkarni, P M

1997-08-01

Concept Locator (CL) is a client-server application that accesses a Sybase relational database server containing a subset of the UMLS Metathesaurus for the purpose of retrieval of concepts corresponding to one or more query expressions supplied to it. CL's query grammar permits complex Boolean expressions, wildcard patterns, and parenthesized (nested) subexpressions. CL translates the query expressions supplied to it into one or more SQL statements that actually perform the retrieval. The generated SQL is optimized by the client to take advantage of the strengths of the server's query optimizer, and sidesteps its weaknesses, so that execution is reasonably efficient.
Applying Semantic Web Concepts to Support Net-Centric Warfare Using the Tactical Assessment Markup Language (TAML)

DTIC Science & Technology

2006-06-01

SPARQL SPARQL Protocol and RDF Query Language SQL Structured Query Language SUMO Suggested Upper Merged Ontology SW... Query optimization algorithms are implemented in the Pellet reasoner in order to ensure querying a knowledge base is efficient . These algorithms...memory as a treelike structure in order for the data to be queried . XML Query (XQuery) is the standard language used when querying XML
Improved dose-volume histogram estimates for radiopharmaceutical therapy by optimizing quantitative SPECT reconstruction parameters

NASA Astrophysics Data System (ADS)

Cheng, Lishui; Hobbs, Robert F.; Segars, Paul W.; Sgouros, George; Frey, Eric C.

2013-06-01

In radiopharmaceutical therapy, an understanding of the dose distribution in normal and target tissues is important for optimizing treatment. Three-dimensional (3D) dosimetry takes into account patient anatomy and the nonuniform uptake of radiopharmaceuticals in tissues. Dose-volume histograms (DVHs) provide a useful summary representation of the 3D dose distribution and have been widely used for external beam treatment planning. Reliable 3D dosimetry requires an accurate 3D radioactivity distribution as the input. However, activity distribution estimates from SPECT are corrupted by noise and partial volume effects (PVEs). In this work, we systematically investigated OS-EM based quantitative SPECT (QSPECT) image reconstruction in terms of its effect on DVHs estimates. A modified 3D NURBS-based Cardiac-Torso (NCAT) phantom that incorporated a non-uniform kidney model and clinically realistic organ activities and biokinetics was used. Projections were generated using a Monte Carlo (MC) simulation; noise effects were studied using 50 noise realizations with clinical count levels. Activity images were reconstructed using QSPECT with compensation for attenuation, scatter and collimator-detector response (CDR). Dose rate distributions were estimated by convolution of the activity image with a voxel S kernel. Cumulative DVHs were calculated from the phantom and QSPECT images and compared both qualitatively and quantitatively. We found that noise, PVEs, and ringing artifacts due to CDR compensation all degraded histogram estimates. Low-pass filtering and early termination of the iterative process were needed to reduce the effects of noise and ringing artifacts on DVHs, but resulted in increased degradations due to PVEs. Large objects with few features, such as the liver, had more accurate histogram estimates and required fewer iterations and more smoothing for optimal results. Smaller objects with fine details, such as the kidneys, required more iterations and less smoothing at early time points post-radiopharmaceutical administration but more smoothing and fewer iterations at later time points when the total organ activity was lower. The results of this study demonstrate the importance of using optimal reconstruction and regularization parameters. Optimal results were obtained with different parameters at each time point, but using a single set of parameters for all time points produced near-optimal dose-volume histograms.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Sugano, Yasutaka; Mizuta, Masahiro; Takao, Seishin

Purpose: Radiotherapy of solid tumors has been performed with various fractionation regimens such as multi- and hypofractionations. However, the ability to optimize the fractionation regimen considering the physical dose distribution remains insufficient. This study aims to optimize the fractionation regimen, in which the authors propose a graphical method for selecting the optimal number of fractions (n) and dose per fraction (d) based on dose–volume histograms for tumor and normal tissues of organs around the tumor. Methods: Modified linear-quadratic models were employed to estimate the radiation effects on the tumor and an organ at risk (OAR), where the repopulation of themore » tumor cells and the linearity of the dose-response curve in the high dose range of the surviving fraction were considered. The minimization problem for the damage effect on the OAR was solved under the constraint that the radiation effect on the tumor is fixed by a graphical method. Here, the damage effect on the OAR was estimated based on the dose–volume histogram. Results: It was found that the optimization of fractionation scheme incorporating the dose–volume histogram is possible by employing appropriate cell surviving models. The graphical method considering the repopulation of tumor cells and a rectilinear response in the high dose range enables them to derive the optimal number of fractions and dose per fraction. For example, in the treatment of prostate cancer, the optimal fractionation was suggested to lie in the range of 8–32 fractions with a daily dose of 2.2–6.3 Gy. Conclusions: It is possible to optimize the number of fractions and dose per fraction based on the physical dose distribution (i.e., dose–volume histogram) by the graphical method considering the effects on tumor and OARs around the tumor. This method may stipulate a new guideline to optimize the fractionation regimen for physics-guided fractionation.« less
Color image enhancement based on particle swarm optimization with Gaussian mixture

NASA Astrophysics Data System (ADS)

Kattakkalil Subhashdas, Shibudas; Choi, Bong-Seok; Yoo, Ji-Hoon; Ha, Yeong-Ho

2015-01-01

This paper proposes a Gaussian mixture based image enhancement method which uses particle swarm optimization (PSO) to have an edge over other contemporary methods. The proposed method uses the guassian mixture model to model the lightness histogram of the input image in CIEL*a*b* space. The intersection points of the guassian components in the model are used to partition the lightness histogram. . The enhanced lightness image is generated by transforming the lightness value in each interval to appropriate output interval according to the transformation function that depends on PSO optimized parameters, weight and standard deviation of Gaussian component and cumulative distribution of the input histogram interval. In addition, chroma compensation is applied to the resulting image to reduce washout appearance. Experimental results show that the proposed method produces a better enhanced image compared to the traditional methods. Moreover, the enhanced image is free from several side effects such as washout appearance, information loss and gradation artifacts.
Query construction, entropy, and generalization in neural-network models

NASA Astrophysics Data System (ADS)

Sollich, Peter

1994-05-01

We study query construction algorithms, which aim at improving the generalization ability of systems that learn from examples by choosing optimal, nonredundant training sets. We set up a general probabilistic framework for deriving such algorithms from the requirement of optimizing a suitable objective function; specifically, we consider the objective functions entropy (or information gain) and generalization error. For two learning scenarios, the high-low game and the linear perceptron, we evaluate the generalization performance obtained by applying the corresponding query construction algorithms and compare it to training on random examples. We find qualitative differences between the two scenarios due to the different structure of the underlying rules (nonlinear and ``noninvertible'' versus linear); in particular, for the linear perceptron, random examples lead to the same generalization ability as a sequence of queries in the limit of an infinite number of examples. We also investigate learning algorithms which are ill matched to the learning environment and find that, in this case, minimum entropy queries can in fact yield a lower generalization ability than random examples. Finally, we study the efficiency of single queries and its dependence on the learning history, i.e., on whether the previous training examples were generated randomly or by querying, and the difference between globally and locally optimal query construction.
Evolutionary Multiobjective Query Workload Optimization of Cloud Data Warehouses

PubMed Central

Dokeroglu, Tansel; Sert, Seyyit Alper; Cinar, Muhammet Serkan

2014-01-01

With the advent of Cloud databases, query optimizers need to find paretooptimal solutions in terms of response time and monetary cost. Our novel approach minimizes both objectives by deploying alternative virtual resources and query plans making use of the virtual resource elasticity of the Cloud. We propose an exact multiobjective branch-and-bound and a robust multiobjective genetic algorithm for the optimization of distributed data warehouse query workloads on the Cloud. In order to investigate the effectiveness of our approach, we incorporate the devised algorithms into a prototype system. Finally, through several experiments that we have conducted with different workloads and virtual resource configurations, we conclude remarkable findings of alternative deployments as well as the advantages and disadvantages of the multiobjective algorithms we propose. PMID:24892048
Whole-Lesion Histogram Analysis of Apparent Diffusion Coefficient for the Assessment of Cervical Cancer.

PubMed

Guan, Yue; Shi, Hua; Chen, Ying; Liu, Song; Li, Weifeng; Jiang, Zhuoran; Wang, Huanhuan; He, Jian; Zhou, Zhengyang; Ge, Yun

2016-01-01

The aim of this study was to explore the application of whole-lesion histogram analysis of apparent diffusion coefficient (ADC) values of cervical cancer. A total of 54 women (mean age, 53 years) with cervical cancers underwent 3-T diffusion-weighted imaging with b values of 0 and 800 s/mm prospectively. Whole-lesion histogram analysis of ADC values was performed. Paired sample t test was used to compare differences in ADC histogram parameters between cervical cancers and normal cervical tissues. Receiver operating characteristic curves were constructed to identify the optimal threshold of each parameter. All histogram parameters in this study including ADCmean, ADCmin, ADC10%-ADC90%, mode, skewness, and kurtosis of cervical cancers were significantly lower than those of normal cervical tissues (all P < 0.0001). ADC90% had the largest area under receiver operating characteristic curve of 0.996. Whole-lesion histogram analysis of ADC maps is useful in the assessment of cervical cancer.
Research of image retrieval technology based on color feature

NASA Astrophysics Data System (ADS)

Fu, Yanjun; Jiang, Guangyu; Chen, Fengying

2009-10-01

Recently, with the development of the communication and the computer technology and the improvement of the storage technology and the capability of the digital image equipment, more and more image resources are given to us than ever. And thus the solution of how to locate the proper image quickly and accurately is wanted.The early method is to set up a key word for searching in the database, but now the method has become very difficult when we search much more picture that we need. In order to overcome the limitation of the traditional searching method, content based image retrieval technology was aroused. Now, it is a hot research subject.Color image retrieval is the important part of it. Color is the most important feature for color image retrieval. Three key questions on how to make use of the color characteristic are discussed in the paper: the expression of color, the abstraction of color characteristic and the measurement of likeness based on color. On the basis, the extraction technology of the color histogram characteristic is especially discussed. Considering the advantages and disadvantages of the overall histogram and the partition histogram, a new method based the partition-overall histogram is proposed. The basic thought of it is to divide the image space according to a certain strategy, and then calculate color histogram of each block as the color feature of this block. Users choose the blocks that contain important space information, confirming the right value. The system calculates the distance between the corresponding blocks that users choosed. Other blocks merge into part overall histograms again, and the distance should be calculated. Then accumulate all the distance as the real distance between two pictures. The partition-overall histogram comprehensive utilizes advantages of two methods above, by choosing blocks makes the feature contain more spatial information which can improve performance; the distances between partition-overall histogram make rotating and translation does not change. The HSV color space is used to show color characteristic of image, which is suitable to the visual characteristic of human. Taking advance of human's feeling to color, it quantifies color sector with unequal interval, and get characteristic vector. Finally, it matches the similarity of image with the algorithm of the histogram intersection and the partition-overall histogram. Users can choose a demonstration image to show inquired vision require, and also can adjust several right value through the relevance-feedback method to obtain the best result of search.An image retrieval system based on these approaches is presented. The result of the experiments shows that the image retrieval based on partition-overall histogram can keep the space distribution information while abstracting color feature efficiently, and it is superior to the normal color histograms in precision rate while researching. The query precision rate is more than 95%. In addition, the efficient block expression will lower the complicate degree of the images to be searched, and thus the searching efficiency will be increased. The image retrieval algorithms based on the partition-overall histogram proposed in the paper is efficient and effective.
Using Generalized Annotated Programs to Solve Social Network Diffusion Optimization Problems

DTIC Science & Technology

2013-01-01

as follows: —Let kall be the k value for the SNDOP-ALL query and for each SNDOP query i, let ki be the k for that query. For each query i, set ki... kall − 1. —Number each element of vi ∈ V such that gI(vi) and V C(vi) are true. For the ith SNDOP query, let vi be the corresponding element of V —Let...vertices of S. PROOF. We set up |V | SNDOP-queries as follows: —Let kall be the k value for the SNDOP-ALL query and and for each SNDOP-query i, let ki be
Dynamic Histogram Analysis To Determine Free Energies and Rates from Biased Simulations.

PubMed

Stelzl, Lukas S; Kells, Adam; Rosta, Edina; Hummer, Gerhard

2017-12-12

We present an algorithm to calculate free energies and rates from molecular simulations on biased potential energy surfaces. As input, it uses the accumulated times spent in each state or bin of a histogram and counts of transitions between them. Optimal unbiased equilibrium free energies for each of the states/bins are then obtained by maximizing the likelihood of a master equation (i.e., first-order kinetic rate model). The resulting free energies also determine the optimal rate coefficients for transitions between the states or bins on the biased potentials. Unbiased rates can be estimated, e.g., by imposing a linear free energy condition in the likelihood maximization. The resulting "dynamic histogram analysis method extended to detailed balance" (DHAMed) builds on the DHAM method. It is also closely related to the transition-based reweighting analysis method (TRAM) and the discrete TRAM (dTRAM). However, in the continuous-time formulation of DHAMed, the detailed balance constraints are more easily accounted for, resulting in compact expressions amenable to efficient numerical treatment. DHAMed produces accurate free energies in cases where the common weighted-histogram analysis method (WHAM) for umbrella sampling fails because of slow dynamics within the windows. Even in the limit of completely uncorrelated data, where WHAM is optimal in the maximum-likelihood sense, DHAMed results are nearly indistinguishable. We illustrate DHAMed with applications to ion channel conduction, RNA duplex formation, α-helix folding, and rate calculations from accelerated molecular dynamics. DHAMed can also be used to construct Markov state models from biased or replica-exchange molecular dynamics simulations. By using binless WHAM formulated as a numerical minimization problem, the bias factors for the individual states can be determined efficiently in a preprocessing step and, if needed, optimized globally afterward.

Automatic classification and detection of clinically relevant images for diabetic retinopathy

NASA Astrophysics Data System (ADS)

Xu, Xinyu; Li, Baoxin

2008-03-01

We proposed a novel approach to automatic classification of Diabetic Retinopathy (DR) images and retrieval of clinically-relevant DR images from a database. Given a query image, our approach first classifies the image into one of the three categories: microaneurysm (MA), neovascularization (NV) and normal, and then it retrieves DR images that are clinically-relevant to the query image from an archival image database. In the classification stage, the query DR images are classified by the Multi-class Multiple-Instance Learning (McMIL) approach, where images are viewed as bags, each of which contains a number of instances corresponding to non-overlapping blocks, and each block is characterized by low-level features including color, texture, histogram of edge directions, and shape. McMIL first learns a collection of instance prototypes for each class that maximizes the Diverse Density function using Expectation- Maximization algorithm. A nonlinear mapping is then defined using the instance prototypes and maps every bag to a point in a new multi-class bag feature space. Finally a multi-class Support Vector Machine is trained in the multi-class bag feature space. In the retrieval stage, we retrieve images from the archival database who bear the same label with the query image, and who are the top K nearest neighbors of the query image in terms of similarity in the multi-class bag feature space. The classification approach achieves high classification accuracy, and the retrieval of clinically-relevant images not only facilitates utilization of the vast amount of hidden diagnostic knowledge in the database, but also improves the efficiency and accuracy of DR lesion diagnosis and assessment.
Query Auto-Completion Based on Word2vec Semantic Similarity

NASA Astrophysics Data System (ADS)

Shao, Taihua; Chen, Honghui; Chen, Wanyu

2018-04-01

Query auto-completion (QAC) is the first step of information retrieval, which helps users formulate the entire query after inputting only a few prefixes. Regarding the models of QAC, the traditional method ignores the contribution from the semantic relevance between queries. However, similar queries always express extremely similar search intention. In this paper, we propose a hybrid model FS-QAC based on query semantic similarity as well as the query frequency. We choose word2vec method to measure the semantic similarity between intended queries and pre-submitted queries. By combining both features, our experiments show that FS-QAC model improves the performance when predicting the user’s query intention and helping formulate the right query. Our experimental results show that the optimal hybrid model contributes to a 7.54% improvement in terms of MRR against a state-of-the-art baseline using the public AOL query logs.
Semi-automatic feedback using concurrence between mixture vectors for general databases

NASA Astrophysics Data System (ADS)

Larabi, Mohamed-Chaker; Richard, Noel; Colot, Olivier; Fernandez-Maloigne, Christine

2001-12-01

This paper describes how a query system can exploit the basic knowledge by employing semi-automatic relevance feedback to refine queries and runtimes. For general databases, it is often useless to call complex attributes, because we have not sufficient information about images in the database. Moreover, these images can be topologically very different from one to each other and an attribute that is powerful for a database category may be very powerless for the other categories. The idea is to use very simple features, such as color histogram, correlograms, Color Coherence Vectors (CCV), to fill out the signature vector. Then, a number of mixture vectors is prepared depending on the number of very distinctive categories in the database. Knowing that a mixture vector is a vector containing the weight of each attribute that will be used to compute a similarity distance. We post a query in the database using successively all the mixture vectors defined previously. We retain then the N first images for each vector in order to make a mapping using the following information: Is image I present in several mixture vectors results? What is its rank in the results? These informations allow us to switch the system on an unsupervised relevance feedback or user's feedback (supervised feedback).
Query-Time Optimization Techniques for Structured Queries in Information Retrieval

ERIC Educational Resources Information Center

Cartright, Marc-Allen

2013-01-01

The use of information retrieval (IR) systems is evolving towards larger, more complicated queries. Both the IR industrial and research communities have generated significant evidence indicating that in order to continue improving retrieval effectiveness, increases in retrieval model complexity may be unavoidable. From an operational perspective,…
Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea.

PubMed

Woo, Hyekyung; Cho, Youngtae; Shim, Eunyoung; Lee, Jong-Koo; Lee, Chang-Gun; Kim, Seong Hwan

2016-07-04

As suggested as early as in 2006, logs of queries submitted to search engines seeking information could be a source for detection of emerging influenza epidemics if changes in the volume of search queries are monitored (infodemiology). However, selecting queries that are most likely to be associated with influenza epidemics is a particular challenge when it comes to generating better predictions. In this study, we describe a methodological extension for detecting influenza outbreaks using search query data; we provide a new approach for query selection through the exploration of contextual information gleaned from social media data. Additionally, we evaluate whether it is possible to use these queries for monitoring and predicting influenza epidemics in South Korea. Our study was based on freely available weekly influenza incidence data and query data originating from the search engine on the Korean website Daum between April 3, 2011 and April 5, 2014. To select queries related to influenza epidemics, several approaches were applied: (1) exploring influenza-related words in social media data, (2) identifying the chief concerns related to influenza, and (3) using Web query recommendations. Optimal feature selection by least absolute shrinkage and selection operator (Lasso) and support vector machine for regression (SVR) were used to construct a model predicting influenza epidemics. In total, 146 queries related to influenza were generated through our initial query selection approach. A considerable proportion of optimal features for final models were derived from queries with reference to the social media data. The SVR model performed well: the prediction values were highly correlated with the recent observed influenza-like illness (r=.956; P<.001) and virological incidence rate (r=.963; P<.001). These results demonstrate the feasibility of using search queries to enhance influenza surveillance in South Korea. In addition, an approach for query selection using social media data seems ideal for supporting influenza surveillance based on search query data.
Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea

PubMed Central

Woo, Hyekyung; Shim, Eunyoung; Lee, Jong-Koo; Lee, Chang-Gun; Kim, Seong Hwan

2016-01-01

Background As suggested as early as in 2006, logs of queries submitted to search engines seeking information could be a source for detection of emerging influenza epidemics if changes in the volume of search queries are monitored (infodemiology). However, selecting queries that are most likely to be associated with influenza epidemics is a particular challenge when it comes to generating better predictions. Objective In this study, we describe a methodological extension for detecting influenza outbreaks using search query data; we provide a new approach for query selection through the exploration of contextual information gleaned from social media data. Additionally, we evaluate whether it is possible to use these queries for monitoring and predicting influenza epidemics in South Korea. Methods Our study was based on freely available weekly influenza incidence data and query data originating from the search engine on the Korean website Daum between April 3, 2011 and April 5, 2014. To select queries related to influenza epidemics, several approaches were applied: (1) exploring influenza-related words in social media data, (2) identifying the chief concerns related to influenza, and (3) using Web query recommendations. Optimal feature selection by least absolute shrinkage and selection operator (Lasso) and support vector machine for regression (SVR) were used to construct a model predicting influenza epidemics. Results In total, 146 queries related to influenza were generated through our initial query selection approach. A considerable proportion of optimal features for final models were derived from queries with reference to the social media data. The SVR model performed well: the prediction values were highly correlated with the recent observed influenza-like illness (r=.956; P<.001) and virological incidence rate (r=.963; P<.001). Conclusions These results demonstrate the feasibility of using search queries to enhance influenza surveillance in South Korea. In addition, an approach for query selection using social media data seems ideal for supporting influenza surveillance based on search query data. PMID:27377323
A Framework for WWW Query Processing

NASA Technical Reports Server (NTRS)

Wu, Binghui Helen; Wharton, Stephen (Technical Monitor)

2000-01-01

Query processing is the most common operation in a DBMS. Sophisticated query processing has been mainly targeted at a single enterprise environment providing centralized control over data and metadata. Submitting queries by anonymous users on the web is different in such a way that load balancing or DBMS' accessing control becomes the key issue. This paper provides a solution by introducing a framework for WWW query processing. The success of this framework lies in the utilization of query optimization techniques and the ontological approach. This methodology has proved to be cost effective at the NASA Goddard Space Flight Center Distributed Active Archive Center (GDAAC).
Agent-Based Framework for Discrete Entity Simulations

DTIC Science & Technology

2006-11-01

Postgres database server for environment queries of neighbors and continuum data. As expected for raw database queries (no database optimizations in...form. Eventually the code was ported to GNU C++ on the same single Intel Pentium 4 CPU running RedHat Linux 9.0 and Postgres database server...Again Postgres was used for environmental queries, and the tool remained relatively slow because of the immense number of queries necessary to assess
A New Framework for Textual Information Mining over Parse Trees. CRESST Report 805

ERIC Educational Resources Information Center

Mousavi, Hamid; Kerr, Deirdre; Iseli, Markus R.

2011-01-01

Textual information mining is a challenging problem that has resulted in the creation of many different rule-based linguistic query languages. However, these languages generally are not optimized for the purpose of text mining. In other words, they usually consider queries as individuals and only return raw results for each query. Moreover they…
Histogram Analysis of Diffusion Weighted Imaging at 3T is Useful for Prediction of Lymphatic Metastatic Spread, Proliferative Activity, and Cellularity in Thyroid Cancer.

PubMed

Schob, Stefan; Meyer, Hans Jonas; Dieckow, Julia; Pervinder, Bhogal; Pazaitis, Nikolaos; Höhn, Anne Kathrin; Garnov, Nikita; Horvath-Rizea, Diana; Hoffmann, Karl-Titus; Surov, Alexey

2017-04-12

Pre-surgical diffusion weighted imaging (DWI) is increasingly important in the context of thyroid cancer for identification of the optimal treatment strategy. It has exemplarily been shown that DWI at 3T can distinguish undifferentiated from well-differentiated thyroid carcinoma, which has decisive implications for the magnitude of surgery. This study used DWI histogram analysis of whole tumor apparent diffusion coefficient (ADC) maps. The primary aim was to discriminate thyroid carcinomas which had already gained the capacity to metastasize lymphatically from those not yet being able to spread via the lymphatic system. The secondary aim was to reflect prognostically important tumor-biological features like cellularity and proliferative activity with ADC histogram analysis. Fifteen patients with follicular-cell derived thyroid cancer were enrolled. Lymph node status, extent of infiltration of surrounding tissue, and Ki-67 and p53 expression were assessed in these patients. DWI was obtained in a 3T system using b values of 0, 400, and 800 s/mm². Whole tumor ADC volumes were analyzed using a histogram-based approach. Several ADC parameters showed significant correlations with immunohistopathological parameters. Most importantly, ADC histogram skewness and ADC histogram kurtosis were able to differentiate between nodal negative and nodal positive thyroid carcinoma. histogram analysis of whole ADC tumor volumes has the potential to provide valuable information on tumor biology in thyroid carcinoma. However, further studies are warranted.
Histogram Analysis of Diffusion Weighted Imaging at 3T is Useful for Prediction of Lymphatic Metastatic Spread, Proliferative Activity, and Cellularity in Thyroid Cancer

PubMed Central

Schob, Stefan; Meyer, Hans Jonas; Dieckow, Julia; Pervinder, Bhogal; Pazaitis, Nikolaos; Höhn, Anne Kathrin; Garnov, Nikita; Horvath-Rizea, Diana; Hoffmann, Karl-Titus; Surov, Alexey

2017-01-01

Pre-surgical diffusion weighted imaging (DWI) is increasingly important in the context of thyroid cancer for identification of the optimal treatment strategy. It has exemplarily been shown that DWI at 3T can distinguish undifferentiated from well-differentiated thyroid carcinoma, which has decisive implications for the magnitude of surgery. This study used DWI histogram analysis of whole tumor apparent diffusion coefficient (ADC) maps. The primary aim was to discriminate thyroid carcinomas which had already gained the capacity to metastasize lymphatically from those not yet being able to spread via the lymphatic system. The secondary aim was to reflect prognostically important tumor-biological features like cellularity and proliferative activity with ADC histogram analysis. Fifteen patients with follicular-cell derived thyroid cancer were enrolled. Lymph node status, extent of infiltration of surrounding tissue, and Ki-67 and p53 expression were assessed in these patients. DWI was obtained in a 3T system using b values of 0, 400, and 800 s/mm2. Whole tumor ADC volumes were analyzed using a histogram-based approach. Several ADC parameters showed significant correlations with immunohistopathological parameters. Most importantly, ADC histogram skewness and ADC histogram kurtosis were able to differentiate between nodal negative and nodal positive thyroid carcinoma. Conclusions: histogram analysis of whole ADC tumor volumes has the potential to provide valuable information on tumor biology in thyroid carcinoma. However, further studies are warranted. PMID:28417929
Facilitating Cohort Discovery by Enhancing Ontology Exploration, Query Management and Query Sharing for Large Clinical Data Repositories.

PubMed

Tao, Shiqiang; Cui, Licong; Wu, Xi; Zhang, Guo-Qiang

2017-01-01

To help researchers better access clinical data, we developed a prototype query engine called DataSphere for exploring large-scale integrated clinical data repositories. DataSphere expedites data importing using a NoSQL data management system and dynamically renders its user interface for concept-based querying tasks. DataSphere provides an interactive query-building interface together with query translation and optimization strategies, which enable users to build and execute queries effectively and efficiently. We successfully loaded a dataset of one million patients for University of Kentucky (UK) Healthcare into DataSphere with more than 300 million clinical data records. We evaluated DataSphere by comparing it with an instance of i2b2 deployed at UK Healthcare, demonstrating that DataSphere provides enhanced user experience for both query building and execution.
Facilitating Cohort Discovery by Enhancing Ontology Exploration, Query Management and Query Sharing for Large Clinical Data Repositories

PubMed Central

Tao, Shiqiang; Cui, Licong; Wu, Xi; Zhang, Guo-Qiang

2017-01-01

To help researchers better access clinical data, we developed a prototype query engine called DataSphere for exploring large-scale integrated clinical data repositories. DataSphere expedites data importing using a NoSQL data management system and dynamically renders its user interface for concept-based querying tasks. DataSphere provides an interactive query-building interface together with query translation and optimization strategies, which enable users to build and execute queries effectively and efficiently. We successfully loaded a dataset of one million patients for University of Kentucky (UK) Healthcare into DataSphere with more than 300 million clinical data records. We evaluated DataSphere by comparing it with an instance of i2b2 deployed at UK Healthcare, demonstrating that DataSphere provides enhanced user experience for both query building and execution. PMID:29854239
Poster — Thur Eve — 69: Computational Study of DVH-guided Cancer Treatment Planning Optimization Methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ghomi, Pooyan Shirvani; Zinchenko, Yuriy

2014-08-15

Purpose: To compare methods to incorporate the Dose Volume Histogram (DVH) curves into the treatment planning optimization. Method: The performance of three methods, namely, the conventional Mixed Integer Programming (MIP) model, a convex moment-based constrained optimization approach, and an unconstrained convex moment-based penalty approach, is compared using anonymized data of a prostate cancer patient. Three plans we generated using the corresponding optimization models. Four Organs at Risk (OARs) and one Tumor were involved in the treatment planning. The OARs and Tumor were discretized into total of 50,221 voxels. The number of beamlets was 943. We used commercially available optimization softwaremore » Gurobi and Matlab to solve the models. Plan comparison was done by recording the model runtime followed by visual inspection of the resulting dose volume histograms. Conclusion: We demonstrate the effectiveness of the moment-based approaches to replicate the set of prescribed DVH curves. The unconstrained convex moment-based penalty approach is concluded to have the greatest potential to reduce the computational effort and holds a promise of substantial computational speed up.« less
Nanocubes for real-time exploration of spatiotemporal datasets.

PubMed

Lins, Lauro; Klosowski, James T; Scheidegger, Carlos

2013-12-01

Consider real-time exploration of large multidimensional spatiotemporal datasets with billions of entries, each defined by a location, a time, and other attributes. Are certain attributes correlated spatially or temporally? Are there trends or outliers in the data? Answering these questions requires aggregation over arbitrary regions of the domain and attributes of the data. Many relational databases implement the well-known data cube aggregation operation, which in a sense precomputes every possible aggregate query over the database. Data cubes are sometimes assumed to take a prohibitively large amount of space, and to consequently require disk storage. In contrast, we show how to construct a data cube that fits in a modern laptop's main memory, even for billions of entries; we call this data structure a nanocube. We present algorithms to compute and query a nanocube, and show how it can be used to generate well-known visual encodings such as heatmaps, histograms, and parallel coordinate plots. When compared to exact visualizations created by scanning an entire dataset, nanocube plots have bounded screen error across a variety of scales, thanks to a hierarchical structure in space and time. We demonstrate the effectiveness of our technique on a variety of real-world datasets, and present memory, timing, and network bandwidth measurements. We find that the timings for the queries in our examples are dominated by network and user-interaction latencies.
Query optimization for graph analytics on linked data using SPARQL

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hong, Seokyong; Lee, Sangkeun; Lim, Seung -Hwan

2015-07-01

Triplestores that support query languages such as SPARQL are emerging as the preferred and scalable solution to represent data and meta-data as massive heterogeneous graphs using Semantic Web standards. With increasing adoption, the desire to conduct graph-theoretic mining and exploratory analysis has also increased. Addressing that desire, this paper presents a solution that is the marriage of Graph Theory and the Semantic Web. We present software that can analyze Linked Data using graph operations such as counting triangles, finding eccentricity, testing connectedness, and computing PageRank directly on triple stores via the SPARQL interface. We describe the process of optimizing performancemore » of the SPARQL-based implementation of such popular graph algorithms by reducing the space-overhead, simplifying iterative complexity and removing redundant computations by understanding query plans. Our optimized approach shows significant performance gains on triplestores hosted on stand-alone workstations as well as hardware-optimized scalable supercomputers such as the Cray XMT.« less
Optimizing Interactive Development of Data-Intensive Applications

PubMed Central

Interlandi, Matteo; Tetali, Sai Deep; Gulzar, Muhammad Ali; Noor, Joseph; Condie, Tyson; Kim, Miryung; Millstein, Todd

2017-01-01

Modern Data-Intensive Scalable Computing (DISC) systems are designed to process data through batch jobs that execute programs (e.g., queries) compiled from a high-level language. These programs are often developed interactively by posing ad-hoc queries over the base data until a desired result is generated. We observe that there can be significant overlap in the structure of these queries used to derive the final program. Yet, each successive execution of a slightly modified query is performed anew, which can significantly increase the development cycle. Vega is an Apache Spark framework that we have implemented for optimizing a series of similar Spark programs, likely originating from a development or exploratory data analysis session. Spark developers (e.g., data scientists) can leverage Vega to significantly reduce the amount of time it takes to re-execute a modified Spark program, reducing the overall time to market for their Big Data applications. PMID:28405637
Numerically accurate computational techniques for optimal estimator analyses of multi-parameter models

NASA Astrophysics Data System (ADS)

Berger, Lukas; Kleinheinz, Konstantin; Attili, Antonio; Bisetti, Fabrizio; Pitsch, Heinz; Mueller, Michael E.

2018-05-01

Modelling unclosed terms in partial differential equations typically involves two steps: First, a set of known quantities needs to be specified as input parameters for a model, and second, a specific functional form needs to be defined to model the unclosed terms by the input parameters. Both steps involve a certain modelling error, with the former known as the irreducible error and the latter referred to as the functional error. Typically, only the total modelling error, which is the sum of functional and irreducible error, is assessed, but the concept of the optimal estimator enables the separate analysis of the total and the irreducible errors, yielding a systematic modelling error decomposition. In this work, attention is paid to the techniques themselves required for the practical computation of irreducible errors. Typically, histograms are used for optimal estimator analyses, but this technique is found to add a non-negligible spurious contribution to the irreducible error if models with multiple input parameters are assessed. Thus, the error decomposition of an optimal estimator analysis becomes inaccurate, and misleading conclusions concerning modelling errors may be drawn. In this work, numerically accurate techniques for optimal estimator analyses are identified and a suitable evaluation of irreducible errors is presented. Four different computational techniques are considered: a histogram technique, artificial neural networks, multivariate adaptive regression splines, and an additive model based on a kernel method. For multiple input parameter models, only artificial neural networks and multivariate adaptive regression splines are found to yield satisfactorily accurate results. Beyond a certain number of input parameters, the assessment of models in an optimal estimator analysis even becomes practically infeasible if histograms are used. The optimal estimator analysis in this paper is applied to modelling the filtered soot intermittency in large eddy simulations using a dataset of a direct numerical simulation of a non-premixed sooting turbulent flame.
An incremental database access method for autonomous interoperable databases

NASA Technical Reports Server (NTRS)

Roussopoulos, Nicholas; Sellis, Timos

1994-01-01

We investigated a number of design and performance issues of interoperable database management systems (DBMS's). The major results of our investigation were obtained in the areas of client-server database architectures for heterogeneous DBMS's, incremental computation models, buffer management techniques, and query optimization. We finished a prototype of an advanced client-server workstation-based DBMS which allows access to multiple heterogeneous commercial DBMS's. Experiments and simulations were then run to compare its performance with the standard client-server architectures. The focus of this research was on adaptive optimization methods of heterogeneous database systems. Adaptive buffer management accounts for the random and object-oriented access methods for which no known characterization of the access patterns exists. Adaptive query optimization means that value distributions and selectives, which play the most significant role in query plan evaluation, are continuously refined to reflect the actual values as opposed to static ones that are computed off-line. Query feedback is a concept that was first introduced to the literature by our group. We employed query feedback for both adaptive buffer management and for computing value distributions and selectivities. For adaptive buffer management, we use the page faults of prior executions to achieve more 'informed' management decisions. For the estimation of the distributions of the selectivities, we use curve-fitting techniques, such as least squares and splines, for regressing on these values.
Query Optimization by Semantic Reasoning.

DTIC Science & Technology

1981-05-01

condition holds, then formulas X and Y are said to be ,nerge-compatible. Let xi be the variable in X that corresponds to variable yj in Y (x is not...Davidson, Ramez EI-Masri, Sheldon Finkelstein, Hector Garcia, Mohammed Olumi, Tom Rogers, Neil Rowe, David Shaw, and Kyu-Young Whang . Special credit...for the simple queries, along with cost formulas and applicability conditions for the methods. Most recently has come the development of optimizers for

Parallel multi-join query optimization algorithm for distributed sensor network in the internet of things

NASA Astrophysics Data System (ADS)

Zheng, Yan

2015-03-01

Internet of things (IoT), focusing on providing users with information exchange and intelligent control, attracts a lot of attention of researchers from all over the world since the beginning of this century. IoT is consisted of large scale of sensor nodes and data processing units, and the most important features of IoT can be illustrated as energy confinement, efficient communication and high redundancy. With the sensor nodes increment, the communication efficiency and the available communication band width become bottle necks. Many research work is based on the instance which the number of joins is less. However, it is not proper to the increasing multi-join query in whole internet of things. To improve the communication efficiency between parallel units in the distributed sensor network, this paper proposed parallel query optimization algorithm based on distribution attributes cost graph. The storage information relations and the network communication cost are considered in this algorithm, and an optimized information changing rule is established. The experimental result shows that the algorithm has good performance, and it would effectively use the resource of each node in the distributed sensor network. Therefore, executive efficiency of multi-join query between different nodes could be improved.
A knowledge-based approach to improving and homogenizing intensity modulated radiation therapy planning quality among treatment centers: an example application to prostate cancer planning.

PubMed

Good, David; Lo, Joseph; Lee, W Robert; Wu, Q Jackie; Yin, Fang-Fang; Das, Shiva K

2013-09-01

Intensity modulated radiation therapy (IMRT) treatment planning can have wide variation among different treatment centers. We propose a system to leverage the IMRT planning experience of larger institutions to automatically create high-quality plans for outside clinics. We explore feasibility by generating plans for patient datasets from an outside institution by adapting plans from our institution. A knowledge database was created from 132 IMRT treatment plans for prostate cancer at our institution. The outside institution, a community hospital, provided the datasets for 55 prostate cancer cases, including their original treatment plans. For each "query" case from the outside institution, a similar "match" case was identified in the knowledge database, and the match case's plan parameters were then adapted and optimized to the query case by use of a semiautomated approach that required no expert planning knowledge. The plans generated with this knowledge-based approach were compared with the original treatment plans at several dose cutpoints. Compared with the original plan, the knowledge-based plan had a significantly more homogeneous dose to the planning target volume and a significantly lower maximum dose. The volumes of the rectum, bladder, and femoral heads above all cutpoints were nominally lower for the knowledge-based plan; the reductions were significantly lower for the rectum. In 40% of cases, the knowledge-based plan had overall superior (lower) dose-volume histograms for rectum and bladder; in 54% of cases, the comparison was equivocal; in 6% of cases, the knowledge-based plan was inferior for both bladder and rectum. Knowledge-based planning was superior or equivalent to the original plan in 95% of cases. The knowledge-based approach shows promise for homogenizing plan quality by transferring planning expertise from more experienced to less experienced institutions. Copyright © 2013 Elsevier Inc. All rights reserved.
Improving the convergence rate in affine registration of PET and SPECT brain images using histogram equalization.

PubMed

Salas-Gonzalez, D; Górriz, J M; Ramírez, J; Padilla, P; Illán, I A

2013-01-01

A procedure to improve the convergence rate for affine registration methods of medical brain images when the images differ greatly from the template is presented. The methodology is based on a histogram matching of the source images with respect to the reference brain template before proceeding with the affine registration. The preprocessed source brain images are spatially normalized to a template using a general affine model with 12 parameters. A sum of squared differences between the source images and the template is considered as objective function, and a Gauss-Newton optimization algorithm is used to find the minimum of the cost function. Using histogram equalization as a preprocessing step improves the convergence rate in the affine registration algorithm of brain images as we show in this work using SPECT and PET brain images.
Parasol: An Architecture for Cross-Cloud Federated Graph Querying

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lieberman, Michael; Choudhury, Sutanay; Hughes, Marisa

2014-06-22

Large scale data fusion of multiple datasets can often provide in- sights that examining datasets individually cannot. However, when these datasets reside in different data centers and cannot be collocated due to technical, administrative, or policy barriers, a unique set of problems arise that hamper querying and data fusion. To ad- dress these problems, a system and architecture named Parasol is presented that enables federated queries over graph databases residing in multiple clouds. Parasol’s design is flexible and requires only minimal assumptions for participant clouds. Query optimization techniques are also described that are compatible with Parasol’s lightweight architecture. Experiments onmore » a prototype implementation of Parasol indicate its suitability for cross-cloud federated graph queries.« less
Demonstration of Hadoop-GIS: A Spatial Data Warehousing System Over MapReduce.

PubMed

Aji, Ablimit; Sun, Xiling; Vo, Hoang; Liu, Qioaling; Lee, Rubao; Zhang, Xiaodong; Saltz, Joel; Wang, Fusheng

2013-11-01

The proliferation of GPS-enabled devices, and the rapid improvement of scientific instruments have resulted in massive amounts of spatial data in the last decade. Support of high performance spatial queries on large volumes data has become increasingly important in numerous fields, which requires a scalable and efficient spatial data warehousing solution as existing approaches exhibit scalability limitations and efficiency bottlenecks for large scale spatial applications. In this demonstration, we present Hadoop-GIS - a scalable and high performance spatial query system over MapReduce. Hadoop-GIS provides an efficient spatial query engine to process spatial queries, data and space based partitioning, and query pipelines that parallelize queries implicitly on MapReduce. Hadoop-GIS also provides an expressive, SQL-like spatial query language for workload specification. We will demonstrate how spatial queries are expressed in spatially extended SQL queries, and submitted through a command line/web interface for execution. Parallel to our system demonstration, we explain the system architecture and details on how queries are translated to MapReduce operators, optimized, and executed on Hadoop. In addition, we will showcase how the system can be used to support two representative real world use cases: large scale pathology analytical imaging, and geo-spatial data warehousing.
Usage of the Jess Engine, Rules and Ontology to Query a Relational Database

NASA Astrophysics Data System (ADS)

Bak, Jaroslaw; Jedrzejek, Czeslaw; Falkowski, Maciej

We present a prototypical implementation of a library tool, the Semantic Data Library (SDL), which integrates the Jess (Java Expert System Shell) engine, rules and ontology to query a relational database. The tool extends functionalities of previous OWL2Jess with SWRL implementations and takes full advantage of the Jess engine, by separating forward and backward reasoning. The optimization of integration of all these technologies is an advancement over previous tools. We discuss the complexity of the query algorithm. As a demonstration of capability of the SDL library, we execute queries using crime ontology which is being developed in the Polish PPBW project.
A survey and evaluations of histogram-based statistics in alignment-free sequence comparison.

PubMed

Luczak, Brian B; James, Benjamin T; Girgis, Hani Z

2017-12-06

Since the dawn of the bioinformatics field, sequence alignment scores have been the main method for comparing sequences. However, alignment algorithms are quadratic, requiring long execution time. As alternatives, scientists have developed tens of alignment-free statistics for measuring the similarity between two sequences. We surveyed tens of alignment-free k-mer statistics. Additionally, we evaluated 33 statistics and multiplicative combinations between the statistics and/or their squares. These statistics are calculated on two k-mer histograms representing two sequences. Our evaluations using global alignment scores revealed that the majority of the statistics are sensitive and capable of finding similar sequences to a query sequence. Therefore, any of these statistics can filter out dissimilar sequences quickly. Further, we observed that multiplicative combinations of the statistics are highly correlated with the identity score. Furthermore, combinations involving sequence length difference or Earth Mover's distance, which takes the length difference into account, are always among the highest correlated paired statistics with identity scores. Similarly, paired statistics including length difference or Earth Mover's distance are among the best performers in finding the K-closest sequences. Interestingly, similar performance can be obtained using histograms of shorter words, resulting in reducing the memory requirement and increasing the speed remarkably. Moreover, we found that simple single statistics are sufficient for processing next-generation sequencing reads and for applications relying on local alignment. Finally, we measured the time requirement of each statistic. The survey and the evaluations will help scientists with identifying efficient alternatives to the costly alignment algorithm, saving thousands of computational hours. The source code of the benchmarking tool is available as Supplementary Materials. © The Author 2017. Published by Oxford University Press.
True progression versus pseudoprogression in the treatment of glioblastomas: a comparison study of normalized cerebral blood volume and apparent diffusion coefficient by histogram analysis.

PubMed

Song, Yong Sub; Choi, Seung Hong; Park, Chul-Kee; Yi, Kyung Sik; Lee, Woong Jae; Yun, Tae Jin; Kim, Tae Min; Lee, Se-Hoon; Kim, Ji-Hoon; Sohn, Chul-Ho; Park, Sung-Hye; Kim, Il Han; Jahng, Geon-Ho; Chang, Kee-Hyun

2013-01-01

The purpose of this study was to differentiate true progression from pseudoprogression of glioblastomas treated with concurrent chemoradiotherapy (CCRT) with temozolomide (TMZ) by using histogram analysis of apparent diffusion coefficient (ADC) and normalized cerebral blood volume (nCBV) maps. Twenty patients with histopathologically proven glioblastoma who had received CCRT with TMZ underwent perfusion-weighted imaging and diffusion-weighted imaging (b = 0, 1000 sec/mm(2)). The corresponding nCBV and ADC maps for the newly visible, entirely enhancing lesions were calculated after the completion of CCRT with TMZ. Two observers independently measured the histogram parameters of the nCBV and ADC maps. The histogram parameters between the true progression group (n = 10) and the pseudoprogression group (n = 10) were compared by use of an unpaired Student's t test and subsequent multivariable stepwise logistic regression analysis to determine the best predictors for the differential diagnosis between the two groups. Receiver operating characteristic analysis was employed to determine the best cutoff values for the histogram parameters that proved to be significant predictors for differentiating true progression from pseudoprogression. Intraclass correlation coefficient was used to determine the level of inter-observer reliability for the histogram parameters. The 5th percentile value (C5) of the cumulative ADC histograms was a significant predictor for the differential diagnosis between true progression and pseudoprogression (p = 0.044 for observer 1; p = 0.011 for observer 2). Optimal cutoff values of 892 × 10(-6) mm(2)/sec for observer 1 and 907 × 10(-6) mm(2)/sec for observer 2 could help differentiate between the two groups with a sensitivity of 90% and 80%, respectively, a specificity of 90% and 80%, respectively, and an area under the curve of 0.880 and 0.840, respectively. There was no other significant differentiating parameter on the nCBV histograms. Inter-observer reliability was excellent or good for all histogram parameters (intraclass correlation coefficient range: 0.70-0.99). The C5 of the cumulative ADC histogram can be a promising parameter for the differentiation of true progression from pseudoprogression of newly visible, entirely enhancing lesions after CCRT with TMZ for glioblastomas.
True Progression versus Pseudoprogression in the Treatment of Glioblastomas: A Comparison Study of Normalized Cerebral Blood Volume and Apparent Diffusion Coefficient by Histogram Analysis

PubMed Central

Song, Yong Sub; Park, Chul-Kee; Yi, Kyung Sik; Lee, Woong Jae; Yun, Tae Jin; Kim, Tae Min; Lee, Se-Hoon; Kim, Ji-Hoon; Sohn, Chul-Ho; Park, Sung-Hye; Kim, Il Han; Jahng, Geon-Ho; Chang, Kee-Hyun

2013-01-01

Objective The purpose of this study was to differentiate true progression from pseudoprogression of glioblastomas treated with concurrent chemoradiotherapy (CCRT) with temozolomide (TMZ) by using histogram analysis of apparent diffusion coefficient (ADC) and normalized cerebral blood volume (nCBV) maps. Materials and Methods Twenty patients with histopathologically proven glioblastoma who had received CCRT with TMZ underwent perfusion-weighted imaging and diffusion-weighted imaging (b = 0, 1000 sec/mm2). The corresponding nCBV and ADC maps for the newly visible, entirely enhancing lesions were calculated after the completion of CCRT with TMZ. Two observers independently measured the histogram parameters of the nCBV and ADC maps. The histogram parameters between the true progression group (n = 10) and the pseudoprogression group (n = 10) were compared by use of an unpaired Student's t test and subsequent multivariable stepwise logistic regression analysis to determine the best predictors for the differential diagnosis between the two groups. Receiver operating characteristic analysis was employed to determine the best cutoff values for the histogram parameters that proved to be significant predictors for differentiating true progression from pseudoprogression. Intraclass correlation coefficient was used to determine the level of inter-observer reliability for the histogram parameters. Results The 5th percentile value (C5) of the cumulative ADC histograms was a significant predictor for the differential diagnosis between true progression and pseudoprogression (p = 0.044 for observer 1; p = 0.011 for observer 2). Optimal cutoff values of 892 × 10-6 mm2/sec for observer 1 and 907 × 10-6 mm2/sec for observer 2 could help differentiate between the two groups with a sensitivity of 90% and 80%, respectively, a specificity of 90% and 80%, respectively, and an area under the curve of 0.880 and 0.840, respectively. There was no other significant differentiating parameter on the nCBV histograms. Inter-observer reliability was excellent or good for all histogram parameters (intraclass correlation coefficient range: 0.70-0.99). Conclusion The C5 of the cumulative ADC histogram can be a promising parameter for the differentiation of true progression from pseudoprogression of newly visible, entirely enhancing lesions after CCRT with TMZ for glioblastomas. PMID:23901325
Sample Training Based Wildfire Segmentation by 2D Histogram θ-Division with Minimum Error

PubMed Central

Dong, Erqian; Sun, Mingui; Jia, Wenyan; Zhang, Dengyi; Yuan, Zhiyong

2013-01-01

A novel wildfire segmentation algorithm is proposed with the help of sample training based 2D histogram θ-division and minimum error. Based on minimum error principle and 2D color histogram, the θ-division methods were presented recently, but application of prior knowledge on them has not been explored. For the specific problem of wildfire segmentation, we collect sample images with manually labeled fire pixels. Then we define the probability function of error division to evaluate θ-division segmentations, and the optimal angle θ is determined by sample training. Performances in different color channels are compared, and the suitable channel is selected. To further improve the accuracy, the combination approach is presented with both θ-division and other segmentation methods such as GMM. Our approach is tested on real images, and the experiments prove its efficiency for wildfire segmentation. PMID:23878526
A method for real-time implementation of HOG feature extraction

NASA Astrophysics Data System (ADS)

Luo, Hai-bo; Yu, Xin-rong; Liu, Hong-mei; Ding, Qing-hai

2011-08-01

Histogram of oriented gradient (HOG) is an efficient feature extraction scheme, and HOG descriptors are feature descriptors which is widely used in computer vision and image processing for the purpose of biometrics, target tracking, automatic target detection(ATD) and automatic target recognition(ATR) etc. However, computation of HOG feature extraction is unsuitable for hardware implementation since it includes complicated operations. In this paper, the optimal design method and theory frame for real-time HOG feature extraction based on FPGA were proposed. The main principle is as follows: firstly, the parallel gradient computing unit circuit based on parallel pipeline structure was designed. Secondly, the calculation of arctangent and square root operation was simplified. Finally, a histogram generator based on parallel pipeline structure was designed to calculate the histogram of each sub-region. Experimental results showed that the HOG extraction can be implemented in a pixel period by these computing units.
Insight on AV-45 binding in white and grey matter from histogram analysis: a study on early Alzheimer's disease patients and healthy subjects

PubMed Central

Nemmi, Federico; Saint-Aubert, Laure; Adel, Djilali; Salabert, Anne-Sophie; Pariente, Jérémie; Barbeau, Emmanuel; Payoux, Pierre; Péran, Patrice

2014-01-01

Purpose AV-45 amyloid biomarker is known to show uptake in white matter in patients with Alzheimer’s disease (AD) but also in healthy population. This binding; thought to be of a non-specific lipophilic nature has not yet been investigated. The aim of this study was to determine the differential pattern of AV-45 binding in healthy and pathological populations in white matter. Methods We recruited 24 patients presenting with AD at early stage and 17 matched, healthy subjects. We used an optimized PET-MRI registration method and an approach based on intensity histogram using several indexes. We compared the results of the intensity histogram analyses with a more canonical approach based on target-to-cerebellum Standard Uptake Value (SUVr) in white and grey matters using MANOVA and discriminant analyses. A cluster analysis on white and grey matter histograms was also performed. Results White matter histogram analysis revealed significant differences between AD and healthy subjects, which were not revealed by SUVr analysis. However, white matter histograms was not decisive to discriminate groups, and indexes based on grey matter only showed better discriminative power than SUVr. The cluster analysis divided our sample in two clusters, showing different uptakes in grey but also in white matter. Conclusion These results demonstrate that AV-45 binding in white matter conveys subtle information not detectable using SUVr approach. Although it is not better than standard SUVr to discriminate AD patients from healthy subjects, this information could reveal white matter modifications. PMID:24573658
Functional annotation by sequence-weighted structure alignments: statistical analysis and case studies from the Protein 3000 structural genomics project in Japan.

PubMed

Standley, Daron M; Toh, Hiroyuki; Nakamura, Haruki

2008-09-01

A method to functionally annotate structural genomics targets, based on a novel structural alignment scoring function, is proposed. In the proposed score, position-specific scoring matrices are used to weight structurally aligned residue pairs to highlight evolutionarily conserved motifs. The functional form of the score is first optimized for discriminating domains belonging to the same Pfam family from domains belonging to different families but the same CATH or SCOP superfamily. In the optimization stage, we consider four standard weighting functions as well as our own, the "maximum substitution probability," and combinations of these functions. The optimized score achieves an area of 0.87 under the receiver-operating characteristic curve with respect to identifying Pfam families within a sequence-unique benchmark set of domain pairs. Confidence measures are then derived from the benchmark distribution of true-positive scores. The alignment method is next applied to the task of functionally annotating 230 query proteins released to the public as part of the Protein 3000 structural genomics project in Japan. Of these queries, 78 were found to align to templates with the same Pfam family as the query or had sequence identities > or = 30%. Another 49 queries were found to match more distantly related templates. Within this group, the template predicted by our method to be the closest functional relative was often not the most structurally similar. Several nontrivial cases are discussed in detail. Finally, 103 queries matched templates at the fold level, but not the family or superfamily level, and remain functionally uncharacterized. 2008 Wiley-Liss, Inc.
Dynamic Querying of Mass-Storage RDF Data with Rule-Based Entailment Regimes

NASA Astrophysics Data System (ADS)

Ianni, Giovambattista; Krennwallner, Thomas; Martello, Alessandra; Polleres, Axel

RDF Schema (RDFS) as a lightweight ontology language is gaining popularity and, consequently, tools for scalable RDFS inference and querying are needed. SPARQL has become recently a W3C standard for querying RDF data, but it mostly provides means for querying simple RDF graphs only, whereas querying with respect to RDFS or other entailment regimes is left outside the current specification. In this paper, we show that SPARQL faces certain unwanted ramifications when querying ontologies in conjunction with RDF datasets that comprise multiple named graphs, and we provide an extension for SPARQL that remedies these effects. Moreover, since RDFS inference has a close relationship with logic rules, we generalize our approach to select a custom ruleset for specifying inferences to be taken into account in a SPARQL query. We show that our extensions are technically feasible by providing benchmark results for RDFS querying in our prototype system GiaBATA, which uses Datalog coupled with a persistent Relational Database as a back-end for implementing SPARQL with dynamic rule-based inference. By employing different optimization techniques like magic set rewriting our system remains competitive with state-of-the-art RDFS querying systems.
Demonstration of Hadoop-GIS: A Spatial Data Warehousing System Over MapReduce

PubMed Central

Aji, Ablimit; Sun, Xiling; Vo, Hoang; Liu, Qioaling; Lee, Rubao; Zhang, Xiaodong; Saltz, Joel; Wang, Fusheng

2016-01-01

The proliferation of GPS-enabled devices, and the rapid improvement of scientific instruments have resulted in massive amounts of spatial data in the last decade. Support of high performance spatial queries on large volumes data has become increasingly important in numerous fields, which requires a scalable and efficient spatial data warehousing solution as existing approaches exhibit scalability limitations and efficiency bottlenecks for large scale spatial applications. In this demonstration, we present Hadoop-GIS – a scalable and high performance spatial query system over MapReduce. Hadoop-GIS provides an efficient spatial query engine to process spatial queries, data and space based partitioning, and query pipelines that parallelize queries implicitly on MapReduce. Hadoop-GIS also provides an expressive, SQL-like spatial query language for workload specification. We will demonstrate how spatial queries are expressed in spatially extended SQL queries, and submitted through a command line/web interface for execution. Parallel to our system demonstration, we explain the system architecture and details on how queries are translated to MapReduce operators, optimized, and executed on Hadoop. In addition, we will showcase how the system can be used to support two representative real world use cases: large scale pathology analytical imaging, and geo-spatial data warehousing. PMID:27617325
Developing A Web-based User Interface for Semantic Information Retrieval

NASA Technical Reports Server (NTRS)

Berrios, Daniel C.; Keller, Richard M.

2003-01-01

While there are now a number of languages and frameworks that enable computer-based systems to search stored data semantically, the optimal design for effective user interfaces for such systems is still uncle ar. Such interfaces should mask unnecessary query detail from users, yet still allow them to build queries of arbitrary complexity without significant restrictions. We developed a user interface supporting s emantic query generation for Semanticorganizer, a tool used by scient ists and engineers at NASA to construct networks of knowledge and dat a. Through this interface users can select node types, node attribute s and node links to build ad-hoc semantic queries for searching the S emanticOrganizer network.
Pathogen metadata platform: software for accessing and analyzing pathogen strain information.

PubMed

Chang, Wenling E; Peterson, Matthew W; Garay, Christopher D; Korves, Tonia

2016-09-15

Pathogen metadata includes information about where and when a pathogen was collected and the type of environment it came from. Along with genomic nucleotide sequence data, this metadata is growing rapidly and becoming a valuable resource not only for research but for biosurveillance and public health. However, current freely available tools for analyzing this data are geared towards bioinformaticians and/or do not provide summaries and visualizations needed to readily interpret results. We designed a platform to easily access and summarize data about pathogen samples. The software includes a PostgreSQL database that captures metadata useful for disease outbreak investigations, and scripts for downloading and parsing data from NCBI BioSample and BioProject into the database. The software provides a user interface to query metadata and obtain standardized results in an exportable, tab-delimited format. To visually summarize results, the user interface provides a 2D histogram for user-selected metadata types and mapping of geolocated entries. The software is built on the LabKey data platform, an open-source data management platform, which enables developers to add functionalities. We demonstrate the use of the software in querying for a pathogen serovar and for genome sequence identifiers. This software enables users to create a local database for pathogen metadata, populate it with data from NCBI, easily query the data, and obtain visual summaries. Some of the components, such as the database, are modular and can be incorporated into other data platforms. The source code is freely available for download at https://github.com/wchangmitre/bioattribution .
The Localized Discovery and Recovery for Query Packet Losses in Wireless Sensor Networks with Distributed Detector Clusters

PubMed Central

Teng, Rui; Leibnitz, Kenji; Miura, Ryu

2013-01-01

An essential application of wireless sensor networks is to successfully respond to user queries. Query packet losses occur in the query dissemination due to wireless communication problems such as interference, multipath fading, packet collisions, etc. The losses of query messages at sensor nodes result in the failure of sensor nodes reporting the requested data. Hence, the reliable and successful dissemination of query messages to sensor nodes is a non-trivial problem. The target of this paper is to enable highly successful query delivery to sensor nodes by localized and energy-efficient discovery, and recovery of query losses. We adopt local and collective cooperation among sensor nodes to increase the success rate of distributed discoveries and recoveries. To enable the scalability in the operations of discoveries and recoveries, we employ a distributed name resolution mechanism at each sensor node to allow sensor nodes to self-detect the correlated queries and query losses, and then efficiently locally respond to the query losses. We prove that the collective discovery of query losses has a high impact on the success of query dissemination and reveal that scalability can be achieved by using the proposed approach. We further study the novel features of the cooperation and competition in the collective recovery at PHY and MAC layers, and show that the appropriate number of detectors can achieve optimal successful recovery rate. We evaluate the proposed approach with both mathematical analyses and computer simulations. The proposed approach enables a high rate of successful delivery of query messages and it results in short route lengths to recover from query losses. The proposed approach is scalable and operates in a fully distributed manner. PMID:23748172
Optimizing Maintenance of Constraint-Based Database Caches

NASA Astrophysics Data System (ADS)

Klein, Joachim; Braun, Susanne

Caching data reduces user-perceived latency and often enhances availability in case of server crashes or network failures. DB caching aims at local processing of declarative queries in a DBMS-managed cache close to the application. Query evaluation must produce the same results as if done at the remote database backend, which implies that all data records needed to process such a query must be present and controlled by the cache, i. e., to achieve “predicate-specific” loading and unloading of such record sets. Hence, cache maintenance must be based on cache constraints such that “predicate completeness” of the caching units currently present can be guaranteed at any point in time. We explore how cache groups can be maintained to provide the data currently needed. Moreover, we design and optimize loading and unloading algorithms for sets of records keeping the caching units complete, before we empirically identify the costs involved in cache maintenance.
Evaluation of hybrid inverse planning and optimization (HIPO) algorithm for optimization in real-time, high-dose-rate (HDR) brachytherapy for prostate.

PubMed

Pokharel, Shyam; Rana, Suresh; Blikenstaff, Joseph; Sadeghi, Amir; Prestidge, Bradley

2013-07-08

The purpose of this study is to investigate the effectiveness of the HIPO planning and optimization algorithm for real-time prostate HDR brachytherapy. This study consists of 20 patients who underwent ultrasound-based real-time HDR brachytherapy of the prostate using the treatment planning system called Oncentra Prostate (SWIFT version 3.0). The treatment plans for all patients were optimized using inverse dose-volume histogram-based optimization followed by graphical optimization (GRO) in real time. The GRO is manual manipulation of isodose lines slice by slice. The quality of the plan heavily depends on planner expertise and experience. The data for all patients were retrieved later, and treatment plans were created and optimized using HIPO algorithm with the same set of dose constraints, number of catheters, and set of contours as in the real-time optimization algorithm. The HIPO algorithm is a hybrid because it combines both stochastic and deterministic algorithms. The stochastic algorithm, called simulated annealing, searches the optimal catheter distributions for a given set of dose objectives. The deterministic algorithm, called dose-volume histogram-based optimization (DVHO), optimizes three-dimensional dose distribution quickly by moving straight downhill once it is in the advantageous region of the search space given by the stochastic algorithm. The PTV receiving 100% of the prescription dose (V100) was 97.56% and 95.38% with GRO and HIPO, respectively. The mean dose (D(mean)) and minimum dose to 10% volume (D10) for the urethra, rectum, and bladder were all statistically lower with HIPO compared to GRO using the student pair t-test at 5% significance level. HIPO can provide treatment plans with comparable target coverage to that of GRO with a reduction in dose to the critical structures.

Contingency Contractor Optimization Phase 3 Sustainment Database Design Document - Contingency Contractor Optimization Tool - Prototype

DOE Office of Scientific and Technical Information (OSTI.GOV)

Frazier, Christopher Rawls; Durfee, Justin David; Bandlow, Alisa

The Contingency Contractor Optimization Tool – Prototype (CCOT-P) database is used to store input and output data for the linear program model described in [1]. The database allows queries to retrieve this data and updating and inserting new input data.
High capacity reversible watermarking for audio by histogram shifting and predicted error expansion.

PubMed

Wang, Fei; Xie, Zhaoxin; Chen, Zuo

2014-01-01

Being reversible, the watermarking information embedded in audio signals can be extracted while the original audio data can achieve lossless recovery. Currently, the few reversible audio watermarking algorithms are confronted with following problems: relatively low SNR (signal-to-noise) of embedded audio; a large amount of auxiliary embedded location information; and the absence of accurate capacity control capability. In this paper, we present a novel reversible audio watermarking scheme based on improved prediction error expansion and histogram shifting. First, we use differential evolution algorithm to optimize prediction coefficients and then apply prediction error expansion to output stego data. Second, in order to reduce location map bits length, we introduced histogram shifting scheme. Meanwhile, the prediction error modification threshold according to a given embedding capacity can be computed by our proposed scheme. Experiments show that this algorithm improves the SNR of embedded audio signals and embedding capacity, drastically reduces location map bits length, and enhances capacity control capability.
YAHA: fast and flexible long-read alignment with optimal breakpoint detection.

PubMed

Faust, Gregory G; Hall, Ira M

2012-10-01

With improved short-read assembly algorithms and the recent development of long-read sequencers, split mapping will soon be the preferred method for structural variant (SV) detection. Yet, current alignment tools are not well suited for this. We present YAHA, a fast and flexible hash-based aligner. YAHA is as fast and accurate as BWA-SW at finding the single best alignment per query and is dramatically faster and more sensitive than both SSAHA2 and MegaBLAST at finding all possible alignments. Unlike other aligners that report all, or one, alignment per query, or that use simple heuristics to select alignments, YAHA uses a directed acyclic graph to find the optimal set of alignments that cover a query using a biologically relevant breakpoint penalty. YAHA can also report multiple mappings per defined segment of the query. We show that YAHA detects more breakpoints in less time than BWA-SW across all SV classes, and especially excels at complex SVs comprising multiple breakpoints. YAHA is currently supported on 64-bit Linux systems. Binaries and sample data are freely available for download from http://faculty.virginia.edu/irahall/YAHA. imh4y@virginia.edu.
Applications of Derandomization Theory in Coding

NASA Astrophysics Data System (ADS)

Cheraghchi, Mahdi

2011-07-01

Randomized techniques play a fundamental role in theoretical computer science and discrete mathematics, in particular for the design of efficient algorithms and construction of combinatorial objects. The basic goal in derandomization theory is to eliminate or reduce the need for randomness in such randomized constructions. In this thesis, we explore some applications of the fundamental notions in derandomization theory to problems outside the core of theoretical computer science, and in particular, certain problems related to coding theory. First, we consider the wiretap channel problem which involves a communication system in which an intruder can eavesdrop a limited portion of the transmissions, and construct efficient and information-theoretically optimal communication protocols for this model. Then we consider the combinatorial group testing problem. In this classical problem, one aims to determine a set of defective items within a large population by asking a number of queries, where each query reveals whether a defective item is present within a specified group of items. We use randomness condensers to explicitly construct optimal, or nearly optimal, group testing schemes for a setting where the query outcomes can be highly unreliable, as well as the threshold model where a query returns positive if the number of defectives pass a certain threshold. Finally, we design ensembles of error-correcting codes that achieve the information-theoretic capacity of a large class of communication channels, and then use the obtained ensembles for construction of explicit capacity achieving codes. [This is a shortened version of the actual abstract in the thesis.
Dynamic-thresholding level set: a novel computer-aided volumetry method for liver tumors in hepatic CT images

NASA Astrophysics Data System (ADS)

Cai, Wenli; Yoshida, Hiroyuki; Harris, Gordon J.

2007-03-01

Measurement of the volume of focal liver tumors, called liver tumor volumetry, is indispensable for assessing the growth of tumors and for monitoring the response of tumors to oncology treatments. Traditional edge models, such as the maximum gradient and zero-crossing methods, often fail to detect the accurate boundary of a fuzzy object such as a liver tumor. As a result, the computerized volumetry based on these edge models tends to differ from manual segmentation results performed by physicians. In this study, we developed a novel computerized volumetry method for fuzzy objects, called dynamic-thresholding level set (DT level set). An optimal threshold value computed from a histogram tends to shift, relative to the theoretical threshold value obtained from a normal distribution model, toward a smaller region in the histogram. We thus designed a mobile shell structure, called a propagating shell, which is a thick region encompassing the level set front. The optimal threshold calculated from the histogram of the shell drives the level set front toward the boundary of a liver tumor. When the volume ratio between the object and the background in the shell approaches one, the optimal threshold value best fits the theoretical threshold value and the shell stops propagating. Application of the DT level set to 26 hepatic CT cases with 63 biopsy-confirmed hepatocellular carcinomas (HCCs) and metastases showed that the computer measured volumes were highly correlated with those of tumors measured manually by physicians. Our preliminary results showed that DT level set was effective and accurate in estimating the volumes of liver tumors detected in hepatic CT images.
A two-level cache for distributed information retrieval in search engines.

PubMed

Zhang, Weizhe; He, Hui; Ye, Jianwei

2013-01-01

To improve the performance of distributed information retrieval in search engines, we propose a two-level cache structure based on the queries of the users' logs. We extract the highest rank queries of users from the static cache, in which the queries are the most popular. We adopt the dynamic cache as an auxiliary to optimize the distribution of the cache data. We propose a distribution strategy of the cache data. The experiments prove that the hit rate, the efficiency, and the time consumption of the two-level cache have advantages compared with other structures of cache.
A Two-Level Cache for Distributed Information Retrieval in Search Engines

PubMed Central

Zhang, Weizhe; He, Hui; Ye, Jianwei

2013-01-01

To improve the performance of distributed information retrieval in search engines, we propose a two-level cache structure based on the queries of the users' logs. We extract the highest rank queries of users from the static cache, in which the queries are the most popular. We adopt the dynamic cache as an auxiliary to optimize the distribution of the cache data. We propose a distribution strategy of the cache data. The experiments prove that the hit rate, the efficiency, and the time consumption of the two-level cache have advantages compared with other structures of cache. PMID:24363621
MO-G-304-01: FEATURED PRESENTATION: Expanding the Knowledge Base for Data-Driven Treatment Planning: Incorporating Patient Outcome Models

DOE Office of Scientific and Technical Information (OSTI.GOV)

Robertson, SP; Quon, H; Cheng, Z

2015-06-15

Purpose: To extend the capabilities of knowledge-based treatment planning beyond simple dose queries by incorporating validated patient outcome models. Methods: From an analytic, relational database of 684 head and neck cancer patients, 372 patients were identified having dose data for both left and right parotid glands as well as baseline and follow-up xerostomia assessments. For each existing patient, knowledge-based treatment planning was simulated for by querying the dose-volume histograms and geometric shape relationships (overlap volume histograms) for all other patients. Dose predictions were captured at normalized volume thresholds (NVT) of 0%, 10%, 20, 30%, 40%, 50%, and 85% and weremore » compared with the actual achieved doses using the Wilcoxon signed-rank test. Next, a logistic regression model was used to predict the maximum severity of xerostomia up to three months following radiotherapy. Baseline xerostomia scores were subtracted from follow-up assessments and were also included in the model. The relative risks from predicted doses and actual doses were computed and compared. Results: The predicted doses for both parotid glands were significantly less than the achieved doses (p < 0.0001), with differences ranging from 830 cGy ± 1270 cGy (0% NVT) to 1673 cGy ± 1197 cGy (30% NVT). The modelled risk of xerostomia ranged from 54% to 64% for achieved doses and from 33% to 51% for the dose predictions. Relative risks varied from 1.24 to 1.87, with maximum relative risk occurring at 85% NVT. Conclusions: Data-driven generation of treatment planning objectives without consideration of the underlying normal tissue complication probability may Result in inferior plans, even if quality metrics indicate otherwise. Inclusion of complication models in knowledge-based treatment planning is necessary in order to close the feedback loop between radiotherapy treatments and patient outcomes. Future work includes advancing and validating complication models in the context of knowledge-based treatment planning. This work is supported by Philips Radiation Oncology Systems.« less
DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data.

PubMed

Putri, Fadhilah Kurnia; Song, Giltae; Kwon, Joonho; Rao, Praveen

2017-09-25

One of the crucial problems for taxi drivers is to efficiently locate passengers in order to increase profits. The rapid advancement and ubiquitous penetration of Internet of Things (IoT) technology into transportation industries enables us to provide taxi drivers with locations that have more potential passengers (more profitable areas) by analyzing and querying taxi trip data. In this paper, we propose a query processing system, called Distributed Profitable-Area Query ( DISPAQ ) which efficiently identifies profitable areas by exploiting the Apache Software Foundation's Spark framework and a MongoDB database. DISPAQ first maintains a profitable-area query index (PQ-index) by extracting area summaries and route summaries from raw taxi trip data. It then identifies candidate profitable areas by searching the PQ-index during query processing. Then, it exploits a Z-Skyline algorithm, which is an extension of skyline processing with a Z-order space filling curve, to quickly refine the candidate profitable areas. To improve the performance of distributed query processing, we also propose local Z-Skyline optimization, which reduces the number of dominant tests by distributing killer profitable areas to each cluster node. Through extensive evaluation with real datasets, we demonstrate that our DISPAQ system provides a scalable and efficient solution for processing profitable-area queries from huge amounts of big taxi trip data.
A Novel Two-Tier Cooperative Caching Mechanism for the Optimization of Multi-Attribute Periodic Queries in Wireless Sensor Networks

PubMed Central

Zhou, ZhangBing; Zhao, Deng; Shu, Lei; Tsang, Kim-Fung

2015-01-01

Wireless sensor networks, serving as an important interface between physical environments and computational systems, have been used extensively for supporting domain applications, where multiple-attribute sensory data are queried from the network continuously and periodically. Usually, certain sensory data may not vary significantly within a certain time duration for certain applications. In this setting, sensory data gathered at a certain time slot can be used for answering concurrent queries and may be reused for answering the forthcoming queries when the variation of these data is within a certain threshold. To address this challenge, a popularity-based cooperative caching mechanism is proposed in this article, where the popularity of sensory data is calculated according to the queries issued in recent time slots. This popularity reflects the possibility that sensory data are interested in the forthcoming queries. Generally, sensory data with the highest popularity are cached at the sink node, while sensory data that may not be interested in the forthcoming queries are cached in the head nodes of divided grid cells. Leveraging these cooperatively cached sensory data, queries are answered through composing these two-tier cached data. Experimental evaluation shows that this approach can reduce the network communication cost significantly and increase the network capability. PMID:26131665
DISPAQ: Distributed Profitable-Area Query from Big Taxi Trip Data †

PubMed Central

Putri, Fadhilah Kurnia; Song, Giltae; Rao, Praveen

2017-01-01

One of the crucial problems for taxi drivers is to efficiently locate passengers in order to increase profits. The rapid advancement and ubiquitous penetration of Internet of Things (IoT) technology into transportation industries enables us to provide taxi drivers with locations that have more potential passengers (more profitable areas) by analyzing and querying taxi trip data. In this paper, we propose a query processing system, called Distributed Profitable-Area Query (DISPAQ) which efficiently identifies profitable areas by exploiting the Apache Software Foundation’s Spark framework and a MongoDB database. DISPAQ first maintains a profitable-area query index (PQ-index) by extracting area summaries and route summaries from raw taxi trip data. It then identifies candidate profitable areas by searching the PQ-index during query processing. Then, it exploits a Z-Skyline algorithm, which is an extension of skyline processing with a Z-order space filling curve, to quickly refine the candidate profitable areas. To improve the performance of distributed query processing, we also propose local Z-Skyline optimization, which reduces the number of dominant tests by distributing killer profitable areas to each cluster node. Through extensive evaluation with real datasets, we demonstrate that our DISPAQ system provides a scalable and efficient solution for processing profitable-area queries from huge amounts of big taxi trip data. PMID:28946679
Efficient Queries of Stand-off Annotations for Natural Language Processing on Electronic Medical Records.

PubMed

Luo, Yuan; Szolovits, Peter

2016-01-01

In natural language processing, stand-off annotation uses the starting and ending positions of an annotation to anchor it to the text and stores the annotation content separately from the text. We address the fundamental problem of efficiently storing stand-off annotations when applying natural language processing on narrative clinical notes in electronic medical records (EMRs) and efficiently retrieving such annotations that satisfy position constraints. Efficient storage and retrieval of stand-off annotations can facilitate tasks such as mapping unstructured text to electronic medical record ontologies. We first formulate this problem into the interval query problem, for which optimal query/update time is in general logarithm. We next perform a tight time complexity analysis on the basic interval tree query algorithm and show its nonoptimality when being applied to a collection of 13 query types from Allen's interval algebra. We then study two closely related state-of-the-art interval query algorithms, proposed query reformulations, and augmentations to the second algorithm. Our proposed algorithm achieves logarithmic time stabbing-max query time complexity and solves the stabbing-interval query tasks on all of Allen's relations in logarithmic time, attaining the theoretic lower bound. Updating time is kept logarithmic and the space requirement is kept linear at the same time. We also discuss interval management in external memory models and higher dimensions.
Efficient Queries of Stand-off Annotations for Natural Language Processing on Electronic Medical Records

PubMed Central

Luo, Yuan; Szolovits, Peter

2016-01-01

In natural language processing, stand-off annotation uses the starting and ending positions of an annotation to anchor it to the text and stores the annotation content separately from the text. We address the fundamental problem of efficiently storing stand-off annotations when applying natural language processing on narrative clinical notes in electronic medical records (EMRs) and efficiently retrieving such annotations that satisfy position constraints. Efficient storage and retrieval of stand-off annotations can facilitate tasks such as mapping unstructured text to electronic medical record ontologies. We first formulate this problem into the interval query problem, for which optimal query/update time is in general logarithm. We next perform a tight time complexity analysis on the basic interval tree query algorithm and show its nonoptimality when being applied to a collection of 13 query types from Allen’s interval algebra. We then study two closely related state-of-the-art interval query algorithms, proposed query reformulations, and augmentations to the second algorithm. Our proposed algorithm achieves logarithmic time stabbing-max query time complexity and solves the stabbing-interval query tasks on all of Allen’s relations in logarithmic time, attaining the theoretic lower bound. Updating time is kept logarithmic and the space requirement is kept linear at the same time. We also discuss interval management in external memory models and higher dimensions. PMID:27478379
The role of economics in the QUERI program: QUERI Series

PubMed Central

Smith, Mark W; Barnett, Paul G

2008-01-01

Background The United States (U.S.) Department of Veterans Affairs (VA) Quality Enhancement Research Initiative (QUERI) has implemented economic analyses in single-site and multi-site clinical trials. To date, no one has reviewed whether the QUERI Centers are taking an optimal approach to doing so. Consistent with the continuous learning culture of the QUERI Program, this paper provides such a reflection. Methods We present a case study of QUERI as an example of how economic considerations can and should be integrated into implementation research within both single and multi-site studies. We review theoretical and applied cost research in implementation studies outside and within VA. We also present a critique of the use of economic research within the QUERI program. Results Economic evaluation is a key element of implementation research. QUERI has contributed many developments in the field of implementation but has only recently begun multi-site implementation trials across multiple regions within the national VA healthcare system. These trials are unusual in their emphasis on developing detailed costs of implementation, as well as in the use of business case analyses (budget impact analyses). Conclusion Economics appears to play an important role in QUERI implementation studies, only after implementation has reached the stage of multi-site trials. Economic analysis could better inform the choice of which clinical best practices to implement and the choice of implementation interventions to employ. QUERI economics also would benefit from research on costing methods and development of widely accepted international standards for implementation economics. PMID:18430199
The role of economics in the QUERI program: QUERI Series.

PubMed

Smith, Mark W; Barnett, Paul G

2008-04-22

The United States (U.S.) Department of Veterans Affairs (VA) Quality Enhancement Research Initiative (QUERI) has implemented economic analyses in single-site and multi-site clinical trials. To date, no one has reviewed whether the QUERI Centers are taking an optimal approach to doing so. Consistent with the continuous learning culture of the QUERI Program, this paper provides such a reflection. We present a case study of QUERI as an example of how economic considerations can and should be integrated into implementation research within both single and multi-site studies. We review theoretical and applied cost research in implementation studies outside and within VA. We also present a critique of the use of economic research within the QUERI program. Economic evaluation is a key element of implementation research. QUERI has contributed many developments in the field of implementation but has only recently begun multi-site implementation trials across multiple regions within the national VA healthcare system. These trials are unusual in their emphasis on developing detailed costs of implementation, as well as in the use of business case analyses (budget impact analyses). Economics appears to play an important role in QUERI implementation studies, only after implementation has reached the stage of multi-site trials. Economic analysis could better inform the choice of which clinical best practices to implement and the choice of implementation interventions to employ. QUERI economics also would benefit from research on costing methods and development of widely accepted international standards for implementation economics.
HodDB: Design and Analysis of a Query Processor for Brick.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fierro, Gabriel; Culler, David

Brick is a recently proposed metadata schema and ontology for describing building components and the relationships between them. It represents buildings as directed labeled graphs using the RDF data model. Using the SPARQL query language, building-agnostic applications query a Brick graph to discover the set of resources and relationships they require to operate. Latency-sensitive applications, such as user interfaces, demand response and modelpredictive control, require fast queries — conventionally less than 100ms. We benchmark a set of popular open-source and commercial SPARQL databases against three real Brick models using seven application queries and find that none of them meet thismore » performance target. This lack of performance can be attributed to design decisions that optimize for queries over large graphs consisting of billions of triples, but give poor spatial locality and join performance on the small dense graphs typical of Brick. We present the design and evaluation of HodDB, a RDF/SPARQL database for Brick built over a node-based index structure. HodDB performs Brick queries 3-700x faster than leading SPARQL databases and consistently meets the 100ms threshold, enabling the portability of important latency-sensitive building applications.« less
LETTER TO THE EDITOR: Optimization of partial search

NASA Astrophysics Data System (ADS)

Korepin, Vladimir E.

2005-11-01

A quantum Grover search algorithm can find a target item in a database faster than any classical algorithm. One can trade accuracy for speed and find a part of the database (a block) containing the target item even faster; this is partial search. A partial search algorithm was recently suggested by Grover and Radhakrishnan. Here we optimize it. Efficiency of the search algorithm is measured by the number of queries to the oracle. The author suggests a new version of the Grover-Radhakrishnan algorithm which uses a minimal number of such queries. The algorithm can run on the same hardware that is used for the usual Grover algorithm.
PARLO: PArallel Run-Time Layout Optimization for Scientific Data Explorations with Heterogeneous Access Pattern

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gong, Zhenhuan; Boyuka, David; Zou, X

Download Citation Email Print Request Permissions Save to Project The size and scope of cutting-edge scientific simulations are growing much faster than the I/O and storage capabilities of their run-time environments. The growing gap is exacerbated by exploratory, data-intensive analytics, such as querying simulation data with multivariate, spatio-temporal constraints, which induces heterogeneous access patterns that stress the performance of the underlying storage system. Previous work addresses data layout and indexing techniques to improve query performance for a single access pattern, which is not sufficient for complex analytics jobs. We present PARLO a parallel run-time layout optimization framework, to achieve multi-levelmore » data layout optimization for scientific applications at run-time before data is written to storage. The layout schemes optimize for heterogeneous access patterns with user-specified priorities. PARLO is integrated with ADIOS, a high-performance parallel I/O middleware for large-scale HPC applications, to achieve user-transparent, light-weight layout optimization for scientific datasets. It offers simple XML-based configuration for users to achieve flexible layout optimization without the need to modify or recompile application codes. Experiments show that PARLO improves performance by 2 to 26 times for queries with heterogeneous access patterns compared to state-of-the-art scientific database management systems. Compared to traditional post-processing approaches, its underlying run-time layout optimization achieves a 56% savings in processing time and a reduction in storage overhead of up to 50%. PARLO also exhibits a low run-time resource requirement, while also limiting the performance impact on running applications to a reasonable level.« less
Towards Building a High Performance Spatial Query System for Large Scale Medical Imaging Data.

PubMed

Aji, Ablimit; Wang, Fusheng; Saltz, Joel H

2012-11-06

Support of high performance queries on large volumes of scientific spatial data is becoming increasingly important in many applications. This growth is driven by not only geospatial problems in numerous fields, but also emerging scientific applications that are increasingly data- and compute-intensive. For example, digital pathology imaging has become an emerging field during the past decade, where examination of high resolution images of human tissue specimens enables more effective diagnosis, prediction and treatment of diseases. Systematic analysis of large-scale pathology images generates tremendous amounts of spatially derived quantifications of micro-anatomic objects, such as nuclei, blood vessels, and tissue regions. Analytical pathology imaging provides high potential to support image based computer aided diagnosis. One major requirement for this is effective querying of such enormous amount of data with fast response, which is faced with two major challenges: the "big data" challenge and the high computation complexity. In this paper, we present our work towards building a high performance spatial query system for querying massive spatial data on MapReduce. Our framework takes an on demand index building approach for processing spatial queries and a partition-merge approach for building parallel spatial query pipelines, which fits nicely with the computing model of MapReduce. We demonstrate our framework on supporting multi-way spatial joins for algorithm evaluation and nearest neighbor queries for microanatomic objects. To reduce query response time, we propose cost based query optimization to mitigate the effect of data skew. Our experiments show that the framework can efficiently support complex analytical spatial queries on MapReduce.
Towards Building a High Performance Spatial Query System for Large Scale Medical Imaging Data

PubMed Central

Aji, Ablimit; Wang, Fusheng; Saltz, Joel H.

2013-01-01

Support of high performance queries on large volumes of scientific spatial data is becoming increasingly important in many applications. This growth is driven by not only geospatial problems in numerous fields, but also emerging scientific applications that are increasingly data- and compute-intensive. For example, digital pathology imaging has become an emerging field during the past decade, where examination of high resolution images of human tissue specimens enables more effective diagnosis, prediction and treatment of diseases. Systematic analysis of large-scale pathology images generates tremendous amounts of spatially derived quantifications of micro-anatomic objects, such as nuclei, blood vessels, and tissue regions. Analytical pathology imaging provides high potential to support image based computer aided diagnosis. One major requirement for this is effective querying of such enormous amount of data with fast response, which is faced with two major challenges: the “big data” challenge and the high computation complexity. In this paper, we present our work towards building a high performance spatial query system for querying massive spatial data on MapReduce. Our framework takes an on demand index building approach for processing spatial queries and a partition-merge approach for building parallel spatial query pipelines, which fits nicely with the computing model of MapReduce. We demonstrate our framework on supporting multi-way spatial joins for algorithm evaluation and nearest neighbor queries for microanatomic objects. To reduce query response time, we propose cost based query optimization to mitigate the effect of data skew. Our experiments show that the framework can efficiently support complex analytical spatial queries on MapReduce. PMID:24501719

Is there a preference for linearity when viewing natural images?

NASA Astrophysics Data System (ADS)

Kane, David; Bertamío, Marcelo

2015-01-01

The system gamma of the imaging pipeline, defined as the product of the encoding and decoding gammas, is typically greater than one and is stronger for images viewed with a dark background (e.g. cinema) than those viewed in lighter conditions (e.g. office displays).1-3 However, for high dynamic range (HDR) images reproduced on a low dynamic range (LDR) monitor, subjects often prefer a system gamma of less than one,4 presumably reflecting the greater need for histogram equalization in HDR images. In this study we ask subjects to rate the perceived quality of images presented on a LDR monitor using various levels of system gamma. We reveal that the optimal system gamma is below one for images with a HDR and approaches or exceeds one for images with a LDR. Additionally, the highest quality scores occur for images where a system gamma of one is optimal, suggesting a preference for linearity (where possible). We find that subjective image quality scores can be predicted by computing the degree of histogram equalization of the lightness distribution. Accordingly, an optimal, image dependent system gamma can be computed that maximizes perceived image quality.
Artemis: Integrating Scientific Data on the Grid (Preprint)

DTIC Science & Technology

2004-07-01

Theseus execution engine [Barish and Knoblock 03] to efficiently execute the generated datalog program. The Theseus execution engine has a wide...variety of operations to query databases, web sources, and web services. Theseus also contains a wide variety of relational operations, such as...selection, union, or projection. Furthermore, Theseus optimizes the execution of an integration plan by querying several data sources in parallel and
Evaluation methodology for query-based scene understanding systems

NASA Astrophysics Data System (ADS)

Huster, Todd P.; Ross, Timothy D.; Culbertson, Jared L.

2015-05-01

In this paper, we are proposing a method for the principled evaluation of scene understanding systems in a query-based framework. We can think of a query-based scene understanding system as a generalization of typical sensor exploitation systems where instead of performing a narrowly defined task (e.g., detect, track, classify, etc.), the system can perform general user-defined tasks specified in a query language. Examples of this type of system have been developed as part of DARPA's Mathematics of Sensing, Exploitation, and Execution (MSEE) program. There is a body of literature on the evaluation of typical sensor exploitation systems, but the open-ended nature of the query interface introduces new aspects to the evaluation problem that have not been widely considered before. In this paper, we state the evaluation problem and propose an approach to efficiently learn about the quality of the system under test. We consider the objective of the evaluation to be to build a performance model of the system under test, and we rely on the principles of Bayesian experiment design to help construct and select optimal queries for learning about the parameters of that model.
Motor Oil Classification using Color Histograms and Pattern Recognition Techniques.

PubMed

Ahmadi, Shiva; Mani-Varnosfaderani, Ahmad; Habibi, Biuck

2018-04-20

Motor oil classification is important for quality control and the identification of oil adulteration. In thiswork, we propose a simple, rapid, inexpensive and nondestructive approach based on image analysis and pattern recognition techniques for the classification of nine different types of motor oils according to their corresponding color histograms. For this, we applied color histogram in different color spaces such as red green blue (RGB), grayscale, and hue saturation intensity (HSI) in order to extract features that can help with the classification procedure. These color histograms and their combinations were used as input for model development and then were statistically evaluated by using linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and support vector machine (SVM) techniques. Here, two common solutions for solving a multiclass classification problem were applied: (1) transformation to binary classification problem using a one-against-all (OAA) approach and (2) extension from binary classifiers to a single globally optimized multilabel classification model. In the OAA strategy, LDA, QDA, and SVM reached up to 97% in terms of accuracy, sensitivity, and specificity for both the training and test sets. In extension from binary case, despite good performances by the SVM classification model, QDA and LDA provided better results up to 92% for RGB-grayscale-HSI color histograms and up to 93% for the HSI color map, respectively. In order to reduce the numbers of independent variables for modeling, a principle component analysis algorithm was used. Our results suggest that the proposed method is promising for the identification and classification of different types of motor oils.
Motivated Proteins: A web application for studying small three-dimensional protein motifs

PubMed Central

Leader, David P; Milner-White, E James

2009-01-01

Background Small loop-shaped motifs are common constituents of the three-dimensional structure of proteins. Typically they comprise between three and seven amino acid residues, and are defined by a combination of dihedral angles and hydrogen bonding partners. The most abundant of these are αβ-motifs, asx-motifs, asx-turns, β-bulges, β-bulge loops, β-turns, nests, niches, Schellmann loops, ST-motifs, ST-staples and ST-turns. We have constructed a database of such motifs from a range of high-quality protein structures and built a web application as a visual interface to this. Description The web application, Motivated Proteins, provides access to these 12 motifs (with 48 sub-categories) in a database of over 400 representative proteins. Queries can be made for specific categories or sub-categories of motif, motifs in the vicinity of ligands, motifs which include part of an enzyme active site, overlapping motifs, or motifs which include a particular amino acid sequence. Individual proteins can be specified, or, where appropriate, motifs for all proteins listed. The results of queries are presented in textual form as an (X)HTML table, and may be saved as parsable plain text or XML. Motifs can be viewed and manipulated either individually or in the context of the protein in the Jmol applet structural viewer. Cartoons of the motifs imposed on a linear representation of protein secondary structure are also provided. Summary information for the motifs is available, as are histograms of amino acid distribution, and graphs of dihedral angles at individual positions in the motifs. Conclusion Motivated Proteins is a publicly and freely accessible web application that enables protein scientists to study small three-dimensional motifs without requiring knowledge of either Structured Query Language or the underlying database schema. PMID:19210785
Enriching text with images and colored light

NASA Astrophysics Data System (ADS)

Sekulovski, Dragan; Geleijnse, Gijs; Kater, Bram; Korst, Jan; Pauws, Steffen; Clout, Ramon

2008-01-01

We present an unsupervised method to enrich textual applications with relevant images and colors. The images are collected by querying large image repositories and subsequently the colors are computed using image processing. A prototype system based on this method is presented where the method is applied to song lyrics. In combination with a lyrics synchronization algorithm the system produces a rich multimedia experience. In order to identify terms within the text that may be associated with images and colors, we select noun phrases using a part of speech tagger. Large image repositories are queried with these terms. Per term representative colors are extracted using the collected images. Hereto, we either use a histogram-based or a mean shift-based algorithm. The representative color extraction uses the non-uniform distribution of the colors found in the large repositories. The images that are ranked best by the search engine are displayed on a screen, while the extracted representative colors are rendered on controllable lighting devices in the living room. We evaluate our method by comparing the computed colors to standard color representations of a set of English color terms. A second evaluation focuses on the distance in color between a queried term in English and its translation in a foreign language. Based on results from three sets of terms, a measure of suitability of a term for color extraction based on KL Divergence is proposed. Finally, we compare the performance of the algorithm using either the automatically indexed repository of Google Images and the manually annotated Flickr.com. Based on the results of these experiments, we conclude that using the presented method we can compute the relevant color for a term using a large image repository and image processing.
FastBit Reference Manual

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Kesheng

2007-08-02

An index in a database system is a data structure that utilizes redundant information about the base data to speed up common searching and retrieval operations. Most commonly used indexes are variants of B-trees, such as B+-tree and B*-tree. FastBit implements a set of alternative indexes call compressed bitmap indexes. Compared with B-tree variants, these indexes provide very efficient searching and retrieval operations by sacrificing the efficiency of updating the indexes after the modification of an individual record. In addition to the well-known strengths of bitmap indexes, FastBit has a special strength stemming from the bitmap compression scheme used. Themore » compression method is called the Word-Aligned Hybrid (WAH) code. It reduces the bitmap indexes to reasonable sizes and at the same time allows very efficient bitwise logical operations directly on the compressed bitmaps. Compared with the well-known compression methods such as LZ77 and Byte-aligned Bitmap code (BBC), WAH sacrifices some space efficiency for a significant improvement in operational efficiency. Since the bitwise logical operations are the most important operations needed to answer queries, using WAH compression has been shown to answer queries significantly faster than using other compression schemes. Theoretical analyses showed that WAH compressed bitmap indexes are optimal for one-dimensional range queries. Only the most efficient indexing schemes such as B+-tree and B*-tree have this optimality property. However, bitmap indexes are superior because they can efficiently answer multi-dimensional range queries by combining the answers to one-dimensional queries.« less
Datacube Services in Action, Using Open Source and Open Standards

NASA Astrophysics Data System (ADS)

Baumann, P.; Misev, D.

2016-12-01

Array Databases comprise novel, promising technology for massive spatio-temporal datacubes, extending the SQL paradigm of "any query, anytime" to n-D arrays. On server side, such queries can be optimized, parallelized, and distributed based on partitioned array storage. The rasdaman ("raster data manager") system, which has pioneered Array Databases, is available in open source on www.rasdaman.org. Its declarative query language extends SQL with array operators which are optimized and parallelized on server side. The rasdaman engine, which is part of OSGeo Live, is mature and in operational use databases individually holding dozens of Terabytes. Further, the rasdaman concepts have strongly impacted international Big Data standards in the field, including the forthcoming MDA ("Multi-Dimensional Array") extension to ISO SQL, the OGC Web Coverage Service (WCS) and Web Coverage Processing Service (WCPS) standards, and the forthcoming INSPIRE WCS/WCPS; in both OGC and INSPIRE, OGC is WCS Core Reference Implementation. In our talk we present concepts, architecture, operational services, and standardization impact of open-source rasdaman, as well as experiences made.
Bridging the Particle Physics and Big Data Worlds

NASA Astrophysics Data System (ADS)

Pivarski, James

2017-09-01

For decades, particle physicists have developed custom software because the scale and complexity of our problems were unique. In recent years, however, the ``big data'' industry has begun to tackle similar problems, and has developed some novel solutions. Incorporating scientific Python libraries, Spark, TensorFlow, and machine learning tools into the physics software stack can improve abstraction, reliability, and in some cases performance. Perhaps more importantly, it can free physicists to concentrate on domain-specific problems. Building bridges isn't always easy, however. Physics software and open-source software from industry differ in many incidental ways and a few fundamental ways. I will show work from the DIANA-HEP project to streamline data flow from ROOT to Numpy and Spark, to incorporate ideas of functional programming into histogram aggregation, and to develop real-time, query-style manipulations of particle data.
Utility of whole-lesion ADC histogram metrics for assessing the malignant potential of pancreatic intraductal papillary mucinous neoplasms (IPMNs).

PubMed

Hoffman, David H; Ream, Justin M; Hajdu, Christina H; Rosenkrantz, Andrew B

2017-04-01

To evaluate whole-lesion ADC histogram metrics for assessing the malignant potential of pancreatic intraductal papillary mucinous neoplasms (IPMNs), including in comparison with conventional MRI features. Eighteen branch-duct IPMNs underwent MRI with DWI prior to resection (n = 16) or FNA (n = 2). A blinded radiologist placed 3D volumes-of-interest on the entire IPMN on the ADC map, from which whole-lesion histogram metrics were generated. The reader also assessed IPMN size, mural nodularity, and adjacent main-duct dilation. Benign (low-to-intermediate grade dysplasia; n = 10) and malignant (high-grade dysplasia or invasive adenocarcinoma; n = 8) IPMNs were compared. Whole-lesion ADC histogram metrics demonstrating significant differences between benign and malignant IPMNs were: entropy (5.1 ± 0.2 vs. 5.4 ± 0.2; p = 0.01, AUC = 86%); mean of the bottom 10th percentile (2.2 ± 0.4 vs. 1.6 ± 0.7; p = 0.03; AUC = 81%); and mean of the 10-25th percentile (2.8 ± 0.4 vs. 2.3 ± 0.6; p = 0.04; AUC = 79%). The overall mean ADC, skewness, and kurtosis were not significantly different between groups (p ≥ 0.06; AUC = 50-78%). For entropy (highest performing histogram metric), an optimal threshold of >5.3 achieved a sensitivity of 100%, a specificity of 70%, and an accuracy of 83% for predicting malignancy. No significant difference (p = 0.18-0.64) was observed between benign and malignant IPMNs for cyst size ≥3 cm, adjacent main-duct dilatation, or mural nodule. At multivariable analysis of entropy in combination with all other ADC histogram and conventional MRI features, entropy was the only significant independent predictor of malignancy (p = 0.004). Although requiring larger studies, ADC entropy obtained from 3D whole-lesion histogram analysis may serve as a biomarker for identifying the malignant potential of IPMNs, independent of conventional MRI features.
Histogram analysis of apparent diffusion coefficient maps for the differentiation between lymphoma and metastatic lymph nodes of squamous cell carcinoma in head and neck region.

PubMed

Wang, Yan-Jun; Xu, Xiao-Quan; Hu, Hao; Su, Guo-Yi; Shen, Jie; Shi, Hai-Bin; Wu, Fei-Yun

2018-06-01

Background To clarify the nature of cervical malignant lymphadenopathy is highly important for the diagnosis and differential diagnosis of head and neck tumors. Purpose To investigate the role of first-order apparent diffusion coefficient (ADC) histogram analysis for differentiating lymphoma from metastatic lymph nodes of squamous cell carcinoma (SCC) in the head and neck region. Material and Methods Diffusion-weighted imaging (DWI) data of 67 patients (lymphoma, n = 20; SCC, n = 47) with malignant lymphadenopathy were retrospectively analyzed. The SCC group was divided into nasopharyngeal SCC and non-nasopharyngeal SCC groups. The ADC histogram features (ADC 10 , ADC 25 , ADC mean , ADC median , ADC 75 , ADC 90 , skewness, and kurtosis) were derived and then compared by independent-samples t-test and one-way analysis of variance test, respectively. Receiver operating characteristic curve analyses were employed to investigate diagnostic performance of the significant parameters. Results Lymphoma showed significantly lower ADC mean , ADC median , ADC 75 , and ADC 90 than SCC (all P < 0.05). Setting ADC 90 = 0.719 × 10 -3 mm 2 /s as the threshold value, optimal diagnostic performance was achieved (area under the curve [AUC] = 0.719, sensitivity = 95.7%, specificity = 50.0%). Subgroup analyses showed no significant difference between lymphoma and NPC (all P > 0.05). Lymphoma showed significantly lower ADC 25 , ADC mean , ADC median , ADC 75 , and ADC 90 than non-nasopharyngeal SCC (all P < 0.05). Optimal diagnostic performance (AUC = 0.847, sensitivity = 86.7%, specificity = 80.0%) could be achieved when setting ADC 90 = 0.943 × 10 -3 mm 2 /s as the threshold value. Conclusion Given its limitations, our study has shown that first-order ADC histogram analysis is capable of differentiating lymphoma from metastatic lymph nodes of SCC, especially those of non-nasopharyngeal SCC.
A similarity measure method combining location feature for mammogram retrieval.

PubMed

Wang, Zhiqiong; Xin, Junchang; Huang, Yukun; Li, Chen; Xu, Ling; Li, Yang; Zhang, Hao; Gu, Huizi; Qian, Wei

2018-05-28

Breast cancer, the most common malignancy among women, has a high mortality rate in clinical practice. Early detection, diagnosis and treatment can reduce the mortalities of breast cancer greatly. The method of mammogram retrieval can help doctors to find the early breast lesions effectively and determine a reasonable feature set for image similarity measure. This will improve the accuracy effectively for mammogram retrieval. This paper proposes a similarity measure method combining location feature for mammogram retrieval. Firstly, the images are pre-processed, the regions of interest are detected and the lesions are segmented in order to get the center point and radius of the lesions. Then, the method, namely Coherent Point Drift, is used for image registration with the pre-defined standard image. The center point and radius of the lesions after registration are obtained and the standard location feature of the image is constructed. This standard location feature can help figure out the location similarity between the image pair from the query image to each dataset image in the database. Next, the content feature of the image is extracted, including the Histogram of Oriented Gradients, the Edge Direction Histogram, the Local Binary Pattern and the Gray Level Histogram, and the image pair content similarity can be calculated using the Earth Mover's Distance. Finally, the location similarity and content similarity are fused to form the image fusion similarity, and the specified number of the most similar images can be returned according to it. In the experiment, 440 mammograms, which are from Chinese women in Northeast China, are used as the database. When fusing 40% lesion location feature similarity and 60% content feature similarity, the results have obvious advantages. At this time, precision is 0.83, recall is 0.76, comprehensive indicator is 0.79, satisfaction is 96.0%, mean is 4.2 and variance is 17.7. The results show that the precision and recall of this method have obvious advantage, compared with the content-based image retrieval.
Web page sorting algorithm based on query keyword distance relation

NASA Astrophysics Data System (ADS)

Yang, Han; Cui, Hong Gang; Tang, Hao

2017-08-01

In order to optimize the problem of page sorting, according to the search keywords in the web page in the relationship between the characteristics of the proposed query keywords clustering ideas. And it is converted into the degree of aggregation of the search keywords in the web page. Based on the PageRank algorithm, the clustering degree factor of the query keyword is added to make it possible to participate in the quantitative calculation. This paper proposes an improved algorithm for PageRank based on the distance relation between search keywords. The experimental results show the feasibility and effectiveness of the method.
Foundations of RDF Databases

NASA Astrophysics Data System (ADS)

Arenas, Marcelo; Gutierrez, Claudio; Pérez, Jorge

The goal of this paper is to give an overview of the basics of the theory of RDF databases. We provide a formal definition of RDF that includes the features that distinguish this model from other graph data models. We then move into the fundamental issue of querying RDF data. We start by considering the RDF query language SPARQL, which is a W3C Recommendation since January 2008. We provide an algebraic syntax and a compositional semantics for this language, study the complexity of the evaluation problem for different fragments of SPARQL, and consider the problem of optimizing the evaluation of SPARQL queries, showing that a natural fragment of this language has some good properties in this respect. We furthermore study the expressive power of SPARQL, by comparing it with some well-known query languages such as relational algebra. We conclude by considering the issue of querying RDF data in the presence of RDFS vocabulary. In particular, we present a recently proposed extension of SPARQL with navigational capabilities.
A Queueing Approach to Optimal Resource Replication in Wireless Sensor Networks

DTIC Science & Technology

2009-04-29

network (an energy- centric approach) or to ensure the proportion of query failures does not exceed a predetermined threshold (a failure- centric ...replication strategies in wireless sensor networks. The model can be used to minimize either the total transmission rate of the network (an energy- centric ...approach) or to ensure the proportion of query failures does not exceed a predetermined threshold (a failure- centric approach). The model explicitly
Optimization of Extended Relational Database Systems

DTIC Science & Technology

1986-07-23

control functions are integrated into a single system in a homogeneoua way. As a first exam - ple, consider previous work in supporting various semantic...sizes are reduced and, wnk? quently, the number of materializations that will be needed is aba lower. For exam - pie, in the above query tuple...retrieve (EMP.name) where EMP hobbies instrument = ’ violin ’ When the various entries in the hobbies field are materialized, only those queries that
Entity Bases: Large-Scale Knowledgebases for Intelligence Data

DTIC Science & Technology

2009-02-01

declaratively expressed as Datalog rules . The EntityBase supports two query scenarios: • Free-Form Querying: A human analyst or a client program can pose...integration, Prometheus follows the Inverse Rules algo- rithm (Duschka 1997) with additional optimizations (Thakkar et al. 2005). We use the mediator...Discovery and Data Mining (PAKDD󈧈), Sydney, Australia. Crammer , K., Dekel, O., Keshet, J., Shalev-Shwartz, S., and Singer, Y. (2006). Online passive
Processing SPARQL queries with regular expressions in RDF databases

PubMed Central

2011-01-01

Background As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users’ requests for extracting information from the RDF data as well as the lack of users’ knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. Results In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Conclusions Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns. PMID:21489225
Processing SPARQL queries with regular expressions in RDF databases.

PubMed

Lee, Jinsoo; Pham, Minh-Duc; Lee, Jihwan; Han, Wook-Shin; Cho, Hune; Yu, Hwanjo; Lee, Jeong-Hoon

2011-03-29

As the Resource Description Framework (RDF) data model is widely used for modeling and sharing a lot of online bioinformatics resources such as Uniprot (dev.isb-sib.ch/projects/uniprot-rdf) or Bio2RDF (bio2rdf.org), SPARQL - a W3C recommendation query for RDF databases - has become an important query language for querying the bioinformatics knowledge bases. Moreover, due to the diversity of users' requests for extracting information from the RDF data as well as the lack of users' knowledge about the exact value of each fact in the RDF databases, it is desirable to use the SPARQL query with regular expression patterns for querying the RDF data. To the best of our knowledge, there is currently no work that efficiently supports regular expression processing in SPARQL over RDF databases. Most of the existing techniques for processing regular expressions are designed for querying a text corpus, or only for supporting the matching over the paths in an RDF graph. In this paper, we propose a novel framework for supporting regular expression processing in SPARQL query. Our contributions can be summarized as follows. 1) We propose an efficient framework for processing SPARQL queries with regular expression patterns in RDF databases. 2) We propose a cost model in order to adapt the proposed framework in the existing query optimizers. 3) We build a prototype for the proposed framework in C++ and conduct extensive experiments demonstrating the efficiency and effectiveness of our technique. Experiments with a full-blown RDF engine show that our framework outperforms the existing ones by up to two orders of magnitude in processing SPARQL queries with regular expression patterns.
Guided Iterative Substructure Search (GI-SSS) - A New Trick for an Old Dog.

PubMed

Weskamp, Nils

2016-07-01

Substructure search (SSS) is a fundamental technique supported by various chemical information systems. Many users apply it in an iterative manner: they modify their queries to shape the composition of the retrieved hit sets according to their needs. We propose and evaluate two heuristic extensions of SSS aimed at simplifying these iterative query modifications by collecting additional information during query processing and visualizing this information in an intuitive way. This gives the user a convenient feedback on how certain changes to the query would affect the retrieved hit set and reduces the number of trial-and-error cycles needed to generate an optimal search result. The proposed heuristics are simple, yet surprisingly effective and can be easily added to existing SSS implementations. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Modifications to Improve Data Acquisition and Analysis for Camouflage Design

DTIC Science & Technology

1983-01-01

terrains into facsimiles of the original scenes in 3, 4# or 5 colors in CIELAB notation. Tasks that were addressed included optimization of the...a histogram algorithm (HIST) was used as a first step In the clustering of the CIELAB values of the scene pixels. This algorithm Is highly efficient...however, an optimal process and the CIELAB coordinates of the final color domains can be Influenced by the color coordinate Increments used In the
A two-phase copula entropy-based multiobjective optimization approach to hydrometeorological gauge network design

NASA Astrophysics Data System (ADS)

Xu, Pengcheng; Wang, Dong; Singh, Vijay P.; Wang, Yuankun; Wu, Jichun; Wang, Lachun; Zou, Xinqing; Chen, Yuanfang; Chen, Xi; Liu, Jiufu; Zou, Ying; He, Ruimin

2017-12-01

Hydrometeorological data are needed for obtaining point and areal mean, quantifying the spatial variability of hydrometeorological variables, and calibration and verification of hydrometeorological models. Hydrometeorological networks are utilized to collect such data. Since data collection is expensive, it is essential to design an optimal network based on the minimal number of hydrometeorological stations in order to reduce costs. This study proposes a two-phase copula entropy- based multiobjective optimization approach that includes: (1) copula entropy-based directional information transfer (CDIT) for clustering the potential hydrometeorological gauges into several groups, and (2) multiobjective method for selecting the optimal combination of gauges for regionalized groups. Although entropy theory has been employed for network design before, the joint histogram method used for mutual information estimation has several limitations. The copula entropy-based mutual information (MI) estimation method is shown to be more effective for quantifying the uncertainty of redundant information than the joint histogram (JH) method. The effectiveness of this approach is verified by applying to one type of hydrometeorological gauge network, with the use of three model evaluation measures, including Nash-Sutcliffe Coefficient (NSC), arithmetic mean of the negative copula entropy (MNCE), and MNCE/NSC. Results indicate that the two-phase copula entropy-based multiobjective technique is capable of evaluating the performance of regional hydrometeorological networks and can enable decision makers to develop strategies for water resources management.
SVM based colon polyps classifier in a wireless active stereo endoscope.

PubMed

Ayoub, J; Granado, B; Mhanna, Y; Romain, O

2010-01-01

This work focuses on the recognition of three-dimensional colon polyps captured by an active stereo vision sensor. The detection algorithm consists of SVM classifier trained on robust feature descriptors. The study is related to Cyclope, this prototype sensor allows real time 3D object reconstruction and continues to be optimized technically to improve its classification task by differentiation between hyperplastic and adenomatous polyps. Experimental results were encouraging and show correct classification rate of approximately 97%. The work contains detailed statistics about the detection rate and the computing complexity. Inspired by intensity histogram, the work shows a new approach that extracts a set of features based on depth histogram and combines stereo measurement with SVM classifiers to correctly classify benign and malignant polyps.
Differentiation between malignant and benign thyroid nodules and stratification of papillary thyroid cancer with aggressive histological features: Whole-lesion diffusion-weighted imaging histogram analysis.

PubMed

Hao, Yonghong; Pan, Chu; Chen, WeiWei; Li, Tao; Zhu, WenZhen; Qi, JianPin

2016-12-01

To explore the usefulness of whole-lesion histogram analysis of apparent diffusion coefficient (ADC) derived from reduced field-of-view (r-FOV) diffusion-weighted imaging (DWI) in differentiating malignant and benign thyroid nodules and stratifying papillary thyroid cancer (PTC) with aggressive histological features. This Institutional Review Board-approved, retrospective study included 93 patients with 101 pathologically proven thyroid nodules. All patients underwent preoperative r-FOV DWI at 3T. The whole-lesion ADC assessments were performed for each patient. Histogram-derived ADC parameters between different subgroups (pathologic type, extrathyroidal extension, lymph node metastasis) were compared. Receiver operating characteristic curve analysis was used to determine optimal histogram parameters in differentiating benign and malignant nodules and predicting aggressiveness of PTC. Mean ADC, median ADC, 5 th percentile ADC, 25 th percentile ADC, 75 th percentile ADC, 95 th percentile ADC (all P < 0.001), and kurtosis (P = 0.001) were significantly lower in malignant thyroid nodules, and mean ADC achieved the highest AUC (0.919) with a cutoff value of 1842.78 × 10 -6 mm 2 /s in differentiating malignant and benign nodules. Compared to the PTCs without extrathyroidal extension, PTCs with extrathyroidal extension showed significantly lower median ADC, 5 th percentile ADC, and 25 th percentile ADC. The 5 th percentile ADC achieved the highest AUC (0.757) with cutoff value of 911.5 × 10 -6 mm 2 /s for differentiating between PTCs with and without extrathyroidal extension. Whole-lesion ADC histogram analysis might help to differentiate malignant nodules from benign ones and show the PTCs with extrathyroidal extension. J. Magn. Reson. Imaging 2016;44:1546-1555. © 2016 International Society for Magnetic Resonance in Medicine.
A Whole-Tumor Histogram Analysis of Apparent Diffusion Coefficient Maps for Differentiating Thymic Carcinoma from Lymphoma.

PubMed

Zhang, Wei; Zhou, Yue; Xu, Xiao-Quan; Kong, Ling-Yan; Xu, Hai; Yu, Tong-Fu; Shi, Hai-Bin; Feng, Qing

2018-01-01

To assess the performance of a whole-tumor histogram analysis of apparent diffusion coefficient (ADC) maps in differentiating thymic carcinoma from lymphoma, and compare it with that of a commonly used hot-spot region-of-interest (ROI)-based ADC measurement. Diffusion weighted imaging data of 15 patients with thymic carcinoma and 13 patients with lymphoma were retrospectively collected and processed with a mono-exponential model. ADC measurements were performed by using a histogram-based and hot-spot-ROI-based approach. In the histogram-based approach, the following parameters were generated: mean ADC (ADC mean ), median ADC (ADC median ), 10th and 90th percentile of ADC (ADC 10 and ADC 90 ), kurtosis, and skewness. The difference in ADCs between thymic carcinoma and lymphoma was compared using a t test. Receiver operating characteristic analyses were conducted to determine and compare the differentiating performance of ADCs. Lymphoma demonstrated significantly lower ADC mean , ADC median , ADC 10 , ADC 90 , and hot-spot-ROI-based mean ADC than those found in thymic carcinoma (all p values < 0.05). There were no differences found in the kurtosis ( p = 0.412) and skewness ( p = 0.273). The ADC 10 demonstrated optimal differentiating performance (cut-off value, 0.403 × 10 -3 mm 2 /s; area under the receiver operating characteristic curve [AUC], 0.977; sensitivity, 92.3%; specificity, 93.3%), followed by the ADC mean , ADC median , ADC 90 , and hot-spot-ROI-based mean ADC. The AUC of ADC 10 was significantly higher than that of the hot spot ROI based ADC (0.977 vs. 0.797, p = 0.036). Compared with the commonly used hot spot ROI based ADC measurement, a histogram analysis of ADC maps can improve the differentiating performance between thymic carcinoma and lymphoma.
Comprehensive Optimal Manpower and Personnel Analytic Simulation System (COMPASS)

DTIC Science & Technology

2009-10-01

4 The EDB consists of 4 major components (some of which are re-usable): 1. Metadata Editor ( MDE ): Also considered a leaf node, the metadata...end-user queries via the QB. The EDB supports multiple instances of the MDE , although currently, only a single instance is recommended. 2 Query...the MSB is a central collection of web services, responsible for the authentication and authorization of users, maintenance of the EDB metadata
LSD: Large Survey Database framework

NASA Astrophysics Data System (ADS)

Juric, Mario

2012-09-01

The Large Survey Database (LSD) is a Python framework and DBMS for distributed storage, cross-matching and querying of large survey catalogs (>10^9 rows, >1 TB). The primary driver behind its development is the analysis of Pan-STARRS PS1 data. It is specifically optimized for fast queries and parallel sweeps of positionally and temporally indexed datasets. It transparently scales to more than >10^2 nodes, and can be made to function in "shared nothing" architectures.
Analytics-Driven Lossless Data Compression for Rapid In-situ Indexing, Storing, and Querying

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jenkins, John; Arkatkar, Isha; Lakshminarasimhan, Sriram

2013-01-01

The analysis of scientific simulations is highly data-intensive and is becoming an increasingly important challenge. Peta-scale data sets require the use of light-weight query-driven analysis methods, as opposed to heavy-weight schemes that optimize for speed at the expense of size. This paper is an attempt in the direction of query processing over losslessly compressed scientific data. We propose a co-designed double-precision compression and indexing methodology for range queries by performing unique-value-based binning on the most significant bytes of double precision data (sign, exponent, and most significant mantissa bits), and inverting the resulting metadata to produce an inverted index over amore » reduced data representation. Without the inverted index, our method matches or improves compression ratios over both general-purpose and floating-point compression utilities. The inverted index is light-weight, and the overall storage requirement for both reduced column and index is less than 135%, whereas existing DBMS technologies can require 200-400%. As a proof-of-concept, we evaluate univariate range queries that additionally return column values, a critical component of data analytics, against state-of-the-art bitmap indexing technology, showing multi-fold query performance improvements.« less
LAILAPS-QSM: A RESTful API and JAVA library for semantic query suggestions.

PubMed

Chen, Jinbo; Scholz, Uwe; Zhou, Ruonan; Lange, Matthias

2018-03-01

In order to access and filter content of life-science databases, full text search is a widely applied query interface. But its high flexibility and intuitiveness is paid for with potentially imprecise and incomplete query results. To reduce this drawback, query assistance systems suggest those combinations of keywords with the highest potential to match most of the relevant data records. Widespread approaches are syntactic query corrections that avoid misspelling and support expansion of words by suffixes and prefixes. Synonym expansion approaches apply thesauri, ontologies, and query logs. All need laborious curation and maintenance. Furthermore, access to query logs is in general restricted. Approaches that infer related queries by their query profile like research field, geographic location, co-authorship, affiliation etc. require user's registration and its public accessibility that contradict privacy concerns. To overcome these drawbacks, we implemented LAILAPS-QSM, a machine learning approach that reconstruct possible linguistic contexts of a given keyword query. The context is referred from the text records that are stored in the databases that are going to be queried or extracted for a general purpose query suggestion from PubMed abstracts and UniProt data. The supplied tool suite enables the pre-processing of these text records and the further computation of customized distributed word vectors. The latter are used to suggest alternative keyword queries. An evaluated of the query suggestion quality was done for plant science use cases. Locally present experts enable a cost-efficient quality assessment in the categories trait, biological entity, taxonomy, affiliation, and metabolic function which has been performed using ontology term similarities. LAILAPS-QSM mean information content similarity for 15 representative queries is 0.70, whereas 34% have a score above 0.80. In comparison, the information content similarity for human expert made query suggestions is 0.90. The software is either available as tool set to build and train dedicated query suggestion services or as already trained general purpose RESTful web service. The service uses open interfaces to be seamless embeddable into database frontends. The JAVA implementation uses highly optimized data structures and streamlined code to provide fast and scalable response for web service calls. The source code of LAILAPS-QSM is available under GNU General Public License version 2 in Bitbucket GIT repository: https://bitbucket.org/ipk_bit_team/bioescorte-suggestion.
Image-Based Airborne LiDAR Point Cloud Encoding for 3d Building Model Retrieval

NASA Astrophysics Data System (ADS)

Chen, Yi-Chen; Lin, Chao-Hung

2016-06-01

With the development of Web 2.0 and cyber city modeling, an increasing number of 3D models have been available on web-based model-sharing platforms with many applications such as navigation, urban planning, and virtual reality. Based on the concept of data reuse, a 3D model retrieval system is proposed to retrieve building models similar to a user-specified query. The basic idea behind this system is to reuse these existing 3D building models instead of reconstruction from point clouds. To efficiently retrieve models, the models in databases are compactly encoded by using a shape descriptor generally. However, most of the geometric descriptors in related works are applied to polygonal models. In this study, the input query of the model retrieval system is a point cloud acquired by Light Detection and Ranging (LiDAR) systems because of the efficient scene scanning and spatial information collection. Using Point clouds with sparse, noisy, and incomplete sampling as input queries is more difficult than that by using 3D models. Because that the building roof is more informative than other parts in the airborne LiDAR point cloud, an image-based approach is proposed to encode both point clouds from input queries and 3D models in databases. The main goal of data encoding is that the models in the database and input point clouds can be consistently encoded. Firstly, top-view depth images of buildings are generated to represent the geometry surface of a building roof. Secondly, geometric features are extracted from depth images based on height, edge and plane of building. Finally, descriptors can be extracted by spatial histograms and used in 3D model retrieval system. For data retrieval, the models are retrieved by matching the encoding coefficients of point clouds and building models. In experiments, a database including about 900,000 3D models collected from the Internet is used for evaluation of data retrieval. The results of the proposed method show a clear superiority over related methods.
Querying Archetype-Based Electronic Health Records Using Hadoop and Dewey Encoding of openEHR Models.

PubMed

Sundvall, Erik; Wei-Kleiner, Fang; Freire, Sergio M; Lambrix, Patrick

2017-01-01

Archetype-based Electronic Health Record (EHR) systems using generic reference models from e.g. openEHR, ISO 13606 or CIMI should be easy to update and reconfigure with new types (or versions) of data models or entries, ideally with very limited programming or manual database tweaking. Exploratory research (e.g. epidemiology) leading to ad-hoc querying on a population-wide scale can be a challenge in such environments. This publication describes implementation and test of an archetype-aware Dewey encoding optimization that can be used to produce such systems in environments supporting relational operations, e.g. RDBMs and distributed map-reduce frameworks like Hadoop. Initial testing was done using a nine-node 2.2 GHz quad-core Hadoop cluster querying a dataset consisting of targeted extracts from 4+ million real patient EHRs, query results with sub-minute response time were obtained.
Multifractal diffusion entropy analysis: Optimal bin width of probability histograms

NASA Astrophysics Data System (ADS)

Jizba, Petr; Korbel, Jan

2014-11-01

In the framework of Multifractal Diffusion Entropy Analysis we propose a method for choosing an optimal bin-width in histograms generated from underlying probability distributions of interest. The method presented uses techniques of Rényi’s entropy and the mean squared error analysis to discuss the conditions under which the error in the multifractal spectrum estimation is minimal. We illustrate the utility of our approach by focusing on a scaling behavior of financial time series. In particular, we analyze the S&P500 stock index as sampled at a daily rate in the time period 1950-2013. In order to demonstrate a strength of the method proposed we compare the multifractal δ-spectrum for various bin-widths and show the robustness of the method, especially for large values of q. For such values, other methods in use, e.g., those based on moment estimation, tend to fail for heavy-tailed data or data with long correlations. Connection between the δ-spectrum and Rényi’s q parameter is also discussed and elucidated on a simple example of multiscale time series.
ProBiS-CHARMMing: Web Interface for Prediction and Optimization of Ligands in Protein Binding Sites.

PubMed

Konc, Janez; Miller, Benjamin T; Štular, Tanja; Lešnik, Samo; Woodcock, H Lee; Brooks, Bernard R; Janežič, Dušanka

2015-11-23

Proteins often exist only as apo structures (unligated) in the Protein Data Bank, with their corresponding holo structures (with ligands) unavailable. However, apoproteins may not represent the amino-acid residue arrangement upon ligand binding well, which is especially problematic for molecular docking. We developed the ProBiS-CHARMMing web interface by connecting the ProBiS ( http://probis.cmm.ki.si ) and CHARMMing ( http://www.charmming.org ) web servers into one functional unit that enables prediction of protein-ligand complexes and allows for their geometry optimization and interaction energy calculation. The ProBiS web server predicts ligands (small compounds, proteins, nucleic acids, and single-atom ligands) that may bind to a query protein. This is achieved by comparing its surface structure against a nonredundant database of protein structures and finding those that have binding sites similar to that of the query protein. Existing ligands found in the similar binding sites are then transposed to the query according to predictions from ProBiS. The CHARMMing web server enables, among other things, minimization and potential energy calculation for a wide variety of biomolecular systems, and it is used here to optimize the geometry of the predicted protein-ligand complex structures using the CHARMM force field and to calculate their interaction energies with the corresponding query proteins. We show how ProBiS-CHARMMing can be used to predict ligands and their poses for a particular binding site, and minimize the predicted protein-ligand complexes to obtain representations of holoproteins. The ProBiS-CHARMMing web interface is freely available for academic users at http://probis.nih.gov.
Efficient visibility-driven medical image visualisation via adaptive binned visibility histogram.

PubMed

Jung, Younhyun; Kim, Jinman; Kumar, Ashnil; Feng, David Dagan; Fulham, Michael

2016-07-01

'Visibility' is a fundamental optical property that represents the observable, by users, proportion of the voxels in a volume during interactive volume rendering. The manipulation of this 'visibility' improves the volume rendering processes; for instance by ensuring the visibility of regions of interest (ROIs) or by guiding the identification of an optimal rendering view-point. The construction of visibility histograms (VHs), which represent the distribution of all the visibility of all voxels in the rendered volume, enables users to explore the volume with real-time feedback about occlusion patterns among spatially related structures during volume rendering manipulations. Volume rendered medical images have been a primary beneficiary of VH given the need to ensure that specific ROIs are visible relative to the surrounding structures, e.g. the visualisation of tumours that may otherwise be occluded by neighbouring structures. VH construction and its subsequent manipulations, however, are computationally expensive due to the histogram binning of the visibilities. This limits the real-time application of VH to medical images that have large intensity ranges and volume dimensions and require a large number of histogram bins. In this study, we introduce an efficient adaptive binned visibility histogram (AB-VH) in which a smaller number of histogram bins are used to represent the visibility distribution of the full VH. We adaptively bin medical images by using a cluster analysis algorithm that groups the voxels according to their intensity similarities into a smaller subset of bins while preserving the distribution of the intensity range of the original images. We increase efficiency by exploiting the parallel computation and multiple render targets (MRT) extension of the modern graphical processing units (GPUs) and this enables efficient computation of the histogram. We show the application of our method to single-modality computed tomography (CT), magnetic resonance (MR) imaging and multi-modality positron emission tomography-CT (PET-CT). In our experiments, the AB-VH markedly improved the computational efficiency for the VH construction and thus improved the subsequent VH-driven volume manipulations. This efficiency was achieved without major degradation in the VH visually and numerical differences between the AB-VH and its full-bin counterpart. We applied several variants of the K-means clustering algorithm with varying Ks (the number of clusters) and found that higher values of K resulted in better performance at a lower computational gain. The AB-VH also had an improved performance when compared to the conventional method of down-sampling of the histogram bins (equal binning) for volume rendering visualisation. Copyright © 2016 Elsevier Ltd. All rights reserved.
Shark: SQL and Analytics with Cost-Based Query Optimization on Coarse-Grained Distributed Memory

DTIC Science & Technology

2014-01-13

RDBMS and contains a database (often MySQL or Derby) with a namespace for tables, table metadata and partition information. Table data is stored in an...serialization/deserialization) Java interface implementations with corresponding object inspectors. The Hive driver controls the processing of queries, coordinat...native API, RDD operations are invoked through a functional interface similar to DryadLINQ [32] in Scala, Java or Python. For example, the Scala code for
Optimal Chunking of Large Multidimensional Arrays for Data Warehousing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Otoo, Ekow J; Otoo, Ekow J.; Rotem, Doron

2008-02-15

Very large multidimensional arrays are commonly used in data intensive scientific computations as well as on-line analytical processingapplications referred to as MOLAP. The storage organization of such arrays on disks is done by partitioning the large global array into fixed size sub-arrays called chunks or tiles that form the units of data transfer between disk and memory. Typical queries involve the retrieval of sub-arrays in a manner that access all chunks that overlap the query results. An important metric of the storage efficiency is the expected number of chunks retrieved over all such queries. The question that immediately arises is"whatmore » shapes of array chunks give the minimum expected number of chunks over a query workload?" The problem of optimal chunking was first introduced by Sarawagi and Stonebraker who gave an approximate solution. In this paper we develop exact mathematical models of the problem and provide exact solutions using steepest descent and geometric programming methods. Experimental results, using synthetic and real life workloads, show that our solutions are consistently within than 2.0percent of the true number of chunks retrieved for any number of dimensions. In contrast, the approximate solution of Sarawagi and Stonebraker can deviate considerably from the true result with increasing number of dimensions and also may lead to suboptimal chunk shapes.« less
SU-F-J-94: Development of a Plug-in Based Image Analysis Tool for Integration Into Treatment Planning

DOE Office of Scientific and Technical Information (OSTI.GOV)

Owen, D; Anderson, C; Mayo, C

Purpose: To extend the functionality of a commercial treatment planning system (TPS) to support (i) direct use of quantitative image-based metrics within treatment plan optimization and (ii) evaluation of dose-functional volume relationships to assist in functional image adaptive radiotherapy. Methods: A script was written that interfaces with a commercial TPS via an Application Programming Interface (API). The script executes a program that performs dose-functional volume analyses. Written in C#, the script reads the dose grid and correlates it with image data on a voxel-by-voxel basis through API extensions that can access registration transforms. A user interface was designed through WinFormsmore » to input parameters and display results. To test the performance of this program, image- and dose-based metrics computed from perfusion SPECT images aligned to the treatment planning CT were generated, validated, and compared. Results: The integration of image analysis information was successfully implemented as a plug-in to a commercial TPS. Perfusion SPECT images were used to validate the calculation and display of image-based metrics as well as dose-intensity metrics and histograms for defined structures on the treatment planning CT. Various biological dose correction models, custom image-based metrics, dose-intensity computations, and dose-intensity histograms were applied to analyze the image-dose profile. Conclusion: It is possible to add image analysis features to commercial TPSs through custom scripting applications. A tool was developed to enable the evaluation of image-intensity-based metrics in the context of functional targeting and avoidance. In addition to providing dose-intensity metrics and histograms that can be easily extracted from a plan database and correlated with outcomes, the system can also be extended to a plug-in optimization system, which can directly use the computed metrics for optimization of post-treatment tumor or normal tissue response models. Supported by NIH - P01 - CA059827.« less
High dimensional biological data retrieval optimization with NoSQL technology.

PubMed

Wang, Shicai; Pandis, Ioannis; Wu, Chao; He, Sijin; Johnson, David; Emam, Ibrahim; Guitton, Florian; Guo, Yike

2014-01-01

High-throughput transcriptomic data generated by microarray experiments is the most abundant and frequently stored kind of data currently used in translational medicine studies. Although microarray data is supported in data warehouses such as tranSMART, when querying relational databases for hundreds of different patient gene expression records queries are slow due to poor performance. Non-relational data models, such as the key-value model implemented in NoSQL databases, hold promise to be more performant solutions. Our motivation is to improve the performance of the tranSMART data warehouse with a view to supporting Next Generation Sequencing data. In this paper we introduce a new data model better suited for high-dimensional data storage and querying, optimized for database scalability and performance. We have designed a key-value pair data model to support faster queries over large-scale microarray data and implemented the model using HBase, an implementation of Google's BigTable storage system. An experimental performance comparison was carried out against the traditional relational data model implemented in both MySQL Cluster and MongoDB, using a large publicly available transcriptomic data set taken from NCBI GEO concerning Multiple Myeloma. Our new key-value data model implemented on HBase exhibits an average 5.24-fold increase in high-dimensional biological data query performance compared to the relational model implemented on MySQL Cluster, and an average 6.47-fold increase on query performance on MongoDB. The performance evaluation found that the new key-value data model, in particular its implementation in HBase, outperforms the relational model currently implemented in tranSMART. We propose that NoSQL technology holds great promise for large-scale data management, in particular for high-dimensional biological data such as that demonstrated in the performance evaluation described in this paper. We aim to use this new data model as a basis for migrating tranSMART's implementation to a more scalable solution for Big Data.
High dimensional biological data retrieval optimization with NoSQL technology

PubMed Central

2014-01-01

Background High-throughput transcriptomic data generated by microarray experiments is the most abundant and frequently stored kind of data currently used in translational medicine studies. Although microarray data is supported in data warehouses such as tranSMART, when querying relational databases for hundreds of different patient gene expression records queries are slow due to poor performance. Non-relational data models, such as the key-value model implemented in NoSQL databases, hold promise to be more performant solutions. Our motivation is to improve the performance of the tranSMART data warehouse with a view to supporting Next Generation Sequencing data. Results In this paper we introduce a new data model better suited for high-dimensional data storage and querying, optimized for database scalability and performance. We have designed a key-value pair data model to support faster queries over large-scale microarray data and implemented the model using HBase, an implementation of Google's BigTable storage system. An experimental performance comparison was carried out against the traditional relational data model implemented in both MySQL Cluster and MongoDB, using a large publicly available transcriptomic data set taken from NCBI GEO concerning Multiple Myeloma. Our new key-value data model implemented on HBase exhibits an average 5.24-fold increase in high-dimensional biological data query performance compared to the relational model implemented on MySQL Cluster, and an average 6.47-fold increase on query performance on MongoDB. Conclusions The performance evaluation found that the new key-value data model, in particular its implementation in HBase, outperforms the relational model currently implemented in tranSMART. We propose that NoSQL technology holds great promise for large-scale data management, in particular for high-dimensional biological data such as that demonstrated in the performance evaluation described in this paper. We aim to use this new data model as a basis for migrating tranSMART's implementation to a more scalable solution for Big Data. PMID:25435347
Benchmarking distributed data warehouse solutions for storing genomic variant information

PubMed Central

Wiewiórka, Marek S.; Wysakowicz, Dawid P.; Okoniewski, Michał J.

2017-01-01

Abstract Genomic-based personalized medicine encompasses storing, analysing and interpreting genomic variants as its central issues. At a time when thousands of patientss sequenced exomes and genomes are becoming available, there is a growing need for efficient database storage and querying. The answer could be the application of modern distributed storage systems and query engines. However, the application of large genomic variant databases to this problem has not been sufficiently far explored so far in the literature. To investigate the effectiveness of modern columnar storage [column-oriented Database Management System (DBMS)] and query engines, we have developed a prototypic genomic variant data warehouse, populated with large generated content of genomic variants and phenotypic data. Next, we have benchmarked performance of a number of combinations of distributed storages and query engines on a set of SQL queries that address biological questions essential for both research and medical applications. In addition, a non-distributed, analytical database (MonetDB) has been used as a baseline. Comparison of query execution times confirms that distributed data warehousing solutions outperform classic relational DBMSs. Moreover, pre-aggregation and further denormalization of data, which reduce the number of distributed join operations, significantly improve query performance by several orders of magnitude. Most of distributed back-ends offer a good performance for complex analytical queries, while the Optimized Row Columnar (ORC) format paired with Presto and Parquet with Spark 2 query engines provide, on average, the lowest execution times. Apache Kudu on the other hand, is the only solution that guarantees a sub-second performance for simple genome range queries returning a small subset of data, where low-latency response is expected, while still offering decent performance for running analytical queries. In summary, research and clinical applications that require the storage and analysis of variants from thousands of samples can benefit from the scalability and performance of distributed data warehouse solutions. Database URL: https://github.com/ZSI-Bio/variantsdwh PMID:29220442

Prevalence scaling: applications to an intelligent workstation for the diagnosis of breast cancer.

PubMed

Horsch, Karla; Giger, Maryellen L; Metz, Charles E

2008-11-01

Our goal was to investigate the effects of changes that the prevalence of cancer in a population have on the probability of malignancy (PM) output and an optimal combination of a true-positive fraction (TPF) and a false-positive fraction (FPF) of a mammographic and sonographic automatic classifier for the diagnosis of breast cancer. We investigate how a prevalence-scaling transformation that is used to change the prevalence inherent in the computer estimates of the PM affects the numerical and histographic output of a previously developed multimodality intelligent workstation. Using Bayes' rule and the binormal model, we study how changes in the prevalence of cancer in the diagnostic breast population affect our computer classifiers' optimal operating points, as defined by maximizing the expected utility. Prevalence scaling affects the threshold at which a particular TPF and FPF pair is achieved. Tables giving the thresholds on the scaled PM estimates that result in particular pairs of TPF and FPF are presented. Histograms of PMs scaled to reflect clinically relevant prevalence values differ greatly from histograms of laboratory-designed PMs. The optimal pair (TPF, FPF) of our lower performing mammographic classifier is more sensitive to changes in clinical prevalence than that of our higher performing sonographic classifier. Prevalence scaling can be used to change computer PM output to reflect clinically more appropriate prevalence. Relatively small changes in clinical prevalence can have large effects on the computer classifier's optimal operating point.
A novel method for the evaluation of uncertainty in dose-volume histogram computation.

PubMed

Henríquez, Francisco Cutanda; Castrillón, Silvia Vargas

2008-03-15

Dose-volume histograms (DVHs) are a useful tool in state-of-the-art radiotherapy treatment planning, and it is essential to recognize their limitations. Even after a specific dose-calculation model is optimized, dose distributions computed by using treatment-planning systems are affected by several sources of uncertainty, such as algorithm limitations, measurement uncertainty in the data used to model the beam, and residual differences between measured and computed dose. This report presents a novel method to take them into account. To take into account the effect of associated uncertainties, a probabilistic approach using a new kind of histogram, a dose-expected volume histogram, is introduced. The expected value of the volume in the region of interest receiving an absorbed dose equal to or greater than a certain value is found by using the probability distribution of the dose at each point. A rectangular probability distribution is assumed for this point dose, and a formulation that accounts for uncertainties associated with point dose is presented for practical computations. This method is applied to a set of DVHs for different regions of interest, including 6 brain patients, 8 lung patients, 8 pelvis patients, and 6 prostate patients planned for intensity-modulated radiation therapy. Results show a greater effect on planning target volume coverage than in organs at risk. In cases of steep DVH gradients, such as planning target volumes, this new method shows the largest differences with the corresponding DVH; thus, the effect of the uncertainty is larger.
Quantum algorithms on Walsh transform and Hamming distance for Boolean functions

NASA Astrophysics Data System (ADS)

Xie, Zhengwei; Qiu, Daowen; Cai, Guangya

2018-06-01

Walsh spectrum or Walsh transform is an alternative description of Boolean functions. In this paper, we explore quantum algorithms to approximate the absolute value of Walsh transform W_f at a single point z0 (i.e., |W_f(z0)|) for n-variable Boolean functions with probability at least 8/π 2 using the number of O(1/|W_f(z_{0)|ɛ }) queries, promised that the accuracy is ɛ , while the best known classical algorithm requires O(2n) queries. The Hamming distance between Boolean functions is used to study the linearity testing and other important problems. We take advantage of Walsh transform to calculate the Hamming distance between two n-variable Boolean functions f and g using O(1) queries in some cases. Then, we exploit another quantum algorithm which converts computing Hamming distance between two Boolean functions to quantum amplitude estimation (i.e., approximate counting). If Ham(f,g)=t≠0, we can approximately compute Ham( f, g) with probability at least 2/3 by combining our algorithm and {Approx-Count(f,ɛ ) algorithm} using the expected number of Θ( √{N/(\\lfloor ɛ t\\rfloor +1)}+√{t(N-t)}/\\lfloor ɛ t\\rfloor +1) queries, promised that the accuracy is ɛ . Moreover, our algorithm is optimal, while the exact query complexity for the above problem is Θ(N) and the query complexity with the accuracy ɛ is O(1/ɛ 2N/(t+1)) in classical algorithm, where N=2n. Finally, we present three exact quantum query algorithms for two promise problems on Hamming distance using O(1) queries, while any classical deterministic algorithm solving the problem uses Ω(2n) queries.
An Active RBSE Framework to Generate Optimal Stimulus Sequences in a BCI for Spelling

NASA Astrophysics Data System (ADS)

Moghadamfalahi, Mohammad; Akcakaya, Murat; Nezamfar, Hooman; Sourati, Jamshid; Erdogmus, Deniz

2017-10-01

A class of brain computer interfaces (BCIs) employs noninvasive recordings of electroencephalography (EEG) signals to enable users with severe speech and motor impairments to interact with their environment and social network. For example, EEG based BCIs for typing popularly utilize event related potentials (ERPs) for inference. Presentation paradigm design in current ERP-based letter by letter typing BCIs typically query the user with an arbitrary subset characters. However, the typing accuracy and also typing speed can potentially be enhanced with more informed subset selection and flash assignment. In this manuscript, we introduce the active recursive Bayesian state estimation (active-RBSE) framework for inference and sequence optimization. Prior to presentation in each iteration, rather than showing a subset of randomly selected characters, the developed framework optimally selects a subset based on a query function. Selected queries are made adaptively specialized for users during each intent detection. Through a simulation-based study, we assess the effect of active-RBSE on the performance of a language-model assisted typing BCI in terms of typing speed and accuracy. To provide a baseline for comparison, we also utilize standard presentation paradigms namely, row and column matrix presentation paradigm and also random rapid serial visual presentation paradigms. The results show that utilization of active-RBSE can enhance the online performance of the system, both in terms of typing accuracy and speed.
Optimal Weight Assignment for a Chinese Signature File.

ERIC Educational Resources Information Center

Liang, Tyne; And Others

1996-01-01

Investigates the performance of a character-based Chinese text retrieval scheme in which monogram keys and bigram keys are encoded into document signatures. Tests and verifies the theoretical predictions of the optimal weight assignments and the minimal false hit rate in experiments using a real Chinese corpus for disyllabic queries of different…
Semiautomated head-and-neck IMRT planning using dose warping and scaling to robustly adapt plans in a knowledge database containing potentially suboptimal plans

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schmidt, Matthew, E-mail: matthew.schmidt@varian.com; Grzetic, Shelby; Lo, Joseph Y.

Purpose: Prior work by the authors and other groups has studied the creation of automated intensity modulated radiotherapy (IMRT) plans of equivalent quality to those in a patient database of manually created clinical plans; those database plans provided guidance on the achievable sparing to organs-at-risk (OARs). However, in certain sites, such as head-and-neck, the clinical plans may not be sufficiently optimized because of anatomical complexity and clinical time constraints. This could lead to automated plans that suboptimally exploit OAR sparing. This work investigates a novel dose warping and scaling scheme that attempts to reduce effects of suboptimal sparing in clinicalmore » database plans, thus improving the quality of semiautomated head-and-neck cancer (HNC) plans. Methods: Knowledge-based radiotherapy (KBRT) plans for each of ten “query” patients were semiautomatically generated by identifying the most similar “match” patient in a database of 103 clinical manually created patient plans. The match patient’s plans were adapted to the query case by: (1) deforming the match beam fluences to suit the query target volume and (2) warping the match primary/boost dose distribution to suit the query geometry and using the warped distribution to generate query primary/boost optimization dose-volume constraints. Item (2) included a distance scaling factor to improve query OAR dose sparing with respect to the possibly suboptimal clinical match plan. To further compensate for a component plan of the match case (primary/boost) not optimally sparing OARs, the query dose volume constraints were reduced using a dose scaling factor to be the minimum from either (a) the warped component plan (primary or boost) dose distribution or (b) the warped total plan dose distribution (primary + boost) scaled in proportion to the ratio of component prescription dose to total prescription dose. The dose-volume constraints were used to plan the query case with no human intervention to adjust constraints during plan optimization. Results: KBRT and original clinical plans were dosimetrically equivalent for parotid glands (mean/median doses), spinal cord, and brainstem (maximum doses). KBRT plans significantly reduced larynx median doses (21.5 ± 6.6 Gy to 17.9 ± 3.9 Gy), and oral cavity mean (32.3 ± 6.2 Gy to 28.9 ± 5.4 Gy) and median (28.7 ± 5.7 Gy to 23.2 ± 5.3 Gy) doses. Doses to ipsilateral parotid gland, larynx, oral cavity, and brainstem were lower or equivalent in the KBRT plans for the majority of cases. By contrast, KBRT plans generated without the dose warping and dose scaling steps were not significantly different from the clinical plans. Conclusions: Fast, semiautomatically generated HNC IMRT plans adapted from existing plans in a clinical database can be of equivalent or better quality than manually created plans. The reductions in OAR doses in the semiautomated plans, compared to the clinical plans, indicate that the proposed dose warping and scaling method shows promise in mitigating the impact of suboptimal clinical plans.« less
A threshold selection method based on edge preserving

NASA Astrophysics Data System (ADS)

Lou, Liantang; Dan, Wei; Chen, Jiaqi

2015-12-01

A method of automatic threshold selection for image segmentation is presented. An optimal threshold is selected in order to preserve edge of image perfectly in image segmentation. The shortcoming of Otsu's method based on gray-level histograms is analyzed. The edge energy function of bivariate continuous function is expressed as the line integral while the edge energy function of image is simulated by discretizing the integral. An optimal threshold method by maximizing the edge energy function is given. Several experimental results are also presented to compare with the Otsu's method.
muBLASTP: database-indexed protein sequence search on multicore CPUs.

PubMed

Zhang, Jing; Misra, Sanchit; Wang, Hao; Feng, Wu-Chun

2016-11-04

The Basic Local Alignment Search Tool (BLAST) is a fundamental program in the life sciences that searches databases for sequences that are most similar to a query sequence. Currently, the BLAST algorithm utilizes a query-indexed approach. Although many approaches suggest that sequence search with a database index can achieve much higher throughput (e.g., BLAT, SSAHA, and CAFE), they cannot deliver the same level of sensitivity as the query-indexed BLAST, i.e., NCBI BLAST, or they can only support nucleotide sequence search, e.g., MegaBLAST. Due to different challenges and characteristics between query indexing and database indexing, the existing techniques for query-indexed search cannot be used into database indexed search. muBLASTP, a novel database-indexed BLAST for protein sequence search, delivers identical hits returned to NCBI BLAST. On Intel Haswell multicore CPUs, for a single query, the single-threaded muBLASTP achieves up to a 4.41-fold speedup for alignment stages, and up to a 1.75-fold end-to-end speedup over single-threaded NCBI BLAST. For a batch of queries, the multithreaded muBLASTP achieves up to a 5.7-fold speedups for alignment stages, and up to a 4.56-fold end-to-end speedup over multithreaded NCBI BLAST. With a newly designed index structure for protein database and associated optimizations in BLASTP algorithm, we re-factored BLASTP algorithm for modern multicore processors that achieves much higher throughput with acceptable memory footprint for the database index.
Dose to mass for evaluation and optimization of lung cancer radiation therapy.

PubMed

Tyler Watkins, William; Moore, Joseph A; Hugo, Geoffrey D; Siebers, Jeffrey V

2017-11-01

To evaluate potential organ at risk dose-sparing by using dose-mass-histogram (DMH) objective functions compared with dose-volume-histogram (DVH) objective functions. Treatment plans were retrospectively optimized for 10 locally advanced non-small cell lung cancer patients based on DVH and DMH objectives. DMH-objectives were the same as DVH objectives, but with mass replacing volume. Plans were normalized to dose to 95% of the PTV volume (PTV-D95v) or mass (PTV-D95m). For a given optimized dose, DVH and DMH were intercompared to ascertain dose-to-volume vs. dose-to-mass differences. Additionally, the optimized doses were intercompared using DVH and DMH metrics to ascertain differences in optimized plans. Mean dose to volume, D v ‾, mean dose to mass, D M ‾, and fluence maps were intercompared. For a given dose distribution, DVH and DMH differ by >5% in heterogeneous structures. In homogeneous structures including heart and spinal cord, DVH and DMH are nearly equivalent. At fixed PTV-D95v, DMH-optimization did not significantly reduce dose to OARs but reduced PTV-D v ‾ by 0.20±0.2Gy (p=0.02) and PTV-D M ‾ by 0.23±0.3Gy (p=0.02). Plans normalized to PTV-D95m also result in minor PTV dose reductions and esophageal dose sparing (D v ‾ reduced 0.45±0.5Gy, p=0.02 and D M ‾ reduced 0.44±0.5Gy, p=0.02) compared to DVH-optimized plans. Optimized fluence map comparisons indicate that DMH optimization reduces dose in the periphery of lung PTVs. DVH- and DMH-dose indices differ by >5% in lung and lung target volumes for fixed dose distributions, but optimizing DMH did not reduce dose to OARs. The primary difference observed in DVH- and DMH-optimized plans were variations in fluence to the periphery of lung target PTVs, where low density lung surrounds tumor. Copyright © 2017 Elsevier B.V. All rights reserved.
Heterogeneous database integration in biomedicine.

PubMed

Sujansky, W

2001-08-01

The rapid expansion of biomedical knowledge, reduction in computing costs, and spread of internet access have created an ocean of electronic data. The decentralized nature of our scientific community and healthcare system, however, has resulted in a patchwork of diverse, or heterogeneous, database implementations, making access to and aggregation of data across databases very difficult. The database heterogeneity problem applies equally to clinical data describing individual patients and biological data characterizing our genome. Specifically, databases are highly heterogeneous with respect to the data models they employ, the data schemas they specify, the query languages they support, and the terminologies they recognize. Heterogeneous database systems attempt to unify disparate databases by providing uniform conceptual schemas that resolve representational heterogeneities, and by providing querying capabilities that aggregate and integrate distributed data. Research in this area has applied a variety of database and knowledge-based techniques, including semantic data modeling, ontology definition, query translation, query optimization, and terminology mapping. Existing systems have addressed heterogeneous database integration in the realms of molecular biology, hospital information systems, and application portability.
EmptyHeaded: A Relational Engine for Graph Processing

PubMed Central

Aberger, Christopher R.; Tu, Susan; Olukotun, Kunle; Ré, Christopher

2016-01-01

There are two types of high-performance graph processing engines: low- and high-level engines. Low-level engines (Galois, PowerGraph, Snap) provide optimized data structures and computation models but require users to write low-level imperative code, hence ensuring that efficiency is the burden of the user. In high-level engines, users write in query languages like datalog (SociaLite) or SQL (Grail). High-level engines are easier to use but are orders of magnitude slower than the low-level graph engines. We present EmptyHeaded, a high-level engine that supports a rich datalog-like query language and achieves performance comparable to that of low-level engines. At the core of EmptyHeaded’s design is a new class of join algorithms that satisfy strong theoretical guarantees but have thus far not achieved performance comparable to that of specialized graph processing engines. To achieve high performance, EmptyHeaded introduces a new join engine architecture, including a novel query optimizer and data layouts that leverage single-instruction multiple data (SIMD) parallelism. With this architecture, EmptyHeaded outperforms high-level approaches by up to three orders of magnitude on graph pattern queries, PageRank, and Single-Source Shortest Paths (SSSP) and is an order of magnitude faster than many low-level baselines. We validate that EmptyHeaded competes with the best-of-breed low-level engine (Galois), achieving comparable performance on PageRank and at most 3× worse performance on SSSP. PMID:28077912
IN VITRO QUANTIFICATION OF THE SIZE DISTRIBUTION OF INTRASACCULAR VOIDS LEFT AFTER ENDOVASCULAR COILING OF CEREBRAL ANEURYSMS.

PubMed

Sadasivan, Chander; Brownstein, Jeremy; Patel, Bhumika; Dholakia, Ronak; Santore, Joseph; Al-Mufti, Fawaz; Puig, Enrique; Rakian, Audrey; Fernandez-Prada, Kenneth D; Elhammady, Mohamed S; Farhat, Hamad; Fiorella, David J; Woo, Henry H; Aziz-Sultan, Mohammad A; Lieber, Baruch B

2013-03-01

Endovascular coiling of cerebral aneurysms remains limited by coil compaction and associated recanalization. Recent coil designs which effect higher packing densities may be far from optimal because hemodynamic forces causing compaction are not well understood since detailed data regarding the location and distribution of coil masses are unavailable. We present an in vitro methodology to characterize coil masses deployed within aneurysms by quantifying intra-aneurysmal void spaces. Eight identical aneurysms were packed with coils by both balloon- and stent-assist techniques. The samples were embedded, sequentially sectioned and imaged. Empty spaces between the coils were numerically filled with circles (2D) in the planar images and with spheres (3D) in the three-dimensional composite images. The 2D and 3D void size histograms were analyzed for local variations and by fitting theoretical probability distribution functions. Balloon-assist packing densities (31±2%) were lower ( p =0.04) than the stent-assist group (40±7%). The maximum and average 2D and 3D void sizes were higher ( p =0.03 to 0.05) in the balloon-assist group as compared to the stent-assist group. None of the void size histograms were normally distributed; theoretical probability distribution fits suggest that the histograms are most probably exponentially distributed with decay constants of 6-10 mm. Significant ( p <=0.001 to p =0.03) spatial trends were noted with the void sizes but correlation coefficients were generally low (absolute r <=0.35). The methodology we present can provide valuable input data for numerical calculations of hemodynamic forces impinging on intra-aneurysmal coil masses and be used to compare and optimize coil configurations as well as coiling techniques.
Reconstructing the Dwarf Galaxy Progenitor from Tidal Streams Using MilkyWay@home

NASA Astrophysics Data System (ADS)

Newberg, Heidi; Shelton, Siddhartha

2018-04-01

We attempt to reconstruct the mass and radial profile of stars and dark matter in the dwarf galaxy progenitor of the Orphan Stream, using only information from the stars in the Orphan Stream. We show that given perfect data and perfect knowledge of the dwarf galaxy profile and Milky Way potential, we are able to reconstruct the mass and radial profiles of both the stars and dark matter in the progenitor to high accuracy using only the density of stars along the stream and either the velocity dispersion or width of the stream in the sky. To perform this test, we simulated the tidal disruption of a two component (stars and dark matter) dwarf galaxy along the orbit of the Orphan Stream. We then created a histogram of the density of stars along the stream and a histogram of either the velocity dispersion or width of the stream in the sky as a function of position along the stream. The volunteer supercomputer MilkyWay@home was given these two histograms, the Milky Way potential model, and the orbital parameters for the progenitor. N-body simulations were run, varying dwarf galaxy parameters and the time of disruption. The goodness-of-fit of the model to the data was determined using an Earth-Mover Distance algorithm. The parameters were optimized using Differential Evolution. Future work will explore whether currently available information on the Orphan Stream stars is sufficient to constrain its progenitor, and how sensitive the optimization is to our knowledge of the Milky Way potential and the density model of the dwarf galaxy progenitor, as well as a host of other real-life unknowns.
Database technology and the management of multimedia data in the Mirror project

NASA Astrophysics Data System (ADS)

de Vries, Arjen P.; Blanken, H. M.

1998-10-01

Multimedia digital libraries require an open distributed architecture instead of a monolithic database system. In the Mirror project, we use the Monet extensible database kernel to manage different representation of multimedia objects. To maintain independence between content, meta-data, and the creation of meta-data, we allow distribution of data and operations using CORBA. This open architecture introduces new problems for data access. From an end user's perspective, the problem is how to search the available representations to fulfill an actual information need; the conceptual gap between human perceptual processes and the meta-data is too large. From a system's perspective, several representations of the data may semantically overlap or be irrelevant. We address these problems with an iterative query process and active user participating through relevance feedback. A retrieval model based on inference networks assists the user with query formulation. The integration of this model into the database design has two advantages. First, the user can query both the logical and the content structure of multimedia objects. Second, the use of different data models in the logical and the physical database design provides data independence and allows algebraic query optimization. We illustrate query processing with a music retrieval application.
Accelerating the weighted histogram analysis method by direct inversion in the iterative subspace.

PubMed

Zhang, Cheng; Lai, Chun-Liang; Pettitt, B Montgomery

The weighted histogram analysis method (WHAM) for free energy calculations is a valuable tool to produce free energy differences with the minimal errors. Given multiple simulations, WHAM obtains from the distribution overlaps the optimal statistical estimator of the density of states, from which the free energy differences can be computed. The WHAM equations are often solved by an iterative procedure. In this work, we use a well-known linear algebra algorithm which allows for more rapid convergence to the solution. We find that the computational complexity of the iterative solution to WHAM and the closely-related multiple Bennett acceptance ratio (MBAR) method can be improved by using the method of direct inversion in the iterative subspace. We give examples from a lattice model, a simple liquid and an aqueous protein solution.
Choosing the best image processing method for masticatory performance assessment when using two-coloured specimens.

PubMed

Vaccaro, G; Pelaez, J I; Gil, J A

2016-07-01

Objective masticatory performance assessment using two-coloured specimens relies on image processing techniques; however, just a few approaches have been tested and no comparative studies are reported. The aim of this study was to present a selection procedure of the optimal image analysis method for masticatory performance assessment with a given two-coloured chewing gum. Dentate participants (n = 250; 25 ± 6·3 years) chewed red-white chewing gums for 3, 6, 9, 12, 15, 18, 21 and 25 cycles (2000 samples). Digitalised images of retrieved specimens were analysed using 122 image processing methods (IPMs) based on feature extraction algorithms (pixel values and histogram analysis). All IPMs were tested following the criteria of: normality of measurements (Kolmogorov-Smirnov), ability to detect differences among mixing states (anova corrected with post hoc Bonferroni) and moderate-to-high correlation with the number of cycles (Spearman's Rho). The optimal IPM was chosen using multiple criteria decision analysis (MCDA). Measurements provided by all IPMs proved to be normally distributed (P < 0·05), 116 proved sensible to mixing states (P < 0·05), and 35 showed moderate-to-high correlation with the number of cycles (|ρ| > 0·5; P < 0·05). The variance of the histogram of the Hue showed the highest correlation with the number of cycles (ρ = 0·792; P < 0·0001) and the highest MCDA score (optimal). The proposed procedure proved to be reliable and able to select the optimal approach among multiple IPMs. This experiment may be reproduced to identify the optimal approach for each case of locally available test foods. © 2016 John Wiley & Sons Ltd.
A fully automatic end-to-end method for content-based image retrieval of CT scans with similar liver lesion annotations.

PubMed

Spanier, A B; Caplan, N; Sosna, J; Acar, B; Joskowicz, L

2018-01-01

The goal of medical content-based image retrieval (M-CBIR) is to assist radiologists in the decision-making process by retrieving medical cases similar to a given image. One of the key interests of radiologists is lesions and their annotations, since the patient treatment depends on the lesion diagnosis. Therefore, a key feature of M-CBIR systems is the retrieval of scans with the most similar lesion annotations. To be of value, M-CBIR systems should be fully automatic to handle large case databases. We present a fully automatic end-to-end method for the retrieval of CT scans with similar liver lesion annotations. The input is a database of abdominal CT scans labeled with liver lesions, a query CT scan, and optionally one radiologist-specified lesion annotation of interest. The output is an ordered list of the database CT scans with the most similar liver lesion annotations. The method starts by automatically segmenting the liver in the scan. It then extracts a histogram-based features vector from the segmented region, learns the features' relative importance, and ranks the database scans according to the relative importance measure. The main advantages of our method are that it fully automates the end-to-end querying process, that it uses simple and efficient techniques that are scalable to large datasets, and that it produces quality retrieval results using an unannotated CT scan. Our experimental results on 9 CT queries on a dataset of 41 volumetric CT scans from the 2014 Image CLEF Liver Annotation Task yield an average retrieval accuracy (Normalized Discounted Cumulative Gain index) of 0.77 and 0.84 without/with annotation, respectively. Fully automatic end-to-end retrieval of similar cases based on image information alone, rather that on disease diagnosis, may help radiologists to better diagnose liver lesions.
Element distinctness revisited

NASA Astrophysics Data System (ADS)

Portugal, Renato

2018-07-01

The element distinctness problem is the problem of determining whether the elements of a list are distinct, that is, if x=(x_1,\\ldots ,x_N) is a list with N elements, we ask whether the elements of x are distinct or not. The solution in a classical computer requires N queries because it uses sorting to check whether there are equal elements. In the quantum case, it is possible to solve the problem in O(N^{2/3}) queries. There is an extension which asks whether there are k colliding elements, known as element k-distinctness problem. This work obtains optimal values of two critical parameters of Ambainis' seminal quantum algorithm (SIAM J Comput 37(1):210-239, 2007). The first critical parameter is the number of repetitions of the algorithm's main block, which inverts the phase of the marked elements and calls a subroutine. The second parameter is the number of quantum walk steps interlaced by oracle queries. We show that, when the optimal values of the parameters are used, the algorithm's success probability is 1-O(N^{1/(k+1)}), quickly approaching 1. The specification of the exact running time and success probability is important in practical applications of this algorithm.
Content based Image Retrieval based on Different Global and Local Color Histogram Methods: A Survey

NASA Astrophysics Data System (ADS)

Suhasini, Pallikonda Sarah; Sri Rama Krishna, K.; Murali Krishna, I. V.

2017-02-01

Different global and local color histogram methods for content based image retrieval (CBIR) are investigated in this paper. Color histogram is a widely used descriptor for CBIR. Conventional method of extracting color histogram is global, which misses the spatial content, is less invariant to deformation and viewpoint changes, and results in a very large three dimensional histogram corresponding to the color space used. To address the above deficiencies, different global and local histogram methods are proposed in recent research. Different ways of extracting local histograms to have spatial correspondence, invariant colour histogram to add deformation and viewpoint invariance and fuzzy linking method to reduce the size of the histogram are found in recent papers. The color space and the distance metric used are vital in obtaining color histogram. In this paper the performance of CBIR based on different global and local color histograms in three different color spaces, namely, RGB, HSV, L*a*b* and also with three distance measures Euclidean, Quadratic and Histogram intersection are surveyed, to choose appropriate method for future research.
Value Driven Information Processing and Fusion

DTIC Science & Technology

2016-03-01

consensus approach allows a decentralized approach to achieve the optimal error exponent of the centralized counterpart, a conclusion that is signifi...SECURITY CLASSIFICATION OF: The objective of the project is to develop a general framework for value driven decentralized information processing...including: optimal data reduction in a network setting for decentralized inference with quantization constraint; interactive fusion that allows queries and

PIRIA: a general tool for indexing, search, and retrieval of multimedia content

NASA Astrophysics Data System (ADS)

Joint, Magali; Moellic, Pierre-Alain; Hede, P.; Adam, P.

2004-05-01

The Internet is a continuously expanding source of multimedia content and information. There are many products in development to search, retrieve, and understand multimedia content. But most of the current image search/retrieval engines, rely on a image database manually pre-indexed with keywords. Computers are still powerless to understand the semantic meaning of still or animated image content. Piria (Program for the Indexing and Research of Images by Affinity), the search engine we have developed brings this possibility closer to reality. Piria is a novel search engine that uses the query by example method. A user query is submitted to the system, which then returns a list of images ranked by similarity, obtained by a metric distance that operates on every indexed image signature. These indexed images are compared according to several different classifiers, not only Keywords, but also Form, Color and Texture, taking into account geometric transformations and variance like rotation, symmetry, mirroring, etc. Form - Edges extracted by an efficient segmentation algorithm. Color - Histogram, semantic color segmentation and spatial color relationship. Texture - Texture wavelets and local edge patterns. If required, Piria is also able to fuse results from multiple classifiers with a new classification of index categories: Single Indexer Single Call (SISC), Single Indexer Multiple Call (SIMC), Multiple Indexers Single Call (MISC) or Multiple Indexers Multiple Call (MIMC). Commercial and industrial applications will be explored and discussed as well as current and future development.
LCC: Light Curves Classifier

NASA Astrophysics Data System (ADS)

Vo, Martin

2017-08-01

Light Curves Classifier uses data mining and machine learning to obtain and classify desired objects. This task can be accomplished by attributes of light curves or any time series, including shapes, histograms, or variograms, or by other available information about the inspected objects, such as color indices, temperatures, and abundances. After specifying features which describe the objects to be searched, the software trains on a given training sample, and can then be used for unsupervised clustering for visualizing the natural separation of the sample. The package can be also used for automatic tuning parameters of used methods (for example, number of hidden neurons or binning ratio). Trained classifiers can be used for filtering outputs from astronomical databases or data stored locally. The Light Curve Classifier can also be used for simple downloading of light curves and all available information of queried stars. It natively can connect to OgleII, OgleIII, ASAS, CoRoT, Kepler, Catalina and MACHO, and new connectors or descriptors can be implemented. In addition to direct usage of the package and command line UI, the program can be used through a web interface. Users can create jobs for ”training” methods on given objects, querying databases and filtering outputs by trained filters. Preimplemented descriptors, classifier and connectors can be picked by simple clicks and their parameters can be tuned by giving ranges of these values. All combinations are then calculated and the best one is used for creating the filter. Natural separation of the data can be visualized by unsupervised clustering.
Common Data Model for Neuroscience Data and Data Model Exchange

PubMed Central

Gardner, Daniel; Knuth, Kevin H.; Abato, Michael; Erde, Steven M.; White, Thomas; DeBellis, Robert; Gardner, Esther P.

2001-01-01

Objective: Generalizing the data models underlying two prototype neurophysiology databases, the authors describe and propose the Common Data Model (CDM) as a framework for federating a broad spectrum of disparate neuroscience information resources. Design: Each component of the CDM derives from one of five superclasses—data, site, method, model, and reference—or from relations defined between them. A hierarchic attribute-value scheme for metadata enables interoperability with variable tree depth to serve specific intra- or broad inter-domain queries. To mediate data exchange between disparate systems, the authors propose a set of XML-derived schema for describing not only data sets but data models. These include biophysical description markup language (BDML), which mediates interoperability between data resources by providing a meta-description for the CDM. Results: The set of superclasses potentially spans data needs of contemporary neuroscience. Data elements abstracted from neurophysiology time series and histogram data represent data sets that differ in dimension and concordance. Site elements transcend neurons to describe subcellular compartments, circuits, regions, or slices; non-neuroanatomic sites include sequences to patients. Methods and models are highly domain-dependent. Conclusions: True federation of data resources requires explicit public description, in a metalanguage, of the contents, query methods, data formats, and data models of each data resource. Any data model that can be derived from the defined superclasses is potentially conformant and interoperability can be enabled by recognition of BDML-described compatibilities. Such metadescriptions can buffer technologic changes. PMID:11141510
Exploratory Study of 4D Versus 3D Robust Optimization in Intensity-Modulated Proton Therapy for Lung Cancer

PubMed Central

Liu, Wei; Schild, Steven E.; Chang, Joe Y.; Liao, Zhongxing; Chang, Yu-Hui; Wen, Zhifei; Shen, Jiajian; Stoker, Joshua B.; Ding, Xiaoning; Hu, Yanle; Sahoo, Narayan; Herman, Michael G.; Vargas, Carlos; Keole, Sameer; Wong, William; Bues, Martin

2015-01-01

Background To compare the impact of uncertainties and interplay effect on 3D and 4D robustly optimized intensity-modulated proton therapy (IMPT) plans for lung cancer in an exploratory methodology study. Methods IMPT plans were created for 11 non-randomly selected non-small-cell lung cancer (NSCLC) cases: 3D robustly optimized plans on average CTs with internal gross tumor volume density overridden to irradiate internal target volume, and 4D robustly optimized plans on 4D CTs to irradiate clinical target volume (CTV). Regular fractionation (66 Gy[RBE] in 33 fractions) were considered. In 4D optimization, the CTV of individual phases received non-uniform doses to achieve a uniform cumulative dose. The root-mean-square-dose volume histograms (RVH) measured the sensitivity of the dose to uncertainties, and the areas under the RVH curve (AUCs) were used to evaluate plan robustness. Dose evaluation software modeled time-dependent spot delivery to incorporate interplay effect with randomized starting phases of each field per fraction. Dose-volume histogram indices comparing CTV coverage, homogeneity, and normal tissue sparing were evaluated using Wilcoxon signed-rank test. Results 4D robust optimization plans led to smaller AUC for CTV (14.26 vs. 18.61 (p=0.001), better CTV coverage (Gy[RBE]) [D95% CTV: 60.6 vs 55.2 (p=0.001)], and better CTV homogeneity [D5%–D95% CTV: 10.3 vs 17.7 (p=0.002)] in the face of uncertainties. With interplay effect considered, 4D robust optimization produced plans with better target coverage [D95% CTV: 64.5 vs 63.8 (p=0.0068)], comparable target homogeneity, and comparable normal tissue protection. The benefits from 4D robust optimization were most obvious for the 2 typical stage III lung cancer patients. Conclusions Our exploratory methodology study showed that, compared to 3D robust optimization, 4D robust optimization produced significantly more robust and interplay-effect-resistant plans for targets with comparable dose distributions for normal tissues. A further study with a larger and more realistic patient population is warranted to generalize the conclusions. PMID:26725727
The Analysis of RDF Semantic Data Storage Optimization in Large Data Era

NASA Astrophysics Data System (ADS)

He, Dandan; Wang, Lijuan; Wang, Can

2018-03-01

With the continuous development of information technology and network technology in China, the Internet has also ushered in the era of large data. In order to obtain the effective acquisition of information in the era of large data, it is necessary to optimize the existing RDF semantic data storage and realize the effective query of various data. This paper discusses the storage optimization of RDF semantic data under large data.
SHARE: system design and case studies for statistical health information release

PubMed Central

Gardner, James; Xiong, Li; Xiao, Yonghui; Gao, Jingjing; Post, Andrew R; Jiang, Xiaoqian; Ohno-Machado, Lucila

2013-01-01

Objectives We present SHARE, a new system for statistical health information release with differential privacy. We present two case studies that evaluate the software on real medical datasets and demonstrate the feasibility and utility of applying the differential privacy framework on biomedical data. Materials and Methods SHARE releases statistical information in electronic health records with differential privacy, a strong privacy framework for statistical data release. It includes a number of state-of-the-art methods for releasing multidimensional histograms and longitudinal patterns. We performed a variety of experiments on two real datasets, the surveillance, epidemiology and end results (SEER) breast cancer dataset and the Emory electronic medical record (EeMR) dataset, to demonstrate the feasibility and utility of SHARE. Results Experimental results indicate that SHARE can deal with heterogeneous data present in medical data, and that the released statistics are useful. The Kullback–Leibler divergence between the released multidimensional histograms and the original data distribution is below 0.5 and 0.01 for seven-dimensional and three-dimensional data cubes generated from the SEER dataset, respectively. The relative error for longitudinal pattern queries on the EeMR dataset varies between 0 and 0.3. While the results are promising, they also suggest that challenges remain in applying statistical data release using the differential privacy framework for higher dimensional data. Conclusions SHARE is one of the first systems to provide a mechanism for custodians to release differentially private aggregate statistics for a variety of use cases in the medical domain. This proof-of-concept system is intended to be applied to large-scale medical data warehouses. PMID:23059729
Ontology-Driven Provenance Management in eScience: An Application in Parasite Research

NASA Astrophysics Data System (ADS)

Sahoo, Satya S.; Weatherly, D. Brent; Mutharaju, Raghava; Anantharam, Pramod; Sheth, Amit; Tarleton, Rick L.

Provenance, from the French word "provenir", describes the lineage or history of a data entity. Provenance is critical information in scientific applications to verify experiment process, validate data quality and associate trust values with scientific results. Current industrial scale eScience projects require an end-to-end provenance management infrastructure. This infrastructure needs to be underpinned by formal semantics to enable analysis of large scale provenance information by software applications. Further, effective analysis of provenance information requires well-defined query mechanisms to support complex queries over large datasets. This paper introduces an ontology-driven provenance management infrastructure for biology experiment data, as part of the Semantic Problem Solving Environment (SPSE) for Trypanosoma cruzi (T.cruzi). This provenance infrastructure, called T.cruzi Provenance Management System (PMS), is underpinned by (a) a domain-specific provenance ontology called Parasite Experiment ontology, (b) specialized query operators for provenance analysis, and (c) a provenance query engine. The query engine uses a novel optimization technique based on materialized views called materialized provenance views (MPV) to scale with increasing data size and query complexity. This comprehensive ontology-driven provenance infrastructure not only allows effective tracking and management of ongoing experiments in the Tarleton Research Group at the Center for Tropical and Emerging Global Diseases (CTEGD), but also enables researchers to retrieve the complete provenance information of scientific results for publication in literature.
Spatial Query for Planetary Data

NASA Technical Reports Server (NTRS)

Shams, Khawaja S.; Crockett, Thomas M.; Powell, Mark W.; Joswig, Joseph C.; Fox, Jason M.

2011-01-01

Science investigators need to quickly and effectively assess past observations of specific locations on a planetary surface. This innovation involves a location-based search technology that was adapted and applied to planetary science data to support a spatial query capability for mission operations software. High-performance location-based searching requires the use of spatial data structures for database organization. Spatial data structures are designed to organize datasets based on their coordinates in a way that is optimized for location-based retrieval. The particular spatial data structure that was adapted for planetary data search is the R+ tree.
Calculation and application of activity discriminants in lead optimization.

PubMed

Luo, Xincai; Krumrine, Jennifer R; Shenvi, Ashok B; Pierson, M Edward; Bernstein, Peter R

2010-11-01

We present a technique for computing activity discriminants of in vitro (pharmacological, DMPK, and safety) assays and the application to the prediction of in vitro activities of proposed synthetic targets during the lead optimization phase of drug discovery projects. This technique emulates how medicinal chemists perform SAR analysis and activity prediction. The activity discriminants that are functions of 6 commonly used medicinal chemistry descriptors can be interpreted easily by medicinal chemists. Further, visualization with Spotfire allows medicinal chemists to analyze how the query molecule is related to compounds tested previously, and to evaluate easily the relevance of the activity discriminants to the activities of the query molecule. Validation with all compounds synthesized and tested in AstraZeneca Wilmington since 2006 demonstrates that this approach is useful for prioritizing new synthetic targets for synthesis. Copyright © 2010 Elsevier Inc. All rights reserved.
Agile Datacube Analytics (not just) for the Earth Sciences

NASA Astrophysics Data System (ADS)

Misev, Dimitar; Merticariu, Vlad; Baumann, Peter

2017-04-01

Metadata are considered small, smart, and queryable; data, on the other hand, are known as big, clumsy, hard to analyze. Consequently, gridded data - such as images, image timeseries, and climate datacubes - are managed separately from the metadata, and with different, restricted retrieval capabilities. One reason for this silo approach is that databases, while good at tables, XML hierarchies, RDF graphs, etc., traditionally do not support multi-dimensional arrays well. This gap is being closed by Array Databases which extend the SQL paradigm of "any query, anytime" to NoSQL arrays. They introduce semantically rich modelling combined with declarative, high-level query languages on n-D arrays. On Server side, such queries can be optimized, parallelized, and distributed based on partitioned array storage. This way, they offer new vistas in flexibility, scalability, performance, and data integration. In this respect, the forthcoming ISO SQL extension MDA ("Multi-dimensional Arrays") will be a game changer in Big Data Analytics. We introduce concepts and opportunities through the example of rasdaman ("raster data manager") which in fact has pioneered the field of Array Databases and forms the blueprint for ISO SQL/MDA and further Big Data standards, such as OGC WCPS for querying spatio-temporal Earth datacubes. With operational installations exceeding 140 TB queries have been split across more than one thousand cloud nodes, using CPUs as well as GPUs. Installations can easily be mashed up securely, enabling large-scale location-transparent query processing in federations. Federation queries have been demonstrated live at EGU 2016 spanning Europe and Australia in the context of the intercontinental EarthServer initiative, visualized through NASA WorldWind.
Agile Datacube Analytics (not just) for the Earth Sciences

NASA Astrophysics Data System (ADS)

Baumann, P.

2016-12-01

Metadata are considered small, smart, and queryable; data, on the other hand, are known as big, clumsy, hard to analyze. Consequently, gridded data - such as images, image timeseries, and climate datacubes - are managed separately from the metadata, and with different, restricted retrieval capabilities. One reason for this silo approach is that databases, while good at tables, XML hierarchies, RDF graphs, etc., traditionally do not support multi-dimensional arrays well.This gap is being closed by Array Databases which extend the SQL paradigm of "any query, anytime" to NoSQL arrays. They introduce semantically rich modelling combined with declarative, high-level query languages on n-D arrays. On Server side, such queries can be optimized, parallelized, and distributed based on partitioned array storage. This way, they offer new vistas in flexibility, scalability, performance, and data integration. In this respect, the forthcoming ISO SQL extension MDA ("Multi-dimensional Arrays") will be a game changer in Big Data Analytics.We introduce concepts and opportunities through the example of rasdaman ("raster data manager") which in fact has pioneered the field of Array Databases and forms the blueprint for ISO SQL/MDA and further Big Data standards, such as OGC WCPS for querying spatio-temporal Earth datacubes. With operational installations exceeding 140 TB queries have been split across more than one thousand cloud nodes, using CPUs as well as GPUs. Installations can easily be mashed up securely, enabling large-scale location-transparent query processing in federations. Federation queries have been demonstrated live at EGU 2016 spanning Europe and Australia in the context of the intercontinental EarthServer initiative, visualized through NASA WorldWind.
Radial polar histogram: obstacle avoidance and path planning for robotic cognition and motion control

NASA Astrophysics Data System (ADS)

Wang, Po-Jen; Keyawa, Nicholas R.; Euler, Craig

2012-01-01

In order to achieve highly accurate motion control and path planning for a mobile robot, an obstacle avoidance algorithm that provided a desired instantaneous turning radius and velocity was generated. This type of obstacle avoidance algorithm, which has been implemented in California State University Northridge's Intelligent Ground Vehicle (IGV), is known as Radial Polar Histogram (RPH). The RPH algorithm utilizes raw data in the form of a polar histogram that is read from a Laser Range Finder (LRF) and a camera. A desired open block is determined from the raw data utilizing a navigational heading and an elliptical approximation. The left and right most radii are determined from the calculated edges of the open block and provide the range of possible radial paths the IGV can travel through. In addition, the calculated obstacle edge positions allow the IGV to recognize complex obstacle arrangements and to slow down accordingly. A radial path optimization function calculates the best radial path between the left and right most radii and is sent to motion control for speed determination. Overall, the RPH algorithm allows the IGV to autonomously travel at average speeds of 3mph while avoiding all obstacles, with a processing time of approximately 10ms.
Secure image retrieval with multiple keys

NASA Astrophysics Data System (ADS)

Liang, Haihua; Zhang, Xinpeng; Wei, Qiuhan; Cheng, Hang

2018-03-01

This article proposes a secure image retrieval scheme under a multiuser scenario. In this scheme, the owner first encrypts and uploads images and their corresponding features to the cloud; then, the user submits the encrypted feature of the query image to the cloud; next, the cloud compares the encrypted features and returns encrypted images with similar content to the user. To find the nearest neighbor in the encrypted features, an encryption with multiple keys is proposed, in which the query feature of each user is encrypted by his/her own key. To improve the key security and space utilization, global optimization and Gaussian distribution are, respectively, employed to generate multiple keys. The experiments show that the proposed encryption can provide effective and secure image retrieval for each user and ensure confidentiality of the query feature of each user.
Enhanced Approximate Nearest Neighbor via Local Area Focused Search.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gonzales, Antonio; Blazier, Nicholas Paul

Approximate Nearest Neighbor (ANN) algorithms are increasingly important in machine learning, data mining, and image processing applications. There is a large family of space- partitioning ANN algorithms, such as randomized KD-Trees, that work well in practice but are limited by an exponential increase in similarity comparisons required to optimize recall. Additionally, they only support a small set of similarity metrics. We present Local Area Fo- cused Search (LAFS), a method that enhances the way queries are performed using an existing ANN index. Instead of a single query, LAFS performs a number of smaller (fewer similarity comparisons) queries and focuses onmore » a local neighborhood which is refined as candidates are identified. We show that our technique improves performance on several well known datasets and is easily extended to general similarity metrics using kernel projection techniques.« less
SU-E-J-16: Automatic Image Contrast Enhancement Based On Automatic Parameter Optimization for Radiation Therapy Setup Verification

DOE Office of Scientific and Technical Information (OSTI.GOV)

Qiu, J; Washington University in St Louis, St Louis, MO; Li, H. Harlod

Purpose: In RT patient setup 2D images, tissues often cannot be seen well due to the lack of image contrast. Contrast enhancement features provided by image reviewing software, e.g. Mosaiq and ARIA, require manual selection of the image processing filters and parameters thus inefficient and cannot be automated. In this work, we developed a novel method to automatically enhance the 2D RT image contrast to allow automatic verification of patient daily setups as a prerequisite step of automatic patient safety assurance. Methods: The new method is based on contrast limited adaptive histogram equalization (CLAHE) and high-pass filtering algorithms. The mostmore » important innovation is to automatically select the optimal parameters by optimizing the image contrast. The image processing procedure includes the following steps: 1) background and noise removal, 2) hi-pass filtering by subtracting the Gaussian smoothed Result, and 3) histogram equalization using CLAHE algorithm. Three parameters were determined through an iterative optimization which was based on the interior-point constrained optimization algorithm: the Gaussian smoothing weighting factor, the CLAHE algorithm block size and clip limiting parameters. The goal of the optimization is to maximize the entropy of the processed Result. Results: A total 42 RT images were processed. The results were visually evaluated by RT physicians and physicists. About 48% of the images processed by the new method were ranked as excellent. In comparison, only 29% and 18% of the images processed by the basic CLAHE algorithm and by the basic window level adjustment process, were ranked as excellent. Conclusion: This new image contrast enhancement method is robust and automatic, and is able to significantly outperform the basic CLAHE algorithm and the manual window-level adjustment process that are currently used in clinical 2D image review software tools.« less
Petaminer: Using ROOT for efficient data storage in MySQL database

NASA Astrophysics Data System (ADS)

Cranshaw, J.; Malon, D.; Vaniachine, A.; Fine, V.; Lauret, J.; Hamill, P.

2010-04-01

High Energy and Nuclear Physics (HENP) experiments store Petabytes of event data and Terabytes of calibration data in ROOT files. The Petaminer project is developing a custom MySQL storage engine to enable the MySQL query processor to directly access experimental data stored in ROOT files. Our project is addressing the problem of efficient navigation to PetaBytes of HENP experimental data described with event-level TAG metadata, which is required by data intensive physics communities such as the LHC and RHIC experiments. Physicists need to be able to compose a metadata query and rapidly retrieve the set of matching events, where improved efficiency will facilitate the discovery process by permitting rapid iterations of data evaluation and retrieval. Our custom MySQL storage engine enables the MySQL query processor to directly access TAG data stored in ROOT TTrees. As ROOT TTrees are column-oriented, reading them directly provides improved performance over traditional row-oriented TAG databases. Leveraging the flexible and powerful SQL query language to access data stored in ROOT TTrees, the Petaminer approach enables rich MySQL index-building capabilities for further performance optimization.
Searching for cancer information on the internet: analyzing natural language search queries.

PubMed

Bader, Judith L; Theofanos, Mary Frances

2003-12-11

Searching for health information is one of the most-common tasks performed by Internet users. Many users begin searching on popular search engines rather than on prominent health information sites. We know that many visitors to our (National Cancer Institute) Web site, cancer.gov, arrive via links in search engine result. To learn more about the specific needs of our general-public users, we wanted to understand what lay users really wanted to know about cancer, how they phrased their questions, and how much detail they used. The National Cancer Institute partnered with AskJeeves, Inc to develop a methodology to capture, sample, and analyze 3 months of cancer-related queries on the Ask.com Web site, a prominent United States consumer search engine, which receives over 35 million queries per week. Using a benchmark set of 500 terms and word roots supplied by the National Cancer Institute, AskJeeves identified a test sample of cancer queries for 1 week in August 2001. From these 500 terms only 37 appeared >or= 5 times/day over the trial test week in 17208 queries. Using these 37 terms, 204165 instances of cancer queries were found in the Ask.com query logs for the actual test period of June-August 2001. Of these, 7500 individual user questions were randomly selected for detailed analysis and assigned to appropriate categories. The exact language of sample queries is presented. Considering multiples of the same questions, the sample of 7500 individual user queries represented 76077 queries (37% of the total 3-month pool). Overall 78.37% of sampled Cancer queries asked about 14 specific cancer types. Within each cancer type, queries were sorted into appropriate subcategories including at least the following: General Information, Symptoms, Diagnosis and Testing, Treatment, Statistics, Definition, and Cause/Risk/Link. The most-common specific cancer types mentioned in queries were Digestive/Gastrointestinal/Bowel (15.0%), Breast (11.7%), Skin (11.3%), and Genitourinary (10.5%). Additional subcategories of queries about specific cancer types varied, depending on user input. Queries that were not specific to a cancer type were also tracked and categorized. Natural-language searching affords users the opportunity to fully express their information needs and can aid users naïve to the content and vocabulary. The specific queries analyzed for this study reflect news and research studies reported during the study dates and would surely change with different study dates. Analyzing queries from search engines represents one way of knowing what kinds of content to provide to users of a given Web site. Users ask questions using whole sentences and keywords, often misspelling words. Providing the option for natural-language searching does not obviate the need for good information architecture, usability engineering, and user testing in order to optimize user experience.
Searching for Cancer Information on the Internet: Analyzing Natural Language Search Queries

PubMed Central

Theofanos, Mary Frances

2003-01-01

Background Searching for health information is one of the most-common tasks performed by Internet users. Many users begin searching on popular search engines rather than on prominent health information sites. We know that many visitors to our (National Cancer Institute) Web site, cancer.gov, arrive via links in search engine result. Objective To learn more about the specific needs of our general-public users, we wanted to understand what lay users really wanted to know about cancer, how they phrased their questions, and how much detail they used. Methods The National Cancer Institute partnered with AskJeeves, Inc to develop a methodology to capture, sample, and analyze 3 months of cancer-related queries on the Ask.com Web site, a prominent United States consumer search engine, which receives over 35 million queries per week. Using a benchmark set of 500 terms and word roots supplied by the National Cancer Institute, AskJeeves identified a test sample of cancer queries for 1 week in August 2001. From these 500 terms only 37 appeared ≥ 5 times/day over the trial test week in 17208 queries. Using these 37 terms, 204165 instances of cancer queries were found in the Ask.com query logs for the actual test period of June-August 2001. Of these, 7500 individual user questions were randomly selected for detailed analysis and assigned to appropriate categories. The exact language of sample queries is presented. Results Considering multiples of the same questions, the sample of 7500 individual user queries represented 76077 queries (37% of the total 3-month pool). Overall 78.37% of sampled Cancer queries asked about 14 specific cancer types. Within each cancer type, queries were sorted into appropriate subcategories including at least the following: General Information, Symptoms, Diagnosis and Testing, Treatment, Statistics, Definition, and Cause/Risk/Link. The most-common specific cancer types mentioned in queries were Digestive/Gastrointestinal/Bowel (15.0%), Breast (11.7%), Skin (11.3%), and Genitourinary (10.5%). Additional subcategories of queries about specific cancer types varied, depending on user input. Queries that were not specific to a cancer type were also tracked and categorized. Conclusions Natural-language searching affords users the opportunity to fully express their information needs and can aid users naïve to the content and vocabulary. The specific queries analyzed for this study reflect news and research studies reported during the study dates and would surely change with different study dates. Analyzing queries from search engines represents one way of knowing what kinds of content to provide to users of a given Web site. Users ask questions using whole sentences and keywords, often misspelling words. Providing the option for natural-language searching does not obviate the need for good information architecture, usability engineering, and user testing in order to optimize user experience. PMID:14713659
Enabling the extended compact genetic algorithm for real-parameter optimization by using adaptive discretization.

PubMed

Chen, Ying-ping; Chen, Chao-Hong

2010-01-01

An adaptive discretization method, called split-on-demand (SoD), enables estimation of distribution algorithms (EDAs) for discrete variables to solve continuous optimization problems. SoD randomly splits a continuous interval if the number of search points within the interval exceeds a threshold, which is decreased at every iteration. After the split operation, the nonempty intervals are assigned integer codes, and the search points are discretized accordingly. As an example of using SoD with EDAs, the integration of SoD and the extended compact genetic algorithm (ECGA) is presented and numerically examined. In this integration, we adopt a local search mechanism as an optional component of our back end optimization engine. As a result, the proposed framework can be considered as a memetic algorithm, and SoD can potentially be applied to other memetic algorithms. The numerical experiments consist of two parts: (1) a set of benchmark functions on which ECGA with SoD and ECGA with two well-known discretization methods: the fixed-height histogram (FHH) and the fixed-width histogram (FWH) are compared; (2) a real-world application, the economic dispatch problem, on which ECGA with SoD is compared to other methods. The experimental results indicate that SoD is a better discretization method to work with ECGA. Moreover, ECGA with SoD works quite well on the economic dispatch problem and delivers solutions better than the best known results obtained by other methods in existence.
Local classifier weighting by quadratic programming.

PubMed

Cevikalp, Hakan; Polikar, Robi

2008-10-01

It has been widely accepted that the classification accuracy can be improved by combining outputs of multiple classifiers. However, how to combine multiple classifiers with various (potentially conflicting) decisions is still an open problem. A rich collection of classifier combination procedures -- many of which are heuristic in nature -- have been developed for this goal. In this brief, we describe a dynamic approach to combine classifiers that have expertise in different regions of the input space. To this end, we use local classifier accuracy estimates to weight classifier outputs. Specifically, we estimate local recognition accuracies of classifiers near a query sample by utilizing its nearest neighbors, and then use these estimates to find the best weights of classifiers to label the query. The problem is formulated as a convex quadratic optimization problem, which returns optimal nonnegative classifier weights with respect to the chosen objective function, and the weights ensure that locally most accurate classifiers are weighted more heavily for labeling the query sample. Experimental results on several data sets indicate that the proposed weighting scheme outperforms other popular classifier combination schemes, particularly on problems with complex decision boundaries. Hence, the results indicate that local classification-accuracy-based combination techniques are well suited for decision making when the classifiers are trained by focusing on different regions of the input space.

Investigation of depth-of-interaction (DOI) effects in single- and dual-layer block detectors by the use of light sharing in scintillators.

PubMed

Yamamoto, Seiichi

2012-01-01

In block detectors for PET scanners that use different lengths of slits in scintillators to share light among photomultiplier tubes (PMTs), a position histogram is distorted when the depth of interaction (DOI) of the gamma photons is near the PMTs (DOI effect). However, it remains unclear whether a DOI effect is observed for block detectors that use light sharing in scintillators. To investigate the effect, I tested the effect for single- and dual-layer block detectors. In the single-layer block detector, Ce doped Gd₂SiO₅ (GSO) crystals of 1.9 × 1.9 × 15 mm³ (0.5 mol% Ce) were used. In the dual-layer block detector, GSO crystals of a 1.9 × 1.9 × 6 mm³ (1.5 mol% Ce) were used for the front layer and GSO crystals of 1.9 × 1.9 × 9 mm³ (0.5 mol% Ce) for the back layer. These scintillators were arranged to form an 8 × 8 matrix with multi-layer optical film inserted partly between the scintillators for obtaining an optimized position response with use of two dual-PMTs. Position histograms and energy responses were measured for these block detectors at three different DOI positions, and the flood histograms were obtained. The results indicated that DOI effects are observed in both block detectors, but the dual-layer block showed more severe distortion in the position histogram as well as larger energy variations. We conclude that, in the block detectors that use light sharing in the scintillators, the DOI effect is an important factor for the performance of the detectors, especially for DOI block detectors.
Evaluation of an ontological resource for pharmacovigilance.

PubMed

Jaulent, Marie-Christine; Alecu, Iulian

2009-01-01

In this work, we present a methodology for evaluating an ontology designed in a previous study to describe adverse drug reactions. We evaluate it in term of its fitness for grouping cases in pharmacovigilance. We define as gold standard the Standardized MedDRA Queries (SMQs) developed manually to group terms representing similar medical conditions. We perform an automatic search in the ontology in order to retrieve concepts related to the medical conditions. An optimal query is built for each medical condition. The evaluation relies on the comparison between the terms in the SMQ and the terms subsumed by the query. The result is quantified by sensitivity and specificity. We applied this methodology for 24 SMQs and we obtain a mean sensitivity of 0.82. This work allows validating the semantic resource and provides, in perspective, tools to maintain the ontology while the knowledge is evolving.
ASSET Queries: A Set-Oriented and Column-Wise Approach to Modern OLAP

NASA Astrophysics Data System (ADS)

Chatziantoniou, Damianos; Sotiropoulos, Yannis

Modern data analysis has given birth to numerous grouping constructs and programming paradigms, way beyond the traditional group by. Applications such as data warehousing, web log analysis, streams monitoring and social networks understanding necessitated the use of data cubes, grouping variables, windows and MapReduce. In this paper we review the associated set (ASSET) concept and discuss its applicability in both continuous and traditional data settings. Given a set of values B, an associated set over B is just a collection of annotated data multisets, one for each b(B. The goal is to efficiently compute aggregates over these data sets. An ASSET query consists of repeated definitions of associated sets and aggregates of these, possibly correlated, resembling a spreadsheet document. We review systems implementing ASSET queries both in continuous and persistent contexts and argue for associated sets' analytical abilities and optimization opportunities.
ADF/ADC Web Tools for Browsing and Visualizing Astronomical Catalogs and NASA Astrophysics Mission Metadata

NASA Astrophysics Data System (ADS)

Shaya, E.; Kargatis, V.; Blackwell, J.; Borne, K.; White, R. A.; Cheung, C.

1998-05-01

Several new web based services have been introduced this year by the Astrophysics Data Facility (ADF) at the NASA Goddard Space Flight Center. IMPReSS is a graphical interface to astrophysics databases that presents the user with the footprints of observations of space-based missions. It also aids astronomers in retrieving these data by sending requests to distributed data archives. The VIEWER is a reader of ADC astronomical catalogs and journal tables that allows subsetting of catalogs by column choices and range selection and provides database-like search capability within each table. With it, the user can easily find the table data most appropriate for their purposes and then download either the subset table or the original table. CATSEYE is a tool that plots output tables from the VIEWER (and soon AMASE), making exploring the datasets fast and easy. Having completed the basic functionality of these systems, we are enhancing the site to provide advanced functionality. These will include: market basket storage of tables and records of VIEWER output for IMPReSS and AstroBrowse queries, non-HTML table responses to AstroBrowse type queries, general column arithmetic, modularity to allow entrance into the sequence of web pages at any point, histogram plots, navigable maps, and overplotting of catalog objects on mission footprint maps. When completed, the ADF/ADC web facilities will provide astronomical tabled data and mission retrieval information in several hyperlinked environments geared for users at any level, from the school student to the typical astronomer to the expert datamining tools at state-of-the-art data centers.
MRI histogram analysis enables objective and continuous classification of intervertebral disc degeneration.

PubMed

Waldenberg, Christian; Hebelka, Hanna; Brisby, Helena; Lagerstrand, Kerstin Magdalena

2018-05-01

Magnetic resonance imaging (MRI) is the best diagnostic imaging method for low back pain. However, the technique is currently not utilized in its full capacity, often failing to depict painful intervertebral discs (IVDs), potentially due to the rough degeneration classification system used clinically today. MR image histograms, which reflect the IVD heterogeneity, may offer sensitive imaging biomarkers for IVD degeneration classification. This study investigates the feasibility of using histogram analysis as means of objective and continuous grading of IVD degeneration. Forty-nine IVDs in ten low back pain patients (six males, 25-69 years) were examined with MRI (T2-weighted images and T2-maps). Each IVD was semi-automatically segmented on three mid-sagittal slices. Histogram features of the IVD were extracted from the defined regions of interest and correlated to Pfirrmann grade. Both T2-weighted images and T2-maps displayed similar histogram features. Histograms of well-hydrated IVDs displayed two separate peaks, representing annulus fibrosus and nucleus pulposus. Degenerated IVDs displayed decreased peak separation, where the separation was shown to correlate strongly with Pfirrmann grade (P < 0.05). In addition, some degenerated IVDs within the same Pfirrmann grade displayed diametrically different histogram appearances. Histogram features correlated well with IVD degeneration, suggesting that IVD histogram analysis is a suitable tool for objective and continuous IVD degeneration classification. As histogram analysis revealed IVD heterogeneity, it may be a clinical tool for characterization of regional IVD degeneration effects. To elucidate the usefulness of histogram analysis in patient management, IVD histogram features between asymptomatic and symptomatic individuals needs to be compared.
Whole-lesion histogram analysis of the apparent diffusion coefficient: Evaluation of the correlation with subtypes of mucinous breast carcinoma.

PubMed

Guo, Yuan; Kong, Qing-Cong; Zhu, Ye-Qing; Liu, Zhen-Zhen; Peng, Ling-Rong; Tang, Wen-Jie; Yang, Rui-Meng; Xie, Jia-Jun; Liu, Chun-Ling

2018-02-01

To evaluate the utility of the whole-lesion histogram apparent diffusion coefficient (ADC) for characterizing the heterogeneity of mucinous breast carcinoma (MBC) and to determine which ADC metrics may help to best differentiate subtypes of MBC. This retrospective study involved 52 MBC patients, including 37 pure MBC (PMBC) and 15 mixed MBC (MMBC). The PMBC patients were subtyped into PMBC-A (20 cases) and PMBC-B (17 cases) groups. All patients underwent preoperative diffusion-weighted imaging (DWI) at 1.5T and the whole-lesion ADC assessments were generated. Histogram-derived ADC parameters were compared between PMBC vs. MMBC and PMBC-A vs. PMBC-B, and receiver operating characteristic (ROC) curve analysis was used to determine optimal histogram parameters for differentiating these groups. The PMBC group exhibited significantly higher ADC values for the mean (P = 0.004), 25 th (P = 0.004), 50 th (P = 0.004), 75 th (P = 0.006), and 90 th percentiles (P = 0.013) and skewness (P = 0.021) than did the MMBC group. The 25 th percentile of ADC values achieved the highest area under the curve (AUC) (0.792), with a cutoff value of 1.345 × 10 -3 mm 2 /s, in distinguishing PMBC and MMBC. The PMBC-A group showed significantly higher ADC values for the mean (P = 0.049), 25 th (P = 0.015), and 50 th (P = 0.026) percentiles and skewness (P = 0.004) than did the PMBC-B group. The 25 th percentile of the ADC cutoff value (1.476 × 10 -3 mm 2 /s) demonstrated the best AUC (0.837) among the ADC values for distinguishing PMBC-A and PMBC-B. Whole-lesion ADC histogram analysis enables comprehensive evaluation of an MBC in its entirety and differentiating subtypes of MBC. Thus, it may be a helpful and supportive tool for conventional MRI. 4 Technical Efficacy: Stage 2 J. Magn. Reson. Imaging 2018;47:391-400. © 2017 International Society for Magnetic Resonance in Medicine.
Exploratory Study of 4D versus 3D Robust Optimization in Intensity Modulated Proton Therapy for Lung Cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, Wei, E-mail: Liu.Wei@mayo.edu; Schild, Steven E.; Chang, Joe Y.

Purpose: The purpose of this study was to compare the impact of uncertainties and interplay on 3-dimensional (3D) and 4D robustly optimized intensity modulated proton therapy (IMPT) plans for lung cancer in an exploratory methodology study. Methods and Materials: IMPT plans were created for 11 nonrandomly selected non-small cell lung cancer (NSCLC) cases: 3D robustly optimized plans on average CTs with internal gross tumor volume density overridden to irradiate internal target volume, and 4D robustly optimized plans on 4D computed tomography (CT) to irradiate clinical target volume (CTV). Regular fractionation (66 Gy [relative biological effectiveness; RBE] in 33 fractions) was considered.more » In 4D optimization, the CTV of individual phases received nonuniform doses to achieve a uniform cumulative dose. The root-mean-square dose-volume histograms (RVH) measured the sensitivity of the dose to uncertainties, and the areas under the RVH curve (AUCs) were used to evaluate plan robustness. Dose evaluation software modeled time-dependent spot delivery to incorporate interplay effect with randomized starting phases of each field per fraction. Dose-volume histogram (DVH) indices comparing CTV coverage, homogeneity, and normal tissue sparing were evaluated using Wilcoxon signed rank test. Results: 4D robust optimization plans led to smaller AUC for CTV (14.26 vs 18.61, respectively; P=.001), better CTV coverage (Gy [RBE]) (D{sub 95%} CTV: 60.6 vs 55.2, respectively; P=.001), and better CTV homogeneity (D{sub 5%}-D{sub 95%} CTV: 10.3 vs 17.7, resspectively; P=.002) in the face of uncertainties. With interplay effect considered, 4D robust optimization produced plans with better target coverage (D{sub 95%} CTV: 64.5 vs 63.8, respectively; P=.0068), comparable target homogeneity, and comparable normal tissue protection. The benefits from 4D robust optimization were most obvious for the 2 typical stage III lung cancer patients. Conclusions: Our exploratory methodology study showed that, compared to 3D robust optimization, 4D robust optimization produced significantly more robust and interplay-effect-resistant plans for targets with comparable dose distributions for normal tissues. A further study with a larger and more realistic patient population is warranted to generalize the conclusions.« less
OASIS: A Data Fusion System Optimized for Access to Distributed Archives

NASA Astrophysics Data System (ADS)

Berriman, G. B.; Kong, M.; Good, J. C.

2002-05-01

The On-Line Archive Science Information Services (OASIS) is accessible as a java applet through the NASA/IPAC Infrared Science Archive home page. It uses Geographical Information System (GIS) technology to provide data fusion and interaction services for astronomers. These services include the ability to process and display arbitrarily large image files, and user-controlled contouring, overlay regeneration and multi-table/image interactions. OASIS has been optimized for access to distributed archives and data sets. Its second release (June 2002) provides a mechanism that enables access to OASIS from "third-party" services and data providers. That is, any data provider who creates a query form to an archive containing a collection of data (images, catalogs, spectra) can direct the result files from the query into OASIS. Similarly, data providers who serve links to datasets or remote services on a web page can access all of these data with one instance of OASIS. In this was any data or service provider is given access to the full suite of capabilites of OASIS. We illustrate the "third-party" access feature with two examples: queries to the high-energy image datasets accessible from GSFC SkyView, and links to data that are returned from a target-based query to the NASA Extragalactic Database (NED). The second release of OASIS also includes a file-transfer manager that reports the status of multiple data downloads from remote sources to the client machine. It is a prototype for a request management system that will ultimately control and manage compute-intensive jobs submitted through OASIS to computing grids, such as request for large scale image mosaics and bulk statistical analysis.
Array Databases: Agile Analytics (not just) for the Earth Sciences

NASA Astrophysics Data System (ADS)

Baumann, P.; Misev, D.

2015-12-01

Gridded data, such as images, image timeseries, and climate datacubes, today are managed separately from the metadata, and with different, restricted retrieval capabilities. While databases are good at metadata modelled in tables, XML hierarchies, or RDF graphs, they traditionally do not support multi-dimensional arrays.This gap is being closed by Array Databases, pioneered by the scalable rasdaman ("raster data manager") array engine. Its declarative query language, rasql, extends SQL with array operators which are optimized and parallelized on server side. Installations can easily be mashed up securely, thereby enabling large-scale location-transparent query processing in federations. Domain experts value the integration with their commonly used tools leading to a quick learning curve.Earth, Space, and Life sciences, but also Social sciences as well as business have massive amounts of data and complex analysis challenges that are answered by rasdaman. As of today, rasdaman is mature and in operational use on hundreds of Terabytes of timeseries datacubes, with transparent query distribution across more than 1,000 nodes. Additionally, its concepts have shaped international Big Data standards in the field, including the forthcoming array extension to ISO SQL, many of which are supported by both open-source and commercial systems meantime. In the geo field, rasdaman is reference implementation for the Open Geospatial Consortium (OGC) Big Data standard, WCS, now also under adoption by ISO. Further, rasdaman is in the final stage of OSGeo incubation.In this contribution we present array queries a la rasdaman, describe the architecture and novel optimization and parallelization techniques introduced in 2015, and put this in context of the intercontinental EarthServer initiative which utilizes rasdaman for enabling agile analytics on Petascale datacubes.
Do not hesitate to use Tversky-and other hints for successful active analogue searches with feature count descriptors.

PubMed

Horvath, Dragos; Marcou, Gilles; Varnek, Alexandre

2013-07-22

This study is an exhaustive analysis of the neighborhood behavior over a large coherent data set (ChEMBL target/ligand pairs of known Ki, for 165 targets with >50 associated ligands each). It focuses on similarity-based virtual screening (SVS) success defined by the ascertained optimality index. This is a weighted compromise between purity and retrieval rate of active hits in the neighborhood of an active query. One key issue addressed here is the impact of Tversky asymmetric weighing of query vs candidate features (represented as integer-value ISIDA colored fragment/pharmacophore triplet count descriptor vectors). The nearly a 3/4 million independent SVS runs showed that Tversky scores with a strong bias in favor of query-specific features are, by far, the most successful and the least failure-prone out of a set of nine other dissimilarity scores. These include classical Tanimoto, which failed to defend its privileged status in practical SVS applications. Tversky performance is not significantly conditioned by tuning of its bias parameter α. Both initial "guesses" of α = 0.9 and 0.7 were more successful than Tanimoto (at its turn, better than Euclid). Tversky was eventually tested in exhaustive similarity searching within the library of 1.6 M commercial + bioactive molecules at http://infochim.u-strasbg.fr/webserv/VSEngine.html , comparing favorably to Tanimoto in terms of "scaffold hopping" propensity. Therefore, it should be used at least as often as, perhaps in parallel to Tanimoto in SVS. Analysis with respect to query subclasses highlighted relationships of query complexity (simply expressed in terms of pharmacophore pattern counts) and/or target nature vs SVS success likelihood. SVS using more complex queries are more robust with respect to the choice of their operational premises (descriptors, metric). Yet, they are best handled by "pro-query" Tversky scores at α > 0.5. Among simpler queries, one may distinguish between "growable" (allowing for active analogs with additional features), and a few "conservative" queries not allowing any growth. These (typically bioactive amine transporter ligands) form the specific application domain of "pro-candidate" biased Tversky scores at α < 0.5.
Optimizability of OGC Standards Implementations - a Case Study

NASA Astrophysics Data System (ADS)

Misev, D.; Baumann, P.

2012-04-01

Why do we shop at Amazon? Because they have a unique offering that is nowhere else available? Certainly not. Rather, Amazon offers (i) simple, yet effective search; (ii) very simple payment; (iii) extremely rapid delivery. This is how scientific services will be distinguished in future: not for their data holding (there will be manifold choice), but for their service quality. We are facing the transition from data stewardship to service stewardship. One of the OGC standards which particularly enables flexible retrieval is the Web Coverage Processing Service (WCPS). It defines a high-level query language on large, multi-dimensional raster data, such as 1D timeseries, 2D EO imagery, 3D x/y/t image time series and x/y/z geophysical data, 4D x/y/z/t climate and ocean data. We have implemented WCPS based on an Array Database Management System, rasdaman, which is available in open source. In this demonstration, we study WCPS queries on 2D, 3D, and 4D data sets. Particular emphasis is placed on the computational load queries generate in such on-demand processing and filtering. We look at different techniques and their impact on performance, such as adaptive storage partitioning, query rewriting, and just-in-time compilation. Results show that there is significant potential for effective server-side optimization once a query language is sufficiently high-level and declarative.
An end user evaluation of query formulation and results review tools in three medical meta-search engines.

PubMed

Leroy, Gondy; Xu, Jennifer; Chung, Wingyan; Eggers, Shauna; Chen, Hsinchun

2007-01-01

Retrieving sufficient relevant information online is difficult for many people because they use too few keywords to search and search engines do not provide many support tools. To further complicate the search, users often ignore support tools when available. Our goal is to evaluate in a realistic setting when users use support tools and how they perceive these tools. We compared three medical search engines with support tools that require more or less effort from users to form a query and evaluate results. We carried out an end user study with 23 users who were asked to find information, i.e., subtopics and supporting abstracts, for a given theme. We used a balanced within-subjects design and report on the effectiveness, efficiency and usability of the support tools from the end user perspective. We found significant differences in efficiency but did not find significant differences in effectiveness between the three search engines. Dynamic user support tools requiring less effort led to higher efficiency. Fewer searches were needed and more documents were found per search when both query reformulation and result review tools dynamically adjust to the user query. The query reformulation tool that provided a long list of keywords, dynamically adjusted to the user query, was used most often and led to more subtopics. As hypothesized, the dynamic result review tools were used more often and led to more subtopics than static ones. These results were corroborated by the usability questionnaires, which showed that support tools that dynamically optimize output were preferred.
Bin Ratio-Based Histogram Distances and Their Application to Image Classification.

PubMed

Hu, Weiming; Xie, Nianhua; Hu, Ruiguang; Ling, Haibin; Chen, Qiang; Yan, Shuicheng; Maybank, Stephen

2014-12-01

Large variations in image background may cause partial matching and normalization problems for histogram-based representations, i.e., the histograms of the same category may have bins which are significantly different, and normalization may produce large changes in the differences between corresponding bins. In this paper, we deal with this problem by using the ratios between bin values of histograms, rather than bin values' differences which are used in the traditional histogram distances. We propose a bin ratio-based histogram distance (BRD), which is an intra-cross-bin distance, in contrast with previous bin-to-bin distances and cross-bin distances. The BRD is robust to partial matching and histogram normalization, and captures correlations between bins with only a linear computational complexity. We combine the BRD with the ℓ1 histogram distance and the χ(2) histogram distance to generate the ℓ1 BRD and the χ(2) BRD, respectively. These combinations exploit and benefit from the robustness of the BRD under partial matching and the robustness of the ℓ1 and χ(2) distances to small noise. We propose a method for assessing the robustness of histogram distances to partial matching. The BRDs and logistic regression-based histogram fusion are applied to image classification. The experimental results on synthetic data sets show the robustness of the BRDs to partial matching, and the experiments on seven benchmark data sets demonstrate promising results of the BRDs for image classification.
ADC Histogram Analysis of Cervical Cancer Aids Detecting Lymphatic Metastases-a Preliminary Study.

PubMed

Schob, Stefan; Meyer, Hans Jonas; Pazaitis, Nikolaos; Schramm, Dominik; Bremicker, Kristina; Exner, Marc; Höhn, Anne Kathrin; Garnov, Nikita; Surov, Alexey

2017-12-01

Apparent diffusion coefficient (ADC) histogram analysis has been used to some extent in cervical cancer (CC) to distinguish between low-grade and high-grade tumors. Although this differentiation is undoubtedly helpful, it would be even more crucial in the presurgical setting to determine whether a tumor already gained the potential to metastasize via the lymphatic system. So far, no studies investigated the potential of 3T ADC histogram analysis in CC to differentiate between nodal-positive and nodal-negative entities. Therefore, the principal aim of our study was to investigate the potential of 3T ADC histogram analysis to differentiate between CC with and without lymph node metastasis. The second aim was to elucidate possible differences in ADC histogram parameters between CC with limited vs. advanced tumor stages and well-differentiated vs. undifferentiated lesions. Finally, correlations of p53 expression and Ki-67 index with ADC parameters were analyzed. Eighteen female patients (mean age 55.4 years, range 32-79 years) with histopathologically confirmed cervical squamous cell carcinoma of the uterine cervix were prospectively enrolled. Tumor stages, tumor grading, status of metastatic dissemination, Ki67-index, and p53 expression were assessed in these patients. Diffusion weighted imaging (DWI) was obtained in a 3T scanner using the following b values: b0 and b1000 s/mm 2 . Group comparisons using Mann-Whitney U test revealed the following findings: nodal-positive CC had statistically significant lower ADC parameters (ADCmin, ADCmean, median ADC, Mode, p10, p25, p75, and p90) in comparison to nodal-negative CC (all p < 0.05). ADCentropy was significantly elevated (p = 0.046) in tumors with advanced T stages (T3/4) compared to tumors with limited T stage (T2). ADCmin values were different in a statistically significant manner comparing G1/G2 and G3 tumors (40.45 ± 18.63 vs. 65.0 ± 23.63 × 10-5 mm2 s -1 , p = 0.035). Furthermore, Spearman Rho calculation identified an inverse correlation between ADCentropy and p53 expression (r = -0.472, p = 0.048). The main finding of our study is the discriminability of nodal-positive from nodal-negative CC using ADC histogram analysis in 3T DWI. This information is crucial for the gynecological surgeon to identify the optimal treatment strategy for patients suffering from CC. Furthermore, ADCentropy was identified as a potential imaging biomarker for tumor heterogeneity and might be able to indicate further molecular changes like loss of p53 expression, which is associated with EMT and consequentially indicates a poor prognosis in CC. Finally, our study confirmed the findings of previous works, which indicated that histogram analysis of ADC maps can distinguish between low-grade and high-grade CC. In conclusion, it can be stated that ADC histogram analysis provides additional, prognostically important information on tumor biology in CC.
Theory and Application of DNA Histogram Analysis.

ERIC Educational Resources Information Center

Bagwell, Charles Bruce

The underlying principles and assumptions associated with DNA histograms are discussed along with the characteristics of fluorescent probes. Information theory was described and used to calculate the information content of a DNA histogram. Two major types of DNA histogram analyses are proposed: parametric and nonparametric analysis. Three levels…
Optimal decoding and information transmission in Hodgkin-Huxley neurons under metabolic cost constraints.

PubMed

Kostal, Lubomir; Kobayashi, Ryota

2015-10-01

Information theory quantifies the ultimate limits on reliable information transfer by means of the channel capacity. However, the channel capacity is known to be an asymptotic quantity, assuming unlimited metabolic cost and computational power. We investigate a single-compartment Hodgkin-Huxley type neuronal model under the spike-rate coding scheme and address how the metabolic cost and the decoding complexity affects the optimal information transmission. We find that the sub-threshold stimulation regime, although attaining the smallest capacity, allows for the most efficient balance between the information transmission and the metabolic cost. Furthermore, we determine post-synaptic firing rate histograms that are optimal from the information-theoretic point of view, which enables the comparison of our results with experimental data. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Using OPeNDAP's Data-Services Framework to Lift Mash-Ups above Blind Dates

NASA Astrophysics Data System (ADS)

Gallagher, J. H. R.; Fulker, D. W.

2015-12-01

OPeNDAP's data-as-service framework (Hyrax) matches diverse sources with many end-user tools and contexts. Keys to its flexibility include: A data model embracing tabular data alongside n-dim arrays and other structures useful in geoinformatics. A REST-like protocol that supports—via suffix notation—a growing set of output forms (netCDF, XML, etc.) plus a query syntax for subsetting. Subsetting applies (via constraints on column values) to tabular data or (via constraints on indices or coordinates) to array-style data . A handler-style architecture that admits a growing set of input types. Community members may contribute handlers, making Hyrax effective as middleware, where N sources are mapped to M outputs with order N+M effort (not NxM). Hyrax offers virtual aggregations of source data, enabling granularity aimed at users, not data-collectors. OPeNDAP-access libraries exist in multiple languages, including Python, Java, and C++. Recent enhancements are increasing this framework's interoperability (i.e., its mash-up) potential. Extensions implemented as servlets—running adjacent to Hyrax—are enriching the forms of aggregation and enabling new protocols: User-specified aggregations, namely, applying a query to (huge) lists of source granules, and receiving one (large) table or zipped netCDF file. OGC (Open Geospatial Consortium) protocols, WMS and WCS. A Webification (W10n) protocol that returns JavaScript Object Notation (JSON). Extensions to OPeNDAP's query language are reducing transfer volumes and enabling new forms of inspection. Advances underway include: Functions that, for triangular-mesh sources, return sub-meshes spec'd via geospatial bounding boxes. Functions that, for data from multiple, satellite-borne sensors (with differing orbits), select observations based on coincidence. Calculations of means, histograms, etc. that greatly reduce output volumes.. Paths for communities to contribute new server functions (in Python, e.g.) that data providers may incorporate into Hyrax via installation parameters. One could say Hyrax itself is a mash-up, but we suggest it as an instrument for a mash-up artist's toolbox. This instrument can support mash-ups built on netCDF files, OGC protocols, JavaScript Web pages, and/or programs written in Python, Java, C or C++.
[Development of a Compared Software for Automatically Generated DVH in Eclipse TPS].

PubMed

Xie, Zhao; Luo, Kelin; Zou, Lian; Hu, Jinyou

2016-03-01

This study is to automatically calculate the dose volume histogram(DVH) for the treatment plan, then to compare it with requirements of doctor's prescriptions. The scripting language Autohotkey and programming language C# were used to develop a compared software for automatically generated DVH in Eclipse TPS. This software is named Show Dose Volume Histogram (ShowDVH), which is composed of prescription documents generation, operation functions of DVH, software visualization and DVH compared report generation. Ten cases in different cancers have been separately selected, in Eclipse TPS 11.0 ShowDVH could not only automatically generate DVH reports but also accurately determine whether treatment plans meet the requirements of doctor’s prescriptions, then reports gave direction for setting optimization parameters of intensity modulated radiated therapy. The ShowDVH is an user-friendly and powerful software, and can automatically generated compared DVH reports fast in Eclipse TPS 11.0. With the help of ShowDVH, it greatly saves plan designing time and improves working efficiency of radiation therapy physicists.
Bilastine in allergic rhinoconjunctivitis and urticaria: a practical approach to treatment decisions based on queries received by the medical information department

PubMed Central

Leceta, Amalia; Sologuren, Ander; Valiente, Román; Campo, Cristina; Labeaga, Luis

2017-01-01

Background Bilastine is a safe and effective commonly prescribed non-sedating H1-antihistamine approved for symptomatic treatment in patients with allergic disorders such as rhinoconjunctivitis and urticaria. It was evaluated in many patients throughout the clinical development required for its approval, but clinical trials generally exclude many patients who will benefit in everyday clinical practice (especially those with coexisting diseases and/or being treated with concomitant drugs). Following its introduction into clinical practice, the Medical Information Specialists at Faes Farma have received many practical queries regarding the optimal use of bilastine in different circumstances. Data sources and methods Queries received by the Medical Information Department and the responses provided to senders of these queries. Results The most frequent questions received by the Medical Information Department included the potential for drug-drug interactions with bilastine and commonly used agents such as anticoagulants (including the novel oral anticoagulants), antiretrovirals, antituberculosis regimens, corticosteroids, digoxin, oral contraceptives, and proton pump inhibitors. Most of these medicines are not usually allowed in clinical trials, and so advice needs to be based upon the pharmacological profiles of the drugs involved and expert opinion. The pharmacokinetic profile of bilastine appears favourable since it undergoes negligible metabolism and is almost exclusively eliminated via renal excretion, and it neither induces nor inhibits the activity of several isoenzymes from the CYP 450 system. Consequently, bilastine does not interact with cytochrome metabolic pathways. Other queries involved specific patient groups such as subjects with renal impairment, women who are breastfeeding or who are trying to become pregnant, and patients with other concomitant diseases. Interestingly, several questions related to topics that are well covered in the Summary of Product Characteristics (SmPC), which suggests that this resource is not being well used. Conclusions Overall, this analysis highlights gaps in our knowledge regarding the optimal use of bilastine. Expert opinion based upon an understanding of the science can help in the decision-making, but more research is needed to provide evidence-based answers in certain circumstances. PMID:28210286
Bilastine in allergic rhinoconjunctivitis and urticaria: a practical approach to treatment decisions based on queries received by the medical information department.

PubMed

Leceta, Amalia; Sologuren, Ander; Valiente, Román; Campo, Cristina; Labeaga, Luis

2017-01-01

Bilastine is a safe and effective commonly prescribed non-sedating H 1 -antihistamine approved for symptomatic treatment in patients with allergic disorders such as rhinoconjunctivitis and urticaria. It was evaluated in many patients throughout the clinical development required for its approval, but clinical trials generally exclude many patients who will benefit in everyday clinical practice (especially those with coexisting diseases and/or being treated with concomitant drugs). Following its introduction into clinical practice, the Medical Information Specialists at Faes Farma have received many practical queries regarding the optimal use of bilastine in different circumstances. Queries received by the Medical Information Department and the responses provided to senders of these queries. The most frequent questions received by the Medical Information Department included the potential for drug-drug interactions with bilastine and commonly used agents such as anticoagulants (including the novel oral anticoagulants), antiretrovirals, antituberculosis regimens, corticosteroids, digoxin, oral contraceptives, and proton pump inhibitors. Most of these medicines are not usually allowed in clinical trials, and so advice needs to be based upon the pharmacological profiles of the drugs involved and expert opinion. The pharmacokinetic profile of bilastine appears favourable since it undergoes negligible metabolism and is almost exclusively eliminated via renal excretion, and it neither induces nor inhibits the activity of several isoenzymes from the CYP 450 system. Consequently, bilastine does not interact with cytochrome metabolic pathways. Other queries involved specific patient groups such as subjects with renal impairment, women who are breastfeeding or who are trying to become pregnant, and patients with other concomitant diseases. Interestingly, several questions related to topics that are well covered in the Summary of Product Characteristics (SmPC), which suggests that this resource is not being well used. Overall, this analysis highlights gaps in our knowledge regarding the optimal use of bilastine. Expert opinion based upon an understanding of the science can help in the decision-making, but more research is needed to provide evidence-based answers in certain circumstances.

Microscopy as a statistical, Rényi-Ulam, half-lie game: a new heuristic search strategy to accelerate imaging.

PubMed

Drumm, Daniel W; Greentree, Andrew D

2017-11-07

Finding a fluorescent target in a biological environment is a common and pressing microscopy problem. This task is formally analogous to the canonical search problem. In ideal (noise-free, truthful) search problems, the well-known binary search is optimal. The case of half-lies, where one of two responses to a search query may be deceptive, introduces a richer, Rényi-Ulam problem and is particularly relevant to practical microscopy. We analyse microscopy in the contexts of Rényi-Ulam games and half-lies, developing a new family of heuristics. We show the cost of insisting on verification by positive result in search algorithms; for the zero-half-lie case bisectioning with verification incurs a 50% penalty in the average number of queries required. The optimal partitioning of search spaces directly following verification in the presence of random half-lies is determined. Trisectioning with verification is shown to be the most efficient heuristic of the family in a majority of cases.
Histogram deconvolution - An aid to automated classifiers

NASA Technical Reports Server (NTRS)

Lorre, J. J.

1983-01-01

It is shown that N-dimensional histograms are convolved by the addition of noise in the picture domain. Three methods are described which provide the ability to deconvolve such noise-affected histograms. The purpose of the deconvolution is to provide automated classifiers with a higher quality N-dimensional histogram from which to obtain classification statistics.
Parameterization of the Age-Dependent Whole Brain Apparent Diffusion Coefficient Histogram

PubMed Central

Batra, Marion; Nägele, Thomas

2015-01-01

Purpose. The distribution of apparent diffusion coefficient (ADC) values in the brain can be used to characterize age effects and pathological changes of the brain tissue. The aim of this study was the parameterization of the whole brain ADC histogram by an advanced model with influence of age considered. Methods. Whole brain ADC histograms were calculated for all data and for seven age groups between 10 and 80 years. Modeling of the histograms was performed for two parts of the histogram separately: the brain tissue part was modeled by two Gaussian curves, while the remaining part was fitted by the sum of a Gaussian curve, a biexponential decay, and a straight line. Results. A consistent fitting of the histograms of all age groups was possible with the proposed model. Conclusions. This study confirms the strong dependence of the whole brain ADC histograms on the age of the examined subjects. The proposed model can be used to characterize changes of the whole brain ADC histogram in certain diseases under consideration of age effects. PMID:26609526
Impact of Spot Size and Spacing on the Quality of Robustly Optimized Intensity Modulated Proton Therapy Plans for Lung Cancer.

PubMed

Liu, Chenbin; Schild, Steven E; Chang, Joe Y; Liao, Zhongxing; Korte, Shawn; Shen, Jiajian; Ding, Xiaoning; Hu, Yanle; Kang, Yixiu; Keole, Sameer R; Sio, Terence T; Wong, William W; Sahoo, Narayan; Bues, Martin; Liu, Wei

2018-06-01

To investigate how spot size and spacing affect plan quality, robustness, and interplay effects of robustly optimized intensity modulated proton therapy (IMPT) for lung cancer. Two robustly optimized IMPT plans were created for 10 lung cancer patients: first by a large-spot machine with in-air energy-dependent large spot size at isocenter (σ: 6-15 mm) and spacing (1.3 σ), and second by a small-spot machine with in-air energy-dependent small spot size (σ: 2-6 mm) and spacing (5 mm). Both plans were generated by optimizing radiation dose to internal target volume on averaged 4-dimensional computed tomography scans using an in-house-developed IMPT planning system. The dose-volume histograms band method was used to evaluate plan robustness. Dose evaluation software was developed to model time-dependent spot delivery to incorporate interplay effects with randomized starting phases for each field per fraction. Patient anatomy voxels were mapped phase-to-phase via deformable image registration, and doses were scored using in-house-developed software. Dose-volume histogram indices, including internal target volume dose coverage, homogeneity, and organs at risk (OARs) sparing, were compared using the Wilcoxon signed-rank test. Compared with the large-spot machine, the small-spot machine resulted in significantly lower heart and esophagus mean doses, with comparable target dose coverage, homogeneity, and protection of other OARs. Plan robustness was comparable for targets and most OARs. With interplay effects considered, significantly lower heart and esophagus mean doses with comparable target dose coverage and homogeneity were observed using smaller spots. Robust optimization with a small spot-machine significantly improves heart and esophagus sparing, with comparable plan robustness and interplay effects compared with robust optimization with a large-spot machine. A small-spot machine uses a larger number of spots to cover the same tumors compared with a large-spot machine, which gives the planning system more freedom to compensate for the higher sensitivity to uncertainties and interplay effects for lung cancer treatments. Copyright © 2018 Elsevier Inc. All rights reserved.
START: a system for flexible analysis of hundreds of genomic signal tracks in few lines of SQL-like queries.

PubMed

Zhu, Xinjie; Zhang, Qiang; Ho, Eric Dun; Yu, Ken Hung-On; Liu, Chris; Huang, Tim H; Cheng, Alfred Sze-Lok; Kao, Ben; Lo, Eric; Yip, Kevin Y

2017-09-22

A genomic signal track is a set of genomic intervals associated with values of various types, such as measurements from high-throughput experiments. Analysis of signal tracks requires complex computational methods, which often make the analysts focus too much on the detailed computational steps rather than on their biological questions. Here we propose Signal Track Query Language (STQL) for simple analysis of signal tracks. It is a Structured Query Language (SQL)-like declarative language, which means one only specifies what computations need to be done but not how these computations are to be carried out. STQL provides a rich set of constructs for manipulating genomic intervals and their values. To run STQL queries, we have developed the Signal Track Analytical Research Tool (START, http://yiplab.cse.cuhk.edu.hk/start/ ), a system that includes a Web-based user interface and a back-end execution system. The user interface helps users select data from our database of around 10,000 commonly-used public signal tracks, manage their own tracks, and construct, store and share STQL queries. The back-end system automatically translates STQL queries into optimized low-level programs and runs them on a computer cluster in parallel. We use STQL to perform 14 representative analytical tasks. By repeating these analyses using bedtools, Galaxy and custom Python scripts, we show that the STQL solution is usually the simplest, and the parallel execution achieves significant speed-up with large data files. Finally, we describe how a biologist with minimal formal training in computer programming self-learned STQL to analyze DNA methylation data we produced from 60 pairs of hepatocellular carcinoma (HCC) samples. Overall, STQL and START provide a generic way for analyzing a large number of genomic signal tracks in parallel easily.
Selecting materialized views using random algorithm

NASA Astrophysics Data System (ADS)

Zhou, Lijuan; Hao, Zhongxiao; Liu, Chi

2007-04-01

The data warehouse is a repository of information collected from multiple possibly heterogeneous autonomous distributed databases. The information stored at the data warehouse is in form of views referred to as materialized views. The selection of the materialized views is one of the most important decisions in designing a data warehouse. Materialized views are stored in the data warehouse for the purpose of efficiently implementing on-line analytical processing queries. The first issue for the user to consider is query response time. So in this paper, we develop algorithms to select a set of views to materialize in data warehouse in order to minimize the total view maintenance cost under the constraint of a given query response time. We call it query_cost view_ selection problem. First, cost graph and cost model of query_cost view_ selection problem are presented. Second, the methods for selecting materialized views by using random algorithms are presented. The genetic algorithm is applied to the materialized views selection problem. But with the development of genetic process, the legal solution produced become more and more difficult, so a lot of solutions are eliminated and producing time of the solutions is lengthened in genetic algorithm. Therefore, improved algorithm has been presented in this paper, which is the combination of simulated annealing algorithm and genetic algorithm for the purpose of solving the query cost view selection problem. Finally, in order to test the function and efficiency of our algorithms experiment simulation is adopted. The experiments show that the given methods can provide near-optimal solutions in limited time and works better in practical cases. Randomized algorithms will become invaluable tools for data warehouse evolution.
Real-time computed tomography dosimetry during ultrasound-guided brachytherapy for prostate cancer.

PubMed

Kaplan, Irving D; Meskell, Paul; Oldenburg, Nicklas E; Saltzman, Brian; Kearney, Gary P; Holupka, Edward J

2006-01-01

Ultrasound-guided implantation of permanent radioactive seeds is a treatment option for localized prostate cancer. Several techniques have been described for the optimal placement of the seeds in the prostate during this procedure. Postimplantation dosimetric calculations are performed after the implant. Areas of underdosing can only be corrected with either an external beam boost or by performing a second implant. We demonstrate the feasibility of performing computed tomography (CT)-based postplanning during the ultrasound-guided implant and subsequently correcting for underdosed areas. Ultrasound-guided brachytherapy is performed on a modified CT table with general anesthesia. The postplanning CT scan is performed after the implant, while the patient is still under anesthesia. Additional seeds are implanted into "cold spots," and the resultant dosimetry confirmed with CT. Intraoperative postplanning was successfully performed. Dose-volume histograms demonstrated adequate dose coverage during the initial implant, but on detailed analysis, for some patients, areas of underdosing were observed either at the apex or the peripheral zone. Additional seeds were implanted to bring these areas to prescription dose. Intraoperative postplanning is feasible during ultrasound-guided brachytherapy for prostate cancer. Although the postimplant dose-volume histograms for all patients, before the implantation of additional seeds, were adequate according to the American Brachytherapy Society criteria, specific critical areas can be underdosed. Additional seeds can then be implanted to optimize the dosimetry and reduce the risk of underdosing areas of cancer.
Comparison of optimized single and multifield irradiation plans of antiproton, proton and carbon ion beams.

PubMed

Bassler, Niels; Kantemiris, Ioannis; Karaiskos, Pantelis; Engelke, Julia; Holzscheiter, Michael H; Petersen, Jørgen B

2010-04-01

Antiprotons have been suggested as a possibly superior modality for radiotherapy, due to the energy released when antiprotons annihilate, which enhances the Bragg peak and introduces a high-LET component to the dose. However, concerns are expressed about the inferior lateral dose distribution caused by the annihilation products. We use the Monte Carlo code FLUKA to generate depth-dose kernels for protons, antiprotons, and carbon ions. Using these we then build virtual treatment plans optimized according to ICRU recommendations for the different beam modalities, which then are recalculated with FLUKA. Dose-volume histograms generated from these plans can be used to compare the different irradiations. The enhancement in physical and possibly biological dose from annihilating antiprotons can significantly lower the dose in the entrance channel; but only at the expense of a diffuse low dose background from long-range secondary particles. Lateral dose distributions are improved using active beam delivery methods, instead of flat fields. Dose-volume histograms for different treatment scenarios show that antiprotons have the potential to reduce the volume of normal tissue receiving medium to high dose, however, in the low dose region antiprotons are inferior to both protons and carbon ions. This limits the potential usage to situations where dose to normal tissue must be reduced as much as possible. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.
Introducing parallelism to histogramming functions for GEM systems

NASA Astrophysics Data System (ADS)

Krawczyk, Rafał D.; Czarski, Tomasz; Kolasinski, Piotr; Pozniak, Krzysztof T.; Linczuk, Maciej; Byszuk, Adrian; Chernyshova, Maryna; Juszczyk, Bartlomiej; Kasprowicz, Grzegorz; Wojenski, Andrzej; Zabolotny, Wojciech

2015-09-01

This article is an assessment of potential parallelization of histogramming algorithms in GEM detector system. Histogramming and preprocessing algorithms in MATLAB were analyzed with regard to adding parallelism. Preliminary implementation of parallel strip histogramming resulted in speedup. Analysis of algorithms parallelizability is presented. Overview of potential hardware and software support to implement parallel algorithm is discussed.
Comparison of Histograms for Use in Cloud Observation and Modeling

NASA Technical Reports Server (NTRS)

Green, Lisa; Xu, Kuan-Man

2005-01-01

Cloud observation and cloud modeling data can be presented in histograms for each characteristic to be measured. Combining information from single-cloud histograms yields a summary histogram. Summary histograms can be compared to each other to reach conclusions about the behavior of an ensemble of clouds in different places at different times or about the accuracy of a particular cloud model. As in any scientific comparison, it is necessary to decide whether any apparent differences are statistically significant. The usual methods of deciding statistical significance when comparing histograms do not apply in this case because they assume independent data. Thus, a new method is necessary. The proposed method uses the Euclidean distance metric and bootstrapping to calculate the significance level.
Scalable and responsive event processing in the cloud

PubMed Central

Suresh, Visalakshmi; Ezhilchelvan, Paul; Watson, Paul

2013-01-01

Event processing involves continuous evaluation of queries over streams of events. Response-time optimization is traditionally done over a fixed set of nodes and/or by using metrics measured at query-operator levels. Cloud computing makes it easy to acquire and release computing nodes as required. Leveraging this flexibility, we propose a novel, queueing-theory-based approach for meeting specified response-time targets against fluctuating event arrival rates by drawing only the necessary amount of computing resources from a cloud platform. In the proposed approach, the entire processing engine of a distinct query is modelled as an atomic unit for predicting response times. Several such units hosted on a single node are modelled as a multiple class M/G/1 system. These aspects eliminate intrusive, low-level performance measurements at run-time, and also offer portability and scalability. Using model-based predictions, cloud resources are efficiently used to meet response-time targets. The efficacy of the approach is demonstrated through cloud-based experiments. PMID:23230164
Data Sharing in DHT Based P2P Systems

NASA Astrophysics Data System (ADS)

Roncancio, Claudia; Del Pilar Villamil, María; Labbé, Cyril; Serrano-Alvarado, Patricia

The evolution of peer-to-peer (P2P) systems triggered the building of large scale distributed applications. The main application domain is data sharing across a very large number of highly autonomous participants. Building such data sharing systems is particularly challenging because of the “extreme” characteristics of P2P infrastructures: massive distribution, high churn rate, no global control, potentially untrusted participants... This article focuses on declarative querying support, query optimization and data privacy on a major class of P2P systems, that based on Distributed Hash Table (P2P DHT). The usual approaches and the algorithms used by classic distributed systems and databases for providing data privacy and querying services are not well suited to P2P DHT systems. A considerable amount of work was required to adapt them for the new challenges such systems present. This paper describes the most important solutions found. It also identifies important future research trends in data management in P2P DHT systems.
A Bayesian Approach to Interactive Retrieval

ERIC Educational Resources Information Center

Tague, Jean M.

1973-01-01

A probabilistic model for interactive retrieval is presented. Bayesian statistical decision theory principles are applied: use of prior and sample information about the relationship of document descriptions to query relevance; maximization of expected value of a utility function, to the problem of optimally restructuring search strategies in an…
A Test of Genetic Algorithms in Relevance Feedback.

ERIC Educational Resources Information Center

Lopez-Pujalte, Cristina; Guerrero Bote, Vicente P.; Moya Anegon, Felix de

2002-01-01

Discussion of information retrieval, query optimization techniques, and relevance feedback focuses on genetic algorithms, which are derived from artificial intelligence techniques. Describes an evaluation of different genetic algorithms using a residual collection method and compares results with the Ide dec-hi method (Salton and Buckley, 1990…
An analytical optimization model for infrared image enhancement via local context

NASA Astrophysics Data System (ADS)

Xu, Yongjian; Liang, Kun; Xiong, Yiru; Wang, Hui

2017-12-01

The requirement for high-quality infrared images is constantly increasing in both military and civilian areas, and it is always associated with little distortion and appropriate contrast, while infrared images commonly have some shortcomings such as low contrast. In this paper, we propose a novel infrared image histogram enhancement algorithm based on local context. By constraining the enhanced image to have high local contrast, a regularized analytical optimization model is proposed to enhance infrared images. The local contrast is determined by evaluating whether two intensities are neighbors and calculating their differences. The comparison on 8-bit images shows that the proposed method can enhance the infrared images with more details and lower noise.
An efficient compression scheme for bitmap indices

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Kesheng; Otoo, Ekow J.; Shoshani, Arie

2004-04-13

When using an out-of-core indexing method to answer a query, it is generally assumed that the I/O cost dominates the overall query response time. Because of this, most research on indexing methods concentrate on reducing the sizes of indices. For bitmap indices, compression has been used for this purpose. However, in most cases, operations on these compressed bitmaps, mostly bitwise logical operations such as AND, OR, and NOT, spend more time in CPU than in I/O. To speedup these operations, a number of specialized bitmap compression schemes have been developed; the best known of which is the byte-aligned bitmap codemore » (BBC). They are usually faster in performing logical operations than the general purpose compression schemes, but, the time spent in CPU still dominates the total query response time. To reduce the query response time, we designed a CPU-friendly scheme named the word-aligned hybrid (WAH) code. In this paper, we prove that the sizes of WAH compressed bitmap indices are about two words per row for large range of attributes. This size is smaller than typical sizes of commonly used indices, such as a B-tree. Therefore, WAH compressed indices are not only appropriate for low cardinality attributes but also for high cardinality attributes.In the worst case, the time to operate on compressed bitmaps is proportional to the total size of the bitmaps involved. The total size of the bitmaps required to answer a query on one attribute is proportional to the number of hits. These indicate that WAH compressed bitmap indices are optimal. To verify their effectiveness, we generated bitmap indices for four different datasets and measured the response time of many range queries. Tests confirm that sizes of compressed bitmap indices are indeed smaller than B-tree indices, and query processing with WAH compressed indices is much faster than with BBC compressed indices, projection indices and B-tree indices. In addition, we also verified that the average query response time is proportional to the index size. This indicates that the compressed bitmap indices are efficient for very large datasets.« less
Development of digital interactive processing system for NOAA satellites AVHRR data

NASA Astrophysics Data System (ADS)

Gupta, R. K.; Murthy, N. N.

The paper discusses the digital image processing system for NOAA/AVHRR data including Land applications - configured around VAX 11/750 host computer supported with FPS 100 Array Processor, Comtal graphic display and HP Plotting devices; wherein the system software for relational Data Base together with query and editing facilities, Man-Machine Interface using form, menu and prompt inputs including validation of user entries for data type and range; preprocessing software for data calibration, Sun-angle correction, Geometric Corrections for Earth curvature effect and Earth rotation offsets and Earth location of AVHRR image have been accomplished. The implemented image enhancement techniques such as grey level stretching, histogram equalization and convolution are discussed. The software implementation details for the computation of vegetative index and normalized vegetative index using NOAA/AVHRR channels 1 and 2 data together with output are presented; scientific background for such computations and obtainability of similar indices from Landsat/MSS data are also included. The paper concludes by specifying the further software developments planned and the progress envisaged in the field of vegetation index studies.
Histogram analysis of diffusion kurtosis imaging of nasopharyngeal carcinoma: Correlation between quantitative parameters and clinical stage.

PubMed

Xu, Xiao-Quan; Ma, Gao; Wang, Yan-Jun; Hu, Hao; Su, Guo-Yi; Shi, Hai-Bin; Wu, Fei-Yun

2017-07-18

To evaluate the correlation between histogram parameters derived from diffusion-kurtosis (DK) imaging and the clinical stage of nasopharyngeal carcinoma (NPC). High T-stage (T3/4) NPC showed significantly higher Kapp-mean (P = 0.018), Kapp-median (P = 0.029) and Kapp-90th (P = 0.003) than low T-stage (T1/2) NPC. High N-stage NPC (N2/3) showed significantly lower Dapp-mean (P = 0.002), Dapp-median (P = 0.002) and Dapp-10th (P < 0.001) than low N-stage NPC (N0/1). High AJCC-stage NPC (III/IV) showed significantly lower Dapp-10th (P = 0.038) than low AJCC-stage NPC (I/II). ROC analyses indicated that Kapp-90th was optimal for predicting high T-stage (AUC, 0.759; sensitivity, 0.842; specificity, 0.607), while Dapp-10th was best for predicting high N- and AJCC-stage (N-stage, AUC, 0.841; sensitivity, 0.875; specificity, 0.807; AJCC-stage, AUC, 0.671; sensitivity, 0.800; specificity, 0.588). DK imaging data of forty-seven consecutive NPC patients were retrospectively analyzed. Apparent diffusion for Gaussian distribution (Dapp) and apparent kurtosis coefficient (Kapp) were generated using diffusion-kurtosis model. Histogram parameters, including mean, median, 10th, 90th percentiles, skewness and kurtosis of Dapp and Kapp were calculated. Patients were divided into low and high T, N and clinical stage based on American Joint Committee on Cancer (AJCC) staging system. Differences of histogram parameters between low and high T, N and AJCC stages were compared using t test. Multiple receiver operating characteristic (ROC) curves were used to determine and compare the value of significant parameters in predicting high T, N and AJCC stage, respectively. DK imaging-derived parameters correlated well with clinical stage of NPC, therefore could serve as an adjunctive imaging technique for evaluating NPC.
Dosimetric impact in the dose-volume histograms of rectal and vesical wall contouring in prostate cancer IMRT treatments.

PubMed

Gómez, Laura; Andrés, Carlos; Ruiz, Antonio

2017-01-01

The main purpose of this study was to evaluate the differences in dose-volume histograms of IMRT treatments for prostate cancer based on the delineation of the main organs at risk (rectum and bladder) as solid organs or by contouring their wall. Rectum and bladder have typically been delineated as solid organs, including the waste material, which, in practice, can lead to an erroneous assessment of the risk of adverse effects. A retrospective study was made on 25 patients treated with IMRT radiotherapy for prostate adenocarcinoma. 76.32 Gy in 36 fractions was prescribed to the prostate and seminal vesicles. In addition to the delineation of the rectum and bladder as solid organs (including their content), the rectal and bladder wall were also delineated and the resulting dose-volume histograms were analyzed for the two groups of structures. Data analysis shows statistically significant differences in the main parameters used to assess the risk of toxicity of a prostate radiotherapy treatment. Higher doses were received on the rectal and bladder walls compared to doses received on the corresponding solid organs. The observed differences in terms of received doses to the rectum and bladder based on the method of contouring could gain greater importance in inverse planning treatments, where the treatment planning system optimizes the dose in these volumes. So, one should take into account the method of delineating of these structures to make a clinical decision regarding dose limitation and risk assessment of chronic toxicity.
An Experimental Comparison of Similarity Assessment Measures for 3D Models on Constrained Surface Deformation

NASA Astrophysics Data System (ADS)

Quan, Lulin; Yang, Zhixin

2010-05-01

To address the issues in the area of design customization, this paper expressed the specification and application of the constrained surface deformation, and reported the experimental performance comparison of three prevail effective similarity assessment algorithms on constrained surface deformation domain. Constrained surface deformation becomes a promising method that supports for various downstream applications of customized design. Similarity assessment is regarded as the key technology for inspecting the success of new design via measuring the difference level between the deformed new design and the initial sample model, and indicating whether the difference level is within the limitation. According to our theoretical analysis and pre-experiments, three similarity assessment algorithms are suitable for this domain, including shape histogram based method, skeleton based method, and U system moment based method. We analyze their basic functions and implementation methodologies in detail, and do a series of experiments on various situations to test their accuracy and efficiency using precision-recall diagram. Shoe model is chosen as an industrial example for the experiments. It shows that shape histogram based method gained an optimal performance in comparison. Based on the result, we proposed a novel approach that integrating surface constrains and shape histogram description with adaptive weighting method, which emphasize the role of constrains during the assessment. The limited initial experimental result demonstrated that our algorithm outperforms other three algorithms. A clear direction for future development is also drawn at the end of the paper.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Beltran, C; Kamal, H

Purpose: To provide a multicriteria optimization algorithm for intensity modulated radiation therapy using pencil proton beam scanning. Methods: Intensity modulated radiation therapy using pencil proton beam scanning requires efficient optimization algorithms to overcome the uncertainties in the Bragg peaks locations. This work is focused on optimization algorithms that are based on Monte Carlo simulation of the treatment planning and use the weights and the dose volume histogram (DVH) control points to steer toward desired plans. The proton beam treatment planning process based on single objective optimization (representing a weighted sum of multiple objectives) usually leads to time-consuming iterations involving treatmentmore » planning team members. We proved a time efficient multicriteria optimization algorithm that is developed to run on NVIDIA GPU (Graphical Processing Units) cluster. The multicriteria optimization algorithm running time benefits from up-sampling of the CT voxel size of the calculations without loss of fidelity. Results: We will present preliminary results of Multicriteria optimization for intensity modulated proton therapy based on DVH control points. The results will show optimization results of a phantom case and a brain tumor case. Conclusion: The multicriteria optimization of the intensity modulated radiation therapy using pencil proton beam scanning provides a novel tool for treatment planning. Work support by a grant from Varian Inc.« less
A rank-based Prediction Algorithm of Learning User's Intention

NASA Astrophysics Data System (ADS)

Shen, Jie; Gao, Ying; Chen, Cang; Gong, HaiPing

Internet search has become an important part in people's daily life. People can find many types of information to meet different needs through search engines on the Internet. There are two issues for the current search engines: first, the users should predetermine the types of information they want and then change to the appropriate types of search engine interfaces. Second, most search engines can support multiple kinds of search functions, each function has its own separate search interface. While users need different types of information, they must switch between different interfaces. In practice, most queries are corresponding to various types of information results. These queries can search the relevant results in various search engines, such as query "Palace" contains the websites about the introduction of the National Palace Museum, blog, Wikipedia, some pictures and video information. This paper presents a new aggregative algorithm for all kinds of search results. It can filter and sort the search results by learning three aspects about the query words, search results and search history logs to achieve the purpose of detecting user's intention. Experiments demonstrate that this rank-based method for multi-types of search results is effective. It can meet the user's search needs well, enhance user's satisfaction, provide an effective and rational model for optimizing search engines and improve user's search experience.
Perspectives on Peace.

ERIC Educational Resources Information Center

Bents, Richard; Trygestad, JoAnn

Students assessed as having different personality types were queried concerning their perspectives on peace. Two hundred seventy-five students (ages 14-18) from Poland, West Germany, and the United States defined peace and indicated the degree of influence they felt they have on the future. Differences in definitions of peace, optimism, and degree…
Projections onto the Pareto surface in multicriteria radiation therapy optimization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bokrantz, Rasmus, E-mail: bokrantz@kth.se, E-mail: rasmus.bokrantz@raysearchlabs.com; Miettinen, Kaisa

2015-10-15

Purpose: To eliminate or reduce the error to Pareto optimality that arises in Pareto surface navigation when the Pareto surface is approximated by a small number of plans. Methods: The authors propose to project the navigated plan onto the Pareto surface as a postprocessing step to the navigation. The projection attempts to find a Pareto optimal plan that is at least as good as or better than the initial navigated plan with respect to all objective functions. An augmented form of projection is also suggested where dose–volume histogram constraints are used to prevent that the projection causes a violation ofmore » some clinical goal. The projections were evaluated with respect to planning for intensity modulated radiation therapy delivered by step-and-shoot and sliding window and spot-scanned intensity modulated proton therapy. Retrospective plans were generated for a prostate and a head and neck case. Results: The projections led to improved dose conformity and better sparing of organs at risk (OARs) for all three delivery techniques and both patient cases. The mean dose to OARs decreased by 3.1 Gy on average for the unconstrained form of the projection and by 2.0 Gy on average when dose–volume histogram constraints were used. No consistent improvements in target homogeneity were observed. Conclusions: There are situations when Pareto navigation leaves room for improvement in OAR sparing and dose conformity, for example, if the approximation of the Pareto surface is coarse or the problem formulation has too permissive constraints. A projection onto the Pareto surface can identify an inaccurate Pareto surface representation and, if necessary, improve the quality of the navigated plan.« less
Optimization of a Fast Neutron Scintillator for Real-Time Pulse Shape Discrimination in the Transient Reactor Test Facility (TREAT) Hodoscope

DOE Office of Scientific and Technical Information (OSTI.GOV)

Johnson, James T.; Thompson, Scott J.; Watson, Scott M.

We present a multi-channel, fast neutron/gamma ray detector array system that utilizes ZnS(Ag) scintillator detectors. The system employs field programmable gate arrays (FPGAs) to do real-time all digital neutron/gamma ray discrimination with pulse height and time histograms to allow count rates in excess of 1,000,000 pulses per second per channel. The system detector number is scalable in blocks of 16 channels.
Optimizing the NASA Technical Report Server

NASA Technical Reports Server (NTRS)

Nelson, Michael L.; Maa, Ming-Hokng

1996-01-01

The NASA Technical Report Server (NTRS), a World Wide Web report distribution NASA technical publications service, is modified for performance enhancement, greater protocol support, and human interface optimization. Results include: Parallel database queries, significantly decreasing user access times by an average factor of 2.3; access from clients behind firewalls and/ or proxies which truncate excessively long Uniform Resource Locators (URLs); access to non-Wide Area Information Server (WAIS) databases and compatibility with the 239-50.3 protocol; and a streamlined user interface.
Predicting the Valence of a Scene from Observers’ Eye Movements

PubMed Central

R.-Tavakoli, Hamed; Atyabi, Adham; Rantanen, Antti; Laukka, Seppo J.; Nefti-Meziani, Samia; Heikkilä, Janne

2015-01-01

Multimedia analysis benefits from understanding the emotional content of a scene in a variety of tasks such as video genre classification and content-based image retrieval. Recently, there has been an increasing interest in applying human bio-signals, particularly eye movements, to recognize the emotional gist of a scene such as its valence. In order to determine the emotional category of images using eye movements, the existing methods often learn a classifier using several features that are extracted from eye movements. Although it has been shown that eye movement is potentially useful for recognition of scene valence, the contribution of each feature is not well-studied. To address the issue, we study the contribution of features extracted from eye movements in the classification of images into pleasant, neutral, and unpleasant categories. We assess ten features and their fusion. The features are histogram of saccade orientation, histogram of saccade slope, histogram of saccade length, histogram of saccade duration, histogram of saccade velocity, histogram of fixation duration, fixation histogram, top-ten salient coordinates, and saliency map. We utilize machine learning approach to analyze the performance of features by learning a support vector machine and exploiting various feature fusion schemes. The experiments reveal that ‘saliency map’, ‘fixation histogram’, ‘histogram of fixation duration’, and ‘histogram of saccade slope’ are the most contributing features. The selected features signify the influence of fixation information and angular behavior of eye movements in the recognition of the valence of images. PMID:26407322
Histogram analysis of T2*-based pharmacokinetic imaging in cerebral glioma grading.

PubMed

Liu, Hua-Shan; Chiang, Shih-Wei; Chung, Hsiao-Wen; Tsai, Ping-Huei; Hsu, Fei-Ting; Cho, Nai-Yu; Wang, Chao-Ying; Chou, Ming-Chung; Chen, Cheng-Yu

2018-03-01

To investigate the feasibility of histogram analysis of the T2*-based permeability parameter volume transfer constant (K trans ) for glioma grading and to explore the diagnostic performance of the histogram analysis of K trans and blood plasma volume (v p ). We recruited 31 and 11 patients with high- and low-grade gliomas, respectively. The histogram parameters of K trans and v p , derived from the first-pass pharmacokinetic modeling based on the T2* dynamic susceptibility-weighted contrast-enhanced perfusion-weighted magnetic resonance imaging (T2* DSC-PW-MRI) from the entire tumor volume, were evaluated for differentiating glioma grades. Histogram parameters of K trans and v p showed significant differences between high- and low-grade gliomas and exhibited significant correlations with tumor grades. The mean K trans derived from the T2* DSC-PW-MRI had the highest sensitivity and specificity for differentiating high-grade gliomas from low-grade gliomas compared with other histogram parameters of K trans and v p . Histogram analysis of T2*-based pharmacokinetic imaging is useful for cerebral glioma grading. The histogram parameters of the entire tumor K trans measurement can provide increased accuracy with additional information regarding microvascular permeability changes for identifying high-grade brain tumors. Copyright © 2017 Elsevier B.V. All rights reserved.
Language-concordant automated telephone queries to assess medication adherence in a diverse population: a cross-sectional analysis of convergent validity with pharmacy claims.

PubMed

Ratanawongsa, Neda; Quan, Judy; Handley, Margaret A; Sarkar, Urmimala; Schillinger, Dean

2018-04-06

Clinicians have difficulty accurately assessing medication non-adherence within chronic disease care settings. Health information technology (HIT) could offer novel tools to assess medication adherence in diverse populations outside of usual health care settings. In a multilingual urban safety net population, we examined the validity of assessing adherence using automated telephone self-management (ATSM) queries, when compared with non-adherence using continuous medication gap (CMG) on pharmacy claims. We hypothesized that patients reporting greater days of missed pills to ATSM queries would have higher rates of non-adherence as measured by CMG, and that ATSM adherence assessments would perform as well as structured interview assessments. As part of an ATSM-facilitated diabetes self-management program, low-income health plan members typed numeric responses to rotating weekly ATSM queries: "In the last 7 days, how many days did you MISS taking your …" diabetes, blood pressure, or cholesterol pill. Research assistants asked similar questions in computer-assisted structured telephone interviews. We measured continuous medication gap (CMG) by claims over 12 preceding months. To evaluate convergent validity, we compared rates of optimal adherence (CMG ≤ 20%) across respondents reporting 0, 1, and ≥ 2 missed pill days on ATSM and on structured interview. Among 210 participants, 46% had limited health literacy, 57% spoke Cantonese, and 19% Spanish. ATSM respondents reported ≥1 missed day for diabetes (33%), blood pressure (19%), and cholesterol (36%) pills. Interview respondents reported ≥1 missed day for diabetes (28%), blood pressure (21%), and cholesterol (26%) pills. Optimal adherence rates by CMG were lower among ATSM respondents reporting more missed days for blood pressure (p = 0.02) and cholesterol (p < 0.01); by interview, differences were significant for cholesterol (p = 0.01). Language-concordant ATSM demonstrated modest potential for assessing adherence. Studies should evaluate HIT assessments of medication beliefs and concerns in diverse populations. NCT00683020 , registered May 21, 2008.
Information granules in image histogram analysis.

PubMed

Wieclawek, Wojciech

2018-04-01

A concept of granular computing employed in intensity-based image enhancement is discussed. First, a weighted granular computing idea is introduced. Then, the implementation of this term in the image processing area is presented. Finally, multidimensional granular histogram analysis is introduced. The proposed approach is dedicated to digital images, especially to medical images acquired by Computed Tomography (CT). As the histogram equalization approach, this method is based on image histogram analysis. Yet, unlike the histogram equalization technique, it works on a selected range of the pixel intensity and is controlled by two parameters. Performance is tested on anonymous clinical CT series. Copyright © 2017 Elsevier Ltd. All rights reserved.
Infrared image segmentation method based on spatial coherence histogram and maximum entropy

NASA Astrophysics Data System (ADS)

Liu, Songtao; Shen, Tongsheng; Dai, Yao

2014-11-01

In order to segment the target well and suppress background noises effectively, an infrared image segmentation method based on spatial coherence histogram and maximum entropy is proposed. First, spatial coherence histogram is presented by weighting the importance of the different position of these pixels with the same gray-level, which is obtained by computing their local density. Then, after enhancing the image by spatial coherence histogram, 1D maximum entropy method is used to segment the image. The novel method can not only get better segmentation results, but also have a faster computation time than traditional 2D histogram-based segmentation methods.
The Amazing Histogram.

ERIC Educational Resources Information Center

Vandermeulen, H.; DeWreede, R. E.

1983-01-01

Presents a histogram drawing program which sorts real numbers in up to 30 categories. Entered data are sorted and saved in a text file which is then used to generate the histogram. Complete Applesoft program listings are included. (JN)
Bin recycling strategy for improving the histogram precision on GPU

NASA Astrophysics Data System (ADS)

Cárdenas-Montes, Miguel; Rodríguez-Vázquez, Juan José; Vega-Rodríguez, Miguel A.

2016-07-01

Histogram is an easily comprehensible way to present data and analyses. In the current scientific context with access to large volumes of data, the processing time for building histogram has dramatically increased. For this reason, parallel construction is necessary to alleviate the impact of the processing time in the analysis activities. In this scenario, GPU computing is becoming widely used for reducing until affordable levels the processing time of histogram construction. Associated to the increment of the processing time, the implementations are stressed on the bin-count accuracy. Accuracy aspects due to the particularities of the implementations are not usually taken into consideration when building histogram with very large data sets. In this work, a bin recycling strategy to create an accuracy-aware implementation for building histogram on GPU is presented. In order to evaluate the approach, this strategy was applied to the computation of the three-point angular correlation function, which is a relevant function in Cosmology for the study of the Large Scale Structure of Universe. As a consequence of the study a high-accuracy implementation for histogram construction on GPU is proposed.
AHIMSA - Ad hoc histogram information measure sensing algorithm for feature selection in the context of histogram inspired clustering techniques

NASA Technical Reports Server (NTRS)

Dasarathy, B. V.

1976-01-01

An algorithm is proposed for dimensionality reduction in the context of clustering techniques based on histogram analysis. The approach is based on an evaluation of the hills and valleys in the unidimensional histograms along the different features and provides an economical means of assessing the significance of the features in a nonparametric unsupervised data environment. The method has relevance to remote sensing applications.
Clinical Utility of Blood Cell Histogram Interpretation

PubMed Central

Bhagya, S.; Majeed, Abdul

2017-01-01

An automated haematology analyser provides blood cell histograms by plotting the sizes of different blood cells on X-axis and their relative number on Y-axis. Histogram interpretation needs careful analysis of Red Blood Cell (RBC), White Blood Cell (WBC) and platelet distribution curves. Histogram analysis is often a neglected part of the automated haemogram which if interpreted well, has significant potential to provide diagnostically relevant information even before higher level investigations are ordered. PMID:29207767
Clinical Utility of Blood Cell Histogram Interpretation.

PubMed

Thomas, E T Arun; Bhagya, S; Majeed, Abdul

2017-09-01

An automated haematology analyser provides blood cell histograms by plotting the sizes of different blood cells on X-axis and their relative number on Y-axis. Histogram interpretation needs careful analysis of Red Blood Cell (RBC), White Blood Cell (WBC) and platelet distribution curves. Histogram analysis is often a neglected part of the automated haemogram which if interpreted well, has significant potential to provide diagnostically relevant information even before higher level investigations are ordered.
Quality assurance for high dose rate brachytherapy treatment planning optimization: using a simple optimization to verify a complex optimization

NASA Astrophysics Data System (ADS)

Deufel, Christopher L.; Furutani, Keith M.

2014-02-01

As dose optimization for high dose rate brachytherapy becomes more complex, it becomes increasingly important to have a means of verifying that optimization results are reasonable. A method is presented for using a simple optimization as quality assurance for the more complex optimization algorithms typically found in commercial brachytherapy treatment planning systems. Quality assurance tests may be performed during commissioning, at regular intervals, and/or on a patient specific basis. A simple optimization method is provided that optimizes conformal target coverage using an exact, variance-based, algebraic approach. Metrics such as dose volume histogram, conformality index, and total reference air kerma agree closely between simple and complex optimizations for breast, cervix, prostate, and planar applicators. The simple optimization is shown to be a sensitive measure for identifying failures in a commercial treatment planning system that are possibly due to operator error or weaknesses in planning system optimization algorithms. Results from the simple optimization are surprisingly similar to the results from a more complex, commercial optimization for several clinical applications. This suggests that there are only modest gains to be made from making brachytherapy optimization more complex. The improvements expected from sophisticated linear optimizations, such as PARETO methods, will largely be in making systems more user friendly and efficient, rather than in finding dramatically better source strength distributions.
MO-G-BRE-03: Automated Continuous Monitoring of Patient Setup with Second-Check Independent Image Registration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jiang, X; Fox, T; Schreibmann, E

2014-06-15

Purpose: To create a non-supervised quality assurance program to monitor image-based patient setup. The system acts a secondary check by independently computing shifts and rotations and interfaces with Varian's database to verify therapist's work and warn against sub-optimal setups. Methods: Temporary digitally-reconstructed radiographs (DRRs) and OBI radiographic image files created by Varian's treatment console during patient setup are intercepted and used as input in an independent registration module customized for accuracy that determines the optimal rotations and shifts. To deal with the poor quality of OBI images, a histogram equalization of the live images to the DDR counterparts is performedmore » as a pre-processing step. A search for the most sensitive metric was performed by plotting search spaces subject to various translations and convergence analysis was applied to ensure the optimizer finds the global minima. Final system configuration uses the NCC metric with 150 histogram bins and a one plus one optimizer running for 2000 iterations with customized scales for translations and rotations in a multi-stage optimization setup that first corrects and translations and subsequently rotations. Results: The system was installed clinically to monitor and provide almost real-time feedback on patient positioning. On a 2 month-basis uncorrected pitch values were of a mean 0.016° with standard deviation of 1.692°, and couch rotations of − 0.090°± 1.547°. The couch shifts were −0.157°±0.466° cm for the vertical, 0.045°±0.286 laterally and 0.084°± 0.501° longitudinally. Uncorrected pitch angles were the most common source of discrepancies. Large variations in the pitch angles were correlated with patient motion inside the mask. Conclusion: A system for automated quality assurance of therapist's registration was designed and tested in clinical practice. The approach complements the clinical software's automated registration in terms of algorithm configuration and performance and constitutes a practical approach to implement safe and cost-effective radiotherapy.« less
Potential of MR histogram analyses for prediction of response to chemotherapy in patients with colorectal hepatic metastases.

PubMed

Liang, He-Yue; Huang, Ya-Qin; Yang, Zhao-Xia; Ying-Ding; Zeng, Meng-Su; Rao, Sheng-Xiang

2016-07-01

To determine if magnetic resonance imaging (MRI) histogram analyses can help predict response to chemotherapy in patients with colorectal hepatic metastases by using response evaluation criteria in solid tumours (RECIST1.1) as the reference standard. Standard MRI including diffusion-weighted imaging (b=0, 500 s/mm(2)) was performed before chemotherapy in 53 patients with colorectal hepatic metastases. Histograms were performed for apparent diffusion coefficient (ADC) maps, arterial, and portal venous phase images; thereafter, mean, percentiles (1st, 10th, 50th, 90th, 99th), skewness, kurtosis, and variance were generated. Quantitative histogram parameters were compared between responders (partial and complete response, n=15) and non-responders (progressive and stable disease, n=38). Receiver operator characteristics (ROC) analyses were further analyzed for the significant parameters. The mean, 1st percentile, 10th percentile, 50th percentile, 90th percentile, 99th percentile of the ADC maps were significantly lower in responding group than that in non-responding group (p=0.000-0.002) with area under the ROC curve (AUCs) of 0.76-0.82. The histogram parameters of arterial and portal venous phase showed no significant difference (p>0.05) between the two groups. Histogram-derived parameters for ADC maps seem to be a promising tool for predicting response to chemotherapy in patients with colorectal hepatic metastases. • ADC histogram analyses can potentially predict chemotherapy response in colorectal liver metastases. • Lower histogram-derived parameters (mean, percentiles) for ADC tend to have good response. • MR enhancement histogram analyses are not reliable to predict response.
Fixing Dataset Search

NASA Technical Reports Server (NTRS)

Lynnes, Chris

2014-01-01

Three current search engines are queried for ozone data at the GES DISC. The results range from sub-optimal to counter-intuitive. We propose a method to fix dataset search by implementing a robust relevancy ranking scheme. The relevancy ranking scheme is based on several heuristics culled from more than 20 years of helping users select datasets.

Using histograms to introduce randomization in the generation of ensembles of decision trees

DOEpatents

Kamath, Chandrika; Cantu-Paz, Erick; Littau, David

2005-02-22

A system for decision tree ensembles that includes a module to read the data, a module to create a histogram, a module to evaluate a potential split according to some criterion using the histogram, a module to select a split point randomly in an interval around the best split, a module to split the data, and a module to combine multiple decision trees in ensembles. The decision tree method includes the steps of reading the data; creating a histogram; evaluating a potential split according to some criterion using the histogram, selecting a split point randomly in an interval around the best split, splitting the data, and combining multiple decision trees in ensembles.
Color Histogram Diffusion for Image Enhancement

NASA Technical Reports Server (NTRS)

Kim, Taemin

2011-01-01

Various color histogram equalization (CHE) methods have been proposed to extend grayscale histogram equalization (GHE) for color images. In this paper a new method called histogram diffusion that extends the GHE method to arbitrary dimensions is proposed. Ranges in a histogram are specified as overlapping bars of uniform heights and variable widths which are proportional to their frequencies. This diagram is called the vistogram. As an alternative approach to GHE, the squared error of the vistogram from the uniform distribution is minimized. Each bar in the vistogram is approximated by a Gaussian function. Gaussian particles in the vistoram diffuse as a nonlinear autonomous system of ordinary differential equations. CHE results of color images showed that the approach is effective.
Distributed Efficient Similarity Search Mechanism in Wireless Sensor Networks

PubMed Central

Ahmed, Khandakar; Gregory, Mark A.

2015-01-01

The Wireless Sensor Network similarity search problem has received considerable research attention due to sensor hardware imprecision and environmental parameter variations. Most of the state-of-the-art distributed data centric storage (DCS) schemes lack optimization for similarity queries of events. In this paper, a DCS scheme with metric based similarity searching (DCSMSS) is proposed. DCSMSS takes motivation from vector distance index, called iDistance, in order to transform the issue of similarity searching into the problem of an interval search in one dimension. In addition, a sector based distance routing algorithm is used to efficiently route messages. Extensive simulation results reveal that DCSMSS is highly efficient and significantly outperforms previous approaches in processing similarity search queries. PMID:25751081
Histogram based analysis of lung perfusion of children after congenital diaphragmatic hernia repair.

PubMed

Kassner, Nora; Weis, Meike; Zahn, Katrin; Schaible, Thomas; Schoenberg, Stefan O; Schad, Lothar R; Zöllner, Frank G

2018-05-01

To investigate a histogram based approach to characterize the distribution of perfusion in the whole left and right lung by descriptive statistics and to show how histograms could be used to visually explore perfusion defects in two year old children after Congenital Diaphragmatic Hernia (CDH) repair. 28 children (age of 24.2±1.7months; all left sided hernia; 9 after extracorporeal membrane oxygenation therapy) underwent quantitative DCE-MRI of the lung. Segmentations of left and right lung were manually drawn to mask the calculated pulmonary blood flow maps and then to derive histograms for each lung side. Individual and group wise analysis of histograms of left and right lung was performed. Ipsilateral and contralateral lung show significant difference in shape and descriptive statistics derived from the histogram (Wilcoxon signed-rank test, p<0.05) on group wise and individual level. Subgroup analysis (patients with vs without ECMO therapy) showed no significant differences using histogram derived parameters. Histogram analysis can be a valuable tool to characterize and visualize whole lung perfusion of children after CDH repair. It allows for several possibilities to analyze the data, either describing the perfusion differences between the right and left lung but also to explore and visualize localized perfusion patterns in the 3D lung volume. Subgroup analysis will be possible given sufficient sample sizes. Copyright © 2017 Elsevier Inc. All rights reserved.
Using the Bootstrap Method for a Statistical Significance Test of Differences between Summary Histograms

NASA Technical Reports Server (NTRS)

Xu, Kuan-Man

2006-01-01

A new method is proposed to compare statistical differences between summary histograms, which are the histograms summed over a large ensemble of individual histograms. It consists of choosing a distance statistic for measuring the difference between summary histograms and using a bootstrap procedure to calculate the statistical significance level. Bootstrapping is an approach to statistical inference that makes few assumptions about the underlying probability distribution that describes the data. Three distance statistics are compared in this study. They are the Euclidean distance, the Jeffries-Matusita distance and the Kuiper distance. The data used in testing the bootstrap method are satellite measurements of cloud systems called cloud objects. Each cloud object is defined as a contiguous region/patch composed of individual footprints or fields of view. A histogram of measured values over footprints is generated for each parameter of each cloud object and then summary histograms are accumulated over all individual histograms in a given cloud-object size category. The results of statistical hypothesis tests using all three distances as test statistics are generally similar, indicating the validity of the proposed method. The Euclidean distance is determined to be most suitable after comparing the statistical tests of several parameters with distinct probability distributions among three cloud-object size categories. Impacts on the statistical significance levels resulting from differences in the total lengths of satellite footprint data between two size categories are also discussed.
Optimal nonlinear codes for the perception of natural colours.

PubMed

von der Twer, T; MacLeod, D I

2001-08-01

We discuss how visual nonlinearity can be optimized for the precise representation of environmental inputs. Such optimization leads to neural signals with a compressively nonlinear input-output function the gradient of which is matched to the cube root of the probability density function (PDF) of the environmental input values (and not to the PDF directly as in histogram equalization). Comparisons between theory and psychophysical and electrophysiological data are roughly consistent with the idea that parvocellular (P) cells are optimized for precision representation of colour: their contrast-response functions span a range appropriately matched to the environmental distribution of natural colours along each dimension of colour space. Thus P cell codes for colour may have been selected to minimize error in the perceptual estimation of stimulus parameters for natural colours. But magnocellular (M) cells have a much stronger than expected saturating nonlinearity; this supports the view that the function of M cells is mainly to detect boundaries rather than to specify contrast or lightness.
A Framework for Reproducible Latent Fingerprint Enhancements.

PubMed

Carasso, Alfred S

2014-01-01

Photoshop processing of latent fingerprints is the preferred methodology among law enforcement forensic experts, but that appproach is not fully reproducible and may lead to questionable enhancements. Alternative, independent, fully reproducible enhancements, using IDL Histogram Equalization and IDL Adaptive Histogram Equalization, can produce better-defined ridge structures, along with considerable background information. Applying a systematic slow motion smoothing procedure to such IDL enhancements, based on the rapid FFT solution of a Lévy stable fractional diffusion equation, can attenuate background detail while preserving ridge information. The resulting smoothed latent print enhancements are comparable to, but distinct from, forensic Photoshop images suitable for input into automated fingerprint identification systems, (AFIS). In addition, this progressive smoothing procedure can be reexamined by displaying the suite of progressively smoother IDL images. That suite can be stored, providing an audit trail that allows monitoring for possible loss of useful information, in transit to the user-selected optimal image. Such independent and fully reproducible enhancements provide a valuable frame of reference that may be helpful in informing, complementing, and possibly validating the forensic Photoshop methodology.
Pedestrian detection in crowded scenes with the histogram of gradients principle

NASA Astrophysics Data System (ADS)

Sidla, O.; Rosner, M.; Lypetskyy, Y.

2006-10-01

This paper describes a close to real-time scale invariant implementation of a pedestrian detector system which is based on the Histogram of Oriented Gradients (HOG) principle. Salient HOG features are first selected from a manually created very large database of samples with an evolutionary optimization procedure that directly trains a polynomial Support Vector Machine (SVM). Real-time operation is achieved by a cascaded 2-step classifier which uses first a very fast linear SVM (with the same features as the polynomial SVM) to reject most of the irrelevant detections and then computes the decision function with a polynomial SVM on the remaining set of candidate detections. Scale invariance is achieved by running the detector of constant size on scaled versions of the original input images and by clustering the results over all resolutions. The pedestrian detection system has been implemented in two versions: i) fully body detection, and ii) upper body only detection. The latter is especially suited for very busy and crowded scenarios. On a state-of-the-art PC it is able to run at a frequency of 8 - 20 frames/sec.
Shot-Noise Limited Single-Molecule FRET Histograms: Comparison between Theory and Experiments†

PubMed Central

Nir, Eyal; Michalet, Xavier; Hamadani, Kambiz M.; Laurence, Ted A.; Neuhauser, Daniel; Kovchegov, Yevgeniy; Weiss, Shimon

2011-01-01

We describe a simple approach and present a straightforward numerical algorithm to compute the best fit shot-noise limited proximity ratio histogram (PRH) in single-molecule fluorescence resonant energy transfer diffusion experiments. The key ingredient is the use of the experimental burst size distribution, as obtained after burst search through the photon data streams. We show how the use of an alternated laser excitation scheme and a correspondingly optimized burst search algorithm eliminates several potential artifacts affecting the calculation of the best fit shot-noise limited PRH. This algorithm is tested extensively on simulations and simple experimental systems. We find that dsDNA data exhibit a wider PRH than expected from shot noise only and hypothetically account for it by assuming a small Gaussian distribution of distances with an average standard deviation of 1.6 Å. Finally, we briefly mention the results of a future publication and illustrate them with a simple two-state model system (DNA hairpin), for which the kinetic transition rates between the open and closed conformations are extracted. PMID:17078646
A Framework for Reproducible Latent Fingerprint Enhancements

PubMed Central

Carasso, Alfred S.

2014-01-01

Photoshop processing1 of latent fingerprints is the preferred methodology among law enforcement forensic experts, but that appproach is not fully reproducible and may lead to questionable enhancements. Alternative, independent, fully reproducible enhancements, using IDL Histogram Equalization and IDL Adaptive Histogram Equalization, can produce better-defined ridge structures, along with considerable background information. Applying a systematic slow motion smoothing procedure to such IDL enhancements, based on the rapid FFT solution of a Lévy stable fractional diffusion equation, can attenuate background detail while preserving ridge information. The resulting smoothed latent print enhancements are comparable to, but distinct from, forensic Photoshop images suitable for input into automated fingerprint identification systems, (AFIS). In addition, this progressive smoothing procedure can be reexamined by displaying the suite of progressively smoother IDL images. That suite can be stored, providing an audit trail that allows monitoring for possible loss of useful information, in transit to the user-selected optimal image. Such independent and fully reproducible enhancements provide a valuable frame of reference that may be helpful in informing, complementing, and possibly validating the forensic Photoshop methodology. PMID:26601028
Image contrast enhancement with brightness preservation using an optimal gamma correction and weighted sum approach

NASA Astrophysics Data System (ADS)

Jiang, G.; Wong, C. Y.; Lin, S. C. F.; Rahman, M. A.; Ren, T. R.; Kwok, Ngaiming; Shi, Haiyan; Yu, Ying-Hao; Wu, Tonghai

2015-04-01

The enhancement of image contrast and preservation of image brightness are two important but conflicting objectives in image restoration. Previous attempts based on linear histogram equalization had achieved contrast enhancement, but exact preservation of brightness was not accomplished. A new perspective is taken here to provide balanced performance of contrast enhancement and brightness preservation simultaneously by casting the quest of such solution to an optimization problem. Specifically, the non-linear gamma correction method is adopted to enhance the contrast, while a weighted sum approach is employed for brightness preservation. In addition, the efficient golden search algorithm is exploited to determine the required optimal parameters to produce the enhanced images. Experiments are conducted on natural colour images captured under various indoor, outdoor and illumination conditions. Results have shown that the proposed method outperforms currently available methods in contrast to enhancement and brightness preservation.
Seamless image stitching by homography refinement and structure deformation using optimal seam pair detection

NASA Astrophysics Data System (ADS)

Lee, Daeho; Lee, Seohyung

2017-11-01

We propose an image stitching method that can remove ghost effects and realign the structure misalignments that occur in common image stitching methods. To reduce the artifacts caused by different parallaxes, an optimal seam pair is selected by comparing the cross correlations from multiple seams detected by variable cost weights. Along the optimal seam pair, a histogram of oriented gradients is calculated, and feature points for matching are detected. The homography is refined using the matching points, and the remaining misalignment is eliminated using the propagation of deformation vectors calculated from matching points. In multiband blending, the overlapping regions are determined from a distance between the matching points to remove overlapping artifacts. The experimental results show that the proposed method more robustly eliminates misalignments and overlapping artifacts than the existing method that uses single seam detection and gradient features.
Structural optimization procedure of a composite wind turbine blade for reducing both material cost and blade weight

NASA Astrophysics Data System (ADS)

Hu, Weifei; Park, Dohyun; Choi, DongHoon

2013-12-01

A composite blade structure for a 2 MW horizontal axis wind turbine is optimally designed. Design requirements are simultaneously minimizing material cost and blade weight while satisfying the constraints on stress ratio, tip deflection, fatigue life and laminate layup requirements. The stress ratio and tip deflection under extreme gust loads and the fatigue life under a stochastic normal wind load are evaluated. A blade element wind load model is proposed to explain the wind pressure difference due to blade height change during rotor rotation. For fatigue life evaluation, the stress result of an implicit nonlinear dynamic analysis under a time-varying fluctuating wind is converted to the histograms of mean and amplitude of maximum stress ratio using the rainflow counting algorithm Miner's rule is employed to predict the fatigue life. After integrating and automating the whole analysis procedure an evolutionary algorithm is used to solve the discrete optimization problem.
Optimal camera exposure for video surveillance systems by predictive control of shutter speed, aperture, and gain

NASA Astrophysics Data System (ADS)

Torres, Juan; Menéndez, José Manuel

2015-02-01

This paper establishes a real-time auto-exposure method to guarantee that surveillance cameras in uncontrolled light conditions take advantage of their whole dynamic range while provide neither under nor overexposed images. State-of-the-art auto-exposure methods base their control on the brightness of the image measured in a limited region where the foreground objects are mostly located. Unlike these methods, the proposed algorithm establishes a set of indicators based on the image histogram that defines its shape and position. Furthermore, the location of the objects to be inspected is likely unknown in surveillance applications. Thus, the whole image is monitored in this approach. To control the camera settings, we defined a parameters function (Ef ) that linearly depends on the shutter speed and the electronic gain; and is inversely proportional to the square of the lens aperture diameter. When the current acquired image is not overexposed, our algorithm computes the value of Ef that would move the histogram to the maximum value that does not overexpose the capture. When the current acquired image is overexposed, it computes the value of Ef that would move the histogram to a value that does not underexpose the capture and remains close to the overexposed region. If the image is under and overexposed, the whole dynamic range of the camera is therefore used, and a default value of the Ef that does not overexpose the capture is selected. This decision follows the idea that to get underexposed images is better than to get overexposed ones, because the noise produced in the lower regions of the histogram can be removed in a post-processing step while the saturated pixels of the higher regions cannot be recovered. The proposed algorithm was tested in a video surveillance camera placed at an outdoor parking lot surrounded by buildings and trees which produce moving shadows in the ground. During the daytime of seven days, the algorithm was running alternatively together with a representative auto-exposure algorithm in the recent literature. Besides the sunrises and the nightfalls, multiple weather conditions occurred which produced light changes in the scene: sunny hours that produced sharpen shadows and highlights; cloud coverages that softened the shadows; and cloudy and rainy hours that dimmed the scene. Several indicators were used to measure the performance of the algorithms. They provided the objective quality as regards: the time that the algorithms recover from an under or over exposure, the brightness stability, and the change related to the optimal exposure. The results demonstrated that our algorithm reacts faster to all the light changes than the selected state-of-the-art algorithm. It is also capable of acquiring well exposed images and maintaining the brightness stable during more time. Summing up the results, we concluded that the proposed algorithm provides a fast and stable auto-exposure method that maintains an optimal exposure for video surveillance applications. Future work will involve the evaluation of this algorithm in robotics.
Optimal updating magnitude in adaptive flat-distribution sampling

NASA Astrophysics Data System (ADS)

Zhang, Cheng; Drake, Justin A.; Ma, Jianpeng; Pettitt, B. Montgomery

2017-11-01

We present a study on the optimization of the updating magnitude for a class of free energy methods based on flat-distribution sampling, including the Wang-Landau (WL) algorithm and metadynamics. These methods rely on adaptive construction of a bias potential that offsets the potential of mean force by histogram-based updates. The convergence of the bias potential can be improved by decreasing the updating magnitude with an optimal schedule. We show that while the asymptotically optimal schedule for the single-bin updating scheme (commonly used in the WL algorithm) is given by the known inverse-time formula, that for the Gaussian updating scheme (commonly used in metadynamics) is often more complex. We further show that the single-bin updating scheme is optimal for very long simulations, and it can be generalized to a class of bandpass updating schemes that are similarly optimal. These bandpass updating schemes target only a few long-range distribution modes and their optimal schedule is also given by the inverse-time formula. Constructed from orthogonal polynomials, the bandpass updating schemes generalize the WL and Langfeld-Lucini-Rago algorithms as an automatic parameter tuning scheme for umbrella sampling.
Optimal updating magnitude in adaptive flat-distribution sampling.

PubMed

Zhang, Cheng; Drake, Justin A; Ma, Jianpeng; Pettitt, B Montgomery

2017-11-07

We present a study on the optimization of the updating magnitude for a class of free energy methods based on flat-distribution sampling, including the Wang-Landau (WL) algorithm and metadynamics. These methods rely on adaptive construction of a bias potential that offsets the potential of mean force by histogram-based updates. The convergence of the bias potential can be improved by decreasing the updating magnitude with an optimal schedule. We show that while the asymptotically optimal schedule for the single-bin updating scheme (commonly used in the WL algorithm) is given by the known inverse-time formula, that for the Gaussian updating scheme (commonly used in metadynamics) is often more complex. We further show that the single-bin updating scheme is optimal for very long simulations, and it can be generalized to a class of bandpass updating schemes that are similarly optimal. These bandpass updating schemes target only a few long-range distribution modes and their optimal schedule is also given by the inverse-time formula. Constructed from orthogonal polynomials, the bandpass updating schemes generalize the WL and Langfeld-Lucini-Rago algorithms as an automatic parameter tuning scheme for umbrella sampling.
A new automatic synthetic aperture radar-based flood mapping application hosted on the European Space Agency's Grid Processing of Demand Fast Access to Imagery environment

NASA Astrophysics Data System (ADS)

Matgen, Patrick; Giustarini, Laura; Hostache, Renaud

2012-10-01

This paper introduces an automatic flood mapping application that is hosted on the Grid Processing on Demand (GPOD) Fast Access to Imagery (Faire) environment of the European Space Agency. The main objective of the online application is to deliver operationally flooded areas using both recent and historical acquisitions of SAR data. Having as a short-term target the flooding-related exploitation of data generated by the upcoming ESA SENTINEL-1 SAR mission, the flood mapping application consists of two building blocks: i) a set of query tools for selecting the "crisis image" and the optimal corresponding "reference image" from the G-POD archive and ii) an algorithm for extracting flooded areas via change detection using the previously selected "crisis image" and "reference image". Stakeholders in flood management and service providers are able to log onto the flood mapping application to get support for the retrieval, from the rolling archive, of the most appropriate reference image. Potential users will also be able to apply the implemented flood delineation algorithm. The latter combines histogram thresholding, region growing and change detection as an approach enabling the automatic, objective and reliable flood extent extraction from SAR images. Both algorithms are computationally efficient and operate with minimum data requirements. The case study of the high magnitude flooding event that occurred in July 2007 on the Severn River, UK, and that was observed with a moderateresolution SAR sensor as well as airborne photography highlights the performance of the proposed online application. The flood mapping application on G-POD can be used sporadically, i.e. whenever a major flood event occurs and there is a demand for SAR-based flood extent maps. In the long term, a potential extension of the application could consist in systematically extracting flooded areas from all SAR images acquired on a daily, weekly or monthly basis.
PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasets.

PubMed

Djokic-Petrovic, Marija; Cvjetkovic, Vladimir; Yang, Jeremy; Zivanovic, Marko; Wild, David J

2017-09-20

There are a huge variety of data sources relevant to chemical, biological and pharmacological research, but these data sources are highly siloed and cannot be queried together in a straightforward way. Semantic technologies offer the ability to create links and mappings across datasets and manage them as a single, linked network so that searching can be carried out across datasets, independently of the source. We have developed an application called PIBAS FedSPARQL that uses semantic technologies to allow researchers to carry out such searching across a vast array of data sources. PIBAS FedSPARQL is a web-based query builder and result set visualizer of bioinformatics data. As an advanced feature, our system can detect similar data items identified by different Uniform Resource Identifiers (URIs), using a text-mining algorithm based on the processing of named entities to be used in Vector Space Model and Cosine Similarity Measures. According to our knowledge, PIBAS FedSPARQL was unique among the systems that we found in that it allows detecting of similar data items. As a query builder, our system allows researchers to intuitively construct and run Federated SPARQL queries across multiple data sources, including global initiatives, such as Bio2RDF, Chem2Bio2RDF, EMBL-EBI, and one local initiative called CPCTAS, as well as additional user-specified data source. From the input topic, subtopic, template and keyword, a corresponding initial Federated SPARQL query is created and executed. Based on the data obtained, end users have the ability to choose the most appropriate data sources in their area of interest and exploit their Resource Description Framework (RDF) structure, which allows users to select certain properties of data to enhance query results. The developed system is flexible and allows intuitive creation and execution of queries for an extensive range of bioinformatics topics. Also, the novel "similar data items detection" algorithm can be particularly useful for suggesting new data sources and cost optimization for new experiments. PIBAS FedSPARQL can be expanded with new topics, subtopics and templates on demand, rendering information retrieval more robust.
DOGMA: A Disk-Oriented Graph Matching Algorithm for RDF Databases

NASA Astrophysics Data System (ADS)

Bröcheler, Matthias; Pugliese, Andrea; Subrahmanian, V. S.

RDF is an increasingly important paradigm for the representation of information on the Web. As RDF databases increase in size to approach tens of millions of triples, and as sophisticated graph matching queries expressible in languages like SPARQL become increasingly important, scalability becomes an issue. To date, there is no graph-based indexing method for RDF data where the index was designed in a way that makes it disk-resident. There is therefore a growing need for indexes that can operate efficiently when the index itself resides on disk. In this paper, we first propose the DOGMA index for fast subgraph matching on disk and then develop a basic algorithm to answer queries over this index. This algorithm is then significantly sped up via an optimized algorithm that uses efficient (but correct) pruning strategies when combined with two different extensions of the index. We have implemented a preliminary system and tested it against four existing RDF database systems developed by others. Our experiments show that our algorithm performs very well compared to these systems, with orders of magnitude improvements for complex graph queries.
Querying clinical data in HL7 RIM based relational model with morph-RDB.

PubMed

Priyatna, Freddy; Alonso-Calvo, Raul; Paraiso-Medina, Sergio; Corcho, Oscar

2017-10-05

Semantic interoperability is essential when carrying out post-genomic clinical trials where several institutions collaborate, since researchers and developers need to have an integrated view and access to heterogeneous data sources. One possible approach to accommodate this need is to use RDB2RDF systems that provide RDF datasets as the unified view. These RDF datasets may be materialized and stored in a triple store, or transformed into RDF in real time, as virtual RDF data sources. Our previous efforts involved materialized RDF datasets, hence losing data freshness. In this paper we present a solution that uses an ontology based on the HL7 v3 Reference Information Model and a set of R2RML mappings that relate this ontology to an underlying relational database implementation, and where morph-RDB is used to expose a virtual, non-materialized SPARQL endpoint over the data. By applying a set of optimization techniques on the SPARQL-to-SQL query translation algorithm, we can now issue SPARQL queries to the underlying relational data with generally acceptable performance.

Query engine optimization for the EHR4CR protocol feasibility scenario.

PubMed

Soto-Rey, Iñaki; Bache, Richard; Dugas, Martin; Fritz, Fleur

2013-01-01

An essential step when recruiting patients for a Clinical Trial (CT) is to determine the number of patients that satisfy the Eligibility Criteria (ECs) for that trial. An innovative feature of the Electronic Health Records for Clinical Research (EHR4CR) platform is that when automatically determining patient counts, it also allows the user to view counts for subsets of the ECs. This is helpful because some combinations of ECs may be so restrictive that they yield very few or zero patients. If we wanted to show all possible combinations of ECs, the number of queries we would have to execute would be of 2ⁿ, where n is the total number of ECs. Assuming that an average study has between 20 and 30 ECs, the program would have to execute between 2²⁰ (1,048,576) and 2³⁰ (1,073,741,824) queries. This is not only computationally expensive but also impractical to visualise. The purpose of our research is to reduce possible combinationsto a manageable number.
Toward An Unstructured Mesh Database

NASA Astrophysics Data System (ADS)

Rezaei Mahdiraji, Alireza; Baumann, Peter Peter

2014-05-01

Unstructured meshes are used in several application domains such as earth sciences (e.g., seismology), medicine, oceanography, cli- mate modeling, GIS as approximate representations of physical objects. Meshes subdivide a domain into smaller geometric elements (called cells) which are glued together by incidence relationships. The subdivision of a domain allows computational manipulation of complicated physical structures. For instance, seismologists model earthquakes using elastic wave propagation solvers on hexahedral meshes. The hexahedral con- tains several hundred millions of grid points and millions of hexahedral cells. Each vertex node in the hexahedrals stores a multitude of data fields. To run simulation on such meshes, one needs to iterate over all the cells, iterate over incident cells to a given cell, retrieve coordinates of cells, assign data values to cells, etc. Although meshes are used in many application domains, to the best of our knowledge there is no database vendor that support unstructured mesh features. Currently, the main tool for querying and manipulating unstructured meshes are mesh libraries, e.g., CGAL and GRAL. Mesh li- braries are dedicated libraries which includes mesh algorithms and can be run on mesh representations. The libraries do not scale with dataset size, do not have declarative query language, and need deep C++ knowledge for query implementations. Furthermore, due to high coupling between the implementations and input file structure, the implementations are less reusable and costly to maintain. A dedicated mesh database offers the following advantages: 1) declarative querying, 2) ease of maintenance, 3) hiding mesh storage structure from applications, and 4) transparent query optimization. To design a mesh database, the first challenge is to define a suitable generic data model for unstructured meshes. We proposed ImG-Complexes data model as a generic topological mesh data model which extends incidence graph model to multi-incidence relationships. We instrument ImG model with sets of optional and application-specific constraints which can be used to check validity of meshes for a specific class of object such as manifold, pseudo-manifold, and simplicial manifold. We conducted experiments to measure the performance of the graph database solution in processing mesh queries and compare it with GrAL mesh library and PostgreSQL database on synthetic and real mesh datasets. The experiments show that each system perform well on specific types of mesh queries, e.g., graph databases perform well on global path-intensive queries. In the future, we investigate database operations for the ImG model and design a mesh query language.
FPGA based charge fast histogramming for GEM detector

NASA Astrophysics Data System (ADS)

Poźniak, Krzysztof T.; Byszuk, A.; Chernyshova, M.; Cieszewski, R.; Czarski, T.; Dominik, W.; Jakubowska, K.; Kasprowicz, G.; Rzadkiewicz, J.; Scholz, M.; Zabolotny, W.

2013-10-01

This article presents a fast charge histogramming method for the position sensitive X-ray GEM detector. The energy resolved measurements are carried out simultaneously for 256 channels of the GEM detector. The whole process of histogramming is performed in 21 FPGA chips (Spartan-6 series from Xilinx) . The results of the histogramming process are stored in an external DDR3 memory. The structure of an electronic measuring equipment and a firmware functionality implemented in the FPGAs is described. Examples of test measurements are presented.
Local dynamic range compensation for scanning electron microscope imaging system.

PubMed

Sim, K S; Huang, Y H

2015-01-01

This is the extended project by introducing the modified dynamic range histogram modification (MDRHM) and is presented in this paper. This technique is used to enhance the scanning electron microscope (SEM) imaging system. By comparing with the conventional histogram modification compensators, this technique utilizes histogram profiling by extending the dynamic range of each tile of an image to the limit of 0-255 range while retains its histogram shape. The proposed technique yields better image compensation compared to conventional methods. © Wiley Periodicals, Inc.
Whole-lesion apparent diffusion coefficient histogram analysis: significance in T and N staging of gastric cancers.

PubMed

Liu, Song; Zhang, Yujuan; Chen, Ling; Guan, Wenxian; Guan, Yue; Ge, Yun; He, Jian; Zhou, Zhengyang

2017-10-02

Whole-lesion apparent diffusion coefficient (ADC) histogram analysis has been introduced and proved effective in assessment of multiple tumors. However, the application of whole-volume ADC histogram analysis in gastrointestinal tumors has just started and never been reported in T and N staging of gastric cancers. Eighty patients with pathologically confirmed gastric carcinomas underwent diffusion weighted (DW) magnetic resonance imaging before surgery prospectively. Whole-lesion ADC histogram analysis was performed by two radiologists independently. The differences of ADC histogram parameters among different T and N stages were compared with independent-samples Kruskal-Wallis test. Receiver operating characteristic (ROC) analysis was performed to evaluate the performance of ADC histogram parameters in differentiating particular T or N stages of gastric cancers. There were significant differences of all the ADC histogram parameters for gastric cancers at different T (except ADC min and ADC max ) and N (except ADC max ) stages. Most ADC histogram parameters differed significantly between T1 vs T3, T1 vs T4, T2 vs T4, N0 vs N1, N0 vs N3, and some parameters (ADC 5% , ADC 10% , ADC min ) differed significantly between N0 vs N2, N2 vs N3 (all P < 0.05). Most parameters except ADC max performed well in differentiating different T and N stages of gastric cancers. Especially for identifying patients with and without lymph node metastasis, the ADC 10% yielded the largest area under the ROC curve of 0.794 (95% confidence interval, 0.677-0.911). All the parameters except ADC max showed excellent inter-observer agreement with intra-class correlation coefficients higher than 0.800. Whole-volume ADC histogram parameters held great potential in differentiating different T and N stages of gastric cancers preoperatively.
Histogram Profiling of Postcontrast T1-Weighted MRI Gives Valuable Insights into Tumor Biology and Enables Prediction of Growth Kinetics and Prognosis in Meningiomas.

PubMed

Gihr, Georg Alexander; Horvath-Rizea, Diana; Kohlhof-Meinecke, Patricia; Ganslandt, Oliver; Henkes, Hans; Richter, Cindy; Hoffmann, Karl-Titus; Surov, Alexey; Schob, Stefan

2018-06-14

Meningiomas are the most frequently diagnosed intracranial masses, oftentimes requiring surgery. Especially procedure-related morbidity can be substantial, particularly in elderly patients. Hence, reliable imaging modalities enabling pretherapeutic prediction of tumor grade, growth kinetic, realistic prognosis, and-as a consequence-necessity of surgery are of great value. In this context, a promising diagnostic approach is advanced analysis of magnetic resonance imaging data. Therefore, our study investigated whether histogram profiling of routinely acquired postcontrast T1-weighted images is capable of separating low-grade from high-grade lesions and whether histogram parameters reflect Ki-67 expression in meningiomas. Pretreatment T1-weighted postcontrast volumes of 44 meningioma patients were used for signal intensity histogram profiling. WHO grade, tumor volume, and Ki-67 expression were evaluated. Comparative and correlative statistics investigating the association between histogram profile parameters and neuropathology were performed. None of the investigated histogram parameters revealed significant differences between low-grade and high-grade meningiomas. However, significant correlations were identified between Ki-67 and the histogram parameters skewness and entropy as well as between entropy and tumor volume. Contrary to previously reported findings, pretherapeutic postcontrast T1-weighted images can be used to predict growth kinetics in meningiomas if whole tumor histogram analysis is employed. However, no differences between distinct WHO grades were identifiable in out cohort. As a consequence, histogram analysis of postcontrast T1-weighted images is a promising approach to obtain quantitative in vivo biomarkers reflecting the proliferative potential in meningiomas. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Comparison of image enhancement methods for the effective diagnosis in successive whole-body bone scans.

PubMed

Jeong, Chang Bu; Kim, Kwang Gi; Kim, Tae Sung; Kim, Seok Ki

2011-06-01

Whole-body bone scan is one of the most frequent diagnostic procedures in nuclear medicine. Especially, it plays a significant role in important procedures such as the diagnosis of osseous metastasis and evaluation of osseous tumor response to chemotherapy and radiation therapy. It can also be used to monitor the possibility of any recurrence of the tumor. However, it is a very time-consuming effort for radiologists to quantify subtle interval changes between successive whole-body bone scans because of many variations such as intensity, geometry, and morphology. In this paper, we present the most effective method of image enhancement based on histograms, which may assist radiologists in interpreting successive whole-body bone scans effectively. Forty-eight successive whole-body bone scans from 10 patients were obtained and evaluated using six methods of image enhancement based on histograms: histogram equalization, brightness-preserving bi-histogram equalization, contrast-limited adaptive histogram equalization, end-in search, histogram matching, and exact histogram matching (EHM). Comparison of the results of the different methods was made using three similarity measures peak signal-to-noise ratio, histogram intersection, and structural similarity. Image enhancement of successive bone scans using EHM showed the best results out of the six methods measured for all similarity measures. EHM is the best method of image enhancement based on histograms for diagnosing successive whole-body bone scans. The method for successive whole-body bone scans has the potential to greatly assist radiologists quantify interval changes more accurately and quickly by compensating for the variable nature of intensity information. Consequently, it can improve radiologists' diagnostic accuracy as well as reduce reading time for detecting interval changes.
Dose-volume histogram prediction using density estimation.

PubMed

Skarpman Munter, Johanna; Sjölund, Jens

2015-09-07

Knowledge of what dose-volume histograms can be expected for a previously unseen patient could increase consistency and quality in radiotherapy treatment planning. We propose a machine learning method that uses previous treatment plans to predict such dose-volume histograms. The key to the approach is the framing of dose-volume histograms in a probabilistic setting.The training consists of estimating, from the patients in the training set, the joint probability distribution of some predictive features and the dose. The joint distribution immediately provides an estimate of the conditional probability of the dose given the values of the predictive features. The prediction consists of estimating, from the new patient, the distribution of the predictive features and marginalizing the conditional probability from the training over this. Integrating the resulting probability distribution for the dose yields an estimate of the dose-volume histogram.To illustrate how the proposed method relates to previously proposed methods, we use the signed distance to the target boundary as a single predictive feature. As a proof-of-concept, we predicted dose-volume histograms for the brainstems of 22 acoustic schwannoma patients treated with stereotactic radiosurgery, and for the lungs of 9 lung cancer patients treated with stereotactic body radiation therapy. Comparing with two previous attempts at dose-volume histogram prediction we find that, given the same input data, the predictions are similar.In summary, we propose a method for dose-volume histogram prediction that exploits the intrinsic probabilistic properties of dose-volume histograms. We argue that the proposed method makes up for some deficiencies in previously proposed methods, thereby potentially increasing ease of use, flexibility and ability to perform well with small amounts of training data.
Probabilistic and machine learning-based retrieval approaches for biomedical dataset retrieval

PubMed Central

Karisani, Payam; Qin, Zhaohui S; Agichtein, Eugene

2018-01-01

Abstract The bioCADDIE dataset retrieval challenge brought together different approaches to retrieval of biomedical datasets relevant to a user’s query, expressed as a text description of a needed dataset. We describe experiments in applying a data-driven, machine learning-based approach to biomedical dataset retrieval as part of this challenge. We report on a series of experiments carried out to evaluate the performance of both probabilistic and machine learning-driven techniques from information retrieval, as applied to this challenge. Our experiments with probabilistic information retrieval methods, such as query term weight optimization, automatic query expansion and simulated user relevance feedback, demonstrate that automatically boosting the weights of important keywords in a verbose query is more effective than other methods. We also show that although there is a rich space of potential representations and features available in this domain, machine learning-based re-ranking models are not able to improve on probabilistic information retrieval techniques with the currently available training data. The models and algorithms presented in this paper can serve as a viable implementation of a search engine to provide access to biomedical datasets. The retrieval performance is expected to be further improved by using additional training data that is created by expert annotation, or gathered through usage logs, clicks and other processes during natural operation of the system. Database URL: https://github.com/emory-irlab/biocaddie PMID:29688379
Structure Size Enhanced Histogram

NASA Astrophysics Data System (ADS)

Wesarg, Stefan; Kirschner, Matthias

Direct volume visualization requires the definition of transfer functions (TFs) for the assignment of opacity and color. Multi-dimensional TFs are based on at least two image properties, and are specified by means of 2D histograms. In this work we propose a new type of a 2D histogram which combines gray value with information about the size of the structures. This structure size enhanced (SSE) histogram is an intuitive approach for representing anatomical features. Clinicians — the users we are focusing on — are much more familiar with selecting features by their size than by their gradient magnitude value. As a proof of concept, we employ the SSE histogram for the definition of two-dimensional TFs for the visualization of 3D MRI and CT image data.
Face recognition algorithm using extended vector quantization histogram features.

PubMed

Yan, Yan; Lee, Feifei; Wu, Xueqian; Chen, Qiu

2018-01-01

In this paper, we propose a face recognition algorithm based on a combination of vector quantization (VQ) and Markov stationary features (MSF). The VQ algorithm has been shown to be an effective method for generating features; it extracts a codevector histogram as a facial feature representation for face recognition. Still, the VQ histogram features are unable to convey spatial structural information, which to some extent limits their usefulness in discrimination. To alleviate this limitation of VQ histograms, we utilize Markov stationary features (MSF) to extend the VQ histogram-based features so as to add spatial structural information. We demonstrate the effectiveness of our proposed algorithm by achieving recognition results superior to those of several state-of-the-art methods on publicly available face databases.
Ultrasonic histogram assessment of early response to concurrent chemo-radiotherapy in patients with locally advanced cervical cancer: a feasibility study.

PubMed

Xu, Yan; Ru, Tong; Zhu, Lijing; Liu, Baorui; Wang, Huanhuan; Zhu, Li; He, Jian; Liu, Song; Zhou, Zhengyang; Yang, Xiaofeng

To monitor early response for locally advanced cervical cancers undergoing concurrent chemo-radiotherapy (CCRT) by ultrasonic histogram. B-mode ultrasound examinations were performed at 4 time points in thirty-four patients during CCRT. Six ultrasonic histogram parameters were used to assess the echogenicity, homogeneity and heterogeneity of tumors. I peak increased rapidly since the first week after therapy initiation, whereas W low , W high and A high changed significantly at the second week. The average ultrasonic histogram progressively moved toward the right and converted into more symmetrical shape. Ultrasonic histogram could be served as a potential marker to monitor early response during CCRT. Copyright © 2018 Elsevier Inc. All rights reserved.
Face verification system for Android mobile devices using histogram based features

NASA Astrophysics Data System (ADS)

Sato, Sho; Kobayashi, Kazuhiro; Chen, Qiu

2016-07-01

This paper proposes a face verification system that runs on Android mobile devices. In this system, facial image is captured by a built-in camera on the Android device firstly, and then face detection is implemented using Haar-like features and AdaBoost learning algorithm. The proposed system verify the detected face using histogram based features, which are generated by binary Vector Quantization (VQ) histogram using DCT coefficients in low frequency domains, as well as Improved Local Binary Pattern (Improved LBP) histogram in spatial domain. Verification results with different type of histogram based features are first obtained separately and then combined by weighted averaging. We evaluate our proposed algorithm by using publicly available ORL database and facial images captured by an Android tablet.
A Deep Learning Method to Automatically Identify Reports of Scientifically Rigorous Clinical Research from the Biomedical Literature: Comparative Analytic Study.

PubMed

Del Fiol, Guilherme; Michelson, Matthew; Iorio, Alfonso; Cotoi, Chris; Haynes, R Brian

2018-06-25

A major barrier to the practice of evidence-based medicine is efficiently finding scientifically sound studies on a given clinical topic. To investigate a deep learning approach to retrieve scientifically sound treatment studies from the biomedical literature. We trained a Convolutional Neural Network using a noisy dataset of 403,216 PubMed citations with title and abstract as features. The deep learning model was compared with state-of-the-art search filters, such as PubMed's Clinical Query Broad treatment filter, McMaster's textword search strategy (no Medical Subject Heading, MeSH, terms), and Clinical Query Balanced treatment filter. A previously annotated dataset (Clinical Hedges) was used as the gold standard. The deep learning model obtained significantly lower recall than the Clinical Queries Broad treatment filter (96.9% vs 98.4%; P<.001); and equivalent recall to McMaster's textword search (96.9% vs 97.1%; P=.57) and Clinical Queries Balanced filter (96.9% vs 97.0%; P=.63). Deep learning obtained significantly higher precision than the Clinical Queries Broad filter (34.6% vs 22.4%; P<.001) and McMaster's textword search (34.6% vs 11.8%; P<.001), but was significantly lower than the Clinical Queries Balanced filter (34.6% vs 40.9%; P<.001). Deep learning performed well compared to state-of-the-art search filters, especially when citations were not indexed. Unlike previous machine learning approaches, the proposed deep learning model does not require feature engineering, or time-sensitive or proprietary features, such as MeSH terms and bibliometrics. Deep learning is a promising approach to identifying reports of scientifically rigorous clinical research. Further work is needed to optimize the deep learning model and to assess generalizability to other areas, such as diagnosis, etiology, and prognosis. ©Guilherme Del Fiol, Matthew Michelson, Alfonso Iorio, Chris Cotoi, R Brian Haynes. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 25.06.2018.
Combining Vector Quantization and Histogram Equalization.

ERIC Educational Resources Information Center

Cosman, Pamela C.; And Others

1992-01-01

Discussion of contrast enhancement techniques focuses on the use of histogram equalization with a data compression technique, i.e., tree-structured vector quantization. The enhancement technique of intensity windowing is described, and the use of enhancement techniques for medical images is explained, including adaptive histogram equalization.…
Histogram Curve Matching Approaches for Object-based Image Classification of Land Cover and Land Use

PubMed Central

Toure, Sory I.; Stow, Douglas A.; Weeks, John R.; Kumar, Sunil

2013-01-01

The classification of image-objects is usually done using parametric statistical measures of central tendency and/or dispersion (e.g., mean or standard deviation). The objectives of this study were to analyze digital number histograms of image objects and evaluate classifications measures exploiting characteristic signatures of such histograms. Two histograms matching classifiers were evaluated and compared to the standard nearest neighbor to mean classifier. An ADS40 airborne multispectral image of San Diego, California was used for assessing the utility of curve matching classifiers in a geographic object-based image analysis (GEOBIA) approach. The classifications were performed with data sets having 0.5 m, 2.5 m, and 5 m spatial resolutions. Results show that histograms are reliable features for characterizing classes. Also, both histogram matching classifiers consistently performed better than the one based on the standard nearest neighbor to mean rule. The highest classification accuracies were produced with images having 2.5 m spatial resolution. PMID:24403648
Histogram and gray level co-occurrence matrix on gray-scale ultrasound images for diagnosing lymphocytic thyroiditis.

PubMed

Shin, Young Gyung; Yoo, Jaeheung; Kwon, Hyeong Ju; Hong, Jung Hwa; Lee, Hye Sun; Yoon, Jung Hyun; Kim, Eun-Kyung; Moon, Hee Jung; Han, Kyunghwa; Kwak, Jin Young

2016-08-01

The objective of the study was to evaluate whether texture analysis using histogram and gray level co-occurrence matrix (GLCM) parameters can help clinicians diagnose lymphocytic thyroiditis (LT) and differentiate LT according to pathologic grade. The background thyroid pathology of 441 patients was classified into no evidence of LT, chronic LT (CLT), and Hashimoto's thyroiditis (HT). Histogram and GLCM parameters were extracted from the regions of interest on ultrasound. The diagnostic performances of the parameters for diagnosing and differentiating LT were calculated. Of the histogram and GLCM parameters, the mean on histogram had the highest Az (0.63) and VUS (0.303). As the degrees of LT increased, the mean decreased and the standard deviation and entropy increased. The mean on histogram from gray-scale ultrasound showed the best diagnostic performance as a single parameter in differentiating LT according to pathologic grade as well as in diagnosing LT. Copyright © 2016 Elsevier Ltd. All rights reserved.
Time-cumulated visible and infrared radiance histograms used as descriptors of surface and cloud variations

NASA Technical Reports Server (NTRS)

Seze, Genevieve; Rossow, William B.

1991-01-01

The spatial and temporal stability of the distributions of satellite-measured visible and infrared radiances, caused by variations in clouds and surfaces, are investigated using bidimensional and monodimensional histograms and time-composite images. Similar analysis of the histograms of the original and time-composite images provides separation of the contributions of the space and time variations to the total variations. The variability of both the surfaces and clouds is found to be larger at scales much larger than the minimum resolved by satellite imagery. This study shows that the shapes of these histograms are distinctive characteristics of the different climate regimes and that particular attributes of these histograms can be related to several general, though not universal, properties of clouds and surface variations at regional and synoptic scales. There are also significant exceptions to these relationships in particular climate regimes. The characteristics of these radiance histograms provide a stable well defined descriptor of the cloud and surface properties.
Apparent diffusion coefficient histogram shape analysis for monitoring early response in patients with advanced cervical cancers undergoing concurrent chemo-radiotherapy.

PubMed

Meng, Jie; Zhu, Lijing; Zhu, Li; Wang, Huanhuan; Liu, Song; Yan, Jing; Liu, Baorui; Guan, Yue; Ge, Yun; He, Jian; Zhou, Zhengyang; Yang, Xiaofeng

2016-10-22

To explore the role of apparent diffusion coefficient (ADC) histogram shape related parameters in early assessment of treatment response during the concurrent chemo-radiotherapy (CCRT) course of advanced cervical cancers. This prospective study was approved by the local ethics committee and informed consent was obtained from all patients. Thirty-two patients with advanced cervical squamous cell carcinomas underwent diffusion weighted magnetic resonance imaging (b values, 0 and 800 s/mm 2 ) before CCRT, at the end of 2nd and 4th week during CCRT and immediately after CCRT completion. Whole lesion ADC histogram analysis generated several histogram shape related parameters including skewness, kurtosis, s-sD av , width, standard deviation, as well as first-order entropy and second-order entropies. The averaged ADC histograms of 32 patients were generated to visually observe dynamic changes of the histogram shape following CCRT. All parameters except width and standard deviation showed significant changes during CCRT (all P < 0.05), and their variation trends fell into four different patterns. Skewness and kurtosis both showed high early decline rate (43.10 %, 48.29 %) at the end of 2nd week of CCRT. All entropies kept decreasing significantly since 2 weeks after CCRT initiated. The shape of averaged ADC histogram also changed obviously following CCRT. ADC histogram shape analysis held the potential in monitoring early tumor response in patients with advanced cervical cancers undergoing CCRT.
[Clinical application of MRI histogram in evaluation of muscle fatty infiltration].

PubMed

Zheng, Y M; Du, J; Li, W Z; Wang, Z X; Zhang, W; Xiao, J X; Yuan, Y

2016-10-18

To describe a method based on analysis of the histogram of intensity values produced from the magnetic resonance imaging (MRI) for quantifying the degree of fatty infiltration. The study included 25 patients with dystrophinopathy. All the subjects underwent muscle MRI test at thigh level. The histogram M values of 250 muscles adjusted for subcutaneous fat, representing the degree of fatty infiltration, were compared with the expert visual reading using the modified Mercuri scale. There was a significant positive correlation between the histogram M values and the scores of visual reading (r=0.854, P<0.001). The distinct pattern of muscle involvement detected in the patients with dystrophinopathy in our study of histogram M values was similar to that of visual reading and results in literature. The histogram M values had stronger correlations with the clinical data than the scores of visual reading as follows: the correlations with age (r=0.730, P<0.001) and (r=0.753, P<0.001); with strength of knee extensor (r=-0.468, P=0.024) and (r=-0.460, P=0.027) respectively. Meanwhile, the histogram M values analysis had better repeatability than visual reading with the interclass correlation coefficient was 0.998 (95% CI: 0.997-0.998, P<0.001) and 0.958 (95% CI: 0.946-0.967, P<0.001) respectively. Histogram M values analysis of MRI with the advantages of repeatability and objectivity can be used to evaluate the degree of muscle fatty infiltration.

Dissimilarity representations in lung parenchyma classification

NASA Astrophysics Data System (ADS)

Sørensen, Lauge; de Bruijne, Marleen

2009-02-01

A good problem representation is important for a pattern recognition system to be successful. The traditional approach to statistical pattern recognition is feature representation. More specifically, objects are represented by a number of features in a feature vector space, and classifiers are built in this representation. This is also the general trend in lung parenchyma classification in computed tomography (CT) images, where the features often are measures on feature histograms. Instead, we propose to build normal density based classifiers in dissimilarity representations for lung parenchyma classification. This allows for the classifiers to work on dissimilarities between objects, which might be a more natural way of representing lung parenchyma. In this context, dissimilarity is defined between CT regions of interest (ROI)s. ROIs are represented by their CT attenuation histogram and ROI dissimilarity is defined as a histogram dissimilarity measure between the attenuation histograms. In this setting, the full histograms are utilized according to the chosen histogram dissimilarity measure. We apply this idea to classification of different emphysema patterns as well as normal, healthy tissue. Two dissimilarity representation approaches as well as different histogram dissimilarity measures are considered. The approaches are evaluated on a set of 168 CT ROIs using normal density based classifiers all showing good performance. Compared to using histogram dissimilarity directly as distance in a emph{k} nearest neighbor classifier, which achieves a classification accuracy of 92.9%, the best dissimilarity representation based classifier is significantly better with a classification accuracy of 97.0% (text{emph{p" border="0" class="imgtopleft"> = 0.046).
LETTER TO THE EDITOR: Constant-time solution to the global optimization problem using Brüschweiler's ensemble search algorithm

NASA Astrophysics Data System (ADS)

Protopopescu, V.; D'Helon, C.; Barhen, J.

2003-06-01

A constant-time solution of the continuous global optimization problem (GOP) is obtained by using an ensemble algorithm. We show that under certain assumptions, the solution can be guaranteed by mapping the GOP onto a discrete unsorted search problem, whereupon Brüschweiler's ensemble search algorithm is applied. For adequate sensitivities of the measurement technique, the query complexity of the ensemble search algorithm depends linearly on the size of the function's domain. Advantages and limitations of an eventual NMR implementation are discussed.
The Histogram-Area Connection

ERIC Educational Resources Information Center

Gratzer, William; Carpenter, James E.

2008-01-01

This article demonstrates an alternative approach to the construction of histograms--one based on the notion of using area to represent relative density in intervals of unequal length. The resulting histograms illustrate the connection between the area of the rectangles associated with particular outcomes and the relative frequency (probability)…
Investigating Student Understanding of Histograms

ERIC Educational Resources Information Center

Kaplan, Jennifer J.; Gabrosek, John G.; Curtiss, Phyllis; Malone, Chris

2014-01-01

Histograms are adept at revealing the distribution of data values, especially the shape of the distribution and any outlier values. They are included in introductory statistics texts, research methods texts, and in the popular press, yet students often have difficulty interpreting the information conveyed by a histogram. This research identifies…
Comparative Analysis of Rank Aggregation Techniques for Metasearch Using Genetic Algorithm

ERIC Educational Resources Information Center

Kaur, Parneet; Singh, Manpreet; Singh Josan, Gurpreet

2017-01-01

Rank Aggregation techniques have found wide applications for metasearch along with other streams such as Sports, Voting System, Stock Markets, and Reduction in Spam. This paper presents the optimization of rank lists for web queries put by the user on different MetaSearch engines. A metaheuristic approach such as Genetic algorithm based rank…
Thresholding histogram equalization.

PubMed

Chuang, K S; Chen, S; Hwang, I M

2001-12-01

The drawbacks of adaptive histogram equalization techniques are the loss of definition on the edges of the object and overenhancement of noise in the images. These drawbacks can be avoided if the noise is excluded in the equalization transformation function computation. A method has been developed to separate the histogram into zones, each with its own equalization transformation. This method can be used to suppress the nonanatomic noise and enhance only certain parts of the object. This method can be combined with other adaptive histogram equalization techniques. Preliminary results indicate that this method can produce images with superior contrast.
Variations of attractors and wavelet spectra of the immunofluorescence distributions for women in the pregnant period

NASA Astrophysics Data System (ADS)

Galich, Nikolay E.

2008-07-01

Communication contains the description of the immunology data treatment. New nonlinear methods of immunofluorescence statistical analysis of peripheral blood neutrophils have been developed. We used technology of respiratory burst reaction of DNA fluorescence in the neutrophils cells nuclei due to oxidative activity. The histograms of photon count statistics the radiant neutrophils populations' in flow cytometry experiments are considered. Distributions of the fluorescence flashes frequency as functions of the fluorescence intensity are analyzed. Statistic peculiarities of histograms set for women in the pregnant period allow dividing all histograms on the three classes. The classification is based on three different types of smoothing and long-range scale averaged immunofluorescence distributions, their bifurcation and wavelet spectra. Heterogeneity peculiarities of long-range scale immunofluorescence distributions and peculiarities of wavelet spectra allow dividing all histograms on three groups. First histograms group belongs to healthy donors. Two other groups belong to donors with autoimmune and inflammatory diseases. Some of the illnesses are not diagnosed by standards biochemical methods. Medical standards and statistical data of the immunofluorescence histograms for identifications of health and illnesses are interconnected. Peculiarities of immunofluorescence for women in pregnant period are classified. Health or illness criteria are connected with statistics features of immunofluorescence histograms. Neutrophils populations' fluorescence presents the sensitive clear indicator of health status.
Complexity of possibly gapped histogram and analysis of histogram.

PubMed

Fushing, Hsieh; Roy, Tania

2018-02-01

We demonstrate that gaps and distributional patterns embedded within real-valued measurements are inseparable biological and mechanistic information contents of the system. Such patterns are discovered through data-driven possibly gapped histogram, which further leads to the geometry-based analysis of histogram (ANOHT). Constructing a possibly gapped histogram is a complex problem of statistical mechanics due to the ensemble of candidate histograms being captured by a two-layer Ising model. This construction is also a distinctive problem of Information Theory from the perspective of data compression via uniformity. By defining a Hamiltonian (or energy) as a sum of total coding lengths of boundaries and total decoding errors within bins, this issue of computing the minimum energy macroscopic states is surprisingly resolved by applying the hierarchical clustering algorithm. Thus, a possibly gapped histogram corresponds to a macro-state. And then the first phase of ANOHT is developed for simultaneous comparison of multiple treatments, while the second phase of ANOHT is developed based on classical empirical process theory for a tree-geometry that can check the authenticity of branches of the treatment tree. The well-known Iris data are used to illustrate our technical developments. Also, a large baseball pitching dataset and a heavily right-censored divorce data are analysed to showcase the existential gaps and utilities of ANOHT.
Histogram-based quantitative evaluation of endobronchial ultrasonography images of peripheral pulmonary lesion.

PubMed

Morikawa, Kei; Kurimoto, Noriaki; Inoue, Takeo; Mineshita, Masamichi; Miyazawa, Teruomi

2015-01-01

Endobronchial ultrasonography using a guide sheath (EBUS-GS) is an increasingly common bronchoscopic technique, but currently, no methods have been established to quantitatively evaluate EBUS images of peripheral pulmonary lesions. The purpose of this study was to evaluate whether histogram data collected from EBUS-GS images can contribute to the diagnosis of lung cancer. Histogram-based analyses focusing on the brightness of EBUS images were retrospectively conducted: 60 patients (38 lung cancer; 22 inflammatory diseases), with clear EBUS images were included. For each patient, a 400-pixel region of interest was selected, typically located at a 3- to 5-mm radius from the probe, from recorded EBUS images during bronchoscopy. Histogram height, width, height/width ratio, standard deviation, kurtosis and skewness were investigated as diagnostic indicators. Median histogram height, width, height/width ratio and standard deviation were significantly different between lung cancer and benign lesions (all p < 0.01). With a cutoff value for standard deviation of 10.5, lung cancer could be diagnosed with an accuracy of 81.7%. Other characteristics investigated were inferior when compared to histogram standard deviation. Histogram standard deviation appears to be the most useful characteristic for diagnosing lung cancer using EBUS images. © 2015 S. Karger AG, Basel.
Complexity of possibly gapped histogram and analysis of histogram

PubMed Central

Roy, Tania

2018-01-01

We demonstrate that gaps and distributional patterns embedded within real-valued measurements are inseparable biological and mechanistic information contents of the system. Such patterns are discovered through data-driven possibly gapped histogram, which further leads to the geometry-based analysis of histogram (ANOHT). Constructing a possibly gapped histogram is a complex problem of statistical mechanics due to the ensemble of candidate histograms being captured by a two-layer Ising model. This construction is also a distinctive problem of Information Theory from the perspective of data compression via uniformity. By defining a Hamiltonian (or energy) as a sum of total coding lengths of boundaries and total decoding errors within bins, this issue of computing the minimum energy macroscopic states is surprisingly resolved by applying the hierarchical clustering algorithm. Thus, a possibly gapped histogram corresponds to a macro-state. And then the first phase of ANOHT is developed for simultaneous comparison of multiple treatments, while the second phase of ANOHT is developed based on classical empirical process theory for a tree-geometry that can check the authenticity of branches of the treatment tree. The well-known Iris data are used to illustrate our technical developments. Also, a large baseball pitching dataset and a heavily right-censored divorce data are analysed to showcase the existential gaps and utilities of ANOHT. PMID:29515829
Complexity of possibly gapped histogram and analysis of histogram

NASA Astrophysics Data System (ADS)

Fushing, Hsieh; Roy, Tania

2018-02-01

We demonstrate that gaps and distributional patterns embedded within real-valued measurements are inseparable biological and mechanistic information contents of the system. Such patterns are discovered through data-driven possibly gapped histogram, which further leads to the geometry-based analysis of histogram (ANOHT). Constructing a possibly gapped histogram is a complex problem of statistical mechanics due to the ensemble of candidate histograms being captured by a two-layer Ising model. This construction is also a distinctive problem of Information Theory from the perspective of data compression via uniformity. By defining a Hamiltonian (or energy) as a sum of total coding lengths of boundaries and total decoding errors within bins, this issue of computing the minimum energy macroscopic states is surprisingly resolved by applying the hierarchical clustering algorithm. Thus, a possibly gapped histogram corresponds to a macro-state. And then the first phase of ANOHT is developed for simultaneous comparison of multiple treatments, while the second phase of ANOHT is developed based on classical empirical process theory for a tree-geometry that can check the authenticity of branches of the treatment tree. The well-known Iris data are used to illustrate our technical developments. Also, a large baseball pitching dataset and a heavily right-censored divorce data are analysed to showcase the existential gaps and utilities of ANOHT.
Optimal image alignment with random projections of manifolds: algorithm and geometric analysis.

PubMed

Kokiopoulou, Effrosyni; Kressner, Daniel; Frossard, Pascal

2011-06-01

This paper addresses the problem of image alignment based on random measurements. Image alignment consists of estimating the relative transformation between a query image and a reference image. We consider the specific problem where the query image is provided in compressed form in terms of linear measurements captured by a vision sensor. We cast the alignment problem as a manifold distance minimization problem in the linear subspace defined by the measurements. The transformation manifold that represents synthesis of shift, rotation, and isotropic scaling of the reference image can be given in closed form when the reference pattern is sparsely represented over a parametric dictionary. We show that the objective function can then be decomposed as the difference of two convex functions (DC) in the particular case where the dictionary is built on Gaussian functions. Thus, the optimization problem becomes a DC program, which in turn can be solved globally by a cutting plane method. The quality of the solution is typically affected by the number of random measurements and the condition number of the manifold that describes the transformations of the reference image. We show that the curvature, which is closely related to the condition number, remains bounded in our image alignment problem, which means that the relative transformation between two images can be determined optimally in a reduced subspace.
Clauser-Horne-Shimony-Holt versus three-party pseudo-telepathy: on the optimal number of samples in device-independent quantum private query

NASA Astrophysics Data System (ADS)

Basak, Jyotirmoy; Maitra, Subhamoy

2018-04-01

In device-independent (DI) paradigm, the trustful assumptions over the devices are removed and CHSH test is performed to check the functionality of the devices toward certifying the security of the protocol. The existing DI protocols consider infinite number of samples from theoretical point of view, though this is not practically implementable. For finite sample analysis of the existing DI protocols, we may also consider strategies for checking device independence other than the CHSH test. In this direction, here we present a comparative analysis between CHSH and three-party Pseudo-telepathy game for the quantum private query protocol in DI paradigm that appeared in Maitra et al. (Phys Rev A 95:042344, 2017) very recently.
Big Data Analytics with Datalog Queries on Spark.

PubMed

Shkapsky, Alexander; Yang, Mohan; Interlandi, Matteo; Chiu, Hsuan; Condie, Tyson; Zaniolo, Carlo

2016-01-01

There is great interest in exploiting the opportunity provided by cloud computing platforms for large-scale analytics. Among these platforms, Apache Spark is growing in popularity for machine learning and graph analytics. Developing efficient complex analytics in Spark requires deep understanding of both the algorithm at hand and the Spark API or subsystem APIs (e.g., Spark SQL, GraphX). Our BigDatalog system addresses the problem by providing concise declarative specification of complex queries amenable to efficient evaluation. Towards this goal, we propose compilation and optimization techniques that tackle the important problem of efficiently supporting recursion in Spark. We perform an experimental comparison with other state-of-the-art large-scale Datalog systems and verify the efficacy of our techniques and effectiveness of Spark in supporting Datalog-based analytics.
Big Data Analytics with Datalog Queries on Spark

PubMed Central

Shkapsky, Alexander; Yang, Mohan; Interlandi, Matteo; Chiu, Hsuan; Condie, Tyson; Zaniolo, Carlo

2017-01-01

There is great interest in exploiting the opportunity provided by cloud computing platforms for large-scale analytics. Among these platforms, Apache Spark is growing in popularity for machine learning and graph analytics. Developing efficient complex analytics in Spark requires deep understanding of both the algorithm at hand and the Spark API or subsystem APIs (e.g., Spark SQL, GraphX). Our BigDatalog system addresses the problem by providing concise declarative specification of complex queries amenable to efficient evaluation. Towards this goal, we propose compilation and optimization techniques that tackle the important problem of efficiently supporting recursion in Spark. We perform an experimental comparison with other state-of-the-art large-scale Datalog systems and verify the efficacy of our techniques and effectiveness of Spark in supporting Datalog-based analytics. PMID:28626296
Three-dimensional volumetric gray-scale uterine cervix histogram prediction of days to delivery in full term pregnancy.

PubMed

Kim, Ji Youn; Kim, Hai-Joong; Hahn, Meong Hi; Jeon, Hye Jin; Cho, Geum Joon; Hong, Sun Chul; Oh, Min Jeong

2013-09-01

Our aim was to figure out whether volumetric gray-scale histogram difference between anterior and posterior cervix can indicate the extent of cervical consistency. We collected data of 95 patients who were appropriate for vaginal delivery with 36th to 37th weeks of gestational age from September 2010 to October 2011 in the Department of Obstetrics and Gynecology, Korea University Ansan Hospital. Patients were excluded who had one of the followings: Cesarean section, labor induction, premature rupture of membrane. Thirty-four patients were finally enrolled. The patients underwent evaluation of the cervix through Bishop score, cervical length, cervical volume, three-dimensional (3D) cervical volumetric gray-scale histogram. The interval days from the cervix evaluation to the delivery day were counted. We compared to 3D cervical volumetric gray-scale histogram, Bishop score, cervical length, cervical volume with interval days from the evaluation of the cervix to the delivery. Gray-scale histogram difference between anterior and posterior cervix was significantly correlated to days to delivery. Its correlation coefficient (R) was 0.500 (P = 0.003). The cervical length was significantly related to the days to delivery. The correlation coefficient (R) and P-value between them were 0.421 and 0.013. However, anterior lip histogram, posterior lip histogram, total cervical volume, Bishop score were not associated with days to delivery (P >0.05). By using gray-scale histogram difference between anterior and posterior cervix and cervical length correlated with the days to delivery. These methods can be utilized to better help predict a cervical consistency.
Construction and Evaluation of Histograms in Teacher Training

ERIC Educational Resources Information Center

Bruno, A.; Espinel, M. C.

2009-01-01

This article details the results of a written test designed to reveal how education majors construct and evaluate histograms and frequency polygons. Included is a description of the mistakes made by the students which shows how they tend to confuse histograms with bar diagrams, incorrectly assign data along the Cartesian axes and experience…
Empirical Histograms in Item Response Theory with Ordinal Data

ERIC Educational Resources Information Center

Woods, Carol M.

2007-01-01

The purpose of this research is to describe, test, and illustrate a new implementation of the empirical histogram (EH) method for ordinal items. The EH method involves the estimation of item response model parameters simultaneously with the approximation of the distribution of the random latent variable (theta) as a histogram. Software for the EH…
Symbol recognition via statistical integration of pixel-level constraint histograms: a new descriptor.

PubMed

Yang, Su

2005-02-01

A new descriptor for symbol recognition is proposed. 1) A histogram is constructed for every pixel to figure out the distribution of the constraints among the other pixels. 2) All the histograms are statistically integrated to form a feature vector with fixed dimension. The robustness and invariance were experimentally confirmed.
Airborne gamma-ray spectrometer and magnetometer survey, Durango D, Colorado. Final report Volume II A. Detail area

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1983-01-01

This volume contains geology of the Durango D detail area, radioactive mineral occurrences in Colorado, and geophysical data interpretation. Eight appendices provide: stacked profiles, geologic histograms, geochemical histograms, speed and altitude histograms, geologic statistical tables, geochemical statistical tables, magnetic and ancillary profiles, and test line data.

Airborne gamma-ray spectrometer and magnetometer survey, Durango C, Colorado. Final report Volume II A. Detail area

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1983-01-01

Geology of Durango C detail area, radioactive mineral occurrences in Colorado, and geophysical data interpretation are included in this report. Eight appendices provide: stacked profiles, geologic histograms, geochemical histograms, speed and altitude histograms, geologic statistical tables, magnetic and ancillary profiles, and test line data.
Making Temporal Search More Central in Spatial Data Infrastructures

NASA Astrophysics Data System (ADS)

Corti, P.; Lewis, B.

2017-10-01

A temporally enabled Spatial Data Infrastructure (SDI) is a framework of geospatial data, metadata, users, and tools intended to provide an efficient and flexible way to use spatial information which includes the historical dimension. One of the key software components of an SDI is the catalogue service which is needed to discover, query, and manage the metadata. A search engine is a software system capable of supporting fast and reliable search, which may use any means necessary to get users to the resources they need quickly and efficiently. These techniques may include features such as full text search, natural language processing, weighted results, temporal search based on enrichment, visualization of patterns in distributions of results in time and space using temporal and spatial faceting, and many others. In this paper we will focus on the temporal aspects of search which include temporal enrichment using a time miner - a software engine able to search for date components within a larger block of text, the storage of time ranges in the search engine, handling historical dates, and the use of temporal histograms in the user interface to display the temporal distribution of search results.
Stationary Wavelet Transform and AdaBoost with SVM Based Pathological Brain Detection in MRI Scanning.

PubMed

Nayak, Deepak Ranjan; Dash, Ratnakar; Majhi, Banshidhar

2017-01-01

This paper presents an automatic classification system for segregating pathological brain from normal brains in magnetic resonance imaging scanning. The proposed system employs contrast limited adaptive histogram equalization scheme to enhance the diseased region in brain MR images. Two-dimensional stationary wavelet transform is harnessed to extract features from the preprocessed images. The feature vector is constructed using the energy and entropy values, computed from the level- 2 SWT coefficients. Then, the relevant and uncorrelated features are selected using symmetric uncertainty ranking filter. Subsequently, the selected features are given input to the proposed AdaBoost with support vector machine classifier, where SVM is used as the base classifier of AdaBoost algorithm. To validate the proposed system, three standard MR image datasets, Dataset-66, Dataset-160, and Dataset- 255 have been utilized. The 5 runs of k-fold stratified cross validation results indicate the suggested scheme offers better performance than other existing schemes in terms of accuracy and number of features. The proposed system earns ideal classification over Dataset-66 and Dataset-160; whereas, for Dataset- 255, an accuracy of 99.45% is achieved. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Deeply learnt hashing forests for content based image retrieval in prostate MR images

NASA Astrophysics Data System (ADS)

Shah, Amit; Conjeti, Sailesh; Navab, Nassir; Katouzian, Amin

2016-03-01

Deluge in the size and heterogeneity of medical image databases necessitates the need for content based retrieval systems for their efficient organization. In this paper, we propose such a system to retrieve prostate MR images which share similarities in appearance and content with a query image. We introduce deeply learnt hashing forests (DL-HF) for this image retrieval task. DL-HF effectively leverages the semantic descriptiveness of deep learnt Convolutional Neural Networks. This is used in conjunction with hashing forests which are unsupervised random forests. DL-HF hierarchically parses the deep-learnt feature space to encode subspaces with compact binary code words. We propose a similarity preserving feature descriptor called Parts Histogram which is derived from DL-HF. Correlation defined on this descriptor is used as a similarity metric for retrieval from the database. Validations on publicly available multi-center prostate MR image database established the validity of the proposed approach. The proposed method is fully-automated without any user-interaction and is not dependent on any external image standardization like image normalization and registration. This image retrieval method is generalizable and is well-suited for retrieval in heterogeneous databases other imaging modalities and anatomies.
Large-scale binding ligand prediction by improved patch-based method Patch-Surfer2.0

PubMed Central

Zhu, Xiaolei; Xiong, Yi; Kihara, Daisuke

2015-01-01

Motivation: Ligand binding is a key aspect of the function of many proteins. Thus, binding ligand prediction provides important insight in understanding the biological function of proteins. Binding ligand prediction is also useful for drug design and examining potential drug side effects. Results: We present a computational method named Patch-Surfer2.0, which predicts binding ligands for a protein pocket. By representing and comparing pockets at the level of small local surface patches that characterize physicochemical properties of the local regions, the method can identify binding pockets of the same ligand even if they do not share globally similar shapes. Properties of local patches are represented by an efficient mathematical representation, 3D Zernike Descriptor. Patch-Surfer2.0 has significant technical improvements over our previous prototype, which includes a new feature that captures approximate patch position with a geodesic distance histogram. Moreover, we constructed a large comprehensive database of ligand binding pockets that will be searched against by a query. The benchmark shows better performance of Patch-Surfer2.0 over existing methods. Availability and implementation: http://kiharalab.org/patchsurfer2.0/ Contact: dkihara@purdue.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25359888
Action recognition via cumulative histogram of multiple features

NASA Astrophysics Data System (ADS)

Yan, Xunshi; Luo, Yupin

2011-01-01

Spatial-temporal interest points (STIPs) are popular in human action recognition. However, they suffer from difficulties in determining size of codebook and losing much information during forming histograms. In this paper, spatial-temporal interest regions (STIRs) are proposed, which are based on STIPs and are capable of marking the locations of the most ``shining'' human body parts. In order to represent human actions, the proposed approach takes great advantages of multiple features, including STIRs, pyramid histogram of oriented gradients and pyramid histogram of oriented optical flows. To achieve this, cumulative histogram is used to integrate dynamic information in sequences and to form feature vectors. Furthermore, the widely used nearest neighbor and AdaBoost methods are employed as classification algorithms. Experiments on public datasets KTH, Weizmann and UCF sports show that the proposed approach achieves effective and robust results.
Adding Data Management Services to Parallel File Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brandt, Scott

2015-03-04

The objective of this project, called DAMASC for “Data Management in Scientific Computing”, is to coalesce data management with parallel file system management to present a declarative interface to scientists for managing, querying, and analyzing extremely large data sets efficiently and predictably. Managing extremely large data sets is a key challenge of exascale computing. The overhead, energy, and cost of moving massive volumes of data demand designs where computation is close to storage. In current architectures, compute/analysis clusters access data in a physically separate parallel file system and largely leave it scientist to reduce data movement. Over the past decadesmore » the high-end computing community has adopted middleware with multiple layers of abstractions and specialized file formats such as NetCDF-4 and HDF5. These abstractions provide a limited set of high-level data processing functions, but have inherent functionality and performance limitations: middleware that provides access to the highly structured contents of scientific data files stored in the (unstructured) file systems can only optimize to the extent that file system interfaces permit; the highly structured formats of these files often impedes native file system performance optimizations. We are developing Damasc, an enhanced high-performance file system with native rich data management services. Damasc will enable efficient queries and updates over files stored in their native byte-stream format while retaining the inherent performance of file system data storage via declarative queries and updates over views of underlying files. Damasc has four key benefits for the development of data-intensive scientific code: (1) applications can use important data-management services, such as declarative queries, views, and provenance tracking, that are currently available only within database systems; (2) the use of these services becomes easier, as they are provided within a familiar file-based ecosystem; (3) common optimizations, e.g., indexing and caching, are readily supported across several file formats, avoiding effort duplication; and (4) performance improves significantly, as data processing is integrated more tightly with data storage. Our key contributions are: SciHadoop which explores changes to MapReduce assumption by taking advantage of semantics of structured data while preserving MapReduce’s failure and resource management; DataMods which extends common abstractions of parallel file systems so they become programmable such that they can be extended to natively support a variety of data models and can be hooked into emerging distributed runtimes such as Stanford’s Legion; and Miso which combines Hadoop and relational data warehousing to minimize time to insight, taking into account the overhead of ingesting data into data warehousing.« less
Histogram analysis of apparent diffusion coefficient for monitoring early response in patients with advanced cervical cancers undergoing concurrent chemo-radiotherapy.

PubMed

Meng, Jie; Zhu, Lijing; Zhu, Li; Ge, Yun; He, Jian; Zhou, Zhengyang; Yang, Xiaofeng

2017-11-01

Background Apparent diffusion coefficient (ADC) histogram analysis has been widely used in determining tumor prognosis. Purpose To investigate the dynamic changes of ADC histogram parameters during concurrent chemo-radiotherapy (CCRT) in patients with advanced cervical cancers. Material and Methods This prospective study enrolled 32 patients with advanced cervical cancers undergoing CCRT who received diffusion-weighted (DW) magnetic resonance imaging (MRI) before CCRT, at the end of the second and fourth week during CCRT and one month after CCRT completion. The ADC histogram for the entire tumor volume was generated, and a series of histogram parameters was obtained. Dynamic changes of those parameters in cervical cancers were investigated as early biomarkers for treatment response. Results All histogram parameters except AUC low showed significant changes during CCRT (all P < 0.05). There were three variable trends involving different parameters. The mode, 5th, 10th, and 25th percentiles showed similar early increase rates (33.33%, 33.99%, 34.12%, and 30.49%, respectively) at the end of the second week of CCRT. The pre-CCRT 5th and 25th percentiles of the complete response (CR) group were significantly lower than those of the partial response (PR) group. Conclusion A series of ADC histogram parameters of cervical cancers changed significantly at the early stage of CCRT, indicating their potential in monitoring early tumor response to therapy.
Whole Tumor Histogram-profiling of Diffusion-Weighted Magnetic Resonance Images Reflects Tumorbiological Features of Primary Central Nervous System Lymphoma.

PubMed

Schob, Stefan; Münch, Benno; Dieckow, Julia; Quäschling, Ulf; Hoffmann, Karl-Titus; Richter, Cindy; Garnov, Nikita; Frydrychowicz, Clara; Krause, Matthias; Meyer, Hans-Jonas; Surov, Alexey

2018-04-01

Diffusion weighted imaging (DWI) quantifies motion of hydrogen nuclei in biological tissues and hereby has been used to assess the underlying tissue microarchitecture. Histogram-profiling of DWI provides more detailed information on diffusion characteristics of a lesion than the standardly calculated values of the apparent diffusion coefficient (ADC)-minimum, mean and maximum. Hence, the aim of our study was to investigate, which parameters of histogram-profiling of DWI in primary central nervous system lymphoma can be used to specifically predict features like cellular density, chromatin content and proliferative activity. Pre-treatment ADC maps of 21 PCNSL patients (8 female, 13 male, 28-89 years) from a 1.5T system were used for Matlab-based histogram profiling. Results of histopathology (H&E staining) and immunohistochemistry (Ki-67 expression) were quantified. Correlations between histogram-profiling parameters and neuropathologic examination were calculated using SPSS 23.0. The lower percentiles (p10 and p25) showed significant correlations with structural parameters of the neuropathologic examination (cellular density, chromatin content). The highest percentile, p90, correlated significantly with Ki-67 expression, resembling proliferative activity. Kurtosis of the ADC histogram correlated significantly with cellular density. Histogram-profiling of DWI in PCNSL provides a comprehensible set of parameters, which reflect distinct tumor-architectural and tumor-biological features, and hence, are promising biomarkers for treatment response and prognosis. Copyright © 2018. Published by Elsevier Inc.
ADC histogram analysis of muscle lymphoma - Correlation with histopathology in a rare entity.

PubMed

Meyer, Hans-Jonas; Pazaitis, Nikolaos; Surov, Alexey

2018-06-21

Diffusion weighted imaging (DWI) is able to reflect histopathology architecture. A novel imaging approach, namely histogram analysis, is used to further characterize lesion on MRI. The purpose of this study is to correlate histogram parameters derived from apparent diffusion coefficient- (ADC) maps with histopathology parameters in muscle lymphoma. Eight patients (mean age 64.8 years, range 45-72 years) with histopathologically confirmed muscle lymphoma were retrospectively identified. Cell count, total nucleic and average nucleic areas were estimated using ImageJ. Additionally, Ki67-index was calculated. DWI was obtained on a 1.5T scanner by using the b values of 0 and 1000 s/mm2. Histogram analysis was performed as a whole lesion measurement by using a custom-made Matlabbased application. The correlation analysis revealed statistically significant correlation between cell count and ADCmean (p=-0.76, P=0.03) as well with ADCp75 (p=-0.79, P=0.02). Kurtosis and entropy correlated with average nucleic area (p=-0.81, P=0.02, p=0.88, P=0.007, respectively). None of the analyzed ADC parameters correlated with total nucleic area and with Ki67-index. This study identified significant correlations between cellularity and histogram parameters derived from ADC maps in muscle lymphoma. Thus, histogram analysis parameters reflect histopathology in muscle tumors. Advances in knowledge: Whole lesion ADC histogram analysis is able to reflect histopathology parameters in muscle lymphomas.
Delay, change and bifurcation of the immunofluorescence distribution attractors in health statuses diagnostics and in medical treatment

NASA Astrophysics Data System (ADS)

Galich, Nikolay E.; Filatov, Michael V.

2008-07-01

Communication contains the description of the immunology experiments and the experimental data treatment. New nonlinear methods of immunofluorescence statistical analysis of peripheral blood neutrophils have been developed. We used technology of respiratory burst reaction of DNA fluorescence in the neutrophils cells nuclei due to oxidative activity. The histograms of photon count statistics the radiant neutrophils populations' in flow cytometry experiments are considered. Distributions of the fluorescence flashes frequency as functions of the fluorescence intensity are analyzed. Statistic peculiarities of histograms set for healthy and unhealthy donors allow dividing all histograms on the three classes. The classification is based on three different types of smoothing and long-range scale averaged immunofluorescence distributions and their bifurcation. Heterogeneity peculiarities of long-range scale immunofluorescence distributions allow dividing all histograms on three groups. First histograms group belongs to healthy donors. Two other groups belong to donors with autoimmune and inflammatory diseases. Some of the illnesses are not diagnosed by standards biochemical methods. Medical standards and statistical data of the immunofluorescence histograms for identifications of health and illnesses are interconnected. Possibilities and alterations of immunofluorescence statistics in registration, diagnostics and monitoring of different diseases in various medical treatments have been demonstrated. Health or illness criteria are connected with statistics features of immunofluorescence histograms. Neutrophils populations' fluorescence presents the sensitive clear indicator of health status.
Informatics in radiology: use of CouchDB for document-based storage of DICOM objects.

PubMed

Rascovsky, Simón J; Delgado, Jorge A; Sanz, Alexander; Calvo, Víctor D; Castrillón, Gabriel

2012-01-01

Picture archiving and communication systems traditionally have depended on schema-based Structured Query Language (SQL) databases for imaging data management. To optimize database size and performance, many such systems store a reduced set of Digital Imaging and Communications in Medicine (DICOM) metadata, discarding informational content that might be needed in the future. As an alternative to traditional database systems, document-based key-value stores recently have gained popularity. These systems store documents containing key-value pairs that facilitate data searches without predefined schemas. Document-based key-value stores are especially suited to archive DICOM objects because DICOM metadata are highly heterogeneous collections of tag-value pairs conveying specific information about imaging modalities, acquisition protocols, and vendor-supported postprocessing options. The authors used an open-source document-based database management system (Apache CouchDB) to create and test two such databases; CouchDB was selected for its overall ease of use, capability for managing attachments, and reliance on HTTP and Representational State Transfer standards for accessing and retrieving data. A large database was created first in which the DICOM metadata from 5880 anonymized magnetic resonance imaging studies (1,949,753 images) were loaded by using a Ruby script. To provide the usual DICOM query functionality, several predefined "views" (standard queries) were created by using JavaScript. For performance comparison, the same queries were executed in both the CouchDB database and a SQL-based DICOM archive. The capabilities of CouchDB for attachment management and database replication were separately assessed in tests of a similar, smaller database. Results showed that CouchDB allowed efficient storage and interrogation of all DICOM objects; with the use of information retrieval algorithms such as map-reduce, all the DICOM metadata stored in the large database were searchable with only a minimal increase in retrieval time over that with the traditional database management system. Results also indicated possible uses for document-based databases in data mining applications such as dose monitoring, quality assurance, and protocol optimization. RSNA, 2012
Completing the physical representation of quantum algorithms provides a retrocausal explanation of the speedup

NASA Astrophysics Data System (ADS)

Castagnoli, Giuseppe

2017-05-01

The usual representation of quantum algorithms, limited to the process of solving the problem, is physically incomplete as it lacks the initial measurement. We extend it to the process of setting the problem. An initial measurement selects a problem setting at random, and a unitary transformation sends it into the desired setting. The extended representation must be with respect to Bob, the problem setter, and any external observer. It cannot be with respect to Alice, the problem solver. It would tell her the problem setting and thus the solution of the problem implicit in it. In the representation to Alice, the projection of the quantum state due to the initial measurement should be postponed until the end of the quantum algorithm. In either representation, there is a unitary transformation between the initial and final measurement outcomes. As a consequence, the final measurement of any ℛ-th part of the solution could select back in time a corresponding part of the random outcome of the initial measurement; the associated projection of the quantum state should be advanced by the inverse of that unitary transformation. This, in the representation to Alice, would tell her, before she begins her problem solving action, that part of the solution. The quantum algorithm should be seen as a sum over classical histories in each of which Alice knows in advance one of the possible ℛ-th parts of the solution and performs the oracle queries still needed to find it - this for the value of ℛ that explains the algorithm's speedup. We have a relation between retrocausality ℛ and the number of oracle queries needed to solve an oracle problem quantumly. All the oracle problems examined can be solved with any value of ℛ up to an upper bound attained by the optimal quantum algorithm. This bound is always in the vicinity of 1/2 . Moreover, ℛ =1/2 always provides the order of magnitude of the number of queries needed to solve the problem in an optimal quantum way. If this were true for any oracle problem, as plausible, it would solve the quantum query complexity problem.
Optimizing a machine learning based glioma grading system using multi-parametric MRI histogram and texture features

PubMed Central

Hu, Yu-Chuan; Li, Gang; Yang, Yang; Han, Yu; Sun, Ying-Zhi; Liu, Zhi-Cheng; Tian, Qiang; Han, Zi-Yang; Liu, Le-De; Hu, Bin-Quan; Qiu, Zi-Yu; Wang, Wen; Cui, Guang-Bin

2017-01-01

Current machine learning techniques provide the opportunity to develop noninvasive and automated glioma grading tools, by utilizing quantitative parameters derived from multi-modal magnetic resonance imaging (MRI) data. However, the efficacies of different machine learning methods in glioma grading have not been investigated.A comprehensive comparison of varied machine learning methods in differentiating low-grade gliomas (LGGs) and high-grade gliomas (HGGs) as well as WHO grade II, III and IV gliomas based on multi-parametric MRI images was proposed in the current study. The parametric histogram and image texture attributes of 120 glioma patients were extracted from the perfusion, diffusion and permeability parametric maps of preoperative MRI. Then, 25 commonly used machine learning classifiers combined with 8 independent attribute selection methods were applied and evaluated using leave-one-out cross validation (LOOCV) strategy. Besides, the influences of parameter selection on the classifying performances were investigated. We found that support vector machine (SVM) exhibited superior performance to other classifiers. By combining all tumor attributes with synthetic minority over-sampling technique (SMOTE), the highest classifying accuracy of 0.945 or 0.961 for LGG and HGG or grade II, III and IV gliomas was achieved. Application of Recursive Feature Elimination (RFE) attribute selection strategy further improved the classifying accuracies. Besides, the performances of LibSVM, SMO, IBk classifiers were influenced by some key parameters such as kernel type, c, gama, K, etc. SVM is a promising tool in developing automated preoperative glioma grading system, especially when being combined with RFE strategy. Model parameters should be considered in glioma grading model optimization. PMID:28599282
Optimizing a machine learning based glioma grading system using multi-parametric MRI histogram and texture features.

PubMed

Zhang, Xin; Yan, Lin-Feng; Hu, Yu-Chuan; Li, Gang; Yang, Yang; Han, Yu; Sun, Ying-Zhi; Liu, Zhi-Cheng; Tian, Qiang; Han, Zi-Yang; Liu, Le-De; Hu, Bin-Quan; Qiu, Zi-Yu; Wang, Wen; Cui, Guang-Bin

2017-07-18

Current machine learning techniques provide the opportunity to develop noninvasive and automated glioma grading tools, by utilizing quantitative parameters derived from multi-modal magnetic resonance imaging (MRI) data. However, the efficacies of different machine learning methods in glioma grading have not been investigated.A comprehensive comparison of varied machine learning methods in differentiating low-grade gliomas (LGGs) and high-grade gliomas (HGGs) as well as WHO grade II, III and IV gliomas based on multi-parametric MRI images was proposed in the current study. The parametric histogram and image texture attributes of 120 glioma patients were extracted from the perfusion, diffusion and permeability parametric maps of preoperative MRI. Then, 25 commonly used machine learning classifiers combined with 8 independent attribute selection methods were applied and evaluated using leave-one-out cross validation (LOOCV) strategy. Besides, the influences of parameter selection on the classifying performances were investigated. We found that support vector machine (SVM) exhibited superior performance to other classifiers. By combining all tumor attributes with synthetic minority over-sampling technique (SMOTE), the highest classifying accuracy of 0.945 or 0.961 for LGG and HGG or grade II, III and IV gliomas was achieved. Application of Recursive Feature Elimination (RFE) attribute selection strategy further improved the classifying accuracies. Besides, the performances of LibSVM, SMO, IBk classifiers were influenced by some key parameters such as kernel type, c, gama, K, etc. SVM is a promising tool in developing automated preoperative glioma grading system, especially when being combined with RFE strategy. Model parameters should be considered in glioma grading model optimization.
Time-cumulated visible and infrared histograms used as descriptor of cloud cover

NASA Technical Reports Server (NTRS)

Seze, G.; Rossow, W.

1987-01-01

To study the statistical behavior of clouds for different climate regimes, the spatial and temporal stability of VIS-IR bidimensional histograms is tested. Also, the effect of data sampling and averaging on the histogram shapes is considered; in particular the sampling strategy used by the International Satellite Cloud Climatology Project is tested.
Interpreting Histograms. As Easy as It Seems?

ERIC Educational Resources Information Center

Lem, Stephanie; Onghena, Patrick; Verschaffel, Lieven; Van Dooren, Wim

2014-01-01

Histograms are widely used, but recent studies have shown that they are not as easy to interpret as it might seem. In this article, we report on three studies on the interpretation of histograms in which we investigated, namely, (1) whether the misinterpretation by university students can be considered to be the result of heuristic reasoning, (2)…
Improving Real World Performance of Vision Aided Navigation in a Flight Environment

DTIC Science & Technology

2016-09-15

Introduction . . . . . . . 63 4.2 Wide Area Search Extent . . . . . . . . . . . . . . . . . 64 4.3 Large-Scale Image Navigation Histogram Filter ...65 4.3.1 Location Model . . . . . . . . . . . . . . . . . . 66 4.3.2 Measurement Model . . . . . . . . . . . . . . . 66 4.3.3 Histogram Filter ...Iteration of Histogram Filter . . . . . . . . . . . 70 4.4 Implementation and Flight Test Campaign . . . . . . . . 71 4.4.1 Software Implementation
Airborne gamma-ray spectrometer and magnetometer survey, Durango A, Colorado. Final report Volume II A. Detail area

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1983-01-01

This volume contains geology of the Durango A detail area, radioactive mineral occurences in Colorado, and geophysical data interpretation. Eight appendices provide the following: stacked profiles, geologic histograms, geochemical histograms, speed and altitude histograms, geologic statistical tables, geochemical statistical tables, magnetic and ancillary profiles, and test line data.
Airborne gamma-ray spectrometer and magnetometer survey, Durango B, Colorado. Final report Volume II A. Detail area

DOE Office of Scientific and Technical Information (OSTI.GOV)

Not Available

1983-01-01

The geology of the Durango B detail area, the radioactive mineral occurrences in Colorado and the geophysical data interpretation are included in this report. Seven appendices contain: stacked profiles, geologic histograms, geochemical histograms, speed and altitude histograms, geologic statistical tables, geochemical statistical tables, and test line data.

Students' Understanding of Bar Graphs and Histograms: Results from the LOCUS Assessments

ERIC Educational Resources Information Center

Whitaker, Douglas; Jacobbe, Tim

2017-01-01

Bar graphs and histograms are core statistical tools that are widely used in statistical practice and commonly taught in classrooms. Despite their importance and the instructional time devoted to them, many students demonstrate misunderstandings when asked to read and interpret bar graphs and histograms. Much of the research that has been…
Beyond Relational: A Database Architecture and Federated Query Optimization in a Multi-Modal Healthcare Environment

ERIC Educational Resources Information Center

Hylock, Ray Hales

2013-01-01

Over the past thirty years, clinical research has benefited substantially from the adoption of electronic medical record systems. As deployment has increased, so too has the number of researchers seeking to improve the overall analytical environment by way of tools and models. Although much work has been done, there are still many uninvestigated…
Using structure to explore the sequence alignment space of remote homologs.

PubMed

Kuziemko, Andrew; Honig, Barry; Petrey, Donald

2011-10-01

Protein structure modeling by homology requires an accurate sequence alignment between the query protein and its structural template. However, sequence alignment methods based on dynamic programming (DP) are typically unable to generate accurate alignments for remote sequence homologs, thus limiting the applicability of modeling methods. A central problem is that the alignment that is "optimal" in terms of the DP score does not necessarily correspond to the alignment that produces the most accurate structural model. That is, the correct alignment based on structural superposition will generally have a lower score than the optimal alignment obtained from sequence. Variations of the DP algorithm have been developed that generate alternative alignments that are "suboptimal" in terms of the DP score, but these still encounter difficulties in detecting the correct structural alignment. We present here a new alternative sequence alignment method that relies heavily on the structure of the template. By initially aligning the query sequence to individual fragments in secondary structure elements and combining high-scoring fragments that pass basic tests for "modelability", we can generate accurate alignments within a small ensemble. Our results suggest that the set of sequences that can currently be modeled by homology can be greatly extended.
Modeling Early Postnatal Brain Growth and Development with CT: Changes in the Brain Radiodensity Histogram from Birth to 2 Years.

PubMed

Cauley, K A; Hu, Y; Och, J; Yorks, P J; Fielden, S W

2018-04-01

The majority of brain growth and development occur in the first 2 years of life. This study investigated these changes by analysis of the brain radiodensity histogram of head CT scans from the clinical population, 0-2 years of age. One hundred twenty consecutive head CTs with normal findings meeting the inclusion criteria from children from birth to 2 years were retrospectively identified from 3 different CT scan platforms. Histogram analysis was performed on brain-extracted images, and histogram mean, mode, full width at half maximum, skewness, kurtosis, and SD were correlated with subject age. The effects of scan platform were investigated. Normative curves were fitted by polynomial regression analysis. Average total brain volume was 360 cm 3 at birth, 948 cm 3 at 1 year, and 1072 cm 3 at 2 years. Total brain tissue density showed an 11% increase in mean density at 1 year and 19% at 2 years. Brain radiodensity histogram skewness was positive at birth, declining logarithmically in the first 200 days of life. The histogram kurtosis also decreased in the first 200 days to approach a normal distribution. Direct segmentation of CT images showed that changes in brain radiodensity histogram skewness correlated with, and can be explained by, a relative increase in gray matter volume and an increase in gray and white matter tissue density that occurs during this period of brain maturation. Normative metrics of the brain radiodensity histogram derived from routine clinical head CT images can be used to develop a model of normal brain development. © 2018 by American Journal of Neuroradiology.
Histogram analysis derived from apparent diffusion coefficient (ADC) is more sensitive to reflect serological parameters in myositis than conventional ADC analysis.

PubMed

Meyer, Hans Jonas; Emmer, Alexander; Kornhuber, Malte; Surov, Alexey

2018-05-01

Diffusion-weighted imaging (DWI) has the potential of being able to reflect histopathology architecture. A novel imaging approach, namely histogram analysis, is used to further characterize tissues on MRI. The aim of this study was to correlate histogram parameters derived from apparent diffusion coefficient (ADC) maps with serological parameters in myositis. 16 patients with autoimmune myositis were included in this retrospective study. DWI was obtained on a 1.5 T scanner by using the b-values of 0 and 1000 s mm - 2 . Histogram analysis was performed as a whole muscle measurement by using a custom-made Matlab-based application. The following ADC histogram parameters were estimated: ADCmean, ADCmax, ADCmin, ADCmedian, ADCmode, and the following percentiles ADCp10, ADCp25, ADCp75, ADCp90, as well histogram parameters kurtosis, skewness, and entropy. In all patients, the blood sample was acquired within 3 days to the MRI. The following serological parameters were estimated: alanine aminotransferase, aspartate aminotransferase, creatine kinase, lactate dehydrogenase, C-reactive protein (CRP) and myoglobin. All patients were screened for Jo1-autobodies. Kurtosis correlated inversely with CRP (p = -0.55 and 0.03). Furthermore, ADCp10 and ADCp90 values tended to correlate with creatine kinase (p = -0.43, 0.11, and p = -0.42, = 0.12 respectively). In addition, ADCmean, p10, p25, median, mode, and entropy were different between Jo1-positive and Jo1-negative patients. ADC histogram parameters are sensitive for detection of muscle alterations in myositis patients. Advances in knowledge: This study identified that kurtosis derived from ADC maps is associated with CRP in myositis patients. Furthermore, several ADC histogram parameters are statistically different between Jo1-positive and Jo1-negative patients.
Can histogram analysis of MR images predict aggressiveness in pancreatic neuroendocrine tumors?

PubMed

De Robertis, Riccardo; Maris, Bogdan; Cardobi, Nicolò; Tinazzi Martini, Paolo; Gobbo, Stefano; Capelli, Paola; Ortolani, Silvia; Cingarlini, Sara; Paiella, Salvatore; Landoni, Luca; Butturini, Giovanni; Regi, Paolo; Scarpa, Aldo; Tortora, Giampaolo; D'Onofrio, Mirko

2018-06-01

To evaluate MRI derived whole-tumour histogram analysis parameters in predicting pancreatic neuroendocrine neoplasm (panNEN) grade and aggressiveness. Pre-operative MR of 42 consecutive patients with panNEN >1 cm were retrospectively analysed. T1-/T2-weighted images and ADC maps were analysed. Histogram-derived parameters were compared to histopathological features using the Mann-Whitney U test. Diagnostic accuracy was assessed by ROC-AUC analysis; sensitivity and specificity were assessed for each histogram parameter. ADC entropy was significantly higher in G2-3 tumours with ROC-AUC 0.757; sensitivity and specificity were 83.3 % (95 % CI: 61.2-94.5) and 61.1 % (95 % CI: 36.1-81.7). ADC kurtosis was higher in panNENs with vascular involvement, nodal and hepatic metastases (p= .008, .021 and .008; ROC-AUC= 0.820, 0.709 and 0.820); sensitivity and specificity were: 85.7/74.3 % (95 % CI: 42-99.2 /56.4-86.9), 36.8/96.5 % (95 % CI: 17.2-61.4 /76-99.8) and 100/62.8 % (95 % CI: 56.1-100/44.9-78.1). No significant differences between groups were found for other histogram-derived parameters (p >.05). Whole-tumour histogram analysis of ADC maps may be helpful in predicting tumour grade, vascular involvement, nodal and liver metastases in panNENs. ADC entropy and ADC kurtosis are the most accurate parameters for identification of panNENs with malignant behaviour. • Whole-tumour ADC histogram analysis can predict aggressiveness in pancreatic neuroendocrine neoplasms. • ADC entropy and kurtosis are higher in aggressive tumours. • ADC histogram analysis can quantify tumour diffusion heterogeneity. • Non-invasive quantification of tumour heterogeneity can provide adjunctive information for prognostication.
Non-small cell lung cancer: Whole-lesion histogram analysis of the apparent diffusion coefficient for assessment of tumor grade, lymphovascular invasion and pleural invasion.

PubMed

Tsuchiya, Naoko; Doai, Mariko; Usuda, Katsuo; Uramoto, Hidetaka; Tonami, Hisao

2017-01-01

Investigating the diagnostic accuracy of histogram analyses of apparent diffusion coefficient (ADC) values for determining non-small cell lung cancer (NSCLC) tumor grades, lymphovascular invasion, and pleural invasion. We studied 60 surgically diagnosed NSCLC patients. Diffusion-weighted imaging (DWI) was performed in the axial plane using a navigator-triggered single-shot, echo-planar imaging sequence with prospective acquisition correction. The ADC maps were generated, and we placed a volume-of-interest on the tumor to construct the whole-lesion histogram. Using the histogram, we calculated the mean, 5th, 10th, 25th, 50th, 75th, 90th, and 95th percentiles of ADC, skewness, and kurtosis. Histogram parameters were correlated with tumor grade, lymphovascular invasion, and pleural invasion. We performed a receiver operating characteristics (ROC) analysis to assess the diagnostic performance of histogram parameters for distinguishing different pathologic features. The ADC mean, 10th, 25th, 50th, 75th, 90th, and 95th percentiles showed significant differences among the tumor grades. The ADC mean, 25th, 50th, 75th, 90th, and 95th percentiles were significant histogram parameters between high- and low-grade tumors. The ROC analysis between high- and low-grade tumors showed that the 95th percentile ADC achieved the highest area under curve (AUC) at 0.74. Lymphovascular invasion was associated with the ADC mean, 50th, 75th, 90th, and 95th percentiles, skewness, and kurtosis. Kurtosis achieved the highest AUC at 0.809. Pleural invasion was only associated with skewness, with the AUC of 0.648. ADC histogram analyses on the basis of the entire tumor volume are able to stratify NSCLCs' tumor grade, lymphovascular invasion and pleural invasion.
SU-E-T-294: Dosimetric Analysis of Planning Phase Using Overlap Volume Histogram for Respiratory Gated Radiotherapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kang, S; Kim, D; Kim, T

2015-06-15

Purpose: End-of-exhale (EOE) phase is generally preferred for gating window because tumor position is more reproducible. However, other gating windows might be more appropriate for dose distribution perspective. In this pilot study, we proposed to utilize overlap volume histogram (OVH) to search optimized gating window and demonstrated its feasibility. Methods: We acquired 4DCT of 10 phases for 3 lung patients (2 with a target at right middle lobe and 1 at right upper lobe). After structures were defined in every phase, the OVH of each OAR was generated to quantify the three dimensional spatial relationship between the PTV and OARsmore » (bronchus, esophagus, heart and cord etc.) at each phase. OVH tells the overlap volume of an OAR according to outward distance from the PTV. Relative overlap volume at 20 mm outward distance from the PTV (ROV-20) was also defined as a metric for measuring overlap volume and obtained. For dose calculation, 3D CRT plans were made for all phases under the same beam angles and objectives (e.g., 95% of the PTV coverage with at least 100% of the prescription dose of 50 Gy). The gating window phase was ranked according to ROV-20, and the relationship between the OVH and dose distribution at each phase was evaluated by comparing the maximum dose, mean dose, and equivalent uniform dose of OAR. Results: OVHs showed noticeable difference from phase to phase, implying it is possible to find optimal phases for gating window. For 2 out of 3 patients (both with a target at RML), maximum dose, mean dose, and EUD increased as ROV-20 increased. Conclusion: It is demonstrated that optimal phases (in dose distribution perspective) for gating window could exist and OVH can be a useful tool for determining such phases without performing dose optimization calculations in all phases. This work was supported by the Radiation Technology R&D program (No. 2013M2A2A7043498) and the Mid-career Researcher Program (2012-007883) through the National Research Foundation (NRF) funded by the Ministry of Science, ICT & Future Planning (MSIP) of Korea.« less
Whole brain myelin mapping using T1- and T2-weighted MR imaging data

PubMed Central

Ganzetti, Marco; Wenderoth, Nicole; Mantini, Dante

2014-01-01

Despite recent advancements in MR imaging, non-invasive mapping of myelin in the brain still remains an open issue. Here we attempted to provide a potential solution. Specifically, we developed a processing workflow based on T1-w and T2-w MR data to generate an optimized myelin enhanced contrast image. The workflow allows whole brain mapping using the T1-w/T2-w technique, which was originally introduced as a non-invasive method for assessing cortical myelin content. The hallmark of our approach is a retrospective calibration algorithm, applied to bias-corrected T1-w and T2-w images, that relies on image intensities outside the brain. This permits standardizing the intensity histogram of the ratio image, thereby allowing for across-subject statistical analyses. Quantitative comparisons of image histograms within and across different datasets confirmed the effectiveness of our normalization procedure. Not only did the calibrated T1-w/T2-w images exhibit a comparable intensity range, but also the shape of the intensity histograms was largely corresponding. We also assessed the reliability and specificity of the ratio image compared to other MR-based techniques, such as magnetization transfer ratio (MTR), fractional anisotropy (FA), and fluid-attenuated inversion recovery (FLAIR). With respect to these other techniques, T1-w/T2-w had consistently high values, as well as low inter-subject variability, in brain structures where myelin is most abundant. Overall, our results suggested that the T1-w/T2-w technique may be a valid tool supporting the non-invasive mapping of myelin in the brain. Therefore, it might find important applications in the study of brain development, aging and disease. PMID:25228871
Communicating projected survival with treatments for chronic kidney disease: patient comprehension and perspectives on visual aids.

PubMed

Dowen, Frances; Sidhu, Karishma; Broadbent, Elizabeth; Pilmore, Helen

2017-09-21

Mortality in end stage renal disease (ESRD) is higher than many malignancies. There is no data about the optimal way to present information about projected survival to patients with ESRD. In other areas, graphs have been shown to be more easily understood than narrative. We examined patient comprehension and perspectives on graphs in communicating projected survival in chronic kidney disease (CKD). One hundred seventy-seven patients with CKD were shown 4 different graphs presenting post transplantation survival data. Patients were asked to interpret a Kaplan Meier curve, pie chart, histogram and pictograph and answer a multi-choice question to determine understanding. We measured interpretation, usefulness and preference for the graphs. Most patients correctly interpreted the graphs. There was asignificant difference in the percentage of correct answers when comparing different graph types (p = 0.0439). The pictograph was correctly interpreted by 81% of participants, the histogram by 79%, pie chart by 77% and Kaplan Meier by 69%. Correct interpretation of the histogram was associated with educational level (p = 0.008) and inversely associated with age > 65 (p = 0.008). Of those who interpreted all four graphs correctly, there was an association with employment (p = 0.001) and New Zealand European ethnicity (p = 0.002). 87% of patients found the graphs useful. The pie chart was the most preferred graph (p 0.002). The readability of the graphs may have been improved with an alternative colour choice, especially in the setting of visual impairment. Visual aids, can be beneficial adjuncts to discussing survival in CKD.
Improved automatic adjustment of density and contrast in FCR system using neural network

NASA Astrophysics Data System (ADS)

Takeo, Hideya; Nakajima, Nobuyoshi; Ishida, Masamitsu; Kato, Hisatoyo

1994-05-01

FCR system has an automatic adjustment of image density and contrast by analyzing the histogram of image data in the radiation field. Advanced image recognition methods proposed in this paper can improve the automatic adjustment performance, in which neural network technology is used. There are two methods. Both methods are basically used 3-layer neural network with back propagation. The image data are directly input to the input-layer in one method and the histogram data is input in the other method. The former is effective to the imaging menu such as shoulder joint in which the position of interest region occupied on the histogram changes by difference of positioning and the latter is effective to the imaging menu such as chest-pediatrics in which the histogram shape changes by difference of positioning. We experimentally confirm the validity of these methods (about the automatic adjustment performance) as compared with the conventional histogram analysis methods.
Investigation on improved infrared image detail enhancement algorithm based on adaptive histogram statistical stretching and gradient filtering

NASA Astrophysics Data System (ADS)

Zeng, Bangze; Zhu, Youpan; Li, Zemin; Hu, Dechao; Luo, Lin; Zhao, Deli; Huang, Juan

2014-11-01

Duo to infrared image with low contrast, big noise and unclear visual effect, target is very difficult to observed and identified. This paper presents an improved infrared image detail enhancement algorithm based on adaptive histogram statistical stretching and gradient filtering (AHSS-GF). Based on the fact that the human eyes are very sensitive to the edges and lines, the author proposed to extract the details and textures by using the gradient filtering. New histogram could be acquired by calculating the sum of original histogram based on fixed window. With the minimum value for cut-off point, author carried on histogram statistical stretching. After the proper weights given to the details and background, the detail-enhanced results could be acquired finally. The results indicate image contrast could be improved and the details and textures could be enhanced effectively as well.
An interactive system for computer-aided diagnosis of breast masses.

PubMed

Wang, Xingwei; Li, Lihua; Liu, Wei; Xu, Weidong; Lederman, Dror; Zheng, Bin

2012-10-01

Although mammography is the only clinically accepted imaging modality for screening the general population to detect breast cancer, interpreting mammograms is difficult with lower sensitivity and specificity. To provide radiologists "a visual aid" in interpreting mammograms, we developed and tested an interactive system for computer-aided detection and diagnosis (CAD) of mass-like cancers. Using this system, an observer can view CAD-cued mass regions depicted on one image and then query any suspicious regions (either cued or not cued by CAD). CAD scheme automatically segments the suspicious region or accepts manually defined region and computes a set of image features. Using content-based image retrieval (CBIR) algorithm, CAD searches for a set of reference images depicting "abnormalities" similar to the queried region. Based on image retrieval results and a decision algorithm, a classification score is assigned to the queried region. In this study, a reference database with 1,800 malignant mass regions and 1,800 benign and CAD-generated false-positive regions was used. A modified CBIR algorithm with a new function of stretching the attributes in the multi-dimensional space and decision scheme was optimized using a genetic algorithm. Using a leave-one-out testing method to classify suspicious mass regions, we compared the classification performance using two CBIR algorithms with either equally weighted or optimally stretched attributes. Using the modified CBIR algorithm, the area under receiver operating characteristic curve was significantly increased from 0.865 ± 0.006 to 0.897 ± 0.005 (p < 0.001). This study demonstrated the feasibility of developing an interactive CAD system with a large reference database and achieving improved performance.
Visual information mining in remote sensing image archives

NASA Astrophysics Data System (ADS)

Pelizzari, Andrea; Descargues, Vincent; Datcu, Mihai P.

2002-01-01

The present article focuses on the development of interactive exploratory tools for visually mining the image content in large remote sensing archives. Two aspects are treated: the iconic visualization of the global information in the archive and the progressive visualization of the image details. The proposed methods are integrated in the Image Information Mining (I2M) system. The images and image structure in the I2M system are indexed based on a probabilistic approach. The resulting links are managed by a relational data base. Both the intrinsic complexity of the observed images and the diversity of user requests result in a great number of associations in the data base. Thus new tools have been designed to visualize, in iconic representation the relationships created during a query or information mining operation: the visualization of the query results positioned on the geographical map, quick-looks gallery, visualization of the measure of goodness of the query, visualization of the image space for statistical evaluation purposes. Additionally the I2M system is enhanced with progressive detail visualization in order to allow better access for operator inspection. I2M is a three-tier Java architecture and is optimized for the Internet.
Automatic Detection of Galaxy Type From Datasets of Galaxies Image Based on Image Retrieval Approach.

PubMed

Abd El Aziz, Mohamed; Selim, I M; Xiong, Shengwu

2017-06-30

This paper presents a new approach for the automatic detection of galaxy morphology from datasets based on an image-retrieval approach. Currently, there are several classification methods proposed to detect galaxy types within an image. However, in some situations, the aim is not only to determine the type of galaxy within the queried image, but also to determine the most similar images for query image. Therefore, this paper proposes an image-retrieval method to detect the type of galaxies within an image and return with the most similar image. The proposed method consists of two stages, in the first stage, a set of features is extracted based on shape, color and texture descriptors, then a binary sine cosine algorithm selects the most relevant features. In the second stage, the similarity between the features of the queried galaxy image and the features of other galaxy images is computed. Our experiments were performed using the EFIGI catalogue, which contains about 5000 galaxies images with different types (edge-on spiral, spiral, elliptical and irregular). We demonstrate that our proposed approach has better performance compared with the particle swarm optimization (PSO) and genetic algorithm (GA) methods.
Expediting Scientific Data Analysis with Reorganization of Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Byna, Surendra; Wu, Kesheng

2013-08-19

Data producers typically optimize the layout of data files to minimize the write time. In most cases, data analysis tasks read these files in access patterns different from the write patterns causing poor read performance. In this paper, we introduce Scientific Data Services (SDS), a framework for bridging the performance gap between writing and reading scientific data. SDS reorganizes data to match the read patterns of analysis tasks and enables transparent data reads from the reorganized data. We implemented a HDF5 Virtual Object Layer (VOL) plugin to redirect the HDF5 dataset read calls to the reorganized data. To demonstrate themore » effectiveness of SDS, we applied two parallel data organization techniques: a sort-based organization on a plasma physics data and a transpose-based organization on mass spectrometry imaging data. We also extended the HDF5 data access API to allow selection of data based on their values through a query interface, called SDS Query. We evaluated the execution time in accessing various subsets of data through existing HDF5 Read API and SDS Query. We showed that reading the reorganized data using SDS is up to 55X faster than reading the original data.« less
Inverse optimization of objective function weights for treatment planning using clinical dose-volume histograms

NASA Astrophysics Data System (ADS)

Babier, Aaron; Boutilier, Justin J.; Sharpe, Michael B.; McNiven, Andrea L.; Chan, Timothy C. Y.

2018-05-01

We developed and evaluated a novel inverse optimization (IO) model to estimate objective function weights from clinical dose-volume histograms (DVHs). These weights were used to solve a treatment planning problem to generate ‘inverse plans’ that had similar DVHs to the original clinical DVHs. Our methodology was applied to 217 clinical head and neck cancer treatment plans that were previously delivered at Princess Margaret Cancer Centre in Canada. Inverse plan DVHs were compared to the clinical DVHs using objective function values, dose-volume differences, and frequency of clinical planning criteria satisfaction. Median differences between the clinical and inverse DVHs were within 1.1 Gy. For most structures, the difference in clinical planning criteria satisfaction between the clinical and inverse plans was at most 1.4%. For structures where the two plans differed by more than 1.4% in planning criteria satisfaction, the difference in average criterion violation was less than 0.5 Gy. Overall, the inverse plans were very similar to the clinical plans. Compared with a previous inverse optimization method from the literature, our new inverse plans typically satisfied the same or more clinical criteria, and had consistently lower fluence heterogeneity. Overall, this paper demonstrates that DVHs, which are essentially summary statistics, provide sufficient information to estimate objective function weights that result in high quality treatment plans. However, as with any summary statistic that compresses three-dimensional dose information, care must be taken to avoid generating plans with undesirable features such as hotspots; our computational results suggest that such undesirable spatial features were uncommon. Our IO-based approach can be integrated into the current clinical planning paradigm to better initialize the planning process and improve planning efficiency. It could also be embedded in a knowledge-based planning or adaptive radiation therapy framework to automatically generate a new plan given a predicted or updated target DVH, respectively.
Inverse optimization of objective function weights for treatment planning using clinical dose-volume histograms.

PubMed

Babier, Aaron; Boutilier, Justin J; Sharpe, Michael B; McNiven, Andrea L; Chan, Timothy C Y

2018-05-10

We developed and evaluated a novel inverse optimization (IO) model to estimate objective function weights from clinical dose-volume histograms (DVHs). These weights were used to solve a treatment planning problem to generate 'inverse plans' that had similar DVHs to the original clinical DVHs. Our methodology was applied to 217 clinical head and neck cancer treatment plans that were previously delivered at Princess Margaret Cancer Centre in Canada. Inverse plan DVHs were compared to the clinical DVHs using objective function values, dose-volume differences, and frequency of clinical planning criteria satisfaction. Median differences between the clinical and inverse DVHs were within 1.1 Gy. For most structures, the difference in clinical planning criteria satisfaction between the clinical and inverse plans was at most 1.4%. For structures where the two plans differed by more than 1.4% in planning criteria satisfaction, the difference in average criterion violation was less than 0.5 Gy. Overall, the inverse plans were very similar to the clinical plans. Compared with a previous inverse optimization method from the literature, our new inverse plans typically satisfied the same or more clinical criteria, and had consistently lower fluence heterogeneity. Overall, this paper demonstrates that DVHs, which are essentially summary statistics, provide sufficient information to estimate objective function weights that result in high quality treatment plans. However, as with any summary statistic that compresses three-dimensional dose information, care must be taken to avoid generating plans with undesirable features such as hotspots; our computational results suggest that such undesirable spatial features were uncommon. Our IO-based approach can be integrated into the current clinical planning paradigm to better initialize the planning process and improve planning efficiency. It could also be embedded in a knowledge-based planning or adaptive radiation therapy framework to automatically generate a new plan given a predicted or updated target DVH, respectively.
Evaluation of poison information services provided by a new poison information center.

PubMed

Churi, Shobha; Abraham, Lovin; Ramesh, M; Narahari, M G

2013-01-01

The aim of this study is to assess the nature and quality of services provided by poison information center established at a tertiary-care teaching hospital, Mysore. This was a prospective observational study. The poison information center was officially established in September 2010 and began its functioning thereafter. The center is equipped with required resources and facility (e.g., text books, Poisindex, Drugdex, toll free telephone service, internet and online services) to provide poison information services. The poison information services provided by the center were recorded in documentation forms. The documentation form consists of numerous sections to collect information on: (a) Type of population (children, adult, elderly or pregnant) (b) poisoning agents (c) route of exposure (d) type of poisoning (intentional, accidental or environmental) (e) demographic details of patient (age, gender and bodyweight) (f) enquirer details (background, place of call and mode of request) (g) category and purpose of query and (h) details of provided service (information provided, mode of provision, time taken to provide information and references consulted). The nature and quality of poison information services provided was assessed using a quality assessment checklist developed in accordance with DSE/World Health Organization guidelines. Chi-Square test (χ(2)). A total of 419 queries were received by the center. A majority (n = 333; 79.5%) of the queries were asked by the doctors to provide optimal care (n = 400; 95.5%). Most of the queries were received during ward rounds (n = 201; 48.0%), followed by direct access (n = 147; 35.1%). The poison information services were predominantly provided through verbal communication (n = 352; 84.0%). Upon receipt of queries, the required service was provided immediately (n = 103; 24.6%) or within 10-20 min (n = 296; 70.6%). The queries were mainly related to intentional poisoning (n = 258; 64.5%), followed by accidental poisoning (n = 142; 35.5%). The most common poisoning agents were medicines (n = 124; 31.0%). The service provided was graded as "Excellent" for the majority of queries (n = 360; 86%; P < 0.001), followed by "Very Good" (n = 50; 12%) and "Good" (n = 9; 2%). The poison information center provided requested services in a skillful, efficient and evidence-based manner to meet the needs of the requestor. The enquiries and information provided is documented in a clear and systematic manner.
Global-scale analysis of vegetation indices for moderate resolution monitoring of terrestrial vegetation

NASA Astrophysics Data System (ADS)

Huete, Alfredo R.; Didan, Kamel; van Leeuwen, Willem J. D.; Vermote, Eric F.

1999-12-01

Vegetation indices have emerged as important tools in the seasonal and inter-annual monitoring of the Earth's vegetation. They are radiometric measures of the amount and condition of vegetation. In this study, the Sea-viewing Wide Field-of-View sensor (SeaWiFS) is used to investigate coarse resolution monitoring of vegetation with multiple indices. A 30-day series of SeaWiFS data, corrected for molecular scattering and absorption, was composited to cloud-free, single channel reflectance images. The normalized difference vegetation index (NDVI) and an optimized index, the enhanced vegetation index (EVI), were computed over various 'continental' regions. The EVI had a normal distribution of values over the continental set of biomes while the NDVI was skewed toward higher values and saturated over forested regions. The NDVI resembled the skewed distributions found in the red band while the EVI resembled the normal distributions found in the NIR band. The EVI minimized smoke contamination over extensive portions of the tropics. As a result, major biome types with continental regions were discriminable in both the EVI imagery and histograms, whereas smoke and saturation considerably degraded the NDVI histogram structure preventing reliable discrimination of biome types.

Motion compensation in digital subtraction angiography using graphics hardware.

PubMed

Deuerling-Zheng, Yu; Lell, Michael; Galant, Adam; Hornegger, Joachim

2006-07-01

An inherent disadvantage of digital subtraction angiography (DSA) is its sensitivity to patient motion which causes artifacts in the subtraction images. These artifacts could often reduce the diagnostic value of this technique. Automated, fast and accurate motion compensation is therefore required. To cope with this requirement, we first examine a method explicitly designed to detect local motions in DSA. Then, we implement a motion compensation algorithm by means of block matching on modern graphics hardware. Both methods search for maximal local similarity by evaluating a histogram-based measure. In this context, we are the first who have mapped an optimizing search strategy on graphics hardware while paralleling block matching. Moreover, we provide an innovative method for creating histograms on graphics hardware with vertex texturing and frame buffer blending. It turns out that both methods can effectively correct the artifacts in most case, as the hardware implementation of block matching performs much faster: the displacements of two 1024 x 1024 images can be calculated at 3 frames/s with integer precision or 2 frames/s with sub-pixel precision. Preliminary clinical evaluation indicates that the computation with integer precision could already be sufficient.
Nonlinear histogram binning for quantitative analysis of lung tissue fibrosis in high-resolution CT data

NASA Astrophysics Data System (ADS)

Zavaletta, Vanessa A.; Bartholmai, Brian J.; Robb, Richard A.

2007-03-01

Diffuse lung diseases, such as idiopathic pulmonary fibrosis (IPF), can be characterized and quantified by analysis of volumetric high resolution CT scans of the lungs. These data sets typically have dimensions of 512 x 512 x 400. It is too subjective and labor intensive for a radiologist to analyze each slice and quantify regional abnormalities manually. Thus, computer aided techniques are necessary, particularly texture analysis techniques which classify various lung tissue types. Second and higher order statistics which relate the spatial variation of the intensity values are good discriminatory features for various textures. The intensity values in lung CT scans range between [-1024, 1024]. Calculation of second order statistics on this range is too computationally intensive so the data is typically binned between 16 or 32 gray levels. There are more effective ways of binning the gray level range to improve classification. An optimal and very efficient way to nonlinearly bin the histogram is to use a dynamic programming algorithm. The objective of this paper is to show that nonlinear binning using dynamic programming is computationally efficient and improves the discriminatory power of the second and higher order statistics for more accurate quantification of diffuse lung disease.
Spline smoothing of histograms by linear programming

NASA Technical Reports Server (NTRS)

Bennett, J. O.

1972-01-01

An algorithm for an approximating function to the frequency distribution is obtained from a sample of size n. To obtain the approximating function a histogram is made from the data. Next, Euclidean space approximations to the graph of the histogram using central B-splines as basis elements are obtained by linear programming. The approximating function has area one and is nonnegative.
Histogram analysis of greyscale sonograms to differentiate between the subtypes of follicular variant of papillary thyroid cancer.

PubMed

Kwon, M-R; Shin, J H; Hahn, S Y; Oh, Y L; Kwak, J Y; Lee, E; Lim, Y

2018-06-01

To evaluate the diagnostic value of histogram analysis using ultrasound (US) to differentiate between the subtypes of follicular variant of papillary thyroid carcinoma (FVPTC). The present study included 151 patients with surgically confirmed FVPTC diagnosed between January 2014 and May 2016. Their preoperative US features were reviewed retrospectively. Histogram parameters (mean, maximum, minimum, range, root mean square, skewness, kurtosis, energy, entropy, and correlation) were obtained for each nodule. The 152 nodules in 151 patients comprised 48 non-invasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTPs; 31.6%), 60 invasive encapsulated FVPTCs (EFVPTCs; 39.5%), and 44 infiltrative FVPTCs (28.9%). The US features differed significantly between the subtypes of FVPTC. Discrimination was achieved between NIFTPs and infiltrative FVPTC, and between invasive EFVPTC and infiltrative FVPTC using histogram parameters; however, the parameters were not significantly different between NIFTP and invasive EFVPTC. It is feasible to use greyscale histogram analysis to differentiate between NIFTP and infiltrative FVPTC, but not between NIFTP and invasive EFVPTC. Histograms can be used as a supplementary tool to differentiate the subtypes of FVPTC. Copyright © 2017 The Royal College of Radiologists. Published by Elsevier Ltd. All rights reserved.
DSP+FPGA-based real-time histogram equalization system of infrared image

NASA Astrophysics Data System (ADS)

Gu, Dongsheng; Yang, Nansheng; Pi, Defu; Hua, Min; Shen, Xiaoyan; Zhang, Ruolan

2001-10-01

Histogram Modification is a simple but effective method to enhance an infrared image. There are several methods to equalize an infrared image's histogram due to the different characteristics of the different infrared images, such as the traditional HE (Histogram Equalization) method, and the improved HP (Histogram Projection) and PE (Plateau Equalization) method and so on. If to realize these methods in a single system, the system must have a mass of memory and extremely fast speed. In our system, we introduce a DSP + FPGA based real-time procession technology to do these things together. FPGA is used to realize the common part of these methods while DSP is to do the different part. The choice of methods and the parameter can be input by a keyboard or a computer. By this means, the function of the system is powerful while it is easy to operate and maintain. In this article, we give out the diagram of the system and the soft flow chart of the methods. And at the end of it, we give out the infrared image and its histogram before and after the process of HE method.
TH-CD-209-05: Impact of Spot Size and Spacing On the Quality of Robustly-Optimized Intensity-Modulated Proton Therapy Plans for Lung Cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liu, W; Ding, X; Hu, Y

Purpose: To investigate how spot size and spacing affect plan quality, especially, plan robustness and the impact of interplay effect, of robustly-optimized intensity-modulated proton therapy (IMPT) plans for lung cancer. Methods: Two robustly-optimized IMPT plans were created for 10 lung cancer patients: (1) one for a proton beam with in-air energy dependent large spot size at isocenter (σ: 5–15 mm) and spacing (1.53σ); (2) the other for a proton beam with small spot size (σ: 2–6 mm) and spacing (5 mm). Both plans were generated on the average CTs with internal-gross-tumor-volume density overridden to irradiate internal target volume (ITV). Themore » root-mean-square-dose volume histograms (RVH) measured the sensitivity of the dose to uncertainties, and the areas under RVH curves were used to evaluate plan robustness. Dose evaluation software was developed to model time-dependent spot delivery to incorporate interplay effect with randomized starting phases of each field per fraction. Patient anatomy voxels were mapped from phase to phase via deformable image registration to score doses. Dose-volume-histogram indices including ITV coverage, homogeneity, and organs-at-risk (OAR) sparing were compared using Student-t test. Results: Compared to large spots, small spots resulted in significantly better OAR sparing with comparable ITV coverage and homogeneity in the nominal plan. Plan robustness was comparable for ITV and most OARs. With interplay effect considered, significantly better OAR sparing with comparable ITV coverage and homogeneity is observed using smaller spots. Conclusion: Robust optimization with smaller spots significantly improves OAR sparing with comparable plan robustness and similar impact of interplay effect compare to larger spots. Small spot size requires the use of larger number of spots, which gives optimizer more freedom to render a plan more robust. The ratio between spot size and spacing was found to be more relevant to determine plan robustness and the impact of interplay effect than spot size alone. This research was supported by the National Cancer Institute Career Developmental Award K25CA168984, by the Fraternal Order of Eagles Cancer Research Fund Career Development Award, by The Lawrence W. and Marilyn W. Matteson Fund for Cancer Research, by Mayo Arizona State University Seed Grant, and by The Kemper Marley Foundation.« less
RDFBuilder: a tool to automatically build RDF-based interfaces for MAGE-OM microarray data sources.

PubMed

Anguita, Alberto; Martin, Luis; Garcia-Remesal, Miguel; Maojo, Victor

2013-07-01

This paper presents RDFBuilder, a tool that enables RDF-based access to MAGE-ML-compliant microarray databases. We have developed a system that automatically transforms the MAGE-OM model and microarray data stored in the ArrayExpress database into RDF format. Additionally, the system automatically enables a SPARQL endpoint. This allows users to execute SPARQL queries for retrieving microarray data, either from specific experiments or from more than one experiment at a time. Our system optimizes response times by caching and reusing information from previous queries. In this paper, we describe our methods for achieving this transformation. We show that our approach is complementary to other existing initiatives, such as Bio2RDF, for accessing and retrieving data from the ArrayExpress database. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Mirador: A Simple, Fast Search Interface for Remote Sensing Data

NASA Technical Reports Server (NTRS)

Lynnes, Christopher; Strub, Richard; Seiler, Edward; Joshi, Talak; MacHarrie, Peter

2008-01-01

A major challenge for remote sensing science researchers is searching and acquiring relevant data files for their research projects based on content, space and time constraints. Several structured query (SQ) and hierarchical navigation (HN) search interfaces have been develop ed to satisfy this requirement, yet the dominant search engines in th e general domain are based on free-text search. The Goddard Earth Sci ences Data and Information Services Center has developed a free-text search interface named Mirador that supports space-time queries, inc luding a gazetteer and geophysical event gazetteer. In order to compe nsate for a slightly reduced search precision relative to SQ and HN t echniques, Mirador uses several search optimizations to return result s quickly. The quick response enables a more iterative search strateg y than is available with many SQ and HN techniques.
An evaluation of multi-probe locality sensitive hashing for computing similarities over web-scale query logs.

PubMed

Cormode, Graham; Dasgupta, Anirban; Goyal, Amit; Lee, Chi Hoon

2018-01-01

Many modern applications of AI such as web search, mobile browsing, image processing, and natural language processing rely on finding similar items from a large database of complex objects. Due to the very large scale of data involved (e.g., users' queries from commercial search engines), computing such near or nearest neighbors is a non-trivial task, as the computational cost grows significantly with the number of items. To address this challenge, we adopt Locality Sensitive Hashing (a.k.a, LSH) methods and evaluate four variants in a distributed computing environment (specifically, Hadoop). We identify several optimizations which improve performance, suitable for deployment in very large scale settings. The experimental results demonstrate our variants of LSH achieve the robust performance with better recall compared with "vanilla" LSH, even when using the same amount of space.
VirSSPA- a virtual reality tool for surgical planning workflow.

PubMed

Suárez, C; Acha, B; Serrano, C; Parra, C; Gómez, T

2009-03-01

A virtual reality tool, called VirSSPA, was developed to optimize the planning of surgical processes. Segmentation algorithms for Computed Tomography (CT) images: a region growing procedure was used for soft tissues and a thresholding algorithm was implemented to segment bones. The algorithms operate semiautomati- cally since they only need seed selection with the mouse on each tissue segmented by the user. The novelty of the paper is the adaptation of an enhancement method based on histogram thresholding applied to CT images for surgical planning, which simplifies subsequent segmentation. A substantial improvement of the virtual reality tool VirSSPA was obtained with these algorithms. VirSSPA was used to optimize surgical planning, to decrease the time spent on surgical planning and to improve operative results. The success rate increases due to surgeons being able to see the exact extent of the patient's ailment. This tool can decrease operating room time, thus resulting in reduced costs. Virtual simulation was effective for optimizing surgical planning, which could, consequently, result in improved outcomes with reduced costs.
Dosimetric comparison between VMAT and RC3D techniques: case of prostate treatment

NASA Astrophysics Data System (ADS)

Chemingui, Fatima Zohra; Benrachi, Fatima; Bali, Mohamed Saleh; Ladjal, Hamid

2017-09-01

Considered as the second men cancer in Algeria, prostate cancer is treated in 70% by radiation. That's why radiation therapy is therapeutic weapon for prostate cancer. Conformational Radiotherapy in 3D is the most common technique [1-5]. The use of conventionally optimized treatment plans was compared at case scenario of optimized treatment plans VMAT for prostate cancer. The evaluation of the two optimizations strategies focused on the resulting plans ability to retain dose objectives under the influence of patient set up. Dose Volume Histogram in the Planning Target Volume and dose in the Organs At Risks were used to calculate the conformity index, and evaluation ratio of irradiated volume which represent the main tool of comparison [6,7]. The situation was analysed systematically. The 14% dose increase in the target leads to a decrease in the dose in adjacent organs with 39% in the bladder. Therefore, the criterion for better efficacy and less toxicity reveal that VMAT is the best choice.
Enhancing tumor apparent diffusion coefficient histogram skewness stratifies the postoperative survival in recurrent glioblastoma multiforme patients undergoing salvage surgery.

PubMed

Zolal, Amir; Juratli, Tareq A; Linn, Jennifer; Podlesek, Dino; Sitoci Ficici, Kerim Hakan; Kitzler, Hagen H; Schackert, Gabriele; Sobottka, Stephan B; Rieger, Bernhard; Krex, Dietmar

2016-05-01

Objective To determine the value of apparent diffusion coefficient (ADC) histogram parameters for the prediction of individual survival in patients undergoing surgery for recurrent glioblastoma (GBM) in a retrospective cohort study. Methods Thirty-one patients who underwent surgery for first recurrence of a known GBM between 2008 and 2012 were included. The following parameters were collected: age, sex, enhancing tumor size, mean ADC, median ADC, ADC skewness, ADC kurtosis and fifth percentile of the ADC histogram, initial progression free survival (PFS), extent of second resection and further adjuvant treatment. The association of these parameters with survival and PFS after second surgery was analyzed using log-rank test and Cox regression. Results Using log-rank test, ADC histogram skewness of the enhancing tumor was significantly associated with both survival (p = 0.001) and PFS after second surgery (p = 0.005). Further parameters associated with prolonged survival after second surgery were: gross total resection at second surgery (p = 0.026), tumor size (0.040) and third surgery (p = 0.003). In the multivariate Cox analysis, ADC histogram skewness was shown to be an independent prognostic factor for survival after second surgery. Conclusion ADC histogram skewness of the enhancing lesion, enhancing lesion size, third surgery, as well as gross total resection have been shown to be associated with survival following the second surgery. ADC histogram skewness was an independent prognostic factor for survival in the multivariate analysis.
Evidential significance of automotive paint trace evidence using a pattern recognition based infrared library search engine for the Paint Data Query Forensic Database.

PubMed

Lavine, Barry K; White, Collin G; Allen, Matthew D; Fasasi, Ayuba; Weakley, Andrew

2016-10-01

A prototype library search engine has been further developed to search the infrared spectral libraries of the paint data query database to identify the line and model of a vehicle from the clear coat, surfacer-primer, and e-coat layers of an intact paint chip. For this study, search prefilters were developed from 1181 automotive paint systems spanning 3 manufacturers: General Motors, Chrysler, and Ford. The best match between each unknown and the spectra in the hit list generated by the search prefilters was identified using a cross-correlation library search algorithm that performed both a forward and backward search. In the forward search, spectra were divided into intervals and further subdivided into windows (which corresponds to the time lag for the comparison) within those intervals. The top five hits identified in each search window were compiled; a histogram was computed that summarized the frequency of occurrence for each library sample, with the IR spectra most similar to the unknown flagged. The backward search computed the frequency and occurrence of each line and model without regard to the identity of the individual spectra. Only those lines and models with a frequency of occurrence greater than or equal to 20% were included in the final hit list. If there was agreement between the forward and backward search results, the specific line and model common to both hit lists was always the correct assignment. Samples assigned to the same line and model by both searches are always well represented in the library and correlate well on an individual basis to specific library samples. For these samples, one can have confidence in the accuracy of the match. This was not the case for the results obtained using commercial library search algorithms, as the hit quality index scores for the top twenty hits were always greater than 99%. Copyright © 2016 Elsevier B.V. All rights reserved.
LandEx - Fast, FOSS-Based Application for Query and Retrieval of Land Cover Patterns

NASA Astrophysics Data System (ADS)

Netzel, P.; Stepinski, T.

2012-12-01

The amount of satellite-based spatial data is continuously increasing making a development of efficient data search tools a priority. The bulk of existing research on searching satellite-gathered data concentrates on images and is based on the concept of Content-Based Image Retrieval (CBIR); however, available solutions are not efficient and robust enough to be put to use as deployable web-based search tools. Here we report on development of a practical, deployable tool that searches classified, rather than raw image. LandEx (Landscape Explorer) is a GeoWeb-based tool for Content-Based Pattern Retrieval (CBPR) contained within the National Land Cover Dataset 2006 (NLCD2006). The USGS-developed NLCD2006 is derived from Landsat multispectral images; it covers the entire conterminous U.S. with the resolution of 30 meters/pixel and it depicts 16 land cover classes. The size of NLCD2006 is about 10 Gpixels (161,000 x 100,000 pixels). LandEx is a multi-tier GeoWeb application based on Open Source Software. Main components are: GeoExt/OpenLayers (user interface), GeoServer (OGC WMS, WCS and WPS server), and GRASS (calculation engine). LandEx performs search using query-by-example approach: user selects a reference scene (exhibiting a chosen pattern of land cover classes) and the tool produces, in real time, a map indicating a degree of similarity between the reference pattern and all local patterns across the U.S. Scene pattern is encapsulated by a 2D histogram of classes and sizes of single-class clumps. Pattern similarity is based on the notion of mutual information. The resultant similarity map can be viewed and navigated in a web browser, or it can download as a GeoTiff file for more in-depth analysis. The LandEx is available at http://sil.uc.edu
Surface similarity-based molecular query-retrieval

PubMed Central

Singh, Rahul

2007-01-01

Background Discerning the similarity between molecules is a challenging problem in drug discovery as well as in molecular biology. The importance of this problem is due to the fact that the biochemical characteristics of a molecule are closely related to its structure. Therefore molecular similarity is a key notion in investigations targeting exploration of molecular structural space, query-retrieval in molecular databases, and structure-activity modelling. Determining molecular similarity is related to the choice of molecular representation. Currently, representations with high descriptive power and physical relevance like 3D surface-based descriptors are available. Information from such representations is both surface-based and volumetric. However, most techniques for determining molecular similarity tend to focus on idealized 2D graph-based descriptors due to the complexity that accompanies reasoning with more elaborate representations. Results This paper addresses the problem of determining similarity when molecules are described using complex surface-based representations. It proposes an intrinsic, spherical representation that systematically maps points on a molecular surface to points on a standard coordinate system (a sphere). Molecular surface properties such as shape, field strengths, and effects due to field super-positioningcan then be captured as distributions on the surface of the sphere. Surface-based molecular similarity is subsequently determined by computing the similarity of the surface-property distributions using a novel formulation of histogram-intersection. The similarity formulation is not only sensitive to the 3D distribution of the surface properties, but is also highly efficient to compute. Conclusion The proposed method obviates the computationally expensive step of molecular pose-optimisation, can incorporate conformational variations, and facilitates highly efficient determination of similarity by directly comparing molecular surfaces and surface-based properties. Retrieval performance, applications in structure-activity modeling of complex biological properties, and comparisons with existing research and commercial methods demonstrate the validity and effectiveness of the approach. PMID:17634096
Visualization of a variety of possible dosimetric outcomes in radiation therapy using dose-volume histogram bands.

PubMed

Trofimov, Alexei; Unkelbach, Jan; DeLaney, Thomas F; Bortfeld, Thomas

2012-01-01

Dose-volume histograms (DVH) are the most common tool used in the appraisal of the quality of a clinical treatment plan. However, when delivery uncertainties are present, the DVH may not always accurately describe the dose distribution actually delivered to the patient. We present a method, based on DVH formalism, to visualize the variability in the expected dosimetric outcome of a treatment plan. For a case of chordoma of the cervical spine, we compared 2 intensity modulated proton therapy plans. Treatment plan A was optimized based on dosimetric objectives alone (ie, desired target coverage, normal tissue tolerance). Plan B was created employing a published probabilistic optimization method that considered the uncertainties in patient setup and proton range in tissue. Dose distributions and DVH for both plans were calculated for the nominal delivery scenario, as well as for scenarios representing deviations from the nominal setup, and a systematic error in the estimate of range in tissue. The histograms from various scenarios were combined to create DVH bands to illustrate possible deviations from the nominal plan for the expected magnitude of setup and range errors. In the nominal scenario, the DVH from plan A showed superior dose coverage, higher dose homogeneity within the target, and improved sparing of the adjacent critical structure. However, when the dose distributions and DVH from plans A and B were recalculated for different error scenarios (eg, proton range underestimation by 3 mm), the plan quality, reflected by DVH, deteriorated significantly for plan A, while plan B was only minimally affected. In the DVH-band representation, plan A produced wider bands, reflecting its higher vulnerability to delivery errors, and uncertainty in the dosimetric outcome. The results illustrate that comparison of DVH for the nominal scenario alone does not provide any information about the relative sensitivity of dosimetric outcome to delivery uncertainties. Thus, such comparison may be misleading and may result in the selection of an inferior plan for delivery to a patient. A better-informed decision can be made if additional information about possible dosimetric variability is presented; for example, in the form of DVH bands. Copyright © 2012 American Society for Radiation Oncology. Published by Elsevier Inc. All rights reserved.
Object Detection Based on Template Matching through Use of Best-So-Far ABC

PubMed Central

2014-01-01

Best-so-far ABC is a modified version of the artificial bee colony (ABC) algorithm used for optimization tasks. This algorithm is one of the swarm intelligence (SI) algorithms proposed in recent literature, in which the results demonstrated that the best-so-far ABC can produce higher quality solutions with faster convergence than either the ordinary ABC or the current state-of-the-art ABC-based algorithm. In this work, we aim to apply the best-so-far ABC-based approach for object detection based on template matching by using the difference between the RGB level histograms corresponding to the target object and the template object as the objective function. Results confirm that the proposed method was successful in both detecting objects and optimizing the time used to reach the solution. PMID:24812556
Texture-based segmentation of temperate-zone woodland in panchromatic IKONOS imagery

NASA Astrophysics Data System (ADS)

Gagnon, Langis; Bugnet, Pierre; Cavayas, Francois

2003-08-01

We have performed a study to identify optimal texture parameters for woodland segmentation in a highly non-homogeneous urban area from a temperate-zone panchromatic IKONOS image. Texture images are produced with the sum- and difference-histograms depend on two parameters: window size f and displacement step p. The four texture features yielding the best discrimination between classes are the mean, contrast, correlation and standard deviation. The f-p combinations 17-1, 17-2, 35-1 and 35-2 are those which give the best performance, with an average classification rate of 90%.
Histogram-based normalization technique on human brain magnetic resonance images from different acquisitions.

PubMed

Sun, Xiaofei; Shi, Lin; Luo, Yishan; Yang, Wei; Li, Hongpeng; Liang, Peipeng; Li, Kuncheng; Mok, Vincent C T; Chu, Winnie C W; Wang, Defeng

2015-07-28

Intensity normalization is an important preprocessing step in brain magnetic resonance image (MRI) analysis. During MR image acquisition, different scanners or parameters would be used for scanning different subjects or the same subject at a different time, which may result in large intensity variations. This intensity variation will greatly undermine the performance of subsequent MRI processing and population analysis, such as image registration, segmentation, and tissue volume measurement. In this work, we proposed a new histogram normalization method to reduce the intensity variation between MRIs obtained from different acquisitions. In our experiment, we scanned each subject twice on two different scanners using different imaging parameters. With noise estimation, the image with lower noise level was determined and treated as the high-quality reference image. Then the histogram of the low-quality image was normalized to the histogram of the high-quality image. The normalization algorithm includes two main steps: (1) intensity scaling (IS), where, for the high-quality reference image, the intensities of the image are first rescaled to a range between the low intensity region (LIR) value and the high intensity region (HIR) value; and (2) histogram normalization (HN),where the histogram of low-quality image as input image is stretched to match the histogram of the reference image, so that the intensity range in the normalized image will also lie between LIR and HIR. We performed three sets of experiments to evaluate the proposed method, i.e., image registration, segmentation, and tissue volume measurement, and compared this with the existing intensity normalization method. It is then possible to validate that our histogram normalization framework can achieve better results in all the experiments. It is also demonstrated that the brain template with normalization preprocessing is of higher quality than the template with no normalization processing. We have proposed a histogram-based MRI intensity normalization method. The method can normalize scans which were acquired on different MRI units. We have validated that the method can greatly improve the image analysis performance. Furthermore, it is demonstrated that with the help of our normalization method, we can create a higher quality Chinese brain template.
Hybrid Quantum-Classical Approach to Quantum Optimal Control.

PubMed

Li, Jun; Yang, Xiaodong; Peng, Xinhua; Sun, Chang-Pu

2017-04-14

A central challenge in quantum computing is to identify more computational problems for which utilization of quantum resources can offer significant speedup. Here, we propose a hybrid quantum-classical scheme to tackle the quantum optimal control problem. We show that the most computationally demanding part of gradient-based algorithms, namely, computing the fitness function and its gradient for a control input, can be accomplished by the process of evolution and measurement on a quantum simulator. By posing queries to and receiving answers from the quantum simulator, classical computing devices update the control parameters until an optimal control solution is found. To demonstrate the quantum-classical scheme in experiment, we use a seven-qubit nuclear magnetic resonance system, on which we have succeeded in optimizing state preparation without involving classical computation of the large Hilbert space evolution.

Image Enhancement via Subimage Histogram Equalization Based on Mean and Variance

PubMed Central

2017-01-01

This paper puts forward a novel image enhancement method via Mean and Variance based Subimage Histogram Equalization (MVSIHE), which effectively increases the contrast of the input image with brightness and details well preserved compared with some other methods based on histogram equalization (HE). Firstly, the histogram of input image is divided into four segments based on the mean and variance of luminance component, and the histogram bins of each segment are modified and equalized, respectively. Secondly, the result is obtained via the concatenation of the processed subhistograms. Lastly, the normalization method is deployed on intensity levels, and the integration of the processed image with the input image is performed. 100 benchmark images from a public image database named CVG-UGR-Database are used for comparison with other state-of-the-art methods. The experiment results show that the algorithm can not only enhance image information effectively but also well preserve brightness and details of the original image. PMID:29403529
Image contrast enhancement using adjacent-blocks-based modification for local histogram equalization

NASA Astrophysics Data System (ADS)

Wang, Yang; Pan, Zhibin

2017-11-01

Infrared images usually have some non-ideal characteristics such as weak target-to-background contrast and strong noise. Because of these characteristics, it is necessary to apply the contrast enhancement algorithm to improve the visual quality of infrared images. Histogram equalization (HE) algorithm is a widely used contrast enhancement algorithm due to its effectiveness and simple implementation. But a drawback of HE algorithm is that the local contrast of an image cannot be equally enhanced. Local histogram equalization algorithms are proved to be the effective techniques for local image contrast enhancement. However, over-enhancement of noise and artifacts can be easily found in the local histogram equalization enhanced images. In this paper, a new contrast enhancement technique based on local histogram equalization algorithm is proposed to overcome the drawbacks mentioned above. The input images are segmented into three kinds of overlapped sub-blocks using the gradients of them. To overcome the over-enhancement effect, the histograms of these sub-blocks are then modified by adjacent sub-blocks. We pay more attention to improve the contrast of detail information while the brightness of the flat region in these sub-blocks is well preserved. It will be shown that the proposed algorithm outperforms other related algorithms by enhancing the local contrast without introducing over-enhancement effects and additional noise.
Value of MR histogram analyses for prediction of microvascular invasion of hepatocellular carcinoma.

PubMed

Huang, Ya-Qin; Liang, He-Yue; Yang, Zhao-Xia; Ding, Ying; Zeng, Meng-Su; Rao, Sheng-Xiang

2016-06-01

The objective is to explore the value of preoperative magnetic resonance (MR) histogram analyses in predicting microvascular invasion (MVI) of hepatocellular carcinoma (HCC).Fifty-one patients with histologically confirmed HCC who underwent diffusion-weighted and contrast-enhanced MR imaging were included. Histogram analyses were performed and mean, variance, skewness, kurtosis, 1th, 10th, 50th, 90th, and 99th percentiles were derived. Quantitative histogram parameters were compared between HCCs with and without MVI. Receiver operating characteristics (ROC) analyses were generated to compare the diagnostic performance of tumor size, histogram analyses of apparent diffusion coefficient (ADC) maps, and MR enhancement.The mean, 1th, 10th, and 50th percentiles of ADC maps, and the mean, variance. 1th, 10th, 50th, 90th, and 99th percentiles of the portal venous phase (PVP) images were significantly different between the groups with and without MVI (P <0.05), with area under the ROC curves (AUCs) of 0.66 to 0.74 for ADC and 0.76 to 0.88 for PVP. The largest AUC of PVP (1th percentile) showed significantly higher accuracy compared with that of arterial phase (AP) or tumor size (P <0.001).MR histogram analyses-in particular for 1th percentile for PVP images-held promise for prediction of MVI of HCC.
Histogram analysis of apparent diffusion coefficient maps for differentiating primary CNS lymphomas from tumefactive demyelinating lesions.

PubMed

Lu, Shan Shan; Kim, Sang Joon; Kim, Namkug; Kim, Ho Sung; Choi, Choong Gon; Lim, Young Min

2015-04-01

This study intended to investigate the usefulness of histogram analysis of apparent diffusion coefficient (ADC) maps for discriminating primary CNS lymphomas (PCNSLs), especially atypical PCNSLs, from tumefactive demyelinating lesions (TDLs). Forty-seven patients with PCNSLs and 18 with TDLs were enrolled in our study. Hyperintense lesions seen on T2-weighted images were defined as ROIs after ADC maps were registered to the corresponding T2-weighted image. ADC histograms were calculated from the ROIs containing the entire lesion on every section and on a voxel-by-voxel basis. The ADC histogram parameters were compared among all PCNSLs and TDLs as well as between the subgroup of atypical PCNSLs and TDLs. ROC curves were constructed to evaluate the diagnostic performance of the histogram parameters and to determine the optimum thresholds. The differences between the PCNSLs and TDLs were found in the minimum ADC values (ADCmin) and in the 5th and 10th percentiles (ADC5% and ADC10%) of the cumulative ADC histograms. However, no statistical significance was found in the mean ADC value or in the ADC value concerning the mode, kurtosis, and skewness. The ADCmin, ADC5%, and ADC10% were also lower in atypical PCNSLs than in TDLs. ADCmin was the best indicator for discriminating atypical PCNSLs from TDLs, with a threshold of 556×10(-6) mm2/s (sensitivity, 81.3 %; specificity, 88.9%). Histogram analysis of ADC maps may help to discriminate PCNSLs from TDLs and may be particularly useful in differentiating atypical PCNSLs from TDLs.
Assessment of histological differentiation in gastric cancers using whole-volume histogram analysis of apparent diffusion coefficient maps.

PubMed

Zhang, Yujuan; Chen, Jun; Liu, Song; Shi, Hua; Guan, Wenxian; Ji, Changfeng; Guo, Tingting; Zheng, Huanhuan; Guan, Yue; Ge, Yun; He, Jian; Zhou, Zhengyang; Yang, Xiaofeng; Liu, Tian

2017-02-01

To investigate the efficacy of histogram analysis of the entire tumor volume in apparent diffusion coefficient (ADC) maps for differentiating between histological grades in gastric cancer. Seventy-eight patients with gastric cancer were enrolled in a retrospective 3.0T magnetic resonance imaging (MRI) study. ADC maps were obtained at two different b values (0 and 1000 sec/mm 2 ) for each patient. Tumors were delineated on each slice of the ADC maps, and a histogram for the entire tumor volume was subsequently generated. A series of histogram parameters (eg, skew and kurtosis) were calculated and correlated with the histological grade of the surgical specimen. The diagnostic performance of each parameter for distinguishing poorly from moderately well-differentiated gastric cancers was assessed by using the area under the receiver operating characteristic curve (AUC). There were significant differences in the 5 th , 10 th , 25 th , and 50 th percentiles, skew, and kurtosis between poorly and well-differentiated gastric cancers (P < 0.05). There were correlations between the degrees of differentiation and histogram parameters, including the 10 th percentile, skew, kurtosis, and max frequency; the correlation coefficients were 0.273, -0.361, -0.339, and -0.370, respectively. Among all the histogram parameters, the max frequency had the largest AUC value, which was 0.675. Histogram analysis of the ADC maps on the basis of the entire tumor volume can be useful in differentiating between histological grades for gastric cancer. 4 J. Magn. Reson. Imaging 2017;45:440-449. © 2016 International Society for Magnetic Resonance in Medicine.
Macronuclear chromatin structure dynamics in Colpoda inflata (Protista, Ciliophora) resting encystment.

PubMed

Tiano, L; Chessa, M G; Carrara, S; Tagliafierro, G; Delmonte Corrado, M U

1999-01-01

The chromatin structure dynamics of the Colpoda inflata macronucleus have been investigated in relation to its functional condition, concerning chromatin body extrusion regulating activity. Samples of 2- and 25-day-old resting cysts derived from a standard culture, and of 1-year-old resting cysts derived from a senescent culture, were examined by means of histogram analysis performed on acquired optical microscopy images. Three groups of histograms were detected in each sample. Histogram classification, clustering and matching were assessed in order to obtain the mean histogram of each group. Comparative analysis of the mean histogram showed a similarity in the grey level range of 25-day- and 1-year-old cysts, unlike the wider grey level range found in 2-day-old cysts. Moreover, the respective mean histograms of the three cyst samples appeared rather similar in shape. All this implies that macronuclear chromatin structural features of 1-year-old cysts are common to both cyst standard cultures. The evaluation of the acquired images and their respective histograms evidenced a dynamic state of the macronuclear chromatin, appearing differently condensed in relation to the chromatin body extrusion regulating activity of the macronucleus. The coexistence of a chromatin-decondensed macronucleus with a pycnotic extrusion body suggests that chromatin unable to decondense, thus inactive, is extruded. This finding, along with the presence of chromatin structural features common to standard and senescent cyst populations, supports the occurrence of 'rejuvenated' cell lines from 1-year-old encysted senescent cells, a phenomenon which could be a result of accomplished macronuclear renewal.
Fast Query-Optimized Kernel-Machine Classification

NASA Technical Reports Server (NTRS)

Mazzoni, Dominic; DeCoste, Dennis

2004-01-01

A recently developed algorithm performs kernel-machine classification via incremental approximate nearest support vectors. The algorithm implements support-vector machines (SVMs) at speeds 10 to 100 times those attainable by use of conventional SVM algorithms. The algorithm offers potential benefits for classification of images, recognition of speech, recognition of handwriting, and diverse other applications in which there are requirements to discern patterns in large sets of data. SVMs constitute a subset of kernel machines (KMs), which have become popular as models for machine learning and, more specifically, for automated classification of input data on the basis of labeled training data. While similar in many ways to k-nearest-neighbors (k-NN) models and artificial neural networks (ANNs), SVMs tend to be more accurate. Using representations that scale only linearly in the numbers of training examples, while exploring nonlinear (kernelized) feature spaces that are exponentially larger than the original input dimensionality, KMs elegantly and practically overcome the classic curse of dimensionality. However, the price that one must pay for the power of KMs is that query-time complexity scales linearly with the number of training examples, making KMs often orders of magnitude more computationally expensive than are ANNs, decision trees, and other popular machine learning alternatives. The present algorithm treats an SVM classifier as a special form of a k-NN. The algorithm is based partly on an empirical observation that one can often achieve the same classification as that of an exact KM by using only small fraction of the nearest support vectors (SVs) of a query. The exact KM output is a weighted sum over the kernel values between the query and the SVs. In this algorithm, the KM output is approximated with a k-NN classifier, the output of which is a weighted sum only over the kernel values involving k selected SVs. Before query time, there are gathered statistics about how misleading the output of the k-NN model can be, relative to the outputs of the exact KM for a representative set of examples, for each possible k from 1 to the total number of SVs. From these statistics, there are derived upper and lower thresholds for each step k. These thresholds identify output levels for which the particular variant of the k-NN model already leans so strongly positively or negatively that a reversal in sign is unlikely, given the weaker SV neighbors still remaining. At query time, the partial output of each query is incrementally updated, stopping as soon as it exceeds the predetermined statistical thresholds of the current step. For an easy query, stopping can occur as early as step k = 1. For more difficult queries, stopping might not occur until nearly all SVs are touched. A key empirical observation is that this approach can tolerate very approximate nearest-neighbor orderings. In experiments, SVs and queries were projected to a subspace comprising the top few principal- component dimensions and neighbor orderings were computed in that subspace. This approach ensured that the overhead of the nearest-neighbor computations was insignificant, relative to that of the exact KM computation.
Non-small cell lung cancer: Whole-lesion histogram analysis of the apparent diffusion coefficient for assessment of tumor grade, lymphovascular invasion and pleural invasion

PubMed Central

Tsuchiya, Naoko; Doai, Mariko; Usuda, Katsuo; Uramoto, Hidetaka

2017-01-01

Purpose Investigating the diagnostic accuracy of histogram analyses of apparent diffusion coefficient (ADC) values for determining non-small cell lung cancer (NSCLC) tumor grades, lymphovascular invasion, and pleural invasion. Materials and methods We studied 60 surgically diagnosed NSCLC patients. Diffusion-weighted imaging (DWI) was performed in the axial plane using a navigator-triggered single-shot, echo-planar imaging sequence with prospective acquisition correction. The ADC maps were generated, and we placed a volume-of-interest on the tumor to construct the whole-lesion histogram. Using the histogram, we calculated the mean, 5th, 10th, 25th, 50th, 75th, 90th, and 95th percentiles of ADC, skewness, and kurtosis. Histogram parameters were correlated with tumor grade, lymphovascular invasion, and pleural invasion. We performed a receiver operating characteristics (ROC) analysis to assess the diagnostic performance of histogram parameters for distinguishing different pathologic features. Results The ADC mean, 10th, 25th, 50th, 75th, 90th, and 95th percentiles showed significant differences among the tumor grades. The ADC mean, 25th, 50th, 75th, 90th, and 95th percentiles were significant histogram parameters between high- and low-grade tumors. The ROC analysis between high- and low-grade tumors showed that the 95th percentile ADC achieved the highest area under curve (AUC) at 0.74. Lymphovascular invasion was associated with the ADC mean, 50th, 75th, 90th, and 95th percentiles, skewness, and kurtosis. Kurtosis achieved the highest AUC at 0.809. Pleural invasion was only associated with skewness, with the AUC of 0.648. Conclusions ADC histogram analyses on the basis of the entire tumor volume are able to stratify NSCLCs' tumor grade, lymphovascular invasion and pleural invasion. PMID:28207858
Development and evaluation of a biomedical search engine using a predicate-based vector space model.

PubMed

Kwak, Myungjae; Leroy, Gondy; Martinez, Jesse D; Harwell, Jeffrey

2013-10-01

Although biomedical information available in articles and patents is increasing exponentially, we continue to rely on the same information retrieval methods and use very few keywords to search millions of documents. We are developing a fundamentally different approach for finding much more precise and complete information with a single query using predicates instead of keywords for both query and document representation. Predicates are triples that are more complex datastructures than keywords and contain more structured information. To make optimal use of them, we developed a new predicate-based vector space model and query-document similarity function with adjusted tf-idf and boost function. Using a test bed of 107,367 PubMed abstracts, we evaluated the first essential function: retrieving information. Cancer researchers provided 20 realistic queries, for which the top 15 abstracts were retrieved using a predicate-based (new) and keyword-based (baseline) approach. Each abstract was evaluated, double-blind, by cancer researchers on a 0-5 point scale to calculate precision (0 versus higher) and relevance (0-5 score). Precision was significantly higher (p<.001) for the predicate-based (80%) than for the keyword-based (71%) approach. Relevance was almost doubled with the predicate-based approach-2.1 versus 1.6 without rank order adjustment (p<.001) and 1.34 versus 0.98 with rank order adjustment (p<.001) for predicate--versus keyword-based approach respectively. Predicates can support more precise searching than keywords, laying the foundation for rich and sophisticated information search. Copyright © 2013 Elsevier Inc. All rights reserved.
BioMart: a data federation framework for large collaborative projects.

PubMed

Zhang, Junjun; Haider, Syed; Baran, Joachim; Cros, Anthony; Guberman, Jonathan M; Hsu, Jack; Liang, Yong; Yao, Long; Kasprzyk, Arek

2011-01-01

BioMart is a freely available, open source, federated database system that provides a unified access to disparate, geographically distributed data sources. It is designed to be data agnostic and platform independent, such that existing databases can easily be incorporated into the BioMart framework. BioMart allows databases hosted on different servers to be presented seamlessly to users, facilitating collaborative projects between different research groups. BioMart contains several levels of query optimization to efficiently manage large data sets and offers a diverse selection of graphical user interfaces and application programming interfaces to ensure that queries can be performed in whatever manner is most convenient for the user. The software has now been adopted by a large number of different biological databases spanning a wide range of data types and providing a rich source of annotation available to bioinformaticians and biologists alike.
An Information Retrieval and Recommendation System for Astronomical Observatories

NASA Astrophysics Data System (ADS)

Mukund, Nikhil; Thakur, Saurabh; Abraham, Sheelu; Aniyan, A. K.; Mitra, Sanjit; Sajeeth Philip, Ninan; Vaghmare, Kaustubh; Acharjya, D. P.

2018-03-01

We present a machine-learning-based information retrieval system for astronomical observatories that tries to address user-defined queries related to an instrument. In the modern instrumentation scenario where heterogeneous systems and talents are simultaneously at work, the ability to supply people with the right information helps speed up the tasks for detector operation, maintenance, and upgradation. The proposed method analyzes existing documented efforts at the site to intelligently group related information to a query and to present it online to the user. The user in response can probe the suggested content and explore previously developed solutions or probable ways to address the present situation optimally. We demonstrate natural language-processing-backed knowledge rediscovery by making use of the open source logbook data from the Laser Interferometric Gravitational Observatory (LIGO). We implement and test a web application that incorporates the above idea for LIGO Livingston, LIGO Hanford, and Virgo observatories.
Pse-Analysis: a python package for DNA/RNA and protein/ peptide sequence analysis based on pseudo components and kernel methods.

PubMed

Liu, Bin; Wu, Hao; Zhang, Deyuan; Wang, Xiaolong; Chou, Kuo-Chen

2017-02-21

To expedite the pace in conducting genome/proteome analysis, we have developed a Python package called Pse-Analysis. The powerful package can automatically complete the following five procedures: (1) sample feature extraction, (2) optimal parameter selection, (3) model training, (4) cross validation, and (5) evaluating prediction quality. All the work a user needs to do is to input a benchmark dataset along with the query biological sequences concerned. Based on the benchmark dataset, Pse-Analysis will automatically construct an ideal predictor, followed by yielding the predicted results for the submitted query samples. All the aforementioned tedious jobs can be automatically done by the computer. Moreover, the multiprocessing technique was adopted to enhance computational speed by about 6 folds. The Pse-Analysis Python package is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/Pse-Analysis/, and can be directly run on Windows, Linux, and Unix.
An evaluation of multi-probe locality sensitive hashing for computing similarities over web-scale query logs

PubMed Central

2018-01-01

Many modern applications of AI such as web search, mobile browsing, image processing, and natural language processing rely on finding similar items from a large database of complex objects. Due to the very large scale of data involved (e.g., users’ queries from commercial search engines), computing such near or nearest neighbors is a non-trivial task, as the computational cost grows significantly with the number of items. To address this challenge, we adopt Locality Sensitive Hashing (a.k.a, LSH) methods and evaluate four variants in a distributed computing environment (specifically, Hadoop). We identify several optimizations which improve performance, suitable for deployment in very large scale settings. The experimental results demonstrate our variants of LSH achieve the robust performance with better recall compared with “vanilla” LSH, even when using the same amount of space. PMID:29346410
Role of the parameters involved in the plan optimization based on the generalized equivalent uniform dose and radiobiological implications

NASA Astrophysics Data System (ADS)

Widesott, L.; Strigari, L.; Pressello, M. C.; Benassi, M.; Landoni, V.

2008-03-01

We investigated the role and the weight of the parameters involved in the intensity modulated radiation therapy (IMRT) optimization based on the generalized equivalent uniform dose (gEUD) method, for prostate and head-and-neck plans. We systematically varied the parameters (gEUDmax and weight) involved in the gEUD-based optimization of rectal wall and parotid glands. We found that the proper value of weight factor, still guaranteeing planning treatment volumes coverage, produced similar organs at risks dose-volume (DV) histograms for different gEUDmax with fixed a = 1. Most of all, we formulated a simple relation that links the reference gEUDmax and the associated weight factor. As secondary objective, we evaluated plans obtained with the gEUD-based optimization and ones based on DV criteria, using the normal tissue complication probability (NTCP) models. gEUD criteria seemed to improve sparing of rectum and parotid glands with respect to DV-based optimization: the mean dose, the V40 and V50 values to the rectal wall were decreased of about 10%, the mean dose to parotids decreased of about 20-30%. But more than the OARs sparing, we underlined the halving of the OARs optimization time with the implementation of the gEUD-based cost function. Using NTCP models we enhanced differences between the two optimization criteria for parotid glands, but no for rectum wall.
Explicit optimization of plan quality measures in intensity-modulated radiation therapy treatment planning.

PubMed

Engberg, Lovisa; Forsgren, Anders; Eriksson, Kjell; Hårdemark, Björn

2017-06-01

To formulate convex planning objectives of treatment plan multicriteria optimization with explicit relationships to the dose-volume histogram (DVH) statistics used in plan quality evaluation. Conventional planning objectives are designed to minimize the violation of DVH statistics thresholds using penalty functions. Although successful in guiding the DVH curve towards these thresholds, conventional planning objectives offer limited control of the individual points on the DVH curve (doses-at-volume) used to evaluate plan quality. In this study, we abandon the usual penalty-function framework and propose planning objectives that more closely relate to DVH statistics. The proposed planning objectives are based on mean-tail-dose, resulting in convex optimization. We also demonstrate how to adapt a standard optimization method to the proposed formulation in order to obtain a substantial reduction in computational cost. We investigated the potential of the proposed planning objectives as tools for optimizing DVH statistics through juxtaposition with the conventional planning objectives on two patient cases. Sets of treatment plans with differently balanced planning objectives were generated using either the proposed or the conventional approach. Dominance in the sense of better distributed doses-at-volume was observed in plans optimized within the proposed framework. The initial computational study indicates that the DVH statistics are better optimized and more efficiently balanced using the proposed planning objectives than using the conventional approach. © 2017 American Association of Physicists in Medicine.
An Energy-Efficient Approach to Enhance Virtual Sensors Provisioning in Sensor Clouds Environments

PubMed Central

Filho, Raimir Holanda; Rabêlo, Ricardo de Andrade L.; de Carvalho, Carlos Giovanni N.; Mendes, Douglas Lopes de S.; Costa, Valney da Gama

2018-01-01

Virtual sensors provisioning is a central issue for sensors cloud middleware since it is responsible for selecting physical nodes, usually from Wireless Sensor Networks (WSN) of different owners, to handle user’s queries or applications. Recent works perform provisioning by clustering sensor nodes based on the correlation measurements and then selecting as few nodes as possible to preserve WSN energy. However, such works consider only homogeneous nodes (same set of sensors). Therefore, those works are not entirely appropriate for sensor clouds, which in most cases comprises heterogeneous sensor nodes. In this paper, we propose ACxSIMv2, an approach to enhance the provisioning task by considering heterogeneous environments. Two main algorithms form ACxSIMv2. The first one, ACASIMv1, creates multi-dimensional clusters of sensor nodes, taking into account the measurements correlations instead of the physical distance between nodes like most works on literature. Then, the second algorithm, ACOSIMv2, based on an Ant Colony Optimization system, selects an optimal set of sensors nodes from to respond user’s queries while attending all parameters and preserving the overall energy consumption. Results from initial experiments show that the approach reduces significantly the sensor cloud energy consumption compared to traditional works, providing a solution to be considered in sensor cloud scenarios. PMID:29495406
An Energy-Efficient Approach to Enhance Virtual Sensors Provisioning in Sensor Clouds Environments.

PubMed

Lemos, Marcus Vinícius de S; Filho, Raimir Holanda; Rabêlo, Ricardo de Andrade L; de Carvalho, Carlos Giovanni N; Mendes, Douglas Lopes de S; Costa, Valney da Gama

2018-02-26

Virtual sensors provisioning is a central issue for sensors cloud middleware since it is responsible for selecting physical nodes, usually from Wireless Sensor Networks (WSN) of different owners, to handle user's queries or applications. Recent works perform provisioning by clustering sensor nodes based on the correlation measurements and then selecting as few nodes as possible to preserve WSN energy. However, such works consider only homogeneous nodes (same set of sensors). Therefore, those works are not entirely appropriate for sensor clouds, which in most cases comprises heterogeneous sensor nodes. In this paper, we propose ACxSIMv2, an approach to enhance the provisioning task by considering heterogeneous environments. Two main algorithms form ACxSIMv2. The first one, ACASIMv1, creates multi-dimensional clusters of sensor nodes, taking into account the measurements correlations instead of the physical distance between nodes like most works on literature. Then, the second algorithm, ACOSIMv2, based on an Ant Colony Optimization system, selects an optimal set of sensors nodes from to respond user's queries while attending all parameters and preserving the overall energy consumption. Results from initial experiments show that the approach reduces significantly the sensor cloud energy consumption compared to traditional works, providing a solution to be considered in sensor cloud scenarios.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Critchlow, Terence J.; Abdulla, Ghaleb; Becla, Jacek

Data management is the organization of information to support efficient access and analysis. For data intensive computing applications, the speed at which relevant data can be accessed is a limiting factor in terms of the size and complexity of computation that can be performed. Data access speed is impacted by the size of the relevant subset of the data, the complexity of the query used to define it, and the layout of the data relative to the query. As the underlying data sets become increasingly complex, the questions asked of it become more involved as well. For example, geospatial datamore » associated with a city is no longer limited to the map data representing its streets, but now also includes layers identifying utility lines, key points, locations and types of businesses within the city limits, tax information for each land parcel, satellite imagery, and possibly even street-level views. As a result, queries have gone from simple questions, such as "how long is Main Street?", to much more complex questions such as "taking all other factors into consideration, are the property values of houses near parks higher than those under power lines, and if so, by what percentage". Answering these questions requires a coherent infrastructure, integrating the relevant data into a format optimized for the questions being asked.« less
Clustering and Flow Conservation Monitoring Tool for Software Defined Networks.

PubMed

Puente Fernández, Jesús Antonio; García Villalba, Luis Javier; Kim, Tai-Hoon

2018-04-03

Prediction systems present some challenges on two fronts: the relation between video quality and observed session features and on the other hand, dynamics changes on the video quality. Software Defined Networks (SDN) is a new concept of network architecture that provides the separation of control plane (controller) and data plane (switches) in network devices. Due to the existence of the southbound interface, it is possible to deploy monitoring tools to obtain the network status and retrieve a statistics collection. Therefore, achieving the most accurate statistics depends on a strategy of monitoring and information requests of network devices. In this paper, we propose an enhanced algorithm for requesting statistics to measure the traffic flow in SDN networks. Such an algorithm is based on grouping network switches in clusters focusing on their number of ports to apply different monitoring techniques. Such grouping occurs by avoiding monitoring queries in network switches with common characteristics and then, by omitting redundant information. In this way, the present proposal decreases the number of monitoring queries to switches, improving the network traffic and preventing the switching overload. We have tested our optimization in a video streaming simulation using different types of videos. The experiments and comparison with traditional monitoring techniques demonstrate the feasibility of our proposal maintaining similar values decreasing the number of queries to the switches.
EnsMart: A Generic System for Fast and Flexible Access to Biological Data

PubMed Central

Kasprzyk, Arek; Keefe, Damian; Smedley, Damian; London, Darin; Spooner, William; Melsopp, Craig; Hammond, Martin; Rocca-Serra, Philippe; Cox, Tony; Birney, Ewan

2004-01-01

The EnsMart system (www.ensembl.org/EnsMart) provides a generic data warehousing solution for fast and flexible querying of large biological data sets and integration with third-party data and tools. The system consists of a query-optimized database and interactive, user-friendly interfaces. EnsMart has been applied to Ensembl, where it extends its genomic browser capabilities, facilitating rapid retrieval of customized data sets. A wide variety of complex queries, on various types of annotations, for numerous species are supported. These can be applied to many research problems, ranging from SNP selection for candidate gene screening, through cross-species evolutionary comparisons, to microarray annotation. Users can group and refine biological data according to many criteria, including cross-species analyses, disease links, sequence variations, and expression patterns. Both tabulated list data and biological sequence output can be generated dynamically, in HTML, text, Microsoft Excel, and compressed formats. A wide range of sequence types, such as cDNA, peptides, coding regions, UTRs, and exons, with additional upstream and downstream regions, can be retrieved. The EnsMart database can be accessed via a public Web site, or through a Java application suite. Both implementations and the database are freely available for local installation, and can be extended or adapted to `non-Ensembl' data sets. PMID:14707178

DWI-associated entire-tumor histogram analysis for the differentiation of low-grade prostate cancer from intermediate-high-grade prostate cancer.

PubMed

Wu, Chen-Jiang; Wang, Qing; Li, Hai; Wang, Xiao-Ning; Liu, Xi-Sheng; Shi, Hai-Bin; Zhang, Yu-Dong

2015-10-01

To investigate diagnostic efficiency of DWI using entire-tumor histogram analysis in differentiating the low-grade (LG) prostate cancer (PCa) from intermediate-high-grade (HG) PCa in comparison with conventional ROI-based measurement. DW images (b of 0-1400 s/mm(2)) from 126 pathology-confirmed PCa (diameter >0.5 cm) in 110 patients were retrospectively collected and processed by mono-exponential model. The measurement of tumor apparent diffusion coefficients (ADCs) was performed with using histogram-based and ROI-based approach, respectively. The diagnostic ability of ADCs from two methods for differentiating LG-PCa (Gleason score, GS ≤ 6) from HG-PCa (GS > 6) was determined by ROC regression, and compared by McNemar's test. There were 49 LG-tumor and 77 HG-tumor at pathologic findings. Histogram-based ADCs (mean, median, 10th and 90th) and ROI-based ADCs (mean) showed dominant relationships with ordinal GS of Pca (ρ = -0.225 to -0.406, p < 0.05). All above imaging indices reflected significant difference between LG-PCa and HG-PCa (all p values <0.01). Histogram 10th ADCs had dominantly high Az (0.738), Youden index (0.415), and positive likelihood ratio (LR+, 2.45) in stratifying tumor GS against mean, median and 90th ADCs, and ROI-based ADCs. Histogram mean, median, and 10th ADCs showed higher specificity (65.3%-74.1% vs. 44.9%, p < 0.01), but lower sensitivity (57.1%-71.3% vs. 84.4%, p < 0.05) than ROI-based ADCs in differentiating LG-PCa from HG-PCa. DWI-associated histogram analysis had higher specificity, Az, Youden index, and LR+ for differentiation of PCa Gleason grade than ROI-based approach.
Dynamic contrast-enhanced MR imaging of the rectum: Correlations between single-section and whole-tumor histogram analyses.

PubMed

Choi, M H; Oh, S N; Park, G E; Yeo, D-M; Jung, S E

2018-05-10

To evaluate the interobserver and intermethod correlations of histogram metrics of dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) parameters acquired by multiple readers using the single-section and whole-tumor volume methods. Four DCE parameters (K trans , K ep , V e , V p ) were evaluated in 45 patients (31 men and 14 women; mean age, 61±11 years [range, 29-83 years]) with locally advanced rectal cancer using pre-chemoradiotherapy (CRT) MRI. Ten histogram metrics were extracted using two methods of lesion selection performed by three radiologists: the whole-tumor volume method for the whole tumor on axial section-by-section images and the single-section method for the entire area of the tumor on one axial image. The interobserver and intermethod correlations were evaluated using the intraclass correlation coefficients (ICCs). The ICCs showed excellent interobserver and intermethod correlations in most of histogram metrics of the DCE parameters. The ICCs among the three readers were > 0.7 (P<0.001) for all histogram metrics, except for the minimum and maximum. The intermethod correlations for most of the histogram metrics were excellent for each radiologist, regardless of the differences in the radiologists' experience. The interobserver and intermethod correlations for most of the histogram metrics of the DCE parameters are excellent in rectal cancer. Therefore, the single-section method may be a potential alternative to the whole-tumor volume method using pre-CRT MRI, despite the fact that the high agreement between the two methods cannot be extrapolated to post-CRT MRI. Copyright © 2018 Société française de radiologie. Published by Elsevier Masson SAS. All rights reserved.
Measuring the apparent diffusion coefficient in primary rectal tumors: is there a benefit in performing histogram analyses?

PubMed

van Heeswijk, Miriam M; Lambregts, Doenja M J; Maas, Monique; Lahaye, Max J; Ayas, Z; Slenter, Jos M G M; Beets, Geerard L; Bakers, Frans C H; Beets-Tan, Regina G H

2017-06-01

The apparent diffusion coefficient (ADC) is a potential prognostic imaging marker in rectal cancer. Typically, mean ADC values are used, derived from precise manual whole-volume tumor delineations by experts. The aim was first to explore whether non-precise circular delineation combined with histogram analysis can be a less cumbersome alternative to acquire similar ADC measurements and second to explore whether histogram analyses provide additional prognostic information. Thirty-seven patients who underwent a primary staging MRI including diffusion-weighted imaging (DWI; b0, 25, 50, 100, 500, 1000; 1.5 T) were included. Volumes-of-interest (VOIs) were drawn on b1000-DWI: (a) precise delineation, manually tracing tumor boundaries (2 expert readers), and (b) non-precise delineation, drawing circular VOIs with a wide margin around the tumor (2 non-experts). Mean ADC and histogram metrics (mean, min, max, median, SD, skewness, kurtosis, 5th-95th percentiles) were derived from the VOIs and delineation time was recorded. Measurements were compared between the two methods and correlated with prognostic outcome parameters. Median delineation time reduced from 47-165 s (precise) to 21-43 s (non-precise). The 45th percentile of the non-precise delineation showed the best correlation with the mean ADC from the precise delineation as the reference standard (ICC 0.71-0.75). None of the mean ADC or histogram parameters showed significant prognostic value; only the total tumor volume (VOI) was significantly larger in patients with positive clinical N stage and mesorectal fascia involvement. When performing non-precise tumor delineation, histogram analysis (in specific 45th ADC percentile) may be used as an alternative to obtain similar ADC values as with precise whole tumor delineation. Histogram analyses are not beneficial to obtain additional prognostic information.
The histogram analysis of diffusion-weighted intravoxel incoherent motion (IVIM) imaging for differentiating the gleason grade of prostate cancer.

PubMed

Zhang, Yu-Dong; Wang, Qing; Wu, Chen-Jiang; Wang, Xiao-Ning; Zhang, Jing; Liu, Hui; Liu, Xi-Sheng; Shi, Hai-Bin

2015-04-01

To evaluate histogram analysis of intravoxel incoherent motion (IVIM) for discriminating the Gleason grade of prostate cancer (PCa). A total of 48 patients pathologically confirmed as having clinically significant PCa (size > 0.5 cm) underwent preoperative DW-MRI (b of 0-900 s/mm(2)). Data was post-processed by monoexponential and IVIM model for quantitation of apparent diffusion coefficients (ADCs), perfusion fraction f, diffusivity D and pseudo-diffusivity D*. Histogram analysis was performed by outlining entire-tumour regions of interest (ROIs) from histological-radiological correlation. The ability of imaging indices to differentiate low-grade (LG, Gleason score (GS) ≤6) from intermediate/high-grade (HG, GS > 6) PCa was analysed by ROC regression. Eleven patients had LG tumours (18 foci) and 37 patients had HG tumours (42 foci) on pathology examination. HG tumours had significantly lower ADCs and D in terms of mean, median, 10th and 75th percentiles, combined with higher histogram kurtosis and skewness for ADCs, D and f, than LG PCa (p < 0.05). Histogram D showed relatively higher correlations (ñ = 0.641-0.668 vs. ADCs: 0.544-0.574) with ordinal GS of PCa; and its mean, median and 10th percentile performed better than ADCs did in distinguishing LG from HG PCa. It is feasible to stratify the pathological grade of PCa by IVIM with histogram metrics. D performed better in distinguishing LG from HG tumour than conventional ADCs. • GS had relatively higher correlation with tumour D than ADCs. • Difference of histogram D among two-grade tumours was statistically significant. • D yielded better individual features in demonstrating tumour grade than ADC. • D* and f failed to determine tumour grade of PCa.
Subtype Differentiation of Small (≤ 4 cm) Solid Renal Mass Using Volumetric Histogram Analysis of DWI at 3-T MRI.

PubMed

Li, Anqin; Xing, Wei; Li, Haojie; Hu, Yao; Hu, Daoyu; Li, Zhen; Kamel, Ihab R

2018-05-29

The purpose of this article is to evaluate the utility of volumetric histogram analysis of apparent diffusion coefficient (ADC) derived from reduced-FOV DWI for small (≤ 4 cm) solid renal mass subtypes at 3-T MRI. This retrospective study included 38 clear cell renal cell carcinomas (RCCs), 16 papillary RCCs, 18 chromophobe RCCs, 13 minimal fat angiomyolipomas (AMLs), and seven oncocytomas evaluated with preoperative MRI. Volumetric ADC maps were generated using all slices of the reduced-FOV DW images to obtain histogram parameters, including mean, median, 10th percentile, 25th percentile, 75th percentile, 90th percentile, and SD ADC values, as well as skewness, kurtosis, and entropy. Comparisons of these parameters were made by one-way ANOVA, t test, and ROC curves analysis. ADC histogram parameters differentiated eight of 10 pairs of renal tumors. Three subtype pairs (clear cell RCC vs papillary RCC, clear cell RCC vs chromophobe RCC, and clear cell RCC vs minimal fat AML) were differentiated by mean ADC. However, five other subtype pairs (clear cell RCC vs oncocytoma, papillary RCC vs minimal fat AML, papillary RCC vs oncocytoma, chromophobe RCC vs minimal fat AML, and chromophobe RCC vs oncocytoma) were differentiated by histogram distribution parameters exclusively (all p < 0.05). Mean ADC, median ADC, 75th and 90th percentile ADC, SD ADC, and entropy of malignant tumors were significantly higher than those of benign tumors (all p < 0.05). Combination of mean ADC with histogram parameters yielded the highest AUC (0.851; sensitivity, 80.0%; specificity, 86.1%). Quantitative volumetric ADC histogram analysis may help differentiate various subtypes of small solid renal tumors, including benign and malignant lesions.
Diffusion-weighted imaging: Apparent diffusion coefficient histogram analysis for detecting pathologic complete response to chemoradiotherapy in locally advanced rectal cancer.

PubMed

Choi, Moon Hyung; Oh, Soon Nam; Rha, Sung Eun; Choi, Joon-Il; Lee, Sung Hak; Jang, Hong Seok; Kim, Jun-Gi; Grimm, Robert; Son, Yohan

2016-07-01

To investigate the usefulness of apparent diffusion coefficient (ADC) values derived from histogram analysis of the whole rectal cancer as a quantitative parameter to evaluate pathologic complete response (pCR) on preoperative magnetic resonance imaging (MRI). We enrolled a total of 86 consecutive patients who had undergone surgery for rectal cancer after neoadjuvant chemoradiotherapy (CRT) at our institution between July 2012 and November 2014. Two radiologists who were blinded to the final pathological results reviewed post-CRT MRI to evaluate tumor stage. Quantitative image analysis was performed using T2 -weighted and diffusion-weighted images independently by two radiologists using dedicated software that performed histogram analysis to assess the distribution of ADC in the whole tumor. After surgery, 16 patients were confirmed to have achieved pCR (18.6%). All parameters from pre- and post-CRT ADC histogram showed good or excellent agreement between two readers. The minimum, 10th, 25th, 50th, and 75th percentile and mean ADC from post-CRT ADC histogram were significantly higher in the pCR group than in the non-pCR group for both readers. The 25th percentile value from ADC histogram in post-CRT MRI had the best diagnostic performance for detecting pCR, with an area under the receiver operating characteristic curve of 0.796. Low percentile values derived from the ADC histogram analysis of rectal cancer on MRI after CRT showed a significant difference between pCR and non-pCR groups, demonstrating the utility of the ADC value as a quantitative and objective marker to evaluate complete pathologic response to preoperative CRT in rectal cancer. J. Magn. Reson. Imaging 2016;44:212-220. © 2015 Wiley Periodicals, Inc.
Serial data acquisition for GEM-2D detector

NASA Astrophysics Data System (ADS)

Kolasinski, Piotr; Pozniak, Krzysztof T.; Czarski, Tomasz; Linczuk, Maciej; Byszuk, Adrian; Chernyshova, Maryna; Juszczyk, Bartlomiej; Kasprowicz, Grzegorz; Wojenski, Andrzej; Zabolotny, Wojciech; Zienkiewicz, Pawel; Mazon, Didier; Malard, Philippe; Herrmann, Albrecht; Vezinet, Didier

2014-11-01

This article debates about data fast acquisition and histogramming method for the X-ray GEM detector. The whole process of histogramming is performed by FPGA chips (Spartan-6 series from Xilinx). The results of the histogramming process are stored in an internal FPGA memory and then sent to PC. In PC data is merged and processed by MATLAB. The structure of firmware functionality implemented in the FPGAs is described. Examples of test measurements and results are presented.
Frequency distribution histograms for the rapid analysis of data

NASA Technical Reports Server (NTRS)

Burke, P. V.; Bullen, B. L.; Poff, K. L.

1988-01-01

The mean and standard error are good representations for the response of a population to an experimental parameter and are frequently used for this purpose. Frequency distribution histograms show, in addition, responses of individuals in the population. Both the statistics and a visual display of the distribution of the responses can be obtained easily using a microcomputer and available programs. The type of distribution shown by the histogram may suggest different mechanisms to be tested.
A Monte Carlo study of the impact of the choice of rectum volume definition on estimates of equivalent uniform doses and the volume parameter

NASA Astrophysics Data System (ADS)

Kvinnsland, Yngve; Muren, Ludvig Paul; Dahl, Olav

2004-08-01

Calculations of normal tissue complication probability (NTCP) values for the rectum are difficult because it is a hollow, non-rigid, organ. Finding the true cumulative dose distribution for a number of treatment fractions requires a CT scan before each treatment fraction. This is labour intensive, and several surrogate distributions have therefore been suggested, such as dose wall histograms, dose surface histograms and histograms for the solid rectum, with and without margins. In this study, a Monte Carlo method is used to investigate the relationships between the cumulative dose distributions based on all treatment fractions and the above-mentioned histograms that are based on one CT scan only, in terms of equivalent uniform dose. Furthermore, the effect of a specific choice of histogram on estimates of the volume parameter of the probit NTCP model was investigated. It was found that the solid rectum and the rectum wall histograms (without margins) gave equivalent uniform doses with an expected value close to the values calculated from the cumulative dose distributions in the rectum wall. With the number of patients available in this study the standard deviations of the estimates of the volume parameter were large, and it was not possible to decide which volume gave the best estimates of the volume parameter, but there were distinct differences in the mean values of the values obtained.
Detection of Local Tumor Recurrence After Definitive Treatment of Head and Neck Squamous Cell Carcinoma: Histogram Analysis of Dynamic Contrast-Enhanced T1-Weighted Perfusion MRI.

PubMed

Choi, Sang Hyun; Lee, Jeong Hyun; Choi, Young Jun; Park, Ji Eun; Sung, Yu Sub; Kim, Namkug; Baek, Jung Hwan

2017-01-01

This study aimed to explore the added value of histogram analysis of the ratio of initial to final 90-second time-signal intensity AUC (AUCR) for differentiating local tumor recurrence from contrast-enhancing scar on follow-up dynamic contrast-enhanced T1-weighted perfusion MRI of patients treated for head and neck squamous cell carcinoma (HNSCC). AUCR histogram parameters were assessed among tumor recurrence (n = 19) and contrast-enhancing scar (n = 27) at primary sites and compared using the t test. ROC analysis was used to determine the best differentiating parameters. The added value of AUCR histogram parameters was assessed when they were added to inconclusive conventional MRI results. Histogram analysis showed statistically significant differences in the 50th, 75th, and 90th percentiles of the AUCR values between the two groups (p < 0.05). The 90th percentile of the AUCR values (AUCR 90 ) was the best predictor of local tumor recurrence (AUC, 0.77; 95% CI, 0.64-0.91) with an estimated cutoff of 1.02. AUCR 90 increased sensitivity by 11.7% over that of conventional MRI alone when added to inconclusive results. Histogram analysis of AUCR can improve the diagnostic yield for local tumor recurrence during surveillance after treatment for HNSCC.
Value of MR histogram analyses for prediction of microvascular invasion of hepatocellular carcinoma

PubMed Central

Huang, Ya-Qin; Liang, He-Yue; Yang, Zhao-Xia; Ding, Ying; Zeng, Meng-Su; Rao, Sheng-Xiang

2016-01-01

Abstract The objective is to explore the value of preoperative magnetic resonance (MR) histogram analyses in predicting microvascular invasion (MVI) of hepatocellular carcinoma (HCC). Fifty-one patients with histologically confirmed HCC who underwent diffusion-weighted and contrast-enhanced MR imaging were included. Histogram analyses were performed and mean, variance, skewness, kurtosis, 1th, 10th, 50th, 90th, and 99th percentiles were derived. Quantitative histogram parameters were compared between HCCs with and without MVI. Receiver operating characteristics (ROC) analyses were generated to compare the diagnostic performance of tumor size, histogram analyses of apparent diffusion coefficient (ADC) maps, and MR enhancement. The mean, 1th, 10th, and 50th percentiles of ADC maps, and the mean, variance. 1th, 10th, 50th, 90th, and 99th percentiles of the portal venous phase (PVP) images were significantly different between the groups with and without MVI (P <0.05), with area under the ROC curves (AUCs) of 0.66 to 0.74 for ADC and 0.76 to 0.88 for PVP. The largest AUC of PVP (1th percentile) showed significantly higher accuracy compared with that of arterial phase (AP) or tumor size (P <0.001). MR histogram analyses—in particular for 1th percentile for PVP images—held promise for prediction of MVI of HCC. PMID:27368028
Effect of respiratory and cardiac gating on the major diffusion-imaging metrics

PubMed Central

Hamaguchi, Hiroyuki; Sugimori, Hiroyuki; Nakanishi, Mitsuhiro; Nakagawa, Shin; Fujiwara, Taro; Yoshida, Hirokazu; Takamori, Sayaka; Shirato, Hiroki

2016-01-01

The effect of respiratory gating on the major diffusion-imaging metrics and that of cardiac gating on mean kurtosis (MK) are not known. For evaluation of whether the major diffusion-imaging metrics—MK, fractional anisotropy (FA), and mean diffusivity (MD) of the brain—varied between gated and non-gated acquisitions, respiratory-gated, cardiac-gated, and non-gated diffusion-imaging of the brain were performed in 10 healthy volunteers. MK, FA, and MD maps were constructed for all acquisitions, and the histograms were constructed. The normalized peak height and location of the histograms were compared among the acquisitions by use of Friedman and post hoc Wilcoxon tests. The effect of the repetition time (TR) on the diffusion-imaging metrics was also tested, and we corrected for its variation among acquisitions, if necessary. The results showed a shift in the peak location of the MK and MD histograms to the right with an increase in TR (p ≤ 0.01). The corrected peak location of the MK histograms, the normalized peak height of the FA histograms, the normalized peak height and the corrected peak location of the MD histograms varied significantly between the gated and non-gated acquisitions (p < 0.05). These results imply an influence of respiration and cardiac pulsation on the major diffusion-imaging metrics. The gating conditions must be kept identical if reproducible results are to be achieved. PMID:27073115
FPFH-based graph matching for 3D point cloud registration

NASA Astrophysics Data System (ADS)

Zhao, Jiapeng; Li, Chen; Tian, Lihua; Zhu, Jihua

2018-04-01

Correspondence detection is a vital step in point cloud registration and it can help getting a reliable initial alignment. In this paper, we put forward an advanced point feature-based graph matching algorithm to solve the initial alignment problem of rigid 3D point cloud registration with partial overlap. Specifically, Fast Point Feature Histograms are used to determine the initial possible correspondences firstly. Next, a new objective function is provided to make the graph matching more suitable for partially overlapping point cloud. The objective function is optimized by the simulated annealing algorithm for final group of correct correspondences. Finally, we present a novel set partitioning method which can transform the NP-hard optimization problem into a O(n3)-solvable one. Experiments on the Stanford and UWA public data sets indicates that our method can obtain better result in terms of both accuracy and time cost compared with other point cloud registration methods.
Regionally adaptive histogram equalization of the chest.

PubMed

Sherrier, R H; Johnson, G A

1987-01-01

Advances in the area of digital chest radiography have resulted in the acquisition of high-quality images of the human chest. With these advances, there arises a genuine need for image processing algorithms specific to the chest, in order to fully exploit this digital technology. We have implemented the well-known technique of histogram equalization, noting the problems encountered when it is adapted to chest images. These problems have been successfully solved with our regionally adaptive histogram equalization method. With this technique histograms are calculated locally and then modified according to both the mean pixel value of that region as well as certain characteristics of the cumulative distribution function. This process, which has allowed certain regions of the chest radiograph to be enhanced differentially, may also have broader implications for other image processing tasks.
Infrared face recognition based on LBP histogram and KW feature selection

NASA Astrophysics Data System (ADS)

Xie, Zhihua

2014-07-01

The conventional LBP-based feature as represented by the local binary pattern (LBP) histogram still has room for performance improvements. This paper focuses on the dimension reduction of LBP micro-patterns and proposes an improved infrared face recognition method based on LBP histogram representation. To extract the local robust features in infrared face images, LBP is chosen to get the composition of micro-patterns of sub-blocks. Based on statistical test theory, Kruskal-Wallis (KW) feature selection method is proposed to get the LBP patterns which are suitable for infrared face recognition. The experimental results show combination of LBP and KW features selection improves the performance of infrared face recognition, the proposed method outperforms the traditional methods based on LBP histogram, discrete cosine transform(DCT) or principal component analysis(PCA).
The Physiology Constant Database of Teen-Agers in Beijing

PubMed Central

Wei-Qi, Wei; Guang-Jin, Zhu; Cheng-Li, Xu; Shao-Mei, Han; Bao-Shen, Qi; Li, Chen; Shu-Yu, Zu; Xiao-Mei, Zhou; Wen-Feng, Hu; Zheng-Guo, Zhang

2004-01-01

Physiology constants of adolescents are important to understand growing living systems and are a useful reference in clinical and epidemiological research. Until recently, physiology constants were not available in China and therefore most physiologists, physicians, and nutritionists had to use data from abroad for reference. However, the very difference between the Eastern and Western races casts doubt on the usefulness of overseas data. We have therefore created a database system to provide a repository for the storage of physiology constants of teen-agers in Beijing. The several thousands of pieces of data are now divided into hematological biochemistry, lung function, and cardiac function with all data manually checked before being transferred into the database. The database was accomplished through the development of a web interface, scripts, and a relational database. The physiology data were integrated into the relational database system to provide flexible facilities by using combinations of various terms and parameters. A web browser interface was designed for the users to facilitate their searching. The database is available on the web. The statistical table, scatter diagram, and histogram of the data are available for both anonym and user according to queries, while only the user can achieve detail, including download data and advanced search. PMID:15258669
Large-scale binding ligand prediction by improved patch-based method Patch-Surfer2.0.

PubMed

Zhu, Xiaolei; Xiong, Yi; Kihara, Daisuke

2015-03-01

Ligand binding is a key aspect of the function of many proteins. Thus, binding ligand prediction provides important insight in understanding the biological function of proteins. Binding ligand prediction is also useful for drug design and examining potential drug side effects. We present a computational method named Patch-Surfer2.0, which predicts binding ligands for a protein pocket. By representing and comparing pockets at the level of small local surface patches that characterize physicochemical properties of the local regions, the method can identify binding pockets of the same ligand even if they do not share globally similar shapes. Properties of local patches are represented by an efficient mathematical representation, 3D Zernike Descriptor. Patch-Surfer2.0 has significant technical improvements over our previous prototype, which includes a new feature that captures approximate patch position with a geodesic distance histogram. Moreover, we constructed a large comprehensive database of ligand binding pockets that will be searched against by a query. The benchmark shows better performance of Patch-Surfer2.0 over existing methods. http://kiharalab.org/patchsurfer2.0/ CONTACT: dkihara@purdue.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Comparison of optimization algorithms in intensity-modulated radiation therapy planning

NASA Astrophysics Data System (ADS)

Kendrick, Rachel

Intensity-modulated radiation therapy is used to better conform the radiation dose to the target, which includes avoiding healthy tissue. Planning programs employ optimization methods to search for the best fluence of each photon beam, and therefore to create the best treatment plan. The Computational Environment for Radiotherapy Research (CERR), a program written in MATLAB, was used to examine some commonly-used algorithms for one 5-beam plan. Algorithms include the genetic algorithm, quadratic programming, pattern search, constrained nonlinear optimization, simulated annealing, the optimization method used in Varian EclipseTM, and some hybrids of these. Quadratic programing, simulated annealing, and a quadratic/simulated annealing hybrid were also separately compared using different prescription doses. The results of each dose-volume histogram as well as the visual dose color wash were used to compare the plans. CERR's built-in quadratic programming provided the best overall plan, but avoidance of the organ-at-risk was rivaled by other programs. Hybrids of quadratic programming with some of these algorithms seems to suggest the possibility of better planning programs, as shown by the improved quadratic/simulated annealing plan when compared to the simulated annealing algorithm alone. Further experimentation will be done to improve cost functions and computational time.
Edge-SIFT: discriminative binary descriptor for scalable partial-duplicate mobile search.

PubMed

Zhang, Shiliang; Tian, Qi; Lu, Ke; Huang, Qingming; Gao, Wen

2013-07-01

As the basis of large-scale partial duplicate visual search on mobile devices, image local descriptor is expected to be discriminative, efficient, and compact. Our study shows that the popularly used histogram-based descriptors, such as scale invariant feature transform (SIFT) are not optimal for this task. This is mainly because histogram representation is relatively expensive to compute on mobile platforms and loses significant spatial clues, which are important for improving discriminative power and matching near-duplicate image patches. To address these issues, we propose to extract a novel binary local descriptor named Edge-SIFT from the binary edge maps of scale- and orientation-normalized image patches. By preserving both locations and orientations of edges and compressing the sparse binary edge maps with a boosting strategy, the final Edge-SIFT shows strong discriminative power with compact representation. Furthermore, we propose a fast similarity measurement and an indexing framework with flexible online verification. Hence, the Edge-SIFT allows an accurate and efficient image search and is ideal for computation sensitive scenarios such as a mobile image search. Experiments on a large-scale dataset manifest that the Edge-SIFT shows superior retrieval accuracy to Oriented BRIEF (ORB) and is superior to SIFT in the aspects of retrieval precision, efficiency, compactness, and transmission cost.
3D/2D image registration using weighted histogram of gradient directions

NASA Astrophysics Data System (ADS)

Ghafurian, Soheil; Hacihaliloglu, Ilker; Metaxas, Dimitris N.; Tan, Virak; Li, Kang

2015-03-01

Three dimensional (3D) to two dimensional (2D) image registration is crucial in many medical applications such as image-guided evaluation of musculoskeletal disorders. One of the key problems is to estimate the 3D CT- reconstructed bone model positions (translation and rotation) which maximize the similarity between the digitally reconstructed radiographs (DRRs) and the 2D fluoroscopic images using a registration method. This problem is computational-intensive due to a large search space and the complicated DRR generation process. Also, finding a similarity measure which converges to the global optimum instead of local optima adds to the challenge. To circumvent these issues, most existing registration methods need a manual initialization, which requires user interaction and is prone to human error. In this paper, we introduce a novel feature-based registration method using the weighted histogram of gradient directions of images. This method simplifies the computation by searching the parameter space (rotation and translation) sequentially rather than simultaneously. In our numeric simulation experiments, the proposed registration algorithm was able to achieve sub-millimeter and sub-degree accuracies. Moreover, our method is robust to the initial guess. It can tolerate up to +/-90°rotation offset from the global optimal solution, which minimizes the need for human interaction to initialize the algorithm.

Multispectral histogram normalization contrast enhancement

NASA Technical Reports Server (NTRS)

Soha, J. M.; Schwartz, A. A.

1979-01-01

A multispectral histogram normalization or decorrelation enhancement which achieves effective color composites by removing interband correlation is described. The enhancement procedure employs either linear or nonlinear transformations to equalize principal component variances. An additional rotation to any set of orthogonal coordinates is thus possible, while full histogram utilization is maintained by avoiding the reintroduction of correlation. For the three-dimensional case, the enhancement procedure may be implemented with a lookup table. An application of the enhancement to Landsat multispectral scanning imagery is presented.
Remote logo detection using angle-distance histograms

NASA Astrophysics Data System (ADS)

Youn, Sungwook; Ok, Jiheon; Baek, Sangwook; Woo, Seongyoun; Lee, Chulhee

2016-05-01

Among all the various computer vision applications, automatic logo recognition has drawn great interest from industry as well as various academic institutions. In this paper, we propose an angle-distance map, which we used to develop a robust logo detection algorithm. The proposed angle-distance histogram is invariant against scale and rotation. The proposed method first used shape information and color characteristics to find the candidate regions and then applied the angle-distance histogram. Experiments show that the proposed method detected logos of various sizes and orientations.
Image Retrieval using Integrated Features of Binary Wavelet Transform

NASA Astrophysics Data System (ADS)

Agarwal, Megha; Maheshwari, R. P.

2011-12-01

In this paper a new approach for image retrieval is proposed with the application of binary wavelet transform. This new approach facilitates the feature calculation with the integration of histogram and correlogram features extracted from binary wavelet subbands. Experiments are performed to evaluate and compare the performance of proposed method with the published literature. It is verified that average precision and average recall of proposed method (69.19%, 41.78%) is significantly improved compared to optimal quantized wavelet correlogram (OQWC) [6] (64.3%, 38.00%) and Gabor wavelet correlogram (GWC) [10] (64.1%, 40.6%). All the experiments are performed on Corel 1000 natural image database [20].
Automated separation of merged Langerhans islets

NASA Astrophysics Data System (ADS)

Švihlík, Jan; Kybic, Jan; Habart, David

2016-03-01

This paper deals with separation of merged Langerhans islets in segmentations in order to evaluate correct histogram of islet diameters. A distribution of islet diameters is useful for determining the feasibility of islet transplantation in diabetes. First, the merged islets at training segmentations are manually separated by medical experts. Based on the single islets, the merged islets are identified and the SVM classifier is trained on both classes (merged/single islets). The testing segmentations were over-segmented using watershed transform and the most probable back merging of islets were found using trained SVM classifier. Finally, the optimized segmentation is compared with ground truth segmentation (correctly separated islets).
Methods in quantitative image analysis.

PubMed

Oberholzer, M; Ostreicher, M; Christen, H; Brühlmann, M

1996-05-01

The main steps of image analysis are image capturing, image storage (compression), correcting imaging defects (e.g. non-uniform illumination, electronic-noise, glare effect), image enhancement, segmentation of objects in the image and image measurements. Digitisation is made by a camera. The most modern types include a frame-grabber, converting the analog-to-digital signal into digital (numerical) information. The numerical information consists of the grey values describing the brightness of every point within the image, named a pixel. The information is stored in bits. Eight bits are summarised in one byte. Therefore, grey values can have a value between 0 and 256 (2(8)). The human eye seems to be quite content with a display of 5-bit images (corresponding to 64 different grey values). In a digitised image, the pixel grey values can vary within regions that are uniform in the original scene: the image is noisy. The noise is mainly manifested in the background of the image. For an optimal discrimination between different objects or features in an image, uniformity of illumination in the whole image is required. These defects can be minimised by shading correction [subtraction of a background (white) image from the original image, pixel per pixel, or division of the original image by the background image]. The brightness of an image represented by its grey values can be analysed for every single pixel or for a group of pixels. The most frequently used pixel-based image descriptors are optical density, integrated optical density, the histogram of the grey values, mean grey value and entropy. The distribution of the grey values existing within an image is one of the most important characteristics of the image. However, the histogram gives no information about the texture of the image. The simplest way to improve the contrast of an image is to expand the brightness scale by spreading the histogram out to the full available range. Rules for transforming the grey value histogram of an existing image (input image) into a new grey value histogram (output image) are most quickly handled by a look-up table (LUT). The histogram of an image can be influenced by gain, offset and gamma of the camera. Gain defines the voltage range, offset defines the reference voltage and gamma the slope of the regression line between the light intensity and the voltage of the camera. A very important descriptor of neighbourhood relations in an image is the co-occurrence matrix. The distance between the pixels (original pixel and its neighbouring pixel) can influence the various parameters calculated from the co-occurrence matrix. The main goals of image enhancement are elimination of surface roughness in an image (smoothing), correction of defects (e.g. noise), extraction of edges, identification of points, strengthening texture elements and improving contrast. In enhancement, two types of operations can be distinguished: pixel-based (point operations) and neighbourhood-based (matrix operations). The most important pixel-based operations are linear stretching of grey values, application of pre-stored LUTs and histogram equalisation. The neighbourhood-based operations work with so-called filters. These are organising elements with an original or initial point in their centre. Filters can be used to accentuate or to suppress specific structures within the image. Filters can work either in the spatial or in the frequency domain. The method used for analysing alterations of grey value intensities in the frequency domain is the Hartley transform. Filter operations in the spatial domain can be based on averaging or ranking the grey values occurring in the organising element. The most important filters, which are usually applied, are the Gaussian filter and the Laplace filter (both averaging filters), and the median filter, the top hat filter and the range operator (all ranking filters). Segmentation of objects is traditionally based on threshold grey values. (AB
Impact of the radiotherapy technique on the correlation between dose-volume histograms of the bladder wall defined on MRI imaging and dose-volume/surface histograms in prostate cancer patients

NASA Astrophysics Data System (ADS)

Maggio, Angelo; Carillo, Viviana; Cozzarini, Cesare; Perna, Lucia; Rancati, Tiziana; Valdagni, Riccardo; Gabriele, Pietro; Fiorino, Claudio

2013-04-01

The aim of this study was to evaluate the correlation between the ‘true’ absolute and relative dose-volume histograms (DVHs) of the bladder wall, dose-wall histogram (DWH) defined on MRI imaging and other surrogates of bladder dosimetry in prostate cancer patients, planned both with 3D-conformal and intensity-modulated radiation therapy (IMRT) techniques. For 17 prostate cancer patients, previously treated with radical intent, CT and MRI scans were acquired and matched. The contours of bladder walls were drawn by using MRI images. External bladder surfaces were then used to generate artificial bladder walls by performing automatic contractions of 5, 7 and 10 mm. For each patient a 3D conformal radiotherapy (3DCRT) and an IMRT treatment plan was generated with a prescription dose of 77.4 Gy (1.8 Gy/fr) and DVH of the whole bladder of the artificial walls (DVH-5/10) and dose-surface histograms (DSHs) were calculated and compared against the DWH in absolute and relative value, for both treatment planning techniques. A specific software (VODCA v. 4.4.0, MSS Inc.) was used for calculating the dose-volume/surface histogram. Correlation was quantified for selected dose-volume/surface parameters by the Spearman correlation coefficient. The agreement between %DWH and DVH5, DVH7 and DVH10 was found to be very good (maximum average deviations below 2%, SD < 5%): DVH5 showed the best agreement. The correlation was slightly better for absolute (R = 0.80-0.94) compared to relative (R = 0.66-0.92) histograms. The DSH was also found to be highly correlated with the DWH, although slightly higher deviations were generally found. The DVH was not a good surrogate of the DWH (R < 0.7 for most of parameters). When comparing the two treatment techniques, more pronounced differences between relative histograms were seen for IMRT with respect to 3DCRT (p < 0.0001).
LEAP into the Pfizer Global Virtual Library (PGVL) space: creation of readily synthesizable design ideas automatically.

PubMed

Hu, Qiyue; Peng, Zhengwei; Kostrowicki, Jaroslav; Kuki, Atsuo

2011-01-01

Pfizer Global Virtual Library (PGVL) of 10(13) readily synthesizable molecules offers a tremendous opportunity for lead optimization and scaffold hopping in drug discovery projects. However, mining into a chemical space of this size presents a challenge for the concomitant design informatics due to the fact that standard molecular similarity searches against a collection of explicit molecules cannot be utilized, since no chemical information system could create and manage more than 10(8) explicit molecules. Nevertheless, by accepting a tolerable level of false negatives in search results, we were able to bypass the need for full 10(13) enumeration and enabled the efficient similarity search and retrieval into this huge chemical space for practical usage by medicinal chemists. In this report, two search methods (LEAP1 and LEAP2) are presented. The first method uses PGVL reaction knowledge to disassemble the incoming search query molecule into a set of reactants and then uses reactant-level similarities into actual available starting materials to focus on a much smaller sub-region of the full virtual library compound space. This sub-region is then explicitly enumerated and searched via a standard similarity method using the original query molecule. The second method uses a fuzzy mapping onto candidate reactions and does not require exact disassembly of the incoming query molecule. Instead Basis Products (or capped reactants) are mapped into the query molecule and the resultant asymmetric similarity scores are used to prioritize the corresponding reactions and reactant sets. All sets of Basis Products are inherently indexed to specific reactions and specific starting materials. This again allows focusing on a much smaller sub-region for explicit enumeration and subsequent standard product-level similarity search. A set of validation studies were conducted. The results have shown that the level of false negatives for the disassembly-based method is acceptable when the query molecule can be recognized for exact disassembly, and the fuzzy reaction mapping method based on Basis Products has an even better performance in terms of lower false-negative rate because it is not limited by the requirement that the query molecule needs to be recognized by any disassembly algorithm. Both search methods have been implemented and accessed through a powerful desktop molecular design tool (see ref. (33) for details). The chapter will end with a comparison of published search methods against large virtual chemical space.
EquiX-A Search and Query Language for XML.

ERIC Educational Resources Information Center

Cohen, Sara; Kanza, Yaron; Kogan, Yakov; Sagiv, Yehoshua; Nutt, Werner; Serebrenik, Alexander

2002-01-01

Describes EquiX, a search language for XML that combines querying with searching to query the data and the meta-data content of Web pages. Topics include search engines; a data model for XML documents; search query syntax; search query semantics; an algorithm for evaluating a query on a document; and indexing EquiX queries. (LRW)
Rapid Prototyping of Application Specific Signal Processors Program

DTIC Science & Technology

1992-10-09

EREQ Query optimizer generator from the University of Colorado. In the five year time frame , this trend toward convergence makes it a non- issue ...related issues . TI’s RASSP vision plans to leverage and support CALS as a baseline for addressing data formatting and handling. Previously stated CALS goals ...of the U.S. Gc;:zrnrenL Distibution Statement A. Approved for public release; distribution is unlimited. Prepared By: _-- Texas Instruments Integrated
Query Optimization in Distributed Databases.

DTIC Science & Technology

1982-10-01

general, the strategy a31 a11 a 3 is more time comsuming than the strategy a, a, and sually we do not use it. Since the semijoin of R.XJ> RS requires...analytic behavior of those heuristic algorithms. Although some analytic results of worst case and average case analysis are difficult to obtain, some...is the study of the analytic behavior of those heuristic algorithms. Although some analytic results of worst case and average case analysis are
UMass Amherst and UT Austin @ The TREC 2009 Relevance Feedback Track

DTIC Science & Technology

2009-11-01

number of terms to select com- pared to our case. We chose AdaRank [Xu and Li, 2007] for the following reasons . It directly optimizes retrieval performance...and the number of topics containing at least one relevant document. query car parts dinosaurs espn sports atari cell phone hoboken dogs adoption auto...infraorder disney activision ringtone nj puppy body bird abc sega forum ny pet lowest extinct channel hardware wireless brook rottweiler cost
Minimizing Statistical Bias with Queries.

DTIC Science & Technology

1995-09-14

method for optimally selecting these points would o er enormous savings in time and money. An active learning system will typically attempt to select data...research in active learning assumes that the sec- ond term of Equation 2 is approximately zero, that is, that the learner is unbiased. If this is the case...outperforms the variance- minimizing algorithm and random exploration. and e ective strategy for active learning . I have given empirical evidence that, with
Google it: obtaining information about local STD/HIV testing services online.

PubMed

Habel, Melissa A; Hood, Julia; Desai, Sheila; Kachur, Rachel; Buhi, Eric R; Liddon, Nicole

2011-04-01

Although the Internet is one of the most commonly accessed resources for health information, finding information on local sexual health services, such as sexually transmitted disease (STD) testing, can be challenging. Recognizing that most quests for online health information begin with search engines, the purpose of this exploratory study was to examine the extent to which online information about local STD/HIV testing services can be found using Google. Queries on STD and HIV testing services were executed in Google for 6 geographically unique locations across the United States. The first 3 websites that resulted from each query were coded for the following characteristics: (1) relevancy to the search topic, (2) domain and purpose, (3) rank in Google results, and (4) content. Websites hosted at .com (57.3%), .org (25.7%), and .gov (10.5%) domains were retrieved most frequently. Roughly half of all websites (n = 376) provided information relevant to the query, and about three-quarters (77.0%) of all queries yielded at least 1 relevant website within the first 3 results. Searches for larger cities were more likely to yield relevant results compared with smaller cities (odds ratio [OR] = 10.0, 95% confidence interval [CI] = 5.6, 17.9). On comparison with .com domains, .gov (OR = 2.9, 95% CI = 1.4, 5.6) and .org domains (OR = 2.9, 95% CI = 1.7, 4.8) were more likely to provide information of the location to get tested. Ease of online access to information about sexual health services varies by search topic and locale. Sexual health service providers must optimize their website placement so as to reach a greater proportion of the sexually active population who use web search engines.
DCMS: A data analytics and management system for molecular simulation.

PubMed

Kumar, Anand; Grupcev, Vladimir; Berrada, Meryem; Fogarty, Joseph C; Tu, Yi-Cheng; Zhu, Xingquan; Pandit, Sagar A; Xia, Yuni

Molecular Simulation (MS) is a powerful tool for studying physical/chemical features of large systems and has seen applications in many scientific and engineering domains. During the simulation process, the experiments generate a very large number of atoms and intend to observe their spatial and temporal relationships for scientific analysis. The sheer data volumes and their intensive interactions impose significant challenges for data accessing, managing, and analysis. To date, existing MS software systems fall short on storage and handling of MS data, mainly because of the missing of a platform to support applications that involve intensive data access and analytical process. In this paper, we present the database-centric molecular simulation (DCMS) system our team developed in the past few years. The main idea behind DCMS is to store MS data in a relational database management system (DBMS) to take advantage of the declarative query interface ( i.e. , SQL), data access methods, query processing, and optimization mechanisms of modern DBMSs. A unique challenge is to handle the analytical queries that are often compute-intensive. For that, we developed novel indexing and query processing strategies (including algorithms running on modern co-processors) as integrated components of the DBMS. As a result, researchers can upload and analyze their data using efficient functions implemented inside the DBMS. Index structures are generated to store analysis results that may be interesting to other users, so that the results are readily available without duplicating the analysis. We have developed a prototype of DCMS based on the PostgreSQL system and experiments using real MS data and workload show that DCMS significantly outperforms existing MS software systems. We also used it as a platform to test other data management issues such as security and compression.
Health information exchange policies of 11 diverse health systems and the associated impact on volume of exchange.

PubMed

Downing, N Lance; Adler-Milstein, Julia; Palma, Jonathan P; Lane, Steven; Eisenberg, Matthew; Sharp, Christopher; Longhurst, Christopher A

2017-01-01

Provider organizations increasingly have the ability to exchange patient health information electronically. Organizational health information exchange (HIE) policy decisions can impact the extent to which external information is readily available to providers, but this relationship has not been well studied. Our objective was to examine the relationship between electronic exchange of patient health information across organizations and organizational HIE policy decisions. We focused on 2 key decisions: whether to automatically search for information from other organizations and whether to require HIE-specific patient consent. We conducted a retrospective time series analysis of the effect of automatic querying and the patient consent requirement on the monthly volume of clinical summaries exchanged. We could not assess degree of use or usefulness of summaries, organizational decision-making processes, or generalizability to other vendors. Between 2013 and 2015, clinical summary exchange volume increased by 1349% across 11 organizations. Nine of the 11 systems were set up to enable auto-querying, and auto-querying was associated with a significant increase in the monthly rate of exchange (P = .006 for change in trend). Seven of the 11 organizations did not require patient consent specifically for HIE, and these organizations experienced a greater increase in volume of exchange over time compared to organizations that required consent. Automatic querying and limited consent requirements are organizational HIE policy decisions that impact the volume of exchange, and ultimately the information available to providers to support optimal care. Future efforts to ensure effective HIE may need to explicitly address these factors. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Inherent smoothness of intensity patterns for intensity modulated radiation therapy generated by simultaneous projection algorithms

NASA Astrophysics Data System (ADS)

Xiao, Ying; Michalski, Darek; Censor, Yair; Galvin, James M.

2004-07-01

The efficient delivery of intensity modulated radiation therapy (IMRT) depends on finding optimized beam intensity patterns that produce dose distributions, which meet given constraints for the tumour as well as any critical organs to be spared. Many optimization algorithms that are used for beamlet-based inverse planning are susceptible to large variations of neighbouring intensities. Accurately delivering an intensity pattern with a large number of extrema can prove impossible given the mechanical limitations of standard multileaf collimator (MLC) delivery systems. In this study, we apply Cimmino's simultaneous projection algorithm to the beamlet-based inverse planning problem, modelled mathematically as a system of linear inequalities. We show that using this method allows us to arrive at a smoother intensity pattern. Including nonlinear terms in the simultaneous projection algorithm to deal with dose-volume histogram (DVH) constraints does not compromise this property from our experimental observation. The smoothness properties are compared with those from other optimization algorithms which include simulated annealing and the gradient descent method. The simultaneous property of these algorithms is ideally suited to parallel computing technologies.
Infrared and visible image fusion using discrete cosine transform and swarm intelligence for surveillance applications

NASA Astrophysics Data System (ADS)

Paramanandham, Nirmala; Rajendiran, Kishore

2018-01-01

A novel image fusion technique is presented for integrating infrared and visible images. Integration of images from the same or various sensing modalities can deliver the required information that cannot be delivered by viewing the sensor outputs individually and consecutively. In this paper, a swarm intelligence based image fusion technique using discrete cosine transform (DCT) domain is proposed for surveillance application which integrates the infrared image with the visible image for generating a single informative fused image. Particle swarm optimization (PSO) is used in the fusion process for obtaining the optimized weighting factor. These optimized weighting factors are used for fusing the DCT coefficients of visible and infrared images. Inverse DCT is applied for obtaining the initial fused image. An enhanced fused image is obtained through adaptive histogram equalization for a better visual understanding and target detection. The proposed framework is evaluated using quantitative metrics such as standard deviation, spatial frequency, entropy and mean gradient. The experimental results demonstrate the outperformance of the proposed algorithm over many other state- of- the- art techniques reported in literature.
Color Feature-Based Object Tracking through Particle Swarm Optimization with Improved Inertia Weight

PubMed Central

Guo, Siqiu; Zhang, Tao; Song, Yulong

2018-01-01

This paper presents a particle swarm tracking algorithm with improved inertia weight based on color features. The weighted color histogram is used as the target feature to reduce the contribution of target edge pixels in the target feature, which makes the algorithm insensitive to the target non-rigid deformation, scale variation, and rotation. Meanwhile, the influence of partial obstruction on the description of target features is reduced. The particle swarm optimization algorithm can complete the multi-peak search, which can cope well with the object occlusion tracking problem. This means that the target is located precisely where the similarity function appears multi-peak. When the particle swarm optimization algorithm is applied to the object tracking, the inertia weight adjustment mechanism has some limitations. This paper presents an improved method. The concept of particle maturity is introduced to improve the inertia weight adjustment mechanism, which could adjust the inertia weight in time according to the different states of each particle in each generation. Experimental results show that our algorithm achieves state-of-the-art performance in a wide range of scenarios. PMID:29690610
Color Feature-Based Object Tracking through Particle Swarm Optimization with Improved Inertia Weight.

PubMed

Guo, Siqiu; Zhang, Tao; Song, Yulong; Qian, Feng

2018-04-23

This paper presents a particle swarm tracking algorithm with improved inertia weight based on color features. The weighted color histogram is used as the target feature to reduce the contribution of target edge pixels in the target feature, which makes the algorithm insensitive to the target non-rigid deformation, scale variation, and rotation. Meanwhile, the influence of partial obstruction on the description of target features is reduced. The particle swarm optimization algorithm can complete the multi-peak search, which can cope well with the object occlusion tracking problem. This means that the target is located precisely where the similarity function appears multi-peak. When the particle swarm optimization algorithm is applied to the object tracking, the inertia weight adjustment mechanism has some limitations. This paper presents an improved method. The concept of particle maturity is introduced to improve the inertia weight adjustment mechanism, which could adjust the inertia weight in time according to the different states of each particle in each generation. Experimental results show that our algorithm achieves state-of-the-art performance in a wide range of scenarios.
The effect of signal variability on the histograms of anthropomorphic channel outputs: factors resulting in non-normally distributed data

NASA Astrophysics Data System (ADS)

Elshahaby, Fatma E. A.; Ghaly, Michael; Jha, Abhinav K.; Frey, Eric C.

2015-03-01

Model Observers are widely used in medical imaging for the optimization and evaluation of instrumentation, acquisition parameters and image reconstruction and processing methods. The channelized Hotelling observer (CHO) is a commonly used model observer in nuclear medicine and has seen increasing use in other modalities. An anthropmorphic CHO consists of a set of channels that model some aspects of the human visual system and the Hotelling Observer, which is the optimal linear discriminant. The optimality of the CHO is based on the assumption that the channel outputs for data with and without the signal present have a multivariate normal distribution with equal class covariance matrices. The channel outputs result from the dot product of channel templates with input images and are thus the sum of a large number of random variables. The central limit theorem is thus often used to justify the assumption that the channel outputs are normally distributed. In this work, we aim to examine this assumption for realistically simulated nuclear medicine images when various types of signal variability are present.

Selection and evaluation of optimal two-dimensional CAIPIRINHA kernels applied to time-resolved three-dimensional CE-MRA.

PubMed

Weavers, Paul T; Borisch, Eric A; Riederer, Stephen J

2015-06-01

To develop and validate a method for choosing the optimal two-dimensional CAIPIRINHA kernel for subtraction contrast-enhanced MR angiography (CE-MRA) and estimate the degree of image quality improvement versus that of some reference acceleration parameter set at R ≥ 8. A metric based on patient-specific coil calibration information was defined for evaluating optimality of CAIPIRINHA kernels as applied to subtraction CE-MRA. Evaluation in retrospective studies using archived coil calibration data from abdomen, calf, foot, and hand CE-MRA exams was accomplished with an evaluation metric comparing the geometry factor (g-factor) histograms. Prospective calf, foot, and hand CE-MRA studies were evaluated with vessel signal-to-noise ratio (SNR). Retrospective studies show g-factor improvement for the selected CAIPIRINHA kernels was significant in the feet, moderate in the abdomen, and modest in the calves and hands. Prospective CE-MRA studies using optimal CAIPIRINHA show reduced noise amplification with identical acquisition time in studies of the feet, with minor improvements in the hands and calves. A method for selection of the optimal CAIPIRINHA kernel for high (R ≥ 8) acceleration CE-MRA exams given a specific patient and receiver array was demonstrated. CAIPIRINHA optimization appears valuable in accelerated CE-MRA of the feet and to a lesser extent in the abdomen. © 2014 Wiley Periodicals, Inc.
Histograms and Raisin Bread

ERIC Educational Resources Information Center

Leyden, Michael B.

1975-01-01

Describes various elementary school activities using a loaf of raisin bread to promote inquiry skills. Activities include estimating the number of raisins in the loaf by constructing histograms of the number of raisins in a slice. (MLH)
Infrared small target enhancement: grey level mapping based on improved sigmoid transformation and saliency histogram

NASA Astrophysics Data System (ADS)

Wan, Minjie; Gu, Guohua; Qian, Weixian; Ren, Kan; Chen, Qian

2018-06-01

Infrared (IR) small target enhancement plays a significant role in modern infrared search and track (IRST) systems and is the basic technique of target detection and tracking. In this paper, a coarse-to-fine grey level mapping method using improved sigmoid transformation and saliency histogram is designed to enhance IR small targets under different backgrounds. For the stage of rough enhancement, the intensity histogram is modified via an improved sigmoid function so as to narrow the regular intensity range of background as much as possible. For the part of further enhancement, a linear transformation is accomplished based on a saliency histogram constructed by averaging the cumulative saliency values provided by a saliency map. Compared with other typical methods, the presented method can achieve both better visual performances and quantitative evaluations.
A domain-knowledge-inspired mathematical framework for the description and classification of H&E stained histopathology images.

PubMed

Massar, Melody L; Bhagavatula, Ramamurthy; Ozolek, John A; Castro, Carlos A; Fickus, Matthew; Kovačević, Jelena

2011-10-19

We present the current state of our work on a mathematical framework for identification and delineation of histopathology images-local histograms and occlusion models. Local histograms are histograms computed over defined spatial neighborhoods whose purpose is to characterize an image locally. This unit of description is augmented by our occlusion models that describe a methodology for image formation. In the context of this image formation model, the power of local histograms with respect to appropriate families of images will be shown through various proved statements about expected performance. We conclude by presenting a preliminary study to demonstrate the power of the framework in the context of histopathology image classification tasks that, while differing greatly in application, both originate from what is considered an appropriate class of images for this framework.
[Research on K-means clustering segmentation method for MRI brain image based on selecting multi-peaks in gray histogram].

PubMed

Chen, Zhaoxue; Yu, Haizhong; Chen, Hao

2013-12-01

To solve the problem of traditional K-means clustering in which initial clustering centers are selected randomly, we proposed a new K-means segmentation algorithm based on robustly selecting 'peaks' standing for White Matter, Gray Matter and Cerebrospinal Fluid in multi-peaks gray histogram of MRI brain image. The new algorithm takes gray value of selected histogram 'peaks' as the initial K-means clustering center and can segment the MRI brain image into three parts of tissue more effectively, accurately, steadily and successfully. Massive experiments have proved that the proposed algorithm can overcome many shortcomings caused by traditional K-means clustering method such as low efficiency, veracity, robustness and time consuming. The histogram 'peak' selecting idea of the proposed segmentootion method is of more universal availability.
Neutron camera employing row and column summations

DOEpatents

Clonts, Lloyd G.; Diawara, Yacouba; Donahue, Jr, Cornelius; Montcalm, Christopher A.; Riedel, Richard A.; Visscher, Theodore

2016-06-14

For each photomultiplier tube in an Anger camera, an R.times.S array of preamplifiers is provided to detect electrons generated within the photomultiplier tube. The outputs of the preamplifiers are digitized to measure the magnitude of the signals from each preamplifier. For each photomultiplier tube, a corresponding summation circuitry including R row summation circuits and S column summation circuits numerically add the magnitudes of the signals from preamplifiers for each row and for each column to generate histograms. For a P.times.Q array of photomultiplier tubes, P.times.Q summation circuitries generate P.times.Q row histograms including R entries and P.times.Q column histograms including S entries. The total set of histograms include P.times.Q.times.(R+S) entries, which can be analyzed by a position calculation circuit to determine the locations of events (detection of a neutron).
Evaluation of breast cancer using intravoxel incoherent motion (IVIM) histogram analysis: comparison with malignant status, histological subtype, and molecular prognostic factors.

PubMed

Cho, Gene Young; Moy, Linda; Kim, Sungheon G; Baete, Steven H; Moccaldi, Melanie; Babb, James S; Sodickson, Daniel K; Sigmund, Eric E

2016-08-01

To examine heterogeneous breast cancer through intravoxel incoherent motion (IVIM) histogram analysis. This HIPAA-compliant, IRB-approved retrospective study included 62 patients (age 48.44 ± 11.14 years, 50 malignant lesions and 12 benign) who underwent contrast-enhanced 3 T breast MRI and diffusion-weighted imaging. Apparent diffusion coefficient (ADC) and IVIM biomarkers of tissue diffusivity (Dt), perfusion fraction (fp), and pseudo-diffusivity (Dp) were calculated using voxel-based analysis for the whole lesion volume. Histogram analysis was performed to quantify tumour heterogeneity. Comparisons were made using Mann-Whitney tests between benign/malignant status, histological subtype, and molecular prognostic factor status while Spearman's rank correlation was used to characterize the association between imaging biomarkers and prognostic factor expression. The average values of the ADC and IVIM biomarkers, Dt and fp, showed significant differences between benign and malignant lesions. Additional significant differences were found in the histogram parameters among tumour subtypes and molecular prognostic factor status. IVIM histogram metrics, particularly fp and Dp, showed significant correlation with hormonal factor expression. Advanced diffusion imaging biomarkers show relationships with molecular prognostic factors and breast cancer malignancy. This analysis reveals novel diagnostic metrics that may explain some of the observed variability in treatment response among breast cancer patients. • Novel IVIM biomarkers characterize heterogeneous breast cancer. • Histogram analysis enables quantification of tumour heterogeneity. • IVIM biomarkers show relationships with breast cancer malignancy and molecular prognostic factors.
Whole-tumor histogram analysis of the cerebral blood volume map: tumor volume defined by 11C-methionine positron emission tomography image improves the diagnostic accuracy of cerebral glioma grading.

PubMed

Wu, Rongli; Watanabe, Yoshiyuki; Arisawa, Atsuko; Takahashi, Hiroto; Tanaka, Hisashi; Fujimoto, Yasunori; Watabe, Tadashi; Isohashi, Kayako; Hatazawa, Jun; Tomiyama, Noriyuki

2017-10-01

This study aimed to compare the tumor volume definition using conventional magnetic resonance (MR) and 11C-methionine positron emission tomography (MET/PET) images in the differentiation of the pre-operative glioma grade by using whole-tumor histogram analysis of normalized cerebral blood volume (nCBV) maps. Thirty-four patients with histopathologically proven primary brain low-grade gliomas (n = 15) and high-grade gliomas (n = 19) underwent pre-operative or pre-biopsy MET/PET, fluid-attenuated inversion recovery, dynamic susceptibility contrast perfusion-weighted magnetic resonance imaging, and contrast-enhanced T1-weighted at 3.0 T. The histogram distribution derived from the nCBV maps was obtained by co-registering the whole tumor volume delineated on conventional MR or MET/PET images, and eight histogram parameters were assessed. The mean nCBV value had the highest AUC value (0.906) based on MET/PET images. Diagnostic accuracy significantly improved when the tumor volume was measured from MET/PET images compared with conventional MR images for the parameters of mean, 50th, and 75th percentile nCBV value (p = 0.0246, 0.0223, and 0.0150, respectively). Whole-tumor histogram analysis of CBV map provides more valuable histogram parameters and increases diagnostic accuracy in the differentiation of pre-operative cerebral gliomas when the tumor volume is derived from MET/PET images.
Effect of respiratory and cardiac gating on the major diffusion-imaging metrics.

PubMed

Hamaguchi, Hiroyuki; Tha, Khin Khin; Sugimori, Hiroyuki; Nakanishi, Mitsuhiro; Nakagawa, Shin; Fujiwara, Taro; Yoshida, Hirokazu; Takamori, Sayaka; Shirato, Hiroki

2016-08-01

The effect of respiratory gating on the major diffusion-imaging metrics and that of cardiac gating on mean kurtosis (MK) are not known. For evaluation of whether the major diffusion-imaging metrics-MK, fractional anisotropy (FA), and mean diffusivity (MD) of the brain-varied between gated and non-gated acquisitions, respiratory-gated, cardiac-gated, and non-gated diffusion-imaging of the brain were performed in 10 healthy volunteers. MK, FA, and MD maps were constructed for all acquisitions, and the histograms were constructed. The normalized peak height and location of the histograms were compared among the acquisitions by use of Friedman and post hoc Wilcoxon tests. The effect of the repetition time (TR) on the diffusion-imaging metrics was also tested, and we corrected for its variation among acquisitions, if necessary. The results showed a shift in the peak location of the MK and MD histograms to the right with an increase in TR (p ≤ 0.01). The corrected peak location of the MK histograms, the normalized peak height of the FA histograms, the normalized peak height and the corrected peak location of the MD histograms varied significantly between the gated and non-gated acquisitions (p < 0.05). These results imply an influence of respiration and cardiac pulsation on the major diffusion-imaging metrics. The gating conditions must be kept identical if reproducible results are to be achieved. © The Author(s) 2016.
Spatial and symbolic queries for 3D image data

NASA Astrophysics Data System (ADS)

Benson, Daniel C.; Zick, Gregory L.

1992-04-01

We present a query system for an object-oriented biomedical imaging database containing 3-D anatomical structures and their corresponding 2-D images. The graphical interface facilitates the formation of spatial queries, nonspatial or symbolic queries, and combined spatial/symbolic queries. A query editor is used for the creation and manipulation of 3-D query objects as volumes, surfaces, lines, and points. Symbolic predicates are formulated through a combination of text fields and multiple choice selections. Query results, which may include images, image contents, composite objects, graphics, and alphanumeric data, are displayed in multiple views. Objects returned by the query may be selected directly within the views for further inspection or modification, or for use as query objects in subsequent queries. Our image database query system provides visual feedback and manipulation of spatial query objects, multiple views of volume data, and the ability to combine spatial and symbolic queries. The system allows for incremental enhancement of existing objects and the addition of new objects and spatial relationships. The query system is designed for databases containing symbolic and spatial data. This paper discuses its application to data acquired in biomedical 3- D image reconstruction, but it is applicable to other areas such as CAD/CAM, geographical information systems, and computer vision.
GenoQuery: a new querying module for functional annotation in a genomic warehouse

PubMed Central

Lemoine, Frédéric; Labedan, Bernard; Froidevaux, Christine

2008-01-01

Motivation: We have to cope with both a deluge of new genome sequences and a huge amount of data produced by high-throughput approaches used to exploit these genomic features. Crossing and comparing such heterogeneous and disparate data will help improving functional annotation of genomes. This requires designing elaborate integration systems such as warehouses for storing and querying these data. Results: We have designed a relational genomic warehouse with an original multi-layer architecture made of a databases layer and an entities layer. We describe a new querying module, GenoQuery, which is based on this architecture. We use the entities layer to define mixed queries. These mixed queries allow searching for instances of biological entities and their properties in the different databases, without specifying in which database they should be found. Accordingly, we further introduce the central notion of alternative queries. Such queries have the same meaning as the original mixed queries, while exploiting complementarities yielded by the various integrated databases of the warehouse. We explain how GenoQuery computes all the alternative queries of a given mixed query. We illustrate how useful this querying module is by means of a thorough example. Availability: http://www.lri.fr/~lemoine/GenoQuery/ Contact: chris@lri.fr, lemoine@lri.fr PMID:18586731
SPARK: Adapting Keyword Query to Semantic Search

NASA Astrophysics Data System (ADS)

Zhou, Qi; Wang, Chong; Xiong, Miao; Wang, Haofen; Yu, Yong

Semantic search promises to provide more accurate result than present-day keyword search. However, progress with semantic search has been delayed due to the complexity of its query languages. In this paper, we explore a novel approach of adapting keywords to querying the semantic web: the approach automatically translates keyword queries into formal logic queries so that end users can use familiar keywords to perform semantic search. A prototype system named 'SPARK' has been implemented in light of this approach. Given a keyword query, SPARK outputs a ranked list of SPARQL queries as the translation result. The translation in SPARK consists of three major steps: term mapping, query graph construction and query ranking. Specifically, a probabilistic query ranking model is proposed to select the most likely SPARQL query. In the experiment, SPARK achieved an encouraging translation result.
Multi-Case Knowledge-Based IMRT Treatment Planning in Head and Neck Cancer

NASA Astrophysics Data System (ADS)

Grzetic, Shelby Mariah

Head and neck cancer (HNC) IMRT treatment planning is a challenging process that relies heavily on the planner's experience. Previously, we used the single, best match from a library of manually planned cases to semi-automatically generate IMRT plans for a new patient. The current multi-case Knowledge Based Radiation Therapy (MC-KBRT) study utilized different matching cases for each of six individual organs-at-risk (OARs), then combined those six cases to create the new treatment plan. From a database of 103 patient plans created by experienced planners, MC-KBRT plans were created for 40 (17 unilateral and 23 bilateral) HNC "query" patients. For each case, 2D beam's-eye-view images were used to find similar geometric "match" patients separately for each of 6 OARs. Dose distributions for each OAR from the 6 matching cases were combined and then warped to suit the query case's geometry. The dose-volume constraints were used to create the new query treatment plan without the need for human decision-making throughout the IMRT optimization. The optimized MC-KBRT plans were compared against the clinically approved plans and Version 1 (previous KBRT using only one matching case with dose warping) using the dose metrics: mean, median, and maximum (brainstem and cord+5mm) doses. Compared to Version 1, MC-KBRT had no significant reduction of the dose to any of the OARs in either unilateral or bilateral cases. Compared to the manually planned unilateral cases, there was significant reduction of the oral cavity mean/median dose (>2Gy) at the expense of the contralateral parotid. Compared to the manually planned bilateral cases, reduction of dose was significant in the ipsilateral parotid, larynx, and oral cavity (>3Gy mean/median) while maintaining PTV coverage. MC-KBRT planning in head and neck cancer generates IMRT plans with better dose sparing than manually created plans. MC-KBRT using multiple case matches does not show significant dose reduction compared to using a single match case with dose warping.
Searching for rare diseases in PubMed: a blind comparison of Orphanet expert query and query based on terminological knowledge.

PubMed

Griffon, N; Schuers, M; Dhombres, F; Merabti, T; Kerdelhué, G; Rollin, L; Darmoni, S J

2016-08-02

Despite international initiatives like Orphanet, it remains difficult to find up-to-date information about rare diseases. The aim of this study is to propose an exhaustive set of queries for PubMed based on terminological knowledge and to evaluate it versus the queries based on expertise provided by the most frequently used resource in Europe: Orphanet. Four rare disease terminologies (MeSH, OMIM, HPO and HRDO) were manually mapped to each other permitting the automatic creation of expended terminological queries for rare diseases. For 30 rare diseases, 30 citations retrieved by Orphanet expert query and/or query based on terminological knowledge were assessed for relevance by two independent reviewers unaware of the query's origin. An adjudication procedure was used to resolve any discrepancy. Precision, relative recall and F-measure were all computed. For each Orphanet rare disease (n = 8982), there was a corresponding terminological query, in contrast with only 2284 queries provided by Orphanet. Only 553 citations were evaluated due to queries with 0 or only a few hits. There were no significant differences between the Orpha query and terminological query in terms of precision, respectively 0.61 vs 0.52 (p = 0.13). Nevertheless, terminological queries retrieved more citations more often than Orpha queries (0.57 vs. 0.33; p = 0.01). Interestingly, Orpha queries seemed to retrieve older citations than terminological queries (p < 0.0001). The terminological queries proposed in this study are now currently available for all rare diseases. They may be a useful tool for both precision or recall oriented literature search.
Pattern-histogram-based temporal change detection using personal chest radiographs

NASA Astrophysics Data System (ADS)

Ugurlu, Yucel; Obi, Takashi; Hasegawa, Akira; Yamaguchi, Masahiro; Ohyama, Nagaaki

1999-05-01

An accurate and reliable detection of temporal changes from a pair of images has considerable interest in the medical science. Traditional registration and subtraction techniques can be applied to extract temporal differences when,the object is rigid or corresponding points are obvious. However, in radiological imaging, loss of the depth information, the elasticity of object, the absence of clearly defined landmarks and three-dimensional positioning differences constraint the performance of conventional registration techniques. In this paper, we propose a new method in order to detect interval changes accurately without using an image registration technique. The method is based on construction of so-called pattern histogram and comparison procedure. The pattern histogram is a graphic representation of the frequency counts of all allowable patterns in the multi-dimensional pattern vector space. K-means algorithm is employed to partition pattern vector space successively. Any differences in the pattern histograms imply that different patterns are involved in the scenes. In our experiment, a pair of chest radiographs of pneumoconiosis is employed and the changing histogram bins are visualized on both of the images. We found that the method can be used as an alternative way of temporal change detection, particularly when the precise image registration is not available.
A Concise Guide to Feature Histograms with Applications to LIDAR-Based Spacecraft Relative Navigation

NASA Astrophysics Data System (ADS)

Rhodes, Andrew P.; Christian, John A.; Evans, Thomas

2017-12-01

With the availability and popularity of 3D sensors, it is advantageous to re-examine the use of point cloud descriptors for the purpose of pose estimation and spacecraft relative navigation. One popular descriptor is the oriented unique repeatable clustered viewpoint feature histogram (OUR-CVFH), which is most often utilized in personal and industrial robotics to simultaneously recognize and navigate relative to an object. Recent research into using the OUR-CVFH descriptor for spacecraft navigation has produced favorable results. Since OUR-CVFH is the most recent innovation in a large family of feature histogram point cloud descriptors, discussions of parameter settings and insights into its functionality are spread among various publications and online resources. This paper organizes the history of feature histogram point cloud descriptors for a straightforward explanation of their evolution. This article compiles all the requisite information needed to implement OUR-CVFH into one location, as well as providing useful suggestions on how to tune the generation parameters. This work is beneficial for anyone interested in using this histogram descriptor for object recognition or navigation - may it be personal robotics or spacecraft navigation.
Hybrid Histogram Descriptor: A Fusion Feature Representation for Image Retrieval.

PubMed

Feng, Qinghe; Hao, Qiaohong; Chen, Yuqi; Yi, Yugen; Wei, Ying; Dai, Jiangyan

2018-06-15

Currently, visual sensors are becoming increasingly affordable and fashionable, acceleratingly the increasing number of image data. Image retrieval has attracted increasing interest due to space exploration, industrial, and biomedical applications. Nevertheless, designing effective feature representation is acknowledged as a hard yet fundamental issue. This paper presents a fusion feature representation called a hybrid histogram descriptor (HHD) for image retrieval. The proposed descriptor comprises two histograms jointly: a perceptually uniform histogram which is extracted by exploiting the color and edge orientation information in perceptually uniform regions; and a motif co-occurrence histogram which is acquired by calculating the probability of a pair of motif patterns. To evaluate the performance, we benchmarked the proposed descriptor on RSSCN7, AID, Outex-00013, Outex-00014 and ETHZ-53 datasets. Experimental results suggest that the proposed descriptor is more effective and robust than ten recent fusion-based descriptors under the content-based image retrieval framework. The computational complexity was also analyzed to give an in-depth evaluation. Furthermore, compared with the state-of-the-art convolutional neural network (CNN)-based descriptors, the proposed descriptor also achieves comparable performance, but does not require any training process.
Improved LSB matching steganography with histogram characters reserved

NASA Astrophysics Data System (ADS)

Chen, Zhihong; Liu, Wenyao

2008-03-01

This letter bases on the researches of LSB (least significant bit, i.e. the last bit of a binary pixel value) matching steganographic method and the steganalytic method which aims at histograms of cover images, and proposes a modification to LSB matching. In the LSB matching, if the LSB of the next cover pixel matches the next bit of secret data, do nothing; otherwise, choose to add or subtract one from the cover pixel value at random. In our improved method, a steganographic information table is defined and records the changes which embedded secrete bits introduce in. Through the table, the next LSB which has the same pixel value will be judged to add or subtract one dynamically in order to ensure the histogram's change of cover image is minimized. Therefore, the modified method allows embedding the same payload as the LSB matching but with improved steganographic security and less vulnerability to attacks compared with LSB matching. The experimental results of the new method show that the histograms maintain their attributes, such as peak values and alternative trends, in an acceptable degree and have better performance than LSB matching in the respects of histogram distortion and resistance against existing steganalysis.
SAMBA: Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahlfeld, R., E-mail: r.ahlfeld14@imperial.ac.uk; Belkouchi, B.; Montomoli, F.

2016-09-01

A new arbitrary Polynomial Chaos (aPC) method is presented for moderately high-dimensional problems characterised by limited input data availability. The proposed methodology improves the algorithm of aPC and extends the method, that was previously only introduced as tensor product expansion, to moderately high-dimensional stochastic problems. The fundamental idea of aPC is to use the statistical moments of the input random variables to develop the polynomial chaos expansion. This approach provides the possibility to propagate continuous or discrete probability density functions and also histograms (data sets) as long as their moments exist, are finite and the determinant of the moment matrixmore » is strictly positive. For cases with limited data availability, this approach avoids bias and fitting errors caused by wrong assumptions. In this work, an alternative way to calculate the aPC is suggested, which provides the optimal polynomials, Gaussian quadrature collocation points and weights from the moments using only a handful of matrix operations on the Hankel matrix of moments. It can therefore be implemented without requiring prior knowledge about statistical data analysis or a detailed understanding of the mathematics of polynomial chaos expansions. The extension to more input variables suggested in this work, is an anisotropic and adaptive version of Smolyak's algorithm that is solely based on the moments of the input probability distributions. It is referred to as SAMBA (PC), which is short for Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos. It is illustrated that for moderately high-dimensional problems (up to 20 different input variables or histograms) SAMBA can significantly simplify the calculation of sparse Gaussian quadrature rules. SAMBA's efficiency for multivariate functions with regard to data availability is further demonstrated by analysing higher order convergence and accuracy for a set of nonlinear test functions with 2, 5 and 10 different input distributions or histograms.« less
Nanomechanical characterization of heterogeneous and hierarchical biomaterials and tissues using nanoindentation: the role of finite mixture models.

PubMed

Zadpoor, Amir A

2015-03-01

Mechanical characterization of biological tissues and biomaterials at the nano-scale is often performed using nanoindentation experiments. The different constituents of the characterized materials will then appear in the histogram that shows the probability of measuring a certain range of mechanical properties. An objective technique is needed to separate the probability distributions that are mixed together in such a histogram. In this paper, finite mixture models (FMMs) are proposed as a tool capable of performing such types of analysis. Finite Gaussian mixture models assume that the measured probability distribution is a weighted combination of a finite number of Gaussian distributions with separate mean and standard deviation values. Dedicated optimization algorithms are available for fitting such a weighted mixture model to experimental data. Moreover, certain objective criteria are available to determine the optimum number of Gaussian distributions. In this paper, FMMs are used for interpreting the probability distribution functions representing the distributions of the elastic moduli of osteoarthritic human cartilage and co-polymeric microspheres. As for cartilage experiments, FMMs indicate that at least three mixture components are needed for describing the measured histogram. While the mechanical properties of the softer mixture components, often assumed to be associated with Glycosaminoglycans, were found to be more or less constant regardless of whether two or three mixture components were used, those of the second mixture component (i.e. collagen network) considerably changed depending on the number of mixture components. Regarding the co-polymeric microspheres, the optimum number of mixture components estimated by the FMM theory, i.e. 3, nicely matches the number of co-polymeric components used in the structure of the polymer. The computer programs used for the presented analyses are made freely available online for other researchers to use. Copyright © 2014 Elsevier B.V. All rights reserved.

SAMBA: Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos

NASA Astrophysics Data System (ADS)

Ahlfeld, R.; Belkouchi, B.; Montomoli, F.

2016-09-01

A new arbitrary Polynomial Chaos (aPC) method is presented for moderately high-dimensional problems characterised by limited input data availability. The proposed methodology improves the algorithm of aPC and extends the method, that was previously only introduced as tensor product expansion, to moderately high-dimensional stochastic problems. The fundamental idea of aPC is to use the statistical moments of the input random variables to develop the polynomial chaos expansion. This approach provides the possibility to propagate continuous or discrete probability density functions and also histograms (data sets) as long as their moments exist, are finite and the determinant of the moment matrix is strictly positive. For cases with limited data availability, this approach avoids bias and fitting errors caused by wrong assumptions. In this work, an alternative way to calculate the aPC is suggested, which provides the optimal polynomials, Gaussian quadrature collocation points and weights from the moments using only a handful of matrix operations on the Hankel matrix of moments. It can therefore be implemented without requiring prior knowledge about statistical data analysis or a detailed understanding of the mathematics of polynomial chaos expansions. The extension to more input variables suggested in this work, is an anisotropic and adaptive version of Smolyak's algorithm that is solely based on the moments of the input probability distributions. It is referred to as SAMBA (PC), which is short for Sparse Approximation of Moment-Based Arbitrary Polynomial Chaos. It is illustrated that for moderately high-dimensional problems (up to 20 different input variables or histograms) SAMBA can significantly simplify the calculation of sparse Gaussian quadrature rules. SAMBA's efficiency for multivariate functions with regard to data availability is further demonstrated by analysing higher order convergence and accuracy for a set of nonlinear test functions with 2, 5 and 10 different input distributions or histograms.
Histograms and Frequency Density.

ERIC Educational Resources Information Center

Micromath, 2003

2003-01-01

Introduces exercises on histograms and frequency density. Guides pupils to Discovering Important Statistical Concepts Using Spreadsheets (DISCUSS), created at the University of Coventry. Includes curriculum points, teaching tips, activities, and internet address (http://www.coventry.ac.uk/discuss/). (KHR)
An advanced web query interface for biological databases

PubMed Central

Latendresse, Mario; Karp, Peter D.

2010-01-01

Although most web-based biological databases (DBs) offer some type of web-based form to allow users to author DB queries, these query forms are quite restricted in the complexity of DB queries that they can formulate. They can typically query only one DB, and can query only a single type of object at a time (e.g. genes) with no possible interaction between the objects—that is, in SQL parlance, no joins are allowed between DB objects. Writing precise queries against biological DBs is usually left to a programmer skillful enough in complex DB query languages like SQL. We present a web interface for building precise queries for biological DBs that can construct much more precise queries than most web-based query forms, yet that is user friendly enough to be used by biologists. It supports queries containing multiple conditions, and connecting multiple object types without using the join concept, which is unintuitive to biologists. This interactive web interface is called the Structured Advanced Query Page (SAQP). Users interactively build up a wide range of query constructs. Interactive documentation within the SAQP describes the schema of the queried DBs. The SAQP is based on BioVelo, a query language based on list comprehension. The SAQP is part of the Pathway Tools software and is available as part of several bioinformatics web sites powered by Pathway Tools, including the BioCyc.org site that contains more than 500 Pathway/Genome DBs. PMID:20624715
Towards the novel reasoning among particles in PSO by the use of RDF and SPARQL.

PubMed

Fister, Iztok; Yang, Xin-She; Ljubič, Karin; Fister, Dušan; Brest, Janez; Fister, Iztok

2014-01-01

The significant development of the Internet has posed some new challenges and many new programming tools have been developed to address such challenges. Today, semantic web is a modern paradigm for representing and accessing knowledge data on the Internet. This paper tries to use the semantic tools such as resource definition framework (RDF) and RDF query language (SPARQL) for the optimization purpose. These tools are combined with particle swarm optimization (PSO) and the selection of the best solutions depends on its fitness. Instead of the local best solution, a neighborhood of solutions for each particle can be defined and used for the calculation of the new position, based on the key ideas from semantic web domain. The preliminary results by optimizing ten benchmark functions showed the promising results and thus this method should be investigated further.
ColorMoves: Optimizing Color's Potential for Exploration and Communication of Data

NASA Astrophysics Data System (ADS)

Samsel, F.

2017-12-01

Color is the most powerful perceptual channel available for exposing and communicating data. Most visualizations are rendered in one of a handful of common colormaps - the rainbow, cool-warm, heat map and viridis. These maps meet the basic criteria for encoding data - perceptual uniformity and reasonable discriminatory power. However, as the size and complexity of data grows, our need to optimize the potential of color grows. The ability to expose greater detail and differentiate between multiple variables becomes ever more important. To meet this need we have created ColorMoves, an interactive colormap construction tool that enables scientists to quickly and easily align a concentration contrast with the data ranges of interest. Perceptual research tells us that luminance is the strongest contrast and thus provides the highest degree of perceptual discrimination. However, the most commonly used colormaps contain a limited range of luminance contrast. ColorMoves enables interactive constructing colormaps enabling one to distribute the luminance where is it most needed. The interactive interface enables optimal placement of the color scales. The ability to watch the changes on ones data, in real time makes precision adjustment quick and easy. By enabling more precise placement and multiple ranges of luminance one can construct colomaps containing greater discriminatory power. By selecting from the wide range of color scale hues scientists can create colormaps intuitive to their subject. ColorMoves is comprised of four main components: a set of 40 color scales; a histogram of the data distribution; a viewing area showing the colormap on your data; and the controls section. The 40 color scales span the spectrum of hues, saturation levels and value distributions. The histogram of the data distribution enables placement of the color scales in precise locations. The viewing area show is the impact of changes on the data in real time. The controls section enables export of the constructed colormaps for use in tools such as ParaView and Matplotlib. For a clearer understanding of ColorMoves capability we recommend trying it out at SciVisColor.org.
SPARQL Query Re-writing Using Partonomy Based Transformation Rules

NASA Astrophysics Data System (ADS)

Jain, Prateek; Yeh, Peter Z.; Verma, Kunal; Henson, Cory A.; Sheth, Amit P.

Often the information present in a spatial knowledge base is represented at a different level of granularity and abstraction than the query constraints. For querying ontology's containing spatial information, the precise relationships between spatial entities has to be specified in the basic graph pattern of SPARQL query which can result in long and complex queries. We present a novel approach to help users intuitively write SPARQL queries to query spatial data, rather than relying on knowledge of the ontology structure. Our framework re-writes queries, using transformation rules to exploit part-whole relations between geographical entities to address the mismatches between query constraints and knowledge base. Our experiments were performed on completely third party datasets and queries. Evaluations were performed on Geonames dataset using questions from National Geographic Bee serialized into SPARQL and British Administrative Geography Ontology using questions from a popular trivia website. These experiments demonstrate high precision in retrieval of results and ease in writing queries.
The DataCube Server. Animate Agent Project Working Note 2, Version 1.0

DTIC Science & Technology

1993-11-01

before this can be called a histogram of all the needed levels must be made and their one band images must be made. Note if a levels backprojection...will not be used then the level does not need to be histogrammed. Any points outside the active region in a levels backprojection will be undefined...this can be called a histogram of all the needed levels must be made and their one band images must be made. Note if a levels backprojection will not
Implementation of Quantum Private Queries Using Nuclear Magnetic Resonance

NASA Astrophysics Data System (ADS)

Wang, Chuan; Hao, Liang; Zhao, Lian-Jie

2011-08-01

We present a modified protocol for the realization of a quantum private query process on a classical database. Using one-qubit query and CNOT operation, the query process can be realized in a two-mode database. In the query process, the data privacy is preserved as the sender would not reveal any information about the database besides her query information, and the database provider cannot retain any information about the query. We implement the quantum private query protocol in a nuclear magnetic resonance system. The density matrix of the memory registers are constructed.
Diffusion Profiling via a Histogram Approach Distinguishes Low-grade from High-grade Meningiomas, Can Reflect the Respective Proliferative Potential and Progesterone Receptor Status.

PubMed

Gihr, Georg Alexander; Horvath-Rizea, Diana; Garnov, Nikita; Kohlhof-Meinecke, Patricia; Ganslandt, Oliver; Henkes, Hans; Meyer, Hans Jonas; Hoffmann, Karl-Titus; Surov, Alexey; Schob, Stefan

2018-02-01

Presurgical grading, estimation of growth kinetics, and other prognostic factors are becoming increasingly important for selecting the best therapeutic approach for meningioma patients. Diffusion-weighted imaging (DWI) provides microstructural information and reflects tumor biology. A novel DWI approach, histogram profiling of apparent diffusion coefficient (ADC) volumes, provides more distinct information than conventional DWI. Therefore, our study investigated whether ADC histogram profiling distinguishes low-grade from high-grade lesions and reflects Ki-67 expression and progesterone receptor status. Pretreatment ADC volumes of 37 meningioma patients (28 low-grade, 9 high-grade) were used for histogram profiling. WHO grade, Ki-67 expression, and progesterone receptor status were evaluated. Comparative and correlative statistics investigating the association between histogram profiling and neuropathology were performed. The entire ADC profile (p10, p25, p75, p90, mean, median) was significantly lower in high-grade versus low-grade meningiomas. The lower percentiles, mean, and modus showed significant correlations with Ki-67 expression. Skewness and entropy of the ADC volumes were significantly associated with progesterone receptor status and Ki-67 expression. ROC analysis revealed entropy to be the most accurate parameter distinguishing low-grade from high-grade meningiomas. ADC histogram profiling provides a distinct set of parameters, which help differentiate low-grade versus high-grade meningiomas. Also, histogram metrics correlate significantly with histological surrogates of the respective proliferative potential. More specifically, entropy revealed to be the most promising imaging biomarker for presurgical grading. Both, entropy and skewness were significantly associated with progesterone receptor status and Ki-67 expression and therefore should be investigated further as predictors for prognostically relevant tumor biological features. Since absolute ADC values vary between MRI scanners of different vendors and field strengths, their use is more limited in the presurgical setting.
Histogram Analysis of CT Perfusion of Hepatocellular Carcinoma for Predicting Response to Transarterial Radioembolization: Value of Tumor Heterogeneity Assessment.

PubMed

Reiner, Caecilia S; Gordic, Sonja; Puippe, Gilbert; Morsbach, Fabian; Wurnig, Moritz; Schaefer, Niklaus; Veit-Haibach, Patrick; Pfammatter, Thomas; Alkadhi, Hatem

2016-03-01

To evaluate in patients with hepatocellular carcinoma (HCC), whether assessment of tumor heterogeneity by histogram analysis of computed tomography (CT) perfusion helps predicting response to transarterial radioembolization (TARE). Sixteen patients (15 male; mean age 65 years; age range 47-80 years) with HCC underwent CT liver perfusion for treatment planning prior to TARE with Yttrium-90 microspheres. Arterial perfusion (AP) derived from CT perfusion was measured in the entire tumor volume, and heterogeneity was analyzed voxel-wise by histogram analysis. Response to TARE was evaluated on follow-up imaging (median follow-up, 129 days) based on modified Response Evaluation Criteria in Solid Tumors (mRECIST). Results of histogram analysis and mean AP values of the tumor were compared between responders and non-responders. Receiver operating characteristics were calculated to determine the parameters' ability to discriminate responders from non-responders. According to mRECIST, 8 patients (50%) were responders and 8 (50%) non-responders. Comparing responders and non-responders, the 50th and 75th percentile of AP derived from histogram analysis was significantly different [AP 43.8/54.3 vs. 27.6/34.3 mL min(-1) 100 mL(-1)); p < 0.05], while the mean AP of HCCs (43.5 vs. 27.9 mL min(-1) 100 mL(-1); p > 0.05) was not. Further heterogeneity parameters from histogram analysis (skewness, coefficient of variation, and 25th percentile) did not differ between responders and non-responders (p > 0.05). If the cut-off for the 75th percentile was set to an AP of 37.5 mL min(-1) 100 mL(-1), therapy response could be predicted with a sensitivity of 88% (7/8) and specificity of 75% (6/8). Voxel-wise histogram analysis of pretreatment CT perfusion indicating tumor heterogeneity of HCC improves the pretreatment prediction of response to TARE.
ADC histogram analysis for adrenal tumor histogram analysis of apparent diffusion coefficient in differentiating adrenal adenoma from pheochromocytoma.

PubMed

Umanodan, Tomokazu; Fukukura, Yoshihiko; Kumagae, Yuichi; Shindo, Toshikazu; Nakajo, Masatoyo; Takumi, Koji; Nakajo, Masanori; Hakamada, Hiroto; Umanodan, Aya; Yoshiura, Takashi

2017-04-01

To determine the diagnostic performance of apparent diffusion coefficient (ADC) histogram analysis in diffusion-weighted (DW) magnetic resonance imaging (MRI) for differentiating adrenal adenoma from pheochromocytoma. We retrospectively evaluated 52 adrenal tumors (39 adenomas and 13 pheochromocytomas) in 47 patients (21 men, 26 women; mean age, 59.3 years; range, 16-86 years) who underwent DW 3.0T MRI. Histogram parameters of ADC (b-values of 0 and 200 [ADC 200 ], 0 and 400 [ADC 400 ], and 0 and 800 s/mm 2 [ADC 800 ])-mean, variance, coefficient of variation (CV), kurtosis, skewness, and entropy-were compared between adrenal adenomas and pheochromocytomas, using the Mann-Whitney U-test. Receiver operating characteristic (ROC) curves for the histogram parameters were generated to differentiate adrenal adenomas from pheochromocytomas. Sensitivity and specificity were calculated by using a threshold criterion that would maximize the average of sensitivity and specificity. Variance and CV of ADC 800 were significantly higher in pheochromocytomas than in adrenal adenomas (P < 0.001 and P = 0.001, respectively). With all b-value combinations, the entropy of ADC was significantly higher in pheochromocytomas than in adrenal adenomas (all P ≤ 0.001), and showed the highest area under the ROC curve among the ADC histogram parameters for diagnosing adrenal adenomas (ADC 200 , 0.82; ADC 400 , 0.87; and ADC 800 , 0.92), with sensitivity of 84.6% and specificity of 84.6% (cutoff, ≤2.82) with ADC 200 ; sensitivity of 89.7% and specificity of 84.6% (cutoff, ≤2.77) with ADC 400 ; and sensitivity of 94.9% and specificity of 92.3% (cutoff, ≤2.67) with ADC 800 . ADC histogram analysis of DW MRI can help differentiate adrenal adenoma from pheochromocytoma. 3 J. Magn. Reson. Imaging 2017;45:1195-1203. © 2016 International Society for Magnetic Resonance in Medicine.
Iterative dataset optimization in automated planning: Implementation for breast and rectal cancer radiotherapy.

PubMed

Fan, Jiawei; Wang, Jiazhou; Zhang, Zhen; Hu, Weigang

2017-06-01

To develop a new automated treatment planning solution for breast and rectal cancer radiotherapy. The automated treatment planning solution developed in this study includes selection of the iterative optimized training dataset, dose volume histogram (DVH) prediction for the organs at risk (OARs), and automatic generation of clinically acceptable treatment plans. The iterative optimized training dataset is selected by an iterative optimization from 40 treatment plans for left-breast and rectal cancer patients who received radiation therapy. A two-dimensional kernel density estimation algorithm (noted as two parameters KDE) which incorporated two predictive features was implemented to produce the predicted DVHs. Finally, 10 additional new left-breast treatment plans are re-planned using the Pinnacle 3 Auto-Planning (AP) module (version 9.10, Philips Medical Systems) with the objective functions derived from the predicted DVH curves. Automatically generated re-optimized treatment plans are compared with the original manually optimized plans. By combining the iterative optimized training dataset methodology and two parameters KDE prediction algorithm, our proposed automated planning strategy improves the accuracy of the DVH prediction. The automatically generated treatment plans using the dose derived from the predicted DVHs can achieve better dose sparing for some OARs without compromising other metrics of plan quality. The proposed new automated treatment planning solution can be used to efficiently evaluate and improve the quality and consistency of the treatment plans for intensity-modulated breast and rectal cancer radiation therapy. © 2017 American Association of Physicists in Medicine.
SU-E-T-07: 4DCT Robust Optimization for Esophageal Cancer Using Intensity Modulated Proton Therapy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Liao, L; Department of Industrial Engineering, University of Houston, Houston, TX; Yu, J

2015-06-15

Purpose: To develop a 4DCT robust optimization method to reduce the dosimetric impact from respiratory motion in intensity modulated proton therapy (IMPT) for esophageal cancer. Methods: Four esophageal cancer patients were selected for this study. The different phases of CT from a set of 4DCT were incorporated into the worst-case dose distribution robust optimization algorithm. 4DCT robust treatment plans were designed and compared with the conventional non-robust plans. Result doses were calculated on the average and maximum inhale/exhale phases of 4DCT. Dose volume histogram (DVH) band graphic and ΔD95%, ΔD98%, ΔD5%, ΔD2% of CTV between different phases were used tomore » evaluate the robustness of the plans. Results: Compare to the IMPT plans optimized using conventional methods, the 4DCT robust IMPT plans can achieve the same quality in nominal cases, while yield a better robustness to breathing motion. The mean ΔD95%, ΔD98%, ΔD5% and ΔD2% of CTV are 6%, 3.2%, 0.9% and 1% for the robustly optimized plans vs. 16.2%, 11.8%, 1.6% and 3.3% from the conventional non-robust plans. Conclusion: A 4DCT robust optimization method was proposed for esophageal cancer using IMPT. We demonstrate that the 4DCT robust optimization can mitigate the dose deviation caused by the diaphragm motion.« less
A study of medical and health queries to web search engines.

PubMed

Spink, Amanda; Yang, Yin; Jansen, Jim; Nykanen, Pirrko; Lorence, Daniel P; Ozmutlu, Seda; Ozmutlu, H Cenk

2004-03-01

This paper reports findings from an analysis of medical or health queries to different web search engines. We report results: (i). comparing samples of 10000 web queries taken randomly from 1.2 million query logs from the AlltheWeb.com and Excite.com commercial web search engines in 2001 for medical or health queries, (ii). comparing the 2001 findings from Excite and AlltheWeb.com users with results from a previous analysis of medical and health related queries from the Excite Web search engine for 1997 and 1999, and (iii). medical or health advice-seeking queries beginning with the word 'should'. Findings suggest: (i). a small percentage of web queries are medical or health related, (ii). the top five categories of medical or health queries were: general health, weight issues, reproductive health and puberty, pregnancy/obstetrics, and human relationships, and (iii). over time, the medical and health queries may have declined as a proportion of all web queries, as the use of specialized medical/health websites and e-commerce-related queries has increased. Findings provide insights into medical and health-related web querying and suggests some implications for the use of the general web search engines when seeking medical/health information.
Monitoring Moving Queries inside a Safe Region

PubMed Central

Al-Khalidi, Haidar; Taniar, David; Alamri, Sultan

2014-01-01

With mobile moving range queries, there is a need to recalculate the relevant surrounding objects of interest whenever the query moves. Therefore, monitoring the moving query is very costly. The safe region is one method that has been proposed to minimise the communication and computation cost of continuously monitoring a moving range query. Inside the safe region the set of objects of interest to the query do not change; thus there is no need to update the query while it is inside its safe region. However, when the query leaves its safe region the mobile device has to reevaluate the query, necessitating communication with the server. Knowing when and where the mobile device will leave a safe region is widely known as a difficult problem. To solve this problem, we propose a novel method to monitor the position of the query over time using a linear function based on the direction of the query obtained by periodic monitoring of its position. Periodic monitoring ensures that the query is aware of its location all the time. This method reduces the costs associated with communications in client-server architecture. Computational results show that our method is successful in handling moving query patterns. PMID:24696652
Robust Audio Watermarking by Using Low-Frequency Histogram

NASA Astrophysics Data System (ADS)

Xiang, Shijun

In continuation to earlier work where the problem of time-scale modification (TSM) has been studied [1] by modifying the shape of audio time domain histogram, here we consider the additional ingredient of resisting additive noise-like operations, such as Gaussian noise, lossy compression and low-pass filtering. In other words, we study the problem of the watermark against both TSM and additive noises. To this end, in this paper we extract the histogram from a Gaussian-filtered low-frequency component for audio watermarking. The watermark is inserted by shaping the histogram in a way that the use of two consecutive bins as a group is exploited for hiding a bit by reassigning their population. The watermarked signals are perceptibly similar to the original one. Comparing with the previous time-domain watermarking scheme [1], the proposed watermarking method is more robust against additive noise, MP3 compression, low-pass filtering, etc.
[Image Feature Extraction and Discriminant Analysis of Xinjiang Uygur Medicine Based on Color Histogram].

PubMed

Hamit, Murat; Yun, Weikang; Yan, Chuanbo; Kutluk, Abdugheni; Fang, Yang; Alip, Elzat

2015-06-01

Image feature extraction is an important part of image processing and it is an important field of research and application of image processing technology. Uygur medicine is one of Chinese traditional medicine and researchers pay more attention to it. But large amounts of Uygur medicine data have not been fully utilized. In this study, we extracted the image color histogram feature of herbal and zooid medicine of Xinjiang Uygur. First, we did preprocessing, including image color enhancement, size normalizition and color space transformation. Then we extracted color histogram feature and analyzed them with statistical method. And finally, we evaluated the classification ability of features by Bayes discriminant analysis. Experimental results showed that high accuracy for Uygur medicine image classification was obtained by using color histogram feature. This study would have a certain help for the content-based medical image retrieval for Xinjiang Uygur medicine.
LSAH: a fast and efficient local surface feature for point cloud registration

NASA Astrophysics Data System (ADS)

Lu, Rongrong; Zhu, Feng; Wu, Qingxiao; Kong, Yanzi

2018-04-01

Point cloud registration is a fundamental task in high level three dimensional applications. Noise, uneven point density and varying point cloud resolutions are the three main challenges for point cloud registration. In this paper, we design a robust and compact local surface descriptor called Local Surface Angles Histogram (LSAH) and propose an effectively coarse to fine algorithm for point cloud registration. The LSAH descriptor is formed by concatenating five normalized sub-histograms into one histogram. The five sub-histograms are created by accumulating a different type of angle from a local surface patch respectively. The experimental results show that our LSAH is more robust to uneven point density and point cloud resolutions than four state-of-the-art local descriptors in terms of feature matching. Moreover, we tested our LSAH based coarse to fine algorithm for point cloud registration. The experimental results demonstrate that our algorithm is robust and efficient as well.
Advanced concentration analysis of atom probe tomography data: Local proximity histograms and pseudo-2D concentration maps.

PubMed

Felfer, Peter; Cairney, Julie

2018-06-01

Analysing the distribution of selected chemical elements with respect to interfaces is one of the most common tasks in data mining in atom probe tomography. This can be represented by 1D concentration profiles, 2D concentration maps or proximity histograms, which represent concentration, density etc. of selected species as a function of the distance from a reference surface/interface. These are some of the most useful tools for the analysis of solute distributions in atom probe data. In this paper, we present extensions to the proximity histogram in the form of 'local' proximity histograms, calculated for selected parts of a surface, and pseudo-2D concentration maps, which are 2D concentration maps calculated on non-flat surfaces. This way, local concentration changes at interfaces or and other structures can be assessed more effectively. Copyright © 2018 Elsevier B.V. All rights reserved.
RDF-GL: A SPARQL-Based Graphical Query Language for RDF

NASA Astrophysics Data System (ADS)

Hogenboom, Frederik; Milea, Viorel; Frasincar, Flavius; Kaymak, Uzay

This chapter presents RDF-GL, a graphical query language (GQL) for RDF. The GQL is based on the textual query language SPARQL and mainly focuses on SPARQL SELECT queries. The advantage of a GQL over textual query languages is that complexity is hidden through the use of graphical symbols. RDF-GL is supported by a Java-based editor, SPARQLinG, which is presented as well. The editor does not only allow for RDF-GL query creation, but also converts RDF-GL queries to SPARQL queries and is able to subsequently execute these. Experiments show that using the GQL in combination with the editor makes RDF querying more accessible for end users.

Implementation of a dose gradient method into optimization of dose distribution in prostate cancer 3D-CRT plans

PubMed Central

Giżyńska, Marta K.; Kukołowicz, Paweł F.; Kordowski, Paweł

2014-01-01

Aim The aim of this work is to present a method of beam weight and wedge angle optimization for patients with prostate cancer. Background 3D-CRT is usually realized with forward planning based on a trial and error method. Several authors have published a few methods of beam weight optimization applicable to the 3D-CRT. Still, none on these methods is in common use. Materials and methods Optimization is based on the assumption that the best plan is achieved if dose gradient at ICRU point is equal to zero. Our optimization algorithm requires beam quality index, depth of maximum dose, profiles of wedged fields and maximum dose to femoral heads. The method was tested for 10 patients with prostate cancer, treated with the 3-field technique. Optimized plans were compared with plans prepared by 12 experienced planners. Dose standard deviation in target volume, and minimum and maximum doses were analyzed. Results The quality of plans obtained with the proposed optimization algorithms was comparable to that prepared by experienced planners. Mean difference in target dose standard deviation was 0.1% in favor of the plans prepared by planners for optimization of beam weights and wedge angles. Introducing a correction factor for patient body outline for dose gradient at ICRU point improved dose distribution homogeneity. On average, a 0.1% lower standard deviation was achieved with the optimization algorithm. No significant difference in mean dose–volume histogram for the rectum was observed. Conclusions Optimization shortens very much time planning. The average planning time was 5 min and less than a minute for forward and computer optimization, respectively. PMID:25337411
A Hybrid Spatio-Temporal Data Indexing Method for Trajectory Databases

PubMed Central

Ke, Shengnan; Gong, Jun; Li, Songnian; Zhu, Qing; Liu, Xintao; Zhang, Yeting

2014-01-01

In recent years, there has been tremendous growth in the field of indoor and outdoor positioning sensors continuously producing huge volumes of trajectory data that has been used in many fields such as location-based services or location intelligence. Trajectory data is massively increased and semantically complicated, which poses a great challenge on spatio-temporal data indexing. This paper proposes a spatio-temporal data indexing method, named HBSTR-tree, which is a hybrid index structure comprising spatio-temporal R-tree, B*-tree and Hash table. To improve the index generation efficiency, rather than directly inserting trajectory points, we group consecutive trajectory points as nodes according to their spatio-temporal semantics and then insert them into spatio-temporal R-tree as leaf nodes. Hash table is used to manage the latest leaf nodes to reduce the frequency of insertion. A new spatio-temporal interval criterion and a new node-choosing sub-algorithm are also proposed to optimize spatio-temporal R-tree structures. In addition, a B*-tree sub-index of leaf nodes is built to query the trajectories of targeted objects efficiently. Furthermore, a database storage scheme based on a NoSQL-type DBMS is also proposed for the purpose of cloud storage. Experimental results prove that HBSTR-tree outperforms TB*-tree in some aspects such as generation efficiency, query performance and query type. PMID:25051028
A hybrid spatio-temporal data indexing method for trajectory databases.

PubMed

Ke, Shengnan; Gong, Jun; Li, Songnian; Zhu, Qing; Liu, Xintao; Zhang, Yeting

2014-07-21

In recent years, there has been tremendous growth in the field of indoor and outdoor positioning sensors continuously producing huge volumes of trajectory data that has been used in many fields such as location-based services or location intelligence. Trajectory data is massively increased and semantically complicated, which poses a great challenge on spatio-temporal data indexing. This paper proposes a spatio-temporal data indexing method, named HBSTR-tree, which is a hybrid index structure comprising spatio-temporal R-tree, B*-tree and Hash table. To improve the index generation efficiency, rather than directly inserting trajectory points, we group consecutive trajectory points as nodes according to their spatio-temporal semantics and then insert them into spatio-temporal R-tree as leaf nodes. Hash table is used to manage the latest leaf nodes to reduce the frequency of insertion. A new spatio-temporal interval criterion and a new node-choosing sub-algorithm are also proposed to optimize spatio-temporal R-tree structures. In addition, a B*-tree sub-index of leaf nodes is built to query the trajectories of targeted objects efficiently. Furthermore, a database storage scheme based on a NoSQL-type DBMS is also proposed for the purpose of cloud storage. Experimental results prove that HBSTR-tree outperforms TB*-tree in some aspects such as generation efficiency, query performance and query type.
Clustering and Flow Conservation Monitoring Tool for Software Defined Networks

PubMed Central

Puente Fernández, Jesús Antonio

2018-01-01

Prediction systems present some challenges on two fronts: the relation between video quality and observed session features and on the other hand, dynamics changes on the video quality. Software Defined Networks (SDN) is a new concept of network architecture that provides the separation of control plane (controller) and data plane (switches) in network devices. Due to the existence of the southbound interface, it is possible to deploy monitoring tools to obtain the network status and retrieve a statistics collection. Therefore, achieving the most accurate statistics depends on a strategy of monitoring and information requests of network devices. In this paper, we propose an enhanced algorithm for requesting statistics to measure the traffic flow in SDN networks. Such an algorithm is based on grouping network switches in clusters focusing on their number of ports to apply different monitoring techniques. Such grouping occurs by avoiding monitoring queries in network switches with common characteristics and then, by omitting redundant information. In this way, the present proposal decreases the number of monitoring queries to switches, improving the network traffic and preventing the switching overload. We have tested our optimization in a video streaming simulation using different types of videos. The experiments and comparison with traditional monitoring techniques demonstrate the feasibility of our proposal maintaining similar values decreasing the number of queries to the switches. PMID:29614049
The EarthServer Federation: State, Role, and Contribution to GEOSS

NASA Astrophysics Data System (ADS)

Merticariu, Vlad; Baumann, Peter

2016-04-01

The intercontinental EarthServer initiative has established a European datacube platform with proven scalability: known databases exceed 100 TB, and single queries have been split across more than 1,000 cloud nodes. Its service interface being rigorously based on the OGC "Big Geo Data" standards, Web Coverage Service (WCS) and Web Coverage Processing Service (WCPS), a series of clients can dock into the services, ranging from open-source OpenLayers and QGIS over open-source NASA WorldWind to proprietary ESRI ArcGIS. Datacube fusion in a "mix and match" style is supported by the platform technolgy, the rasdaman Array Database System, which transparently federates queries so that users simply approach any node of the federation to access any data item, internally optimized for minimal data transfer. Notably, rasdaman is part of GEOSS GCI. NASA is contributing its Web WorldWind virtual globe for user-friendly data extraction, navigation, and analysis. Integrated datacube / metadata queries are contributed by CITE. Current federation members include ESA (managed by MEEO sr.l.), Plymouth Marine Laboratory (PML), the European Centre for Medium-Range Weather Forecast (ECMWF), Australia's National Computational Infrastructure, and Jacobs University (adding in Planetary Science). Further data centers have expressed interest in joining. We present the EarthServer approach, discuss its underlying technology, and illustrate the contribution this datacube platform can make to GEOSS.
Shark: SQL and Rich Analytics at Scale

DTIC Science & Technology

2012-11-26

learning programs up to 100 faster than Hadoop. Unlike previous systems, Shark shows that it is possible to achieve these speedups while retaining a...Shark to run SQL queries up to 100× faster than Apache Hive, and machine learning programs up to 100× faster than Hadoop. Unlike previous systems, Shark...so using a runtime that is optimized for such workloads and a programming model that is designed to express machine learn - ing algorithms. 4.1
Cumulative query method for influenza surveillance using search engine data.

PubMed

Seo, Dong-Woo; Jo, Min-Woo; Sohn, Chang Hwan; Shin, Soo-Yong; Lee, JaeHo; Yu, Maengsoo; Kim, Won Young; Lim, Kyoung Soo; Lee, Sang-Il

2014-12-16

Internet search queries have become an important data source in syndromic surveillance system. However, there is currently no syndromic surveillance system using Internet search query data in South Korea. The objective of this study was to examine correlations between our cumulative query method and national influenza surveillance data. Our study was based on the local search engine, Daum (approximately 25% market share), and influenza-like illness (ILI) data from the Korea Centers for Disease Control and Prevention. A quota sampling survey was conducted with 200 participants to obtain popular queries. We divided the study period into two sets: Set 1 (the 2009/10 epidemiological year for development set 1 and 2010/11 for validation set 1) and Set 2 (2010/11 for development Set 2 and 2011/12 for validation Set 2). Pearson's correlation coefficients were calculated between the Daum data and the ILI data for the development set. We selected the combined queries for which the correlation coefficients were .7 or higher and listed them in descending order. Then, we created a cumulative query method n representing the number of cumulative combined queries in descending order of the correlation coefficient. In validation set 1, 13 cumulative query methods were applied, and 8 had higher correlation coefficients (min=.916, max=.943) than that of the highest single combined query. Further, 11 of 13 cumulative query methods had an r value of ≥.7, but 4 of 13 combined queries had an r value of ≥.7. In validation set 2, 8 of 15 cumulative query methods showed higher correlation coefficients (min=.975, max=.987) than that of the highest single combined query. All 15 cumulative query methods had an r value of ≥.7, but 6 of 15 combined queries had an r value of ≥.7. Cumulative query method showed relatively higher correlation with national influenza surveillance data than combined queries in the development and validation set.
[Queries related to the technology of soybean seed inoculation with Bradyrhizobium spp].

PubMed

Lodeiro, Aníbal R

2015-01-01

With the aim of exploiting symbiotic nitrogen fixation, soybean crops are inoculated with selected strains of Bradyrhizobium japonicum, Bradyrhizobium diazoefficiens or Bradyrhizobium elkanii (collectively referred to as Bradyrhizobium spp.). The most common method of inoculation used is seed inoculation, whether performed immediately before sowing or using preinoculated seeds or pretreated seeds by the professional seed treatment. The methodology of inoculation should not only cover the seeds with living rhizobia, but must also optimize the chances of these rhizobia to infect the roots and nodulate. To this end, inoculated rhizobia must be in such an amount and condition that would allow them to overcome the competition exerted by the rhizobia of the allochthonous population of the soil, which are usually less effective for nitrogen fixation and thus dilute the effect of inoculation on yield. This optimization requires solving some queries related to the current knowledge of seed inoculation, which are addressed in this article. I conclude that the aspects that require further research are the adhesion and survival of rhizobia on seeds, the release of rhizobia once the seeds are deposited in the soil, and the movement of rhizobia from the vicinity of the seeds to the infection sites in the roots. Copyright © 2015 Asociación Argentina de Microbiología. Publicado por Elsevier España, S.L.U. All rights reserved.
Histogram contrast analysis and the visual segregation of IID textures.

PubMed

Chubb, C; Econopouly, J; Landy, M S

1994-09-01

A new psychophysical methodology is introduced, histogram contrast analysis, that allows one to measure stimulus transformations, f, used by the visual system to draw distinctions between different image regions. The method involves the discrimination of images constructed by selecting texture micropatterns randomly and independently (across locations) on the basis of a given micropattern histogram. Different components of f are measured by use of different component functions to modulate the micropattern histogram until the resulting textures are discriminable. When no discrimination threshold can be obtained for a given modulating component function, a second titration technique may be used to measure the contribution of that component to f. The method includes several strong tests of its own assumptions. An example is given of the method applied to visual textures composed of small, uniform squares with randomly chosen gray levels. In particular, for a fixed mean gray level mu and a fixed gray-level variance sigma 2, histogram contrast analysis is used to establish that the class S of all textures composed of small squares with jointly independent, identically distributed gray levels with mean mu and variance sigma 2 is perceptually elementary in the following sense: there exists a single, real-valued function f S of gray level, such that two textures I and J in S are discriminable only if the average value of f S applied to the gray levels in I is significantly different from the average value of f S applied to the gray levels in J. Finally, histogram contrast analysis is used to obtain a seventh-order polynomial approximation of f S.
A Query Integrator and Manager for the Query Web

PubMed Central

Brinkley, James F.; Detwiler, Landon T.

2012-01-01

We introduce two concepts: the Query Web as a layer of interconnected queries over the document web and the semantic web, and a Query Web Integrator and Manager (QI) that enables the Query Web to evolve. QI permits users to write, save and reuse queries over any web accessible source, including other queries saved in other installations of QI. The saved queries may be in any language (e.g. SPARQL, XQuery); the only condition for interconnection is that the queries return their results in some form of XML. This condition allows queries to chain off each other, and to be written in whatever language is appropriate for the task. We illustrate the potential use of QI for several biomedical use cases, including ontology view generation using a combination of graph-based and logical approaches, value set generation for clinical data management, image annotation using terminology obtained from an ontology web service, ontology-driven brain imaging data integration, small-scale clinical data integration, and wider-scale clinical data integration. Such use cases illustrate the current range of applications of QI and lead us to speculate about the potential evolution from smaller groups of interconnected queries into a larger query network that layers over the document and semantic web. The resulting Query Web could greatly aid researchers and others who now have to manually navigate through multiple information sources in order to answer specific questions. PMID:22531831
Predicting pathologic tumor response to chemoradiotherapy with histogram distances characterizing longitudinal changes in 18F-FDG uptake patterns

PubMed Central

Tan, Shan; Zhang, Hao; Zhang, Yongxue; Chen, Wengen; D’Souza, Warren D.; Lu, Wei

2013-01-01

Purpose: A family of fluorine-18 (18F)-fluorodeoxyglucose (18F-FDG) positron-emission tomography (PET) features based on histogram distances is proposed for predicting pathologic tumor response to neoadjuvant chemoradiotherapy (CRT). These features describe the longitudinal change of FDG uptake distribution within a tumor. Methods: Twenty patients with esophageal cancer treated with CRT plus surgery were included in this study. All patients underwent PET/CT scans before (pre-) and after (post-) CRT. The two scans were first rigidly registered, and the original tumor sites were then manually delineated on the pre-PET/CT by an experienced nuclear medicine physician. Two histograms representing the FDG uptake distribution were extracted from the pre- and the registered post-PET images, respectively, both within the delineated tumor. Distances between the two histograms quantify longitudinal changes in FDG uptake distribution resulting from CRT, and thus are potential predictors of tumor response. A total of 19 histogram distances were examined and compared to both traditional PET response measures and Haralick texture features. Receiver operating characteristic analyses and Mann-Whitney U test were performed to assess their predictive ability. Results: Among all tested histogram distances, seven bin-to-bin and seven crossbin distances outperformed traditional PET response measures using maximum standardized uptake value (AUC = 0.70) or total lesion glycolysis (AUC = 0.80). The seven bin-to-bin distances were: L2 distance (AUC = 0.84), χ2 distance (AUC = 0.83), intersection distance (AUC = 0.82), cosine distance (AUC = 0.83), squared Euclidean distance (AUC = 0.83), L1 distance (AUC = 0.82), and Jeffrey distance (AUC = 0.82). The seven crossbin distances were: quadratic-chi distance (AUC = 0.89), earth mover distance (AUC = 0.86), fast earth mover distance (AUC = 0.86), diffusion distance (AUC = 0.88), Kolmogorov-Smirnov distance (AUC = 0.88), quadratic form distance (AUC = 0.87), and match distance (AUC = 0.84). These crossbin histogram distance features showed slightly higher prediction accuracy than texture features on post-PET images. Conclusions: The results suggest that longitudinal patterns in 18F-FDG uptake characterized using histogram distances provide useful information for predicting the pathologic response of esophageal cancer to CRT. PMID:24089897
[Characteristics of high resolution diffusion weighted imaging apparent diffusion coefficient histogram and its correlations with cancer stages in patients with nasopharyngeal carcinoma].

PubMed

Wang, G J; Wang, Y; Ye, Y; Chen, F; Lu, Y T; Li, S L

2017-11-07

Objective: To investigate the features of apparent diffusion coefficient (ADC) histogram parameters based on entire tumor volume data in high resolution diffusion weighted imaging of nasopharyngeal carcinoma (NPC) and to evaluate its correlations with cancer stages. Methods: This retrospective study included 154 cases of NPC patients[102 males and 52 females, mean age (48±11) years]who had received readout segmentation of long variable echo trains of MRI scan before radiation therapy. The area of tumor was delineated on each section of axial ADC maps to generate ADC histogram by using Image J. ADC histogram of entire tumor along with the histogram parameters-the tumor voxels, ADC(mean), ADC(25%), ADC(50%), ADC(75%), skewness and kurtosis were obtained by merging all sections with SPSS 22.0 software. Intra-observer repeatability was assessed by using intra-class correlation coefficients (ICC). The patients were subdivided into two groups according to cancer volume: small cancer group (<305 voxels, about 2 cm(3)) and large cancer group (≥2 cm(3)). The correlation between ADC histogram parameters and cancer stages was evaluated with Spearman test. Results: The ICC of measuring ADC histogram parameters of tumor voxels, ADC(mean), ADC(25%), ADC(50%), ADC(75%), skewness, kurtosis was 0.938, 0.861, 0.885, 0.838, 0.836, 0.358 and 0.456, respectively. The tumor voxels was positively correlated with T staging ( r =0.368, P <0.05). There were significant differences in tumor voxels among patients with different T stages ( K =22.306, P <0.05). There were significant differences in the ADC(mean), ADC(25%), ADC(50%) among patients with different T stages in the small cancer group( K =8.409, 8.187, 8.699, all P <0.05), and the up-mentioned three indices were positively correlated with T staging ( r =0.221, 0.209, 0.235, all P <0.05). Skewness and kurtosis differed significantly between the groups with different cancer volume( t =-2.987, Z =-3.770, both P <0.05). Conclusion: The tumor volume, tissue uniformity of NPC are important factors affecting ADC and cancer stages, parameters of ADC histogram (ADC(mean), ADC(25%), ADC(50%)) increases with T staging in NPC smaller than 2 cm(3).
A formulation of a matrix sparsity approach for the quantum ordered search algorithm

NASA Astrophysics Data System (ADS)

Parmar, Jupinder; Rahman, Saarim; Thiara, Jaskaran

One specific subset of quantum algorithms is Grovers Ordered Search Problem (OSP), the quantum counterpart of the classical binary search algorithm, which utilizes oracle functions to produce a specified value within an ordered database. Classically, the optimal algorithm is known to have a log2N complexity; however, Grovers algorithm has been found to have an optimal complexity between the lower bound of ((lnN-1)/π≈0.221log2N) and the upper bound of 0.433log2N. We sought to lower the known upper bound of the OSP. With Farhi et al. MITCTP 2815 (1999), arXiv:quant-ph/9901059], we see that the OSP can be resolved into a translational invariant algorithm to create quantum query algorithm restraints. With these restraints, one can find Laurent polynomials for various k — queries — and N — database sizes — thus finding larger recursive sets to solve the OSP and effectively reducing the upper bound. These polynomials are found to be convex functions, allowing one to make use of convex optimization to find an improvement on the known bounds. According to Childs et al. [Phys. Rev. A 75 (2007) 032335], semidefinite programming, a subset of convex optimization, can solve the particular problem represented by the constraints. We were able to implement a program abiding to their formulation of a semidefinite program (SDP), leading us to find that it takes an immense amount of storage and time to compute. To combat this setback, we then formulated an approach to improve results of the SDP using matrix sparsity. Through the development of this approach, along with an implementation of a rudimentary solver, we demonstrate how matrix sparsity reduces the amount of time and storage required to compute the SDP — overall ensuring further improvements will likely be made to reach the theorized lower bound.
Designing a Bioengine for Detection and Analysis of Base String on an Affected Sequence in High-Concentration Regions

PubMed Central

Mandal, Bijoy Kumar; Kim, Tai-hoon

2013-01-01

We design an Algorithm for bioengine. As a program are enable optimal alignments searching between two sequences, the host sequence (normal plant) as well as query sequence (virus). Searching for homologues has become a routine operation of biological sequences in 4 × 4 combination with different subsequence (word size). This program takes the advantage of the high degree of homology between such sequences to construct an alignment of the matching regions. There is a main aim which is to detect the overlapping reading frames. This program also enables to find out the highly infected colones selection highest matching region with minimum gap or mismatch zones and unique virus colones matches. This is a small, portable, interactive, front-end program intended to be used to find out the regions of matching between host sequence and query subsequences. All the operations are carried out in fraction of seconds, depending on the required task and on the sequence length. PMID:24000321
Active learning based segmentation of Crohns disease from abdominal MRI.

PubMed

Mahapatra, Dwarikanath; Vos, Franciscus M; Buhmann, Joachim M

2016-05-01

This paper proposes a novel active learning (AL) framework, and combines it with semi supervised learning (SSL) for segmenting Crohns disease (CD) tissues from abdominal magnetic resonance (MR) images. Robust fully supervised learning (FSL) based classifiers require lots of labeled data of different disease severities. Obtaining such data is time consuming and requires considerable expertise. SSL methods use a few labeled samples, and leverage the information from many unlabeled samples to train an accurate classifier. AL queries labels of most informative samples and maximizes gain from the labeling effort. Our primary contribution is in designing a query strategy that combines novel context information with classification uncertainty and feature similarity. Combining SSL and AL gives a robust segmentation method that: (1) optimally uses few labeled samples and many unlabeled samples; and (2) requires lower training time. Experimental results show our method achieves higher segmentation accuracy than FSL methods with fewer samples and reduced training effort. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
The Selection and Placement Method of Materialized Views on Big Data Platform of Equipment Condition Assessment

NASA Astrophysics Data System (ADS)

Ma, Yan; Yao, Jinxia; Gu, Chao; Chen, Yufeng; Yang, Yi; Zou, Lida

2017-05-01

With the formation of electric big data environment, more and more big data analyses emerge. In the complicated data analysis on equipment condition assessment, there exist many join operations, which are time-consuming. In order to save time, the approach of materialized view is usually used. It places part of common and critical join results on external storage and avoids the frequent join operation. In the paper we propose the methods of selecting and placing materialized views to reduce the query time of electric transmission and transformation equipment, and make the profits of service providers maximal. In selection method we design a computation way for the value of non-leaf node based on MVPP structure chart. In placement method we use relevance weights to place the selected materialized views, which help reduce the network transmission time. Our experiments show that the proposed selection and placement methods have a high throughput and good optimization ability of query time for electric transmission and transformation equipment.
Intelligent Data Granulation on Load: Improving Infobright's Knowledge Grid

NASA Astrophysics Data System (ADS)

Ślęzak, Dominik; Kowalski, Marcin

One of the major aspects of Infobright's relational database technology is automatic decomposition of each of data tables onto Rough Rows, each consisting of 64K of original rows. Rough Rows are automatically annotated by Knowledge Nodes that represent compact information about the rows' values. Query performance depends on the quality of Knowledge Nodes, i.e., their efficiency in minimizing the access to the compressed portions of data stored on disk, according to the specific query optimization procedures. We show how to implement the mechanism of organizing the incoming data into such Rough Rows that maximize the quality of the corresponding Knowledge Nodes. Given clear business-driven requirements, the implemented mechanism needs to be fully integrated with the data load process, causing no decrease in the data load speed. The performance gain resulting from better data organization is illustrated by some tests over our benchmark data. The differences between the proposed mechanism and some well-known procedures of database clustering or partitioning are discussed. The paper is a continuation of our patent application [22].
Final Report: Efficient Databases for MPC Microdata

DOE Office of Scientific and Technical Information (OSTI.GOV)

Michael A. Bender; Martin Farach-Colton; Bradley C. Kuszmaul

2011-08-31

The purpose of this grant was to develop the theory and practice of high-performance databases for massive streamed datasets. Over the last three years, we have developed fast indexing technology, that is, technology for rapidly ingesting data and storing that data so that it can be efficiently queried and analyzed. During this project we developed the technology so that high-bandwidth data streams can be indexed and queried efficiently. Our technology has been proven to work data sets composed of tens of billions of rows when the data streams arrives at over 40,000 rows per second. We achieved these numbers evenmore » on a single disk driven by two cores. Our work comprised (1) new write-optimized data structures with better asymptotic complexity than traditional structures, (2) implementation, and (3) benchmarking. We furthermore developed a prototype of TokuFS, a middleware layer that can handle microdata I/O packaged up in an MPI-IO abstraction.« less
Clinician-Oriented Access to Data - C.O.A.D.: A Natural Language Interface to a VA DHCP Database

PubMed Central

Levy, Christine; Rogers, Elizabeth

1995-01-01

Hospitals collect enormous amounts of data related to the on-going care of patients. Unfortunately, a clinicians access to the data is limited by complexities of the database structure and/or programming skills required to access the database. The COAD project attempts to bridge the gap between the clinical user's need for specific information from the database, and the wealth of data residing in the hospital information system. The project design includes a natural language interface to data contained in a VA DHCP database. We have developed a prototype which links natural language software to certain DHCP data elements, including, patient demographics, prescriptions, diagnoses, laboratory data, and provider information. English queries can by typed onto the system, and answers to the questions are returned. Future work includes refinement of natural language/DHCP connections to enable more sophisticated queries, and optimization of the system to reduce response time to user questions.
Intraoperative virtual brain counseling

NASA Astrophysics Data System (ADS)

Jiang, Zhaowei; Grosky, William I.; Zamorano, Lucia J.; Muzik, Otto; Diaz, Fernando

1997-06-01

Our objective is to offer online real-tim e intelligent guidance to the neurosurgeon. Different from traditional image-guidance technologies that offer intra-operative visualization of medical images or atlas images, virtual brain counseling goes one step further. It can distinguish related brain structures and provide information about them intra-operatively. Virtual brain counseling is the foundation for surgical planing optimization and on-line surgical reference. It can provide a warning system that alerts the neurosurgeon if the chosen trajectory will pass through eloquent brain areas. In order to fulfill this objective, tracking techniques are involved for intra- operativity. Most importantly, a 3D virtual brian environment, different from traditional 3D digitized atlases, is an object-oriented model of the brain that stores information about different brain structures together with their elated information. An object-oriented hierarchical hyper-voxel space (HHVS) is introduced to integrate anatomical and functional structures. Spatial queries based on position of interest, line segment of interest, and volume of interest are introduced in this paper. The virtual brain environment is integrated with existing surgical pre-planning and intra-operative tracking systems to provide information for planning optimization and on-line surgical guidance. The neurosurgeon is alerted automatically if the planned treatment affects any critical structures. Architectures such as HHVS and algorithms, such as spatial querying, normalizing, and warping are presented in the paper. A prototype has shown that the virtual brain is intuitive in its hierarchical 3D appearance. It also showed that HHVS, as the key structure for virtual brain counseling, efficiently integrates multi-scale brain structures based on their spatial relationships.This is a promising development for optimization of treatment plans and online surgical intelligent guidance.

A web-based data-querying tool based on ontology-driven methodology and flowchart-based model.

PubMed

Ping, Xiao-Ou; Chung, Yufang; Tseng, Yi-Ju; Liang, Ja-Der; Yang, Pei-Ming; Huang, Guan-Tarn; Lai, Feipei

2013-10-08

Because of the increased adoption rate of electronic medical record (EMR) systems, more health care records have been increasingly accumulating in clinical data repositories. Therefore, querying the data stored in these repositories is crucial for retrieving the knowledge from such large volumes of clinical data. The aim of this study is to develop a Web-based approach for enriching the capabilities of the data-querying system along the three following considerations: (1) the interface design used for query formulation, (2) the representation of query results, and (3) the models used for formulating query criteria. The Guideline Interchange Format version 3.5 (GLIF3.5), an ontology-driven clinical guideline representation language, was used for formulating the query tasks based on the GLIF3.5 flowchart in the Protégé environment. The flowchart-based data-querying model (FBDQM) query execution engine was developed and implemented for executing queries and presenting the results through a visual and graphical interface. To examine a broad variety of patient data, the clinical data generator was implemented to automatically generate the clinical data in the repository, and the generated data, thereby, were employed to evaluate the system. The accuracy and time performance of the system for three medical query tasks relevant to liver cancer were evaluated based on the clinical data generator in the experiments with varying numbers of patients. In this study, a prototype system was developed to test the feasibility of applying a methodology for building a query execution engine using FBDQMs by formulating query tasks using the existing GLIF. The FBDQM-based query execution engine was used to successfully retrieve the clinical data based on the query tasks formatted using the GLIF3.5 in the experiments with varying numbers of patients. The accuracy of the three queries (ie, "degree of liver damage," "degree of liver damage when applying a mutually exclusive setting," and "treatments for liver cancer") was 100% for all four experiments (10 patients, 100 patients, 1000 patients, and 10,000 patients). Among the three measured query phases, (1) structured query language operations, (2) criteria verification, and (3) other, the first two had the longest execution time. The ontology-driven FBDQM-based approach enriched the capabilities of the data-querying system. The adoption of the GLIF3.5 increased the potential for interoperability, shareability, and reusability of the query tasks.
Microbubble cloud characterization by nonlinear frequency mixing.

PubMed

Cavaro, M; Payan, C; Moysan, J; Baqué, F

2011-05-01

In the frame of the fourth generation forum, France decided to develop sodium fast nuclear reactors. French Safety Authority requests the associated monitoring of argon gas into sodium. This implies to estimate the void fraction, and a histogram indicating the bubble population. In this context, the present letter studies the possibility of achieving an accurate determination of the histogram with acoustic methods. A nonlinear, two-frequency mixing technique has been implemented, and a specific optical device has been developed in order to validate the experimental results. The acoustically reconstructed histograms are in excellent agreement with those obtained using optical methods.
The ISI distribution of the stochastic Hodgkin-Huxley neuron.

PubMed

Rowat, Peter F; Greenwood, Priscilla E

2014-01-01

The simulation of ion-channel noise has an important role in computational neuroscience. In recent years several approximate methods of carrying out this simulation have been published, based on stochastic differential equations, and all giving slightly different results. The obvious, and essential, question is: which method is the most accurate and which is most computationally efficient? Here we make a contribution to the answer. We compare interspike interval histograms from simulated data using four different approximate stochastic differential equation (SDE) models of the stochastic Hodgkin-Huxley neuron, as well as the exact Markov chain model simulated by the Gillespie algorithm. One of the recent SDE models is the same as the Kurtz approximation first published in 1978. All the models considered give similar ISI histograms over a wide range of deterministic and stochastic input. Three features of these histograms are an initial peak, followed by one or more bumps, and then an exponential tail. We explore how these features depend on deterministic input and on level of channel noise, and explain the results using the stochastic dynamics of the model. We conclude with a rough ranking of the four SDE models with respect to the similarity of their ISI histograms to the histogram of the exact Markov chain model.
Histogram equalization with Bayesian estimation for noise robust speech recognition.

PubMed

Suh, Youngjoo; Kim, Hoirin

2018-02-01

The histogram equalization approach is an efficient feature normalization technique for noise robust automatic speech recognition. However, it suffers from performance degradation when some fundamental conditions are not satisfied in the test environment. To remedy these limitations of the original histogram equalization methods, class-based histogram equalization approach has been proposed. Although this approach showed substantial performance improvement under noise environments, it still suffers from performance degradation due to the overfitting problem when test data are insufficient. To address this issue, the proposed histogram equalization technique employs the Bayesian estimation method in the test cumulative distribution function estimation. It was reported in a previous study conducted on the Aurora-4 task that the proposed approach provided substantial performance gains in speech recognition systems based on the acoustic modeling of the Gaussian mixture model-hidden Markov model. In this work, the proposed approach was examined in speech recognition systems with deep neural network-hidden Markov model (DNN-HMM), the current mainstream speech recognition approach where it also showed meaningful performance improvement over the conventional maximum likelihood estimation-based method. The fusion of the proposed features with the mel-frequency cepstral coefficients provided additional performance gains in DNN-HMM systems, which otherwise suffer from performance degradation in the clean test condition.
Dynamic Contrast-enhanced MR Imaging in Renal Cell Carcinoma: Reproducibility of Histogram Analysis on Pharmacokinetic Parameters

PubMed Central

Wang, Hai-yi; Su, Zi-hua; Xu, Xiao; Sun, Zhi-peng; Duan, Fei-xue; Song, Yuan-yuan; Li, Lu; Wang, Ying-wei; Ma, Xin; Guo, Ai-tao; Ma, Lin; Ye, Hui-yi

2016-01-01

Pharmacokinetic parameters derived from dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) have been increasingly used to evaluate the permeability of tumor vessel. Histogram metrics are a recognized promising method of quantitative MR imaging that has been recently introduced in analysis of DCE-MRI pharmacokinetic parameters in oncology due to tumor heterogeneity. In this study, 21 patients with renal cell carcinoma (RCC) underwent paired DCE-MRI studies on a 3.0 T MR system. Extended Tofts model and population-based arterial input function were used to calculate kinetic parameters of RCC tumors. Mean value and histogram metrics (Mode, Skewness and Kurtosis) of each pharmacokinetic parameter were generated automatically using ImageJ software. Intra- and inter-observer reproducibility and scan–rescan reproducibility were evaluated using intra-class correlation coefficients (ICCs) and coefficient of variation (CoV). Our results demonstrated that the histogram method (Mode, Skewness and Kurtosis) was not superior to the conventional Mean value method in reproducibility evaluation on DCE-MRI pharmacokinetic parameters (K trans & Ve) in renal cell carcinoma, especially for Skewness and Kurtosis which showed lower intra-, inter-observer and scan-rescan reproducibility than Mean value. Our findings suggest that additional studies are necessary before wide incorporation of histogram metrics in quantitative analysis of DCE-MRI pharmacokinetic parameters. PMID:27380733
A study of data representation in Hadoop to optimize data storage and search performance for the ATLAS EventIndex

NASA Astrophysics Data System (ADS)

Baranowski, Z.; Canali, L.; Toebbicke, R.; Hrivnac, J.; Barberis, D.

2017-10-01

This paper reports on the activities aimed at improving the architecture and performance of the ATLAS EventIndex implementation in Hadoop. The EventIndex contains tens of billions of event records, each of which consists of ∼100 bytes, all having the same probability to be searched or counted. Data formats represent one important area for optimizing the performance and storage footprint of applications based on Hadoop. This work reports on the production usage and on tests using several data formats including Map Files, Apache Parquet, Avro, and various compression algorithms. The query engine plays also a critical role in the architecture. We report also on the use of HBase for the EventIndex, focussing on the optimizations performed in production and on the scalability tests. Additional engines that have been tested include Cloudera Impala, in particular for its SQL interface, and the optimizations for data warehouse workloads and reports.
Fabrication and characterization of a co-planar detector in diamond for low energy single ion implantation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abraham, John Bishoy Sam; Pacheco, Jose L.; Aguirre, Brandon Adrian

2016-08-09

We demonstrate low energy single ion detection using a co-planar detector fabricated on a diamond substrate and characterized by ion beam induced charge collection. Histograms are taken with low fluence ion pulses illustrating quantized ion detection down to a single ion with a signal-to-noise ratio of approximately 10. We anticipate that this detection technique can serve as a basis to optimize the yield of single color centers in diamond. In conclusion, the ability to count ions into a diamond substrate is expected to reduce the uncertainty in the yield of color center formation by removing Poisson statistics from the implantationmore » process.« less
An optimization model for infrared image enhancement method based on p-q norm constrained by saliency value

NASA Astrophysics Data System (ADS)

Fan, Fan; Ma, Yong; Dai, Xiaobing; Mei, Xiaoguang

2018-04-01

Infrared image enhancement is an important and necessary task in the infrared imaging system. In this paper, by defining the contrast in terms of the area between adjacent non-zero histogram, a novel analytical model is proposed to enlarge the areas so that the contrast can be increased. In addition, the analytical model is regularized by a penalty term based on the saliency value to enhance the salient regions as well. Thus, both of the whole images and salient regions can be enhanced, and the rank consistency can be preserved. The comparisons on 8-bit images show that the proposed method can enhance the infrared images with more details.
Comparative Analysis of Online Health Queries Originating From Personal Computers and Smart Devices on a Consumer Health Information Portal

PubMed Central

Jadhav, Ashutosh; Andrews, Donna; Fiksdal, Alexander; Kumbamu, Ashok; McCormick, Jennifer B; Misitano, Andrew; Nelsen, Laurie; Ryu, Euijung; Sheth, Amit; Wu, Stephen

2014-01-01

Background The number of people using the Internet and mobile/smart devices for health information seeking is increasing rapidly. Although the user experience for online health information seeking varies with the device used, for example, smart devices (SDs) like smartphones/tablets versus personal computers (PCs) like desktops/laptops, very few studies have investigated how online health information seeking behavior (OHISB) may differ by device. Objective The objective of this study is to examine differences in OHISB between PCs and SDs through a comparative analysis of large-scale health search queries submitted through Web search engines from both types of devices. Methods Using the Web analytics tool, IBM NetInsight OnDemand, and based on the type of devices used (PCs or SDs), we obtained the most frequent health search queries between June 2011 and May 2013 that were submitted on Web search engines and directed users to the Mayo Clinic’s consumer health information website. We performed analyses on “Queries with considering repetition counts (QwR)” and “Queries without considering repetition counts (QwoR)”. The dataset contains (1) 2.74 million and 3.94 million QwoR, respectively for PCs and SDs, and (2) more than 100 million QwR for both PCs and SDs. We analyzed structural properties of the queries (length of the search queries, usage of query operators and special characters in health queries), types of search queries (keyword-based, wh-questions, yes/no questions), categorization of the queries based on health categories and information mentioned in the queries (gender, age-groups, temporal references), misspellings in the health queries, and the linguistic structure of the health queries. Results Query strings used for health information searching via PCs and SDs differ by almost 50%. The most searched health categories are “Symptoms” (1 in 3 search queries), “Causes”, and “Treatments & Drugs”. The distribution of search queries for different health categories differs with the device used for the search. Health queries tend to be longer and more specific than general search queries. Health queries from SDs are longer and have slightly fewer spelling mistakes than those from PCs. Users specify words related to women and children more often than that of men and any other age group. Most of the health queries are formulated using keywords; the second-most common are wh- and yes/no questions. Users ask more health questions using SDs than PCs. Almost all health queries have at least one noun and health queries from SDs are more descriptive than those from PCs. Conclusions This study is a large-scale comparative analysis of health search queries to understand the effects of device type (PCs vs SDs) used on OHISB. The study indicates that the device used for online health information search plays an important role in shaping how health information searches by consumers and patients are executed. PMID:25000537
Comparative analysis of online health queries originating from personal computers and smart devices on a consumer health information portal.

PubMed

Jadhav, Ashutosh; Andrews, Donna; Fiksdal, Alexander; Kumbamu, Ashok; McCormick, Jennifer B; Misitano, Andrew; Nelsen, Laurie; Ryu, Euijung; Sheth, Amit; Wu, Stephen; Pathak, Jyotishman

2014-07-04

The number of people using the Internet and mobile/smart devices for health information seeking is increasing rapidly. Although the user experience for online health information seeking varies with the device used, for example, smart devices (SDs) like smartphones/tablets versus personal computers (PCs) like desktops/laptops, very few studies have investigated how online health information seeking behavior (OHISB) may differ by device. The objective of this study is to examine differences in OHISB between PCs and SDs through a comparative analysis of large-scale health search queries submitted through Web search engines from both types of devices. Using the Web analytics tool, IBM NetInsight OnDemand, and based on the type of devices used (PCs or SDs), we obtained the most frequent health search queries between June 2011 and May 2013 that were submitted on Web search engines and directed users to the Mayo Clinic's consumer health information website. We performed analyses on "Queries with considering repetition counts (QwR)" and "Queries without considering repetition counts (QwoR)". The dataset contains (1) 2.74 million and 3.94 million QwoR, respectively for PCs and SDs, and (2) more than 100 million QwR for both PCs and SDs. We analyzed structural properties of the queries (length of the search queries, usage of query operators and special characters in health queries), types of search queries (keyword-based, wh-questions, yes/no questions), categorization of the queries based on health categories and information mentioned in the queries (gender, age-groups, temporal references), misspellings in the health queries, and the linguistic structure of the health queries. Query strings used for health information searching via PCs and SDs differ by almost 50%. The most searched health categories are "Symptoms" (1 in 3 search queries), "Causes", and "Treatments & Drugs". The distribution of search queries for different health categories differs with the device used for the search. Health queries tend to be longer and more specific than general search queries. Health queries from SDs are longer and have slightly fewer spelling mistakes than those from PCs. Users specify words related to women and children more often than that of men and any other age group. Most of the health queries are formulated using keywords; the second-most common are wh- and yes/no questions. Users ask more health questions using SDs than PCs. Almost all health queries have at least one noun and health queries from SDs are more descriptive than those from PCs. This study is a large-scale comparative analysis of health search queries to understand the effects of device type (PCs vs. SDs) used on OHISB. The study indicates that the device used for online health information search plays an important role in shaping how health information searches by consumers and patients are executed.
Evaluation of Matrix9 silicon photomultiplier array for small-animal PET.

PubMed

Du, Junwei; Schmall, Jeffrey P; Yang, Yongfeng; Di, Kun; Roncali, Emilie; Mitchell, Gregory S; Buckley, Steve; Jackson, Carl; Cherry, Simon R

2015-02-01

The MatrixSL-9-30035-OEM (Matrix9) from SensL is a large-area silicon photomultiplier (SiPM) photodetector module consisting of a 3 × 3 array of 4 × 4 element SiPM arrays (total of 144 SiPM pixels) and incorporates SensL's front-end electronics board and coincidence board. Each SiPM pixel measures 3.16 × 3.16 mm(2) and the total size of the detector head is 47.8 × 46.3 mm(2). Using 8 × 8 polished LSO/LYSO arrays (pitch 1.5 mm) the performance of this detector system (SiPM array and readout electronics) was evaluated with a view for its eventual use in small-animal positron emission tomography (PET). Measurements of noise, signal, signal-to-noise ratio, energy resolution, flood histogram quality, timing resolution, and array trigger error were obtained at different bias voltages (28.0-32.5 V in 0.5 V intervals) and at different temperatures (5 °C-25 °C in 5 °C degree steps) to find the optimal operating conditions. The best measured signal-to-noise ratio and flood histogram quality for 511 keV gamma photons were obtained at a bias voltage of 30.0 V and a temperature of 5 °C. The energy resolution and timing resolution under these conditions were 14.2% ± 0.1% and 4.2 ± 0.1 ns, respectively. The flood histograms show that all the crystals in the 1.5 mm pitch LSO array can be clearly identified and that smaller crystal pitches can also be resolved. Flood histogram quality was also calculated using different center of gravity based positioning algorithms. Improved and more robust results were achieved using the local 9 pixels for positioning along with an energy offset calibration. To evaluate the front-end detector readout, and multiplexing efficiency, an array trigger error metric is introduced and measured at different lower energy thresholds. Using a lower energy threshold greater than 150 keV effectively eliminates any mispositioning between SiPM arrays. In summary, the Matrix9 detector system can resolve high-resolution scintillator arrays common in small-animal PET with adequate energy resolution and timing resolution over a large detector area. The modular design of the Matrix9 detector allows it to be used as a building block for simple, low channel-count, yet high performance, small animal PET or PET/MRI systems.
Evaluation of Matrix9 silicon photomultiplier array for small-animal PET

PubMed Central

Du, Junwei; Schmall, Jeffrey P.; Yang, Yongfeng; Di, Kun; Roncali, Emilie; Mitchell, Gregory S.; Buckley, Steve; Jackson, Carl; Cherry, Simon R.

2015-01-01

Purpose: The MatrixSL-9-30035-OEM (Matrix9) from SensL is a large-area silicon photomultiplier (SiPM) photodetector module consisting of a 3 × 3 array of 4 × 4 element SiPM arrays (total of 144 SiPM pixels) and incorporates SensL’s front-end electronics board and coincidence board. Each SiPM pixel measures 3.16 × 3.16 mm2 and the total size of the detector head is 47.8 × 46.3 mm2. Using 8 × 8 polished LSO/LYSO arrays (pitch 1.5 mm) the performance of this detector system (SiPM array and readout electronics) was evaluated with a view for its eventual use in small-animal positron emission tomography (PET). Methods: Measurements of noise, signal, signal-to-noise ratio, energy resolution, flood histogram quality, timing resolution, and array trigger error were obtained at different bias voltages (28.0–32.5 V in 0.5 V intervals) and at different temperatures (5 °C–25 °C in 5 °C degree steps) to find the optimal operating conditions. Results: The best measured signal-to-noise ratio and flood histogram quality for 511 keV gamma photons were obtained at a bias voltage of 30.0 V and a temperature of 5 °C. The energy resolution and timing resolution under these conditions were 14.2% ± 0.1% and 4.2 ± 0.1 ns, respectively. The flood histograms show that all the crystals in the 1.5 mm pitch LSO array can be clearly identified and that smaller crystal pitches can also be resolved. Flood histogram quality was also calculated using different center of gravity based positioning algorithms. Improved and more robust results were achieved using the local 9 pixels for positioning along with an energy offset calibration. To evaluate the front-end detector readout, and multiplexing efficiency, an array trigger error metric is introduced and measured at different lower energy thresholds. Using a lower energy threshold greater than 150 keV effectively eliminates any mispositioning between SiPM arrays. Conclusions: In summary, the Matrix9 detector system can resolve high-resolution scintillator arrays common in small-animal PET with adequate energy resolution and timing resolution over a large detector area. The modular design of the Matrix9 detector allows it to be used as a building block for simple, low channel-count, yet high performance, small animal PET or PET/MRI systems. PMID:25652479
Feasibility Study of ASTER SWIR data prediction

NASA Astrophysics Data System (ADS)

Yamamoto, H.; Gonzalez, L.

2017-12-01

Observation by ASTER SWIR spectral bands are unavailable since 2008 due to anomalously high SWIR detector temperatures, but ASTER VNIR and TIR spectral bands are still valid. SWIR wavelength region is however very useful to determining the land cover or discriminating rock types, etc. In this work, we present the results of a feasibility study for the prediction of ASTER SWIR bands with artificial neural networks (ANN) using ASTER valid bands. The latter are selected over three types of ground data sets, representing desert, rocky and vegetated areas. The ASTER VNIR bands are atmospherically corrected, using the US standard 62 model, without aerosol correction. To optimize the training of the ANN, it is crucial to categorize the input data. In this goal, we have built a histogram using a simple linear combination of the 3 VNIR bands, that we call contrast histogram, to split the input ASTER data in 4 areas. For each of these 4 areas, we have built six ANN, one for each SWIR band to retrieve with 3 inputs and two layers with 5 hidden nodes each and one outputs layer. The training of the ANN is done using ASTER pixels selected over several millions of pixels in representative desert, green and rocky areas. The analysis of the ANN results demonstrates that 99 % of the pixels are reconstructed with less than 20% error in desert areas. In rocky areas, the errors do not exceed 30%. However, the errors can exceed 50% in vegetated areas. This led us to improve the ANN by introducing new spectral bands (1.24, 1.64, 2.13 μm) from TERRA MODIS that is time synchronized with ASTER. The measurements are pan-sharpened to match ASTER spatial resolution. Instead of using a contrast histogram, a NDVI histogram helps us to classify the input data. With the newly constructed ANNs, the quality of the retrieved SWIR values is perceptible in particular over vegetation ( 45% of the points with less than 20% errors), and even more over the desert and rocky areas ( 75% of the points with less than 10% errors). We demonstrate that it is possible to build ANNs that are capable of regenerating, with a reasonable error, the SWIR bands in deserts and mountainous, while SWIR reconstruction in vegetation areas is more difficult. Improvements can be envisaged by introducing missing elements such as snow or ice along with a better discrimination of the vegetation.
Evaluation of Matrix9 silicon photomultiplier array for small-animal PET

DOE Office of Scientific and Technical Information (OSTI.GOV)

Du, Junwei, E-mail: jwdu@ucdavis.edu; Schmall, Jeffrey P.; Yang, Yongfeng

Purpose: The MatrixSL-9-30035-OEM (Matrix9) from SensL is a large-area silicon photomultiplier (SiPM) photodetector module consisting of a 3 × 3 array of 4 × 4 element SiPM arrays (total of 144 SiPM pixels) and incorporates SensL’s front-end electronics board and coincidence board. Each SiPM pixel measures 3.16 × 3.16 mm{sup 2} and the total size of the detector head is 47.8 × 46.3 mm{sup 2}. Using 8 × 8 polished LSO/LYSO arrays (pitch 1.5 mm) the performance of this detector system (SiPM array and readout electronics) was evaluated with a view for its eventual use in small-animal positron emission tomographymore » (PET). Methods: Measurements of noise, signal, signal-to-noise ratio, energy resolution, flood histogram quality, timing resolution, and array trigger error were obtained at different bias voltages (28.0–32.5 V in 0.5 V intervals) and at different temperatures (5 °C–25 °C in 5 °C degree steps) to find the optimal operating conditions. Results: The best measured signal-to-noise ratio and flood histogram quality for 511 keV gamma photons were obtained at a bias voltage of 30.0 V and a temperature of 5 °C. The energy resolution and timing resolution under these conditions were 14.2% ± 0.1% and 4.2 ± 0.1 ns, respectively. The flood histograms show that all the crystals in the 1.5 mm pitch LSO array can be clearly identified and that smaller crystal pitches can also be resolved. Flood histogram quality was also calculated using different center of gravity based positioning algorithms. Improved and more robust results were achieved using the local 9 pixels for positioning along with an energy offset calibration. To evaluate the front-end detector readout, and multiplexing efficiency, an array trigger error metric is introduced and measured at different lower energy thresholds. Using a lower energy threshold greater than 150 keV effectively eliminates any mispositioning between SiPM arrays. Conclusions: In summary, the Matrix9 detector system can resolve high-resolution scintillator arrays common in small-animal PET with adequate energy resolution and timing resolution over a large detector area. The modular design of the Matrix9 detector allows it to be used as a building block for simple, low channel-count, yet high performance, small animal PET or PET/MRI systems.« less
Variability in the microcanonical cascades parameters among gauges of urban precipitation monitoring network

NASA Astrophysics Data System (ADS)

Licznar, Paweł; Rupp, David; Adamowski, Witold

2013-04-01

In the fall of 2008, Municipal Water Supply and Sewerage Company (MWSSC) in Warsaw began operating the first large precipitation monitoring network dedicated to urban hydrology in Poland. The process of establishing the network as well as the preliminary phase of its operation, raised a number of questions concerning optimal gauge location and density and revealed the urgent need for new data processing techniques. When considering the full-field precipitation as input to hydrodynamic models of stormwater and combined sewage systems, standard processing techniques developed previously for single gauges and concentrating mainly on the analysis of maximum rainfall rates and intensity-duration-frequency (IDF) curves development were found inadequate. We used a multifractal rainfall modeling framework based on microcanonical multiplicative random cascades to analyze properties of Warsaw precipitation. We calculated breakdown coefficients (BDC) for the hierarchy of timescales from λ=1 (5-min) up to λ=128 (1280-min) for all 25 gauges in the network. At small timescales histograms of BDCs were strongly deformed due to the recording precision of rainfall amounts. A randomization procedure statistically removed the artifacts due to precision errors in the original series. At large timescales BDC values were sparse due to relatively short period of observations (2008-2011). An algorithm with a moving window was proposed to increase the number of BDC values at large timescales and to smooth their histograms. The resulting empirical BDC histograms were modeled by a theoretical "2N-B" distribution, which combined 2 separate normal (N) distributions and one beta (B) distribution. A clear evolution of BDC histograms from a 2N-B distribution for small timescales to a N-B distributions for intermediate timescales and finally to a single beta distributions for large timescales was observed for all gauges. Cluster analysis revealed close patterns of BDC distributions among almost all gauges and timescales with exception of two gauges located at the city limits (one gauge was located on the Okęcie airport). We evaluated the performance of the microcanonical cascades at disaggregating 1280-min (quasi daily precipitation totals) into 5-min rainfall data for selected gauges. Synthetic time series were analyzed with respect to their intermittency and variability of rainfall intensities and compared to observational series. We showed that microcanonical cascades models could be used in practice for generating synthetic rainfall time series suitable as input to urban hydrology models in Warsaw.
A high-performance spatial database based approach for pathology imaging algorithm evaluation

PubMed Central

Wang, Fusheng; Kong, Jun; Gao, Jingjing; Cooper, Lee A.D.; Kurc, Tahsin; Zhou, Zhengwen; Adler, David; Vergara-Niedermayr, Cristobal; Katigbak, Bryan; Brat, Daniel J.; Saltz, Joel H.

2013-01-01

Background: Algorithm evaluation provides a means to characterize variability across image analysis algorithms, validate algorithms by comparison with human annotations, combine results from multiple algorithms for performance improvement, and facilitate algorithm sensitivity studies. The sizes of images and image analysis results in pathology image analysis pose significant challenges in algorithm evaluation. We present an efficient parallel spatial database approach to model, normalize, manage, and query large volumes of analytical image result data. This provides an efficient platform for algorithm evaluation. Our experiments with a set of brain tumor images demonstrate the application, scalability, and effectiveness of the platform. Context: The paper describes an approach and platform for evaluation of pathology image analysis algorithms. The platform facilitates algorithm evaluation through a high-performance database built on the Pathology Analytic Imaging Standards (PAIS) data model. Aims: (1) Develop a framework to support algorithm evaluation by modeling and managing analytical results and human annotations from pathology images; (2) Create a robust data normalization tool for converting, validating, and fixing spatial data from algorithm or human annotations; (3) Develop a set of queries to support data sampling and result comparisons; (4) Achieve high performance computation capacity via a parallel data management infrastructure, parallel data loading and spatial indexing optimizations in this infrastructure. Materials and Methods: We have considered two scenarios for algorithm evaluation: (1) algorithm comparison where multiple result sets from different methods are compared and consolidated; and (2) algorithm validation where algorithm results are compared with human annotations. We have developed a spatial normalization toolkit to validate and normalize spatial boundaries produced by image analysis algorithms or human annotations. The validated data were formatted based on the PAIS data model and loaded into a spatial database. To support efficient data loading, we have implemented a parallel data loading tool that takes advantage of multi-core CPUs to accelerate data injection. The spatial database manages both geometric shapes and image features or classifications, and enables spatial sampling, result comparison, and result aggregation through expressive structured query language (SQL) queries with spatial extensions. To provide scalable and efficient query support, we have employed a shared nothing parallel database architecture, which distributes data homogenously across multiple database partitions to take advantage of parallel computation power and implements spatial indexing to achieve high I/O throughput. Results: Our work proposes a high performance, parallel spatial database platform for algorithm validation and comparison. This platform was evaluated by storing, managing, and comparing analysis results from a set of brain tumor whole slide images. The tools we develop are open source and available to download. Conclusions: Pathology image algorithm validation and comparison are essential to iterative algorithm development and refinement. One critical component is the support for queries involving spatial predicates and comparisons. In our work, we develop an efficient data model and parallel database approach to model, normalize, manage and query large volumes of analytical image result data. Our experiments demonstrate that the data partitioning strategy and the grid-based indexing result in good data distribution across database nodes and reduce I/O overhead in spatial join queries through parallel retrieval of relevant data and quick subsetting of datasets. The set of tools in the framework provide a full pipeline to normalize, load, manage and query analytical results for algorithm evaluation. PMID:23599905
SkyQuery - A Prototype Distributed Query and Cross-Matching Web Service for the Virtual Observatory

NASA Astrophysics Data System (ADS)

Thakar, A. R.; Budavari, T.; Malik, T.; Szalay, A. S.; Fekete, G.; Nieto-Santisteban, M.; Haridas, V.; Gray, J.

2002-12-01

We have developed a prototype distributed query and cross-matching service for the VO community, called SkyQuery, which is implemented with hierarchichal Web Services. SkyQuery enables astronomers to run combined queries on existing distributed heterogeneous astronomy archives. SkyQuery provides a simple, user-friendly interface to run distributed queries over the federation of registered astronomical archives in the VO. The SkyQuery client connects to the portal Web Service, which farms the query out to the individual archives, which are also Web Services called SkyNodes. The cross-matching algorithm is run recursively on each SkyNode. Each archive is a relational DBMS with a HTM index for fast spatial lookups. The results of the distributed query are returned as an XML DataSet that is automatically rendered by the client. SkyQuery also returns the image cutout corresponding to the query result. SkyQuery finds not only matches between the various catalogs, but also dropouts - objects that exist in some of the catalogs but not in others. This is often as important as finding matches. We demonstrate the utility of SkyQuery with a brown-dwarf search between SDSS and 2MASS, and a search for radio-quiet quasars in SDSS, 2MASS and FIRST. The importance of a service like SkyQuery for the worldwide astronomical community cannot be overstated: data on the same objects in various archives is mapped in different wavelength ranges and looks very different due to different errors, instrument sensitivities and other peculiarities of each archive. Our cross-matching algorithm preforms a fuzzy spatial join across multiple catalogs. This type of cross-matching is currently often done by eye, one object at a time. A static cross-identification table for a set of archives would become obsolete by the time it was built - the exponential growth of astronomical data means that a dynamic cross-identification mechanism like SkyQuery is the only viable option. SkyQuery was funded by a grant from the NASA AISR program.
Fast Marching Tree: a Fast Marching Sampling-Based Method for Optimal Motion Planning in Many Dimensions*

PubMed Central

Janson, Lucas; Schmerling, Edward; Clark, Ashley; Pavone, Marco

2015-01-01

In this paper we present a novel probabilistic sampling-based motion planning algorithm called the Fast Marching Tree algorithm (FMT*). The algorithm is specifically aimed at solving complex motion planning problems in high-dimensional configuration spaces. This algorithm is proven to be asymptotically optimal and is shown to converge to an optimal solution faster than its state-of-the-art counterparts, chiefly PRM* and RRT*. The FMT* algorithm performs a “lazy” dynamic programming recursion on a predetermined number of probabilistically-drawn samples to grow a tree of paths, which moves steadily outward in cost-to-arrive space. As such, this algorithm combines features of both single-query algorithms (chiefly RRT) and multiple-query algorithms (chiefly PRM), and is reminiscent of the Fast Marching Method for the solution of Eikonal equations. As a departure from previous analysis approaches that are based on the notion of almost sure convergence, the FMT* algorithm is analyzed under the notion of convergence in probability: the extra mathematical flexibility of this approach allows for convergence rate bounds—the first in the field of optimal sampling-based motion planning. Specifically, for a certain selection of tuning parameters and configuration spaces, we obtain a convergence rate bound of order O(n−1/d+ρ), where n is the number of sampled points, d is the dimension of the configuration space, and ρ is an arbitrarily small constant. We go on to demonstrate asymptotic optimality for a number of variations on FMT*, namely when the configuration space is sampled non-uniformly, when the cost is not arc length, and when connections are made based on the number of nearest neighbors instead of a fixed connection radius. Numerical experiments over a range of dimensions and obstacle configurations confirm our the-oretical and heuristic arguments by showing that FMT*, for a given execution time, returns substantially better solutions than either PRM* or RRT*, especially in high-dimensional configuration spaces and in scenarios where collision-checking is expensive. PMID:27003958
Linear feasibility algorithms for treatment planning in interstitial photodynamic therapy

NASA Astrophysics Data System (ADS)

Rendon, A.; Beck, J. C.; Lilge, Lothar

2008-02-01

Interstitial Photodynamic therapy (IPDT) has been under intense investigation in recent years, with multiple clinical trials underway. This effort has demanded the development of optimization strategies that determine the best locations and output powers for light sources (cylindrical or point diffusers) to achieve an optimal light delivery. Furthermore, we have recently introduced cylindrical diffusers with customizable emission profiles, placing additional requirements on the optimization algorithms, particularly in terms of the stability of the inverse problem. Here, we present a general class of linear feasibility algorithms and their properties. Moreover, we compare two particular instances of these algorithms, which are been used in the context of IPDT: the Cimmino algorithm and a weighted gradient descent (WGD) algorithm. The algorithms were compared in terms of their convergence properties, the cost function they minimize in the infeasible case, their ability to regularize the inverse problem, and the resulting optimal light dose distributions. Our results show that the WGD algorithm overall performs slightly better than the Cimmino algorithm and that it converges to a minimizer of a clinically relevant cost function in the infeasible case. Interestingly however, treatment plans resulting from either algorithms were very similar in terms of the resulting fluence maps and dose volume histograms, once the diffuser powers adjusted to achieve equal prostate coverage.
An Adaptive Image Enhancement Technique by Combining Cuckoo Search and Particle Swarm Optimization Algorithm

PubMed Central

Ye, Zhiwei; Wang, Mingwei; Hu, Zhengbing; Liu, Wei

2015-01-01

Image enhancement is an important procedure of image processing and analysis. This paper presents a new technique using a modified measure and blending of cuckoo search and particle swarm optimization (CS-PSO) for low contrast images to enhance image adaptively. In this way, contrast enhancement is obtained by global transformation of the input intensities; it employs incomplete Beta function as the transformation function and a novel criterion for measuring image quality considering three factors which are threshold, entropy value, and gray-level probability density of the image. The enhancement process is a nonlinear optimization problem with several constraints. CS-PSO is utilized to maximize the objective fitness criterion in order to enhance the contrast and detail in an image by adapting the parameters of a novel extension to a local enhancement technique. The performance of the proposed method has been compared with other existing techniques such as linear contrast stretching, histogram equalization, and evolutionary computing based image enhancement methods like backtracking search algorithm, differential search algorithm, genetic algorithm, and particle swarm optimization in terms of processing time and image quality. Experimental results demonstrate that the proposed method is robust and adaptive and exhibits the better performance than other methods involved in the paper. PMID:25784928

Some links on this page may take you to non-federal websites. Their policies may differ from this site.