Sample records for web document clustering

  1. Information Clustering Based on Fuzzy Multisets.

    ERIC Educational Resources Information Center

    Miyamoto, Sadaaki

    2003-01-01

    Proposes a fuzzy multiset model for information clustering with application to information retrieval on the World Wide Web. Highlights include search engines; term clustering; document clustering; algorithms for calculating cluster centers; theoretical properties concerning clustering algorithms; and examples to show how the algorithms work.…

  2. Model-based document categorization employing semantic pattern analysis and local structure clustering

    NASA Astrophysics Data System (ADS)

    Fume, Kosei; Ishitani, Yasuto

    2008-01-01

    We propose a document categorization method based on a document model that can be defined externally for each task and that categorizes Web content or business documents into a target category in accordance with the similarity of the model. The main feature of the proposed method consists of two aspects of semantics extraction from an input document. The semantics of terms are extracted by the semantic pattern analysis and implicit meanings of document substructure are specified by a bottom-up text clustering technique focusing on the similarity of text line attributes. We have constructed a system based on the proposed method for trial purposes. The experimental results show that the system achieves more than 80% classification accuracy in categorizing Web content and business documents into 15 or 70 categories.

  3. BioTextQuest: a web-based biomedical text mining suite for concept discovery.

    PubMed

    Papanikolaou, Nikolas; Pafilis, Evangelos; Nikolaou, Stavros; Ouzounis, Christos A; Iliopoulos, Ioannis; Promponas, Vasilis J

    2011-12-01

    BioTextQuest combines automated discovery of significant terms in article clusters with structured knowledge annotation, via Named Entity Recognition services, offering interactive user-friendly visualization. A tag-cloud-based illustration of terms labeling each document cluster are semantically annotated according to the biological entity, and a list of document titles enable users to simultaneously compare terms and documents of each cluster, facilitating concept association and hypothesis generation. BioTextQuest allows customization of analysis parameters, e.g. clustering/stemming algorithms, exclusion of documents/significant terms, to better match the biological question addressed. http://biotextquest.biol.ucy.ac.cy vprobon@ucy.ac.cy; iliopj@med.uoc.gr Supplementary data are available at Bioinformatics online.

  4. Automatic document classification of biological literature

    PubMed Central

    Chen, David; Müller, Hans-Michael; Sternberg, Paul W

    2006-01-01

    Background Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, a text-mining system for biological literature, which marks up full text according to a shallow ontology that includes terms of biological interest. This project investigates document classification in the context of biological literature, making use of the Textpresso markup of a corpus of Caenorhabditis elegans literature. Results We present a two-step text categorization algorithm to classify a corpus of C. elegans papers. Our classification method first uses a support vector machine-trained classifier, followed by a novel, phrase-based clustering algorithm. This clustering step autonomously creates cluster labels that are descriptive and understandable by humans. This clustering engine performed better on a standard test-set (Reuters 21578) compared to previously published results (F-value of 0.55 vs. 0.49), while producing cluster descriptions that appear more useful. A web interface allows researchers to quickly navigate through the hierarchy and look for documents that belong to a specific concept. Conclusion We have demonstrated a simple method to classify biological documents that embodies an improvement over current methods. While the classification results are currently optimized for Caenorhabditis elegans papers by human-created rules, the classification engine can be adapted to different types of documents. We have demonstrated this by presenting a web interface that allows researchers to quickly navigate through the hierarchy and look for documents that belong to a specific concept. PMID:16893465

  5. Fuzzy Document Clustering Approach using WordNet Lexical Categories

    NASA Astrophysics Data System (ADS)

    Gharib, Tarek F.; Fouad, Mohammed M.; Aref, Mostafa M.

    Text mining refers generally to the process of extracting interesting information and knowledge from unstructured text. This area is growing rapidly mainly because of the strong need for analysing the huge and large amount of textual data that reside on internal file systems and the Web. Text document clustering provides an effective navigation mechanism to organize this large amount of data by grouping their documents into a small number of meaningful classes. In this paper we proposed a fuzzy text document clustering approach using WordNet lexical categories and Fuzzy c-Means algorithm. Some experiments are performed to compare efficiency of the proposed approach with the recently reported approaches. Experimental results show that Fuzzy clustering leads to great performance results. Fuzzy c-means algorithm overcomes other classical clustering algorithms like k-means and bisecting k-means in both clustering quality and running time efficiency.

  6. Thematic clustering of text documents using an EM-based approach

    PubMed Central

    2012-01-01

    Clustering textual contents is an important step in mining useful information on the web or other text-based resources. The common task in text clustering is to handle text in a multi-dimensional space, and to partition documents into groups, where each group contains documents that are similar to each other. However, this strategy lacks a comprehensive view for humans in general since it cannot explain the main subject of each cluster. Utilizing semantic information can solve this problem, but it needs a well-defined ontology or pre-labeled gold standard set. In this paper, we present a thematic clustering algorithm for text documents. Given text, subject terms are extracted and used for clustering documents in a probabilistic framework. An EM approach is used to ensure documents are assigned to correct subjects, hence it converges to a locally optimal solution. The proposed method is distinctive because its results are sufficiently explanatory for human understanding as well as efficient for clustering performance. The experimental results show that the proposed method provides a competitive performance compared to other state-of-the-art approaches. We also show that the extracted themes from the MEDLINE® dataset represent the subjects of clusters reasonably well. PMID:23046528

  7. Pipelining Architecture of Indexing Using Agglomerative Clustering

    NASA Astrophysics Data System (ADS)

    Goyal, Deepika; Goyal, Deepti; Gupta, Parul

    2010-11-01

    The World Wide Web is an interlinked collection of billions of documents. Ironically the huge size of this collection has become an obstacle for information retrieval. To access the information from Internet, search engine is used. Search engine retrieve the pages from indexer. This paper introduce a novel pipelining technique for structuring the core index-building system that substantially reduces the index construction time and also clustering algorithm that aims at partitioning the set of documents into ordered clusters so that the documents within the same cluster are similar and are being assigned the closer document identifiers. After assigning to the clusters it creates the hierarchy of index so that searching is efficient. It will make the super cluster then mega cluster by itself. The pipeline architecture will create the index in such a way that it will be efficient in space and time saving manner. It will direct the search from higher level to lower level of index or higher level of clusters to lower level of cluster so that the user gets the possible match result in time saving manner. As one cluster is making by taking only two clusters so it search is limited to two clusters for lower level of index and so on. So it is efficient in time saving manner.

  8. RSAT 2015: Regulatory Sequence Analysis Tools

    PubMed Central

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A.; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M.; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-01-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632

  9. RSAT 2015: Regulatory Sequence Analysis Tools.

    PubMed

    Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

    2015-07-01

    RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. Assessing the Amazon Cloud Suitability for CLARREO's Computational Needs

    NASA Technical Reports Server (NTRS)

    Goldin, Daniel; Vakhnin, Andrei A.; Currey, Jon C.

    2015-01-01

    In this document we compare the performance of the Amazon Web Services (AWS), also known as Amazon Cloud, with the CLARREO (Climate Absolute Radiance and Refractivity Observatory) cluster and assess its suitability for computational needs of the CLARREO mission. A benchmark executable to process one month and one year of PARASOL (Polarization and Anistropy of Reflectances for Atmospheric Sciences coupled with Observations from a Lidar) data was used. With the optimal AWS configuration, adequate data-processing times, comparable to the CLARREO cluster, were found. The assessment of alternatives to the CLARREO cluster continues and several options, such as a NASA-based cluster, are being considered.

  11. The BioPrompt-box: an ontology-based clustering tool for searching in biological databases.

    PubMed

    Corsi, Claudio; Ferragina, Paolo; Marangoni, Roberto

    2007-03-08

    High-throughput molecular biology provides new data at an incredible rate, so that the increase in the size of biological databanks is enormous and very rapid. This scenario generates severe problems not only at indexing time, where suitable algorithmic techniques for data indexing and retrieval are required, but also at query time, since a user query may produce such a large set of results that their browsing and "understanding" becomes humanly impractical. This problem is well known to the Web community, where a new generation of Web search engines is being developed, like Vivisimo. These tools organize on-the-fly the results of a user query in a hierarchy of labeled folders that ease their browsing and knowledge extraction. We investigate this approach on biological data, and propose the so called The BioPrompt-boxsoftware system which deploys ontology-driven clustering strategies for making the searching process of biologists more efficient and effective. The BioPrompt-box (Bpb) defines a document as a biological sequence plus its associated meta-data taken from the underneath databank--like references to ontologies or to external databanks, and plain texts as comments of researchers and (title, abstracts or even body of) papers. Bpboffers several tools to customize the search and the clustering process over its indexed documents. The user can search a set of keywords within a specific field of the document schema, or can execute Blastto find documents relative to homologue sequences. In both cases the search task returns a set of documents (hits) which constitute the answer to the user query. Since the number of hits may be large, Bpbclusters them into groups of homogenous content, organized as a hierarchy of labeled clusters. The user can actually choose among several ontology-based hierarchical clustering strategies, each offering a different "view" of the returned hits. Bpbcomputes these views by exploiting the meta-data present within the retrieved documents such as the references to Gene Ontology, the taxonomy lineage, the organism and the keywords. Of course, the approach is flexible enough to leave room for future additions of other meta-information. The ultimate goal of the clustering process is to provide the user with several different readings of the (maybe numerous) query results and show possible hidden correlations among them, thus improving their browsing and understanding. Bpb is a powerful search engine that makes it very easy to perform complex queries over the indexed databanks (currently only UNIPROT is considered). The ontology-based clustering approach is efficient and effective, and could thus be applied successfully to larger databanks, like GenBank or EMBL.

  12. The BioPrompt-box: an ontology-based clustering tool for searching in biological databases

    PubMed Central

    Corsi, Claudio; Ferragina, Paolo; Marangoni, Roberto

    2007-01-01

    Background High-throughput molecular biology provides new data at an incredible rate, so that the increase in the size of biological databanks is enormous and very rapid. This scenario generates severe problems not only at indexing time, where suitable algorithmic techniques for data indexing and retrieval are required, but also at query time, since a user query may produce such a large set of results that their browsing and "understanding" becomes humanly impractical. This problem is well known to the Web community, where a new generation of Web search engines is being developed, like Vivisimo. These tools organize on-the-fly the results of a user query in a hierarchy of labeled folders that ease their browsing and knowledge extraction. We investigate this approach on biological data, and propose the so called The BioPrompt-boxsoftware system which deploys ontology-driven clustering strategies for making the searching process of biologists more efficient and effective. Results The BioPrompt-box (Bpb) defines a document as a biological sequence plus its associated meta-data taken from the underneath databank – like references to ontologies or to external databanks, and plain texts as comments of researchers and (title, abstracts or even body of) papers. Bpboffers several tools to customize the search and the clustering process over its indexed documents. The user can search a set of keywords within a specific field of the document schema, or can execute Blastto find documents relative to homologue sequences. In both cases the search task returns a set of documents (hits) which constitute the answer to the user query. Since the number of hits may be large, Bpbclusters them into groups of homogenous content, organized as a hierarchy of labeled clusters. The user can actually choose among several ontology-based hierarchical clustering strategies, each offering a different "view" of the returned hits. Bpbcomputes these views by exploiting the meta-data present within the retrieved documents such as the references to Gene Ontology, the taxonomy lineage, the organism and the keywords. Of course, the approach is flexible enough to leave room for future additions of other meta-information. The ultimate goal of the clustering process is to provide the user with several different readings of the (maybe numerous) query results and show possible hidden correlations among them, thus improving their browsing and understanding. Conclusion Bpb is a powerful search engine that makes it very easy to perform complex queries over the indexed databanks (currently only UNIPROT is considered). The ontology-based clustering approach is efficient and effective, and could thus be applied successfully to larger databanks, like GenBank or EMBL. PMID:17430575

  13. Font adaptive word indexing of modern printed documents.

    PubMed

    Marinai, Simone; Marino, Emanuele; Soda, Giovanni

    2006-08-01

    We propose an approach for the word-level indexing of modern printed documents which are difficult to recognize using current OCR engines. By means of word-level indexing, it is possible to retrieve the position of words in a document, enabling queries involving proximity of terms. Web search engines implement this kind of indexing, allowing users to retrieve Web pages on the basis of their textual content. Nowadays, digital libraries hold collections of digitized documents that can be retrieved either by browsing the document images or relying on appropriate metadata assembled by domain experts. Word indexing tools would therefore increase the access to these collections. The proposed system is designed to index homogeneous document collections by automatically adapting to different languages and font styles without relying on OCR engines for character recognition. The approach is based on three main ideas: the use of Self Organizing Maps (SOM) to perform unsupervised character clustering, the definition of one suitable vector-based word representation whose size depends on the word aspect-ratio, and the run-time alignment of the query word with indexed words to deal with broken and touching characters. The most appropriate applications are for processing modern printed documents (17th to 19th centuries) where current OCR engines are less accurate. Our experimental analysis addresses six data sets containing documents ranging from books of the 17th century to contemporary journals.

  14. MR-Tandem: parallel X!Tandem using Hadoop MapReduce on Amazon Web Services.

    PubMed

    Pratt, Brian; Howbert, J Jeffry; Tasman, Natalie I; Nilsson, Erik J

    2012-01-01

    MR-Tandem adapts the popular X!Tandem peptide search engine to work with Hadoop MapReduce for reliable parallel execution of large searches. MR-Tandem runs on any Hadoop cluster but offers special support for Amazon Web Services for creating inexpensive on-demand Hadoop clusters, enabling search volumes that might not otherwise be feasible with the compute resources a researcher has at hand. MR-Tandem is designed to drop in wherever X!Tandem is already in use and requires no modification to existing X!Tandem parameter files, and only minimal modification to X!Tandem-based workflows. MR-Tandem is implemented as a lightly modified X!Tandem C++ executable and a Python script that drives Hadoop clusters including Amazon Web Services (AWS) Elastic Map Reduce (EMR), using the modified X!Tandem program as a Hadoop Streaming mapper and reducer. The modified X!Tandem C++ source code is Artistic licensed, supports pluggable scoring, and is available as part of the Sashimi project at http://sashimi.svn.sourceforge.net/viewvc/sashimi/trunk/trans_proteomic_pipeline/extern/xtandem/. The MR-Tandem Python script is Apache licensed and available as part of the Insilicos Cloud Army project at http://ica.svn.sourceforge.net/viewvc/ica/trunk/mr-tandem/. Full documentation and a windows installer that configures MR-Tandem, Python and all necessary packages are available at this same URL. brian.pratt@insilicos.com

  15. ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis

    PubMed Central

    Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas

    2016-01-01

    Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/. PMID:26882475

  16. Document clustering methods, document cluster label disambiguation methods, document clustering apparatuses, and articles of manufacture

    DOEpatents

    Sanfilippo, Antonio [Richland, WA; Calapristi, Augustin J [West Richland, WA; Crow, Vernon L [Richland, WA; Hetzler, Elizabeth G [Kennewick, WA; Turner, Alan E [Kennewick, WA

    2009-12-22

    Document clustering methods, document cluster label disambiguation methods, document clustering apparatuses, and articles of manufacture are described. In one aspect, a document clustering method includes providing a document set comprising a plurality of documents, providing a cluster comprising a subset of the documents of the document set, using a plurality of terms of the documents, providing a cluster label indicative of subject matter content of the documents of the cluster, wherein the cluster label comprises a plurality of word senses, and selecting one of the word senses of the cluster label.

  17. Extracting Related Words from Anchor Text Clusters by Focusing on the Page Designer's Intention

    NASA Astrophysics Data System (ADS)

    Liu, Jianquan; Chen, Hanxiong; Furuse, Kazutaka; Ohbo, Nobuo

    Approaches for extracting related words (terms) by co-occurrence work poorly sometimes. Two words frequently co-occurring in the same documents are considered related. However, they may not relate at all because they would have no common meanings nor similar semantics. We address this problem by considering the page designer’s intention and propose a new model to extract related words. Our approach is based on the idea that the web page designers usually make the correlative hyperlinks appear in close zone on the browser. We developed a browser-based crawler to collect “geographically” near hyperlinks, then by clustering these hyperlinks based on their pixel coordinates, we extract related words which can well reflect the designer’s intention. Experimental results show that our method can represent the intention of the web page designer in extremely high precision. Moreover, the experiments indicate that our extracting method can obtain related words in a high average precision.

  18. MR-Tandem: parallel X!Tandem using Hadoop MapReduce on Amazon Web Services

    PubMed Central

    Pratt, Brian; Howbert, J. Jeffry; Tasman, Natalie I.; Nilsson, Erik J.

    2012-01-01

    Summary: MR-Tandem adapts the popular X!Tandem peptide search engine to work with Hadoop MapReduce for reliable parallel execution of large searches. MR-Tandem runs on any Hadoop cluster but offers special support for Amazon Web Services for creating inexpensive on-demand Hadoop clusters, enabling search volumes that might not otherwise be feasible with the compute resources a researcher has at hand. MR-Tandem is designed to drop in wherever X!Tandem is already in use and requires no modification to existing X!Tandem parameter files, and only minimal modification to X!Tandem-based workflows. Availability and implementation: MR-Tandem is implemented as a lightly modified X!Tandem C++ executable and a Python script that drives Hadoop clusters including Amazon Web Services (AWS) Elastic Map Reduce (EMR), using the modified X!Tandem program as a Hadoop Streaming mapper and reducer. The modified X!Tandem C++ source code is Artistic licensed, supports pluggable scoring, and is available as part of the Sashimi project at http://sashimi.svn.sourceforge.net/viewvc/sashimi/trunk/trans_proteomic_pipeline/extern/xtandem/. The MR-Tandem Python script is Apache licensed and available as part of the Insilicos Cloud Army project at http://ica.svn.sourceforge.net/viewvc/ica/trunk/mr-tandem/. Full documentation and a windows installer that configures MR-Tandem, Python and all necessary packages are available at this same URL. Contact: brian.pratt@insilicos.com PMID:22072385

  19. Development of a model of the tobacco industry's interference with tobacco control programmes

    PubMed Central

    Trochim, W; Stillman, F; Clark, P; Schmitt, C

    2003-01-01

    Objective: To construct a conceptual model of tobacco industry tactics to undermine tobacco control programmes for the purposes of: (1) developing measures to evaluate industry tactics, (2) improving tobacco control planning, and (3) supplementing current or future frameworks used to classify and analyse tobacco industry documents. Design: Web based concept mapping was conducted, including expert brainstorming, sorting, and rating of statements describing industry tactics. Statistical analyses used multidimensional scaling and cluster analysis. Interpretation of the resulting maps was accomplished by an expert panel during a face-to-face meeting. Subjects: 34 experts, selected because of their previous encounters with industry resistance or because of their research into industry tactics, took part in some or all phases of the project. Results: Maps with eight non-overlapping clusters in two dimensional space were developed, with importance ratings of the statements and clusters. Cluster and quadrant labels were agreed upon by the experts. Conclusions: The conceptual maps summarise the tactics used by the industry and their relationships to each other, and suggest a possible hierarchy for measures that can be used in statistical modelling of industry tactics and for review of industry documents. Finally, the maps enable hypothesis of a likely progression of industry reactions as public health programmes become more successful, and therefore more threatening to industry profits. PMID:12773723

  20. WEBCAP: Web Scheduler for Distance Learning Multimedia Documents with Web Workload Considerations

    ERIC Educational Resources Information Center

    Habib, Sami; Safar, Maytham

    2008-01-01

    In many web applications, such as the distance learning, the frequency of refreshing multimedia web documents places a heavy burden on the WWW resources. Moreover, the updated web documents may encounter inordinate delays, which make it difficult to retrieve web documents in time. Here, we present an Internet tool called WEBCAP that can schedule…

  1. Mining a Web Citation Database for Author Co-Citation Analysis.

    ERIC Educational Resources Information Center

    He, Yulan; Hui, Siu Cheung

    2002-01-01

    Proposes a mining process to automate author co-citation analysis based on the Web Citation Database, a data warehouse for storing citation indices of Web publications. Describes the use of agglomerative hierarchical clustering for author clustering and multidimensional scaling for displaying author cluster maps, and explains PubSearch, a…

  2. ClusterControl: a web interface for distributing and monitoring bioinformatics applications on a Linux cluster.

    PubMed

    Stocker, Gernot; Rieder, Dietmar; Trajanoski, Zlatko

    2004-03-22

    ClusterControl is a web interface to simplify distributing and monitoring bioinformatics applications on Linux cluster systems. We have developed a modular concept that enables integration of command line oriented program into the application framework of ClusterControl. The systems facilitate integration of different applications accessed through one interface and executed on a distributed cluster system. The package is based on freely available technologies like Apache as web server, PHP as server-side scripting language and OpenPBS as queuing system and is available free of charge for academic and non-profit institutions. http://genome.tugraz.at/Software/ClusterControl

  3. A Web service substitution method based on service cluster nets

    NASA Astrophysics Data System (ADS)

    Du, YuYue; Gai, JunJing; Zhou, MengChu

    2017-11-01

    Service substitution is an important research topic in the fields of Web services and service-oriented computing. This work presents a novel method to analyse and substitute Web services. A new concept, called a Service Cluster Net Unit, is proposed based on Web service clusters. A service cluster is converted into a Service Cluster Net Unit. Then it is used to analyse whether the services in the cluster can satisfy some service requests. Meanwhile, the substitution methods of an atomic service and a composite service are proposed. The correctness of the proposed method is proved, and the effectiveness is shown and compared with the state-of-the-art method via an experiment. It can be readily applied to e-commerce service substitution to meet the business automation needs.

  4. Document Clustering Approach for Meta Search Engine

    NASA Astrophysics Data System (ADS)

    Kumar, Naresh, Dr.

    2017-08-01

    The size of WWW is growing exponentially with ever change in technology. This results in huge amount of information with long list of URLs. Manually it is not possible to visit each page individually. So, if the page ranking algorithms are used properly then user search space can be restricted up to some pages of searched results. But available literatures show that no single search system can provide qualitative results from all the domains. This paper provides solution to this problem by introducing a new meta search engine that determine the relevancy of query corresponding to web page and cluster the results accordingly. The proposed approach reduces the user efforts, improves the quality of results and performance of the meta search engine.

  5. Dynamic "inline" images: context-sensitive retrieval and integration of images into Web documents.

    PubMed

    Kahn, Charles E

    2008-09-01

    Integrating relevant images into web-based information resources adds value for research and education. This work sought to evaluate the feasibility of using "Web 2.0" technologies to dynamically retrieve and integrate pertinent images into a radiology web site. An online radiology reference of 1,178 textual web documents was selected as the set of target documents. The ARRS GoldMiner image search engine, which incorporated 176,386 images from 228 peer-reviewed journals, retrieved images on demand and integrated them into the documents. At least one image was retrieved in real-time for display as an "inline" image gallery for 87% of the web documents. Each thumbnail image was linked to the full-size image at its original web site. Review of 20 randomly selected Collaborative Hypertext of Radiology documents found that 69 of 72 displayed images (96%) were relevant to the target document. Users could click on the "More" link to search the image collection more comprehensively and, from there, link to the full text of the article. A gallery of relevant radiology images can be inserted easily into web pages on any web server. Indexing by concepts and keywords allows context-aware image retrieval, and searching by document title and subject metadata yields excellent results. These techniques allow web developers to incorporate easily a context-sensitive image gallery into their documents.

  6. Study of parameters of the nearest neighbour shared algorithm on clustering documents

    NASA Astrophysics Data System (ADS)

    Mustika Rukmi, Alvida; Budi Utomo, Daryono; Imro’atus Sholikhah, Neni

    2018-03-01

    Document clustering is one way of automatically managing documents, extracting of document topics and fastly filtering information. Preprocess of clustering documents processed by textmining consists of: keyword extraction using Rapid Automatic Keyphrase Extraction (RAKE) and making the document as concept vector using Latent Semantic Analysis (LSA). Furthermore, the clustering process is done so that the documents with the similarity of the topic are in the same cluster, based on the preprocesing by textmining performed. Shared Nearest Neighbour (SNN) algorithm is a clustering method based on the number of "nearest neighbors" shared. The parameters in the SNN Algorithm consist of: k nearest neighbor documents, ɛ shared nearest neighbor documents and MinT minimum number of similar documents, which can form a cluster. Characteristics The SNN algorithm is based on shared ‘neighbor’ properties. Each cluster is formed by keywords that are shared by the documents. SNN algorithm allows a cluster can be built more than one keyword, if the value of the frequency of appearing keywords in document is also high. Determination of parameter values on SNN algorithm affects document clustering results. The higher parameter value k, will increase the number of neighbor documents from each document, cause similarity of neighboring documents are lower. The accuracy of each cluster is also low. The higher parameter value ε, caused each document catch only neighbor documents that have a high similarity to build a cluster. It also causes more unclassified documents (noise). The higher the MinT parameter value cause the number of clusters will decrease, since the number of similar documents can not form clusters if less than MinT. Parameter in the SNN Algorithm determine performance of clustering result and the amount of noise (unclustered documents ). The Silhouette coeffisient shows almost the same result in many experiments, above 0.9, which means that SNN algorithm works well with different parameter values.

  7. The Food Web of Potter Cove (Antarctica): complexity, structure and function

    NASA Astrophysics Data System (ADS)

    Marina, Tomás I.; Salinas, Vanesa; Cordone, Georgina; Campana, Gabriela; Moreira, Eugenia; Deregibus, Dolores; Torre, Luciana; Sahade, Ricardo; Tatián, Marcos; Barrera Oro, Esteban; De Troch, Marleen; Doyle, Santiago; Quartino, María Liliana; Saravia, Leonardo A.; Momo, Fernando R.

    2018-01-01

    Knowledge of the food web structure and complexity are central to better understand ecosystem functioning. A food-web approach includes both species and energy flows among them, providing a natural framework for characterizing species' ecological roles and the mechanisms through which biodiversity influences ecosystem dynamics. Here we present for the first time a high-resolution food web for a marine ecosystem at Potter Cove (northern Antarctic Peninsula). Eleven food web properties were analyzed in order to document network complexity, structure and topology. We found a low linkage density (3.4), connectance (0.04) and omnivory percentage (45), as well as a short path length (1.8) and a low clustering coefficient (0.08). Furthermore, relating the structure of the food web to its dynamics, an exponential degree distribution (in- and out-links) was found. This suggests that the Potter Cove food web may be vulnerable if the most connected species became locally extinct. For two of the three more connected functional groups, competition overlap graphs imply high trophic interaction between demersal fish and niche specialization according to feeding strategies in amphipods. On the other hand, the prey overlap graph shows also that multiple energy pathways of carbon flux exist across benthic and pelagic habitats in the Potter Cove ecosystem. Although alternative food sources might add robustness to the web, network properties (low linkage density, connectance and omnivory) suggest fragility and potential trophic cascade effects.

  8. Web service discovery among large service pools utilising semantic similarity and clustering

    NASA Astrophysics Data System (ADS)

    Chen, Fuzan; Li, Minqiang; Wu, Harris; Xie, Lingli

    2017-03-01

    With the rapid development of electronic business, Web services have attracted much attention in recent years. Enterprises can combine individual Web services to provide new value-added services. An emerging challenge is the timely discovery of close matches to service requests among large service pools. In this study, we first define a new semantic similarity measure combining functional similarity and process similarity. We then present a service discovery mechanism that utilises the new semantic similarity measure for service matching. All the published Web services are pre-grouped into functional clusters prior to the matching process. For a user's service request, the discovery mechanism first identifies matching services clusters and then identifies the best matching Web services within these matching clusters. Experimental results show that the proposed semantic discovery mechanism performs better than a conventional lexical similarity-based mechanism.

  9. Analysis of co-occurrence toponyms in web pages based on complex networks

    NASA Astrophysics Data System (ADS)

    Zhong, Xiang; Liu, Jiajun; Gao, Yong; Wu, Lun

    2017-01-01

    A large number of geographical toponyms exist in web pages and other documents, providing abundant geographical resources for GIS. It is very common for toponyms to co-occur in the same documents. To investigate these relations associated with geographic entities, a novel complex network model for co-occurrence toponyms is proposed. Then, 12 toponym co-occurrence networks are constructed from the toponym sets extracted from the People's Daily Paper documents of 2010. It is found that two toponyms have a high co-occurrence probability if they are at the same administrative level or if they possess a part-whole relationship. By applying complex network analysis methods to toponym co-occurrence networks, we find the following characteristics. (1) The navigation vertices of the co-occurrence networks can be found by degree centrality analysis. (2) The networks express strong cluster characteristics, and it takes only several steps to reach one vertex from another one, implying that the networks are small-world graphs. (3) The degree distribution satisfies the power law with an exponent of 1.7, so the networks are free-scale. (4) The networks are disassortative and have similar assortative modes, with assortative exponents of approximately 0.18 and assortative indexes less than 0. (5) The frequency of toponym co-occurrence is weakly negatively correlated with geographic distance, but more strongly negatively correlated with administrative hierarchical distance. Considering the toponym frequencies and co-occurrence relationships, a novel method based on link analysis is presented to extract the core toponyms from web pages. This method is suitable and effective for geographical information retrieval.

  10. Web Program for Development of GUIs for Cluster Computers

    NASA Technical Reports Server (NTRS)

    Czikmantory, Akos; Cwik, Thomas; Klimeck, Gerhard; Hua, Hook; Oyafuso, Fabiano; Vinyard, Edward

    2003-01-01

    WIGLAF (a Web Interface Generator and Legacy Application Facade) is a computer program that provides a Web-based, distributed, graphical-user-interface (GUI) framework that can be adapted to any of a broad range of application programs, written in any programming language, that are executed remotely on any cluster computer system. WIGLAF enables the rapid development of a GUI for controlling and monitoring a specific application program running on the cluster and for transferring data to and from the application program. The only prerequisite for the execution of WIGLAF is a Web-browser program on a user's personal computer connected with the cluster via the Internet. WIGLAF has a client/server architecture: The server component is executed on the cluster system, where it controls the application program and serves data to the client component. The client component is an applet that runs in the Web browser. WIGLAF utilizes the Extensible Markup Language to hold all data associated with the application software, Java to enable platform-independent execution on the cluster system and the display of a GUI generator through the browser, and the Java Remote Method Invocation software package to provide simple, effective client/server networking.

  11. Load Balancing in Distributed Web Caching: A Novel Clustering Approach

    NASA Astrophysics Data System (ADS)

    Tiwari, R.; Kumar, K.; Khan, G.

    2010-11-01

    The World Wide Web suffers from scaling and reliability problems due to overloaded and congested proxy servers. Caching at local proxy servers helps, but cannot satisfy more than a third to half of requests; more requests are still sent to original remote origin servers. In this paper we have developed an algorithm for Distributed Web Cache, which incorporates cooperation among proxy servers of one cluster. This algorithm uses Distributed Web Cache concepts along with static hierarchies with geographical based clusters of level one proxy server with dynamic mechanism of proxy server during the congestion of one cluster. Congestion and scalability problems are being dealt by clustering concept used in our approach. This results in higher hit ratio of caches, with lesser latency delay for requested pages. This algorithm also guarantees data consistency between the original server objects and the proxy cache objects.

  12. Going, going, still there: using the WebCite service to permanently archive cited web pages.

    PubMed

    Eysenbach, Gunther; Trudel, Mathieu

    2005-12-30

    Scholars are increasingly citing electronic "web references" which are not preserved in libraries or full text archives. WebCite is a new standard for citing web references. To "webcite" a document involves archiving the cited Web page through www.webcitation.org and citing the WebCite permalink instead of (or in addition to) the unstable live Web page. This journal has amended its "instructions for authors" accordingly, asking authors to archive cited Web pages before submitting a manuscript. Almost 200 other journals are already using the system. We discuss the rationale for WebCite, its technology, and how scholars, editors, and publishers can benefit from the service. Citing scholars initiate an archiving process of all cited Web references, ideally before they submit a manuscript. Authors of online documents and websites which are expected to be cited by others can ensure that their work is permanently available by creating an archived copy using WebCite and providing the citation information including the WebCite link on their Web document(s). Editors should ask their authors to cache all cited Web addresses (Uniform Resource Locators, or URLs) "prospectively" before submitting their manuscripts to their journal. Editors and publishers should also instruct their copyeditors to cache cited Web material if the author has not done so already. Finally, WebCite can process publisher submitted "citing articles" (submitted for example as eXtensible Markup Language [XML] documents) to automatically archive all cited Web pages shortly before or on publication. Finally, WebCite can act as a focussed crawler, caching retrospectively references of already published articles. Copyright issues are addressed by honouring respective Internet standards (robot exclusion files, no-cache and no-archive tags). Long-term preservation is ensured by agreements with libraries and digital preservation organizations. The resulting WebCite Index may also have applications for research assessment exercises, being able to measure the impact of Web services and published Web documents through access and Web citation metrics.

  13. Scientific authorship and collaboration network analysis on malaria research in Benin: papers indexed in the web of science (1996-2016).

    PubMed

    Azondekon, Roseric; Harper, Zachary James; Agossa, Fiacre Rodrigue; Welzig, Charles Michael; McRoy, Susan

    2018-01-01

    To sustain the critical progress made, prioritization and a multidisciplinary approach to malaria research remain important to the national malaria control program in Benin. To document the structure of the malaria collaborative research in Benin, we analyze authorship of the scientific documents published on malaria from Benin. We collected bibliographic data from the Web Of Science on malaria research in Benin from January 1996 to December 2016. From the collected data, a mulitigraph co-authorship network with authors representing vertices was generated. An edge was drawn between two authors when they co-author a paper. We computed vertex degree, betweenness, closeness, and eigenvectors among others to identify prolific authors. We further assess the weak points and how information flow in the network. Finally, we perform a hierarchical clustering analysis, and Monte-Carlo simulations. Overall, 427 publications were included in this study. The generated network contained 1792 authors and 116,388 parallel edges which converted in a weighted graph of 1792 vertices and 95,787 edges. Our results suggested that prolific authors with higher degrees tend to collaborate more. The hierarchical clustering revealed 23 clusters, seven of which form a giant component containing 94% of all the vertices in the network. This giant component has all the characteristics of a small-world network with a small shortest path distance between pairs of three, a diameter of 10 and a high clustering coefficient of 0.964. However, Monte-Carlo simulations suggested our observed network is an unusual type of small-world network. Sixteen vertices were identified as weak articulation points within the network. The malaria research collaboration network in Benin is a complex network that seems to display the characteristics of a small-world network. This research reveals the presence of closed research groups where collaborative research likely happens only between members. Interdisciplinary collaboration tends to occur at higher levels between prolific researchers. Continuously supporting, stabilizing the identified key brokers and most productive authors in the Malaria research collaborative network is an urgent need in Benin. It will foster the malaria research network and ensure the promotion of junior scientists in the field.

  14. Extraction of a group-pair relation: problem-solving relation from web-board documents.

    PubMed

    Pechsiri, Chaveevan; Piriyakul, Rapepun

    2016-01-01

    This paper aims to extract a group-pair relation as a Problem-Solving relation, for example a DiseaseSymptom-Treatment relation and a CarProblem-Repair relation, between two event-explanation groups, a problem-concept group as a symptom/CarProblem-concept group and a solving-concept group as a treatment-concept/repair concept group from hospital-web-board and car-repair-guru-web-board documents. The Problem-Solving relation (particularly Symptom-Treatment relation) including the graphical representation benefits non-professional persons by supporting knowledge of primarily solving problems. The research contains three problems: how to identify an EDU (an Elementary Discourse Unit, which is a simple sentence) with the event concept of either a problem or a solution; how to determine a problem-concept EDU boundary and a solving-concept EDU boundary as two event-explanation groups, and how to determine the Problem-Solving relation between these two event-explanation groups. Therefore, we apply word co-occurrence to identify a problem-concept EDU and a solving-concept EDU, and machine-learning techniques to solve a problem-concept EDU boundary and a solving-concept EDU boundary. We propose using k-mean and Naïve Bayes to determine the Problem-Solving relation between the two event-explanation groups involved with clustering features. In contrast to previous works, the proposed approach enables group-pair relation extraction with high accuracy.

  15. ΛGR Centennial: Cosmic Web in Dark Energy Background

    NASA Astrophysics Data System (ADS)

    Chernin, A. D.

    The basic building blocks of the Cosmic Web are groups and clusters of galaxies, super-clusters (pancakes) and filaments embedded in the universal dark energy background. The background produces antigravity, and the antigravity effect is strong in groups, clusters and superclusters. Antigravity is very weak in filaments where matter (dark matter and baryons) produces gravity dominating in the filament internal dynamics. Gravity-antigravity interplay on the large scales is a grandiose phenomenon predicted by ΛGR theory and seen in modern observations of the Cosmic Web.

  16. ICM: a web server for integrated clustering of multi-dimensional biomedical data.

    PubMed

    He, Song; He, Haochen; Xu, Wenjian; Huang, Xin; Jiang, Shuai; Li, Fei; He, Fuchu; Bo, Xiaochen

    2016-07-08

    Large-scale efforts for parallel acquisition of multi-omics profiling continue to generate extensive amounts of multi-dimensional biomedical data. Thus, integrated clustering of multiple types of omics data is essential for developing individual-based treatments and precision medicine. However, while rapid progress has been made, methods for integrated clustering are lacking an intuitive web interface that facilitates the biomedical researchers without sufficient programming skills. Here, we present a web tool, named Integrated Clustering of Multi-dimensional biomedical data (ICM), that provides an interface from which to fuse, cluster and visualize multi-dimensional biomedical data and knowledge. With ICM, users can explore the heterogeneity of a disease or a biological process by identifying subgroups of patients. The results obtained can then be interactively modified by using an intuitive user interface. Researchers can also exchange the results from ICM with collaborators via a web link containing a Project ID number that will directly pull up the analysis results being shared. ICM also support incremental clustering that allows users to add new sample data into the data of a previous study to obtain a clustering result. Currently, the ICM web server is available with no login requirement and at no cost at http://biotech.bmi.ac.cn/icm/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Trust estimation of the semantic web using semantic web clustering

    NASA Astrophysics Data System (ADS)

    Shirgahi, Hossein; Mohsenzadeh, Mehran; Haj Seyyed Javadi, Hamid

    2017-05-01

    Development of semantic web and social network is undeniable in the Internet world these days. Widespread nature of semantic web has been very challenging to assess the trust in this field. In recent years, extensive researches have been done to estimate the trust of semantic web. Since trust of semantic web is a multidimensional problem, in this paper, we used parameters of social network authority, the value of pages links authority and semantic authority to assess the trust. Due to the large space of semantic network, we considered the problem scope to the clusters of semantic subnetworks and obtained the trust of each cluster elements as local and calculated the trust of outside resources according to their local trusts and trust of clusters to each other. According to the experimental result, the proposed method shows more than 79% Fscore that is about 11.9% in average more than Eigen, Tidal and centralised trust methods. Mean of error in this proposed method is 12.936, that is 9.75% in average less than Eigen and Tidal trust methods.

  18. Agent-based method for distributed clustering of textual information

    DOEpatents

    Potok, Thomas E [Oak Ridge, TN; Reed, Joel W [Knoxville, TN; Elmore, Mark T [Oak Ridge, TN; Treadwell, Jim N [Louisville, TN

    2010-09-28

    A computer method and system for storing, retrieving and displaying information has a multiplexing agent (20) that calculates a new document vector (25) for a new document (21) to be added to the system and transmits the new document vector (25) to master cluster agents (22) and cluster agents (23) for evaluation. These agents (22, 23) perform the evaluation and return values upstream to the multiplexing agent (20) based on the similarity of the document to documents stored under their control. The multiplexing agent (20) then sends the document (21) and the document vector (25) to the master cluster agent (22), which then forwards it to a cluster agent (23) or creates a new cluster agent (23) to manage the document (21). The system also searches for stored documents according to a search query having at least one term and identifying the documents found in the search, and displays the documents in a clustering display (80) of similarity so as to indicate similarity of the documents to each other.

  19. The design and implementation of web mining in web sites security

    NASA Astrophysics Data System (ADS)

    Li, Jian; Zhang, Guo-Yin; Gu, Guo-Chang; Li, Jian-Li

    2003-06-01

    The backdoor or information leak of Web servers can be detected by using Web Mining techniques on some abnormal Web log and Web application log data. The security of Web servers can be enhanced and the damage of illegal access can be avoided. Firstly, the system for discovering the patterns of information leakages in CGI scripts from Web log data was proposed. Secondly, those patterns for system administrators to modify their codes and enhance their Web site security were provided. The following aspects were described: one is to combine web application log with web log to extract more information, so web data mining could be used to mine web log for discovering the information that firewall and Information Detection System cannot find. Another approach is to propose an operation module of web site to enhance Web site security. In cluster server session, Density-Based Clustering technique is used to reduce resource cost and obtain better efficiency.

  20. OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species

    USDA-ARS?s Scientific Manuscript database

    Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that i...

  1. Query Results Clustering by Extending SPARQL with CLUSTER BY

    NASA Astrophysics Data System (ADS)

    Ławrynowicz, Agnieszka

    The task of dynamic clustering of the search results proved to be useful in the Web context, where the user often does not know the granularity of the search results in advance. The goal of this paper is to provide a declarative way for invoking dynamic clustering of the results of queries submitted over Semantic Web data. To achieve this goal the paper proposes an approach that extends SPARQL by clustering abilities. The approach introduces a new statement, CLUSTER BY, into the SPARQL grammar and proposes semantics for such extension.

  2. Going, Going, Still There: Using the WebCite Service to Permanently Archive Cited Web Pages

    PubMed Central

    Trudel, Mathieu

    2005-01-01

    Scholars are increasingly citing electronic “web references” which are not preserved in libraries or full text archives. WebCite is a new standard for citing web references. To “webcite” a document involves archiving the cited Web page through www.webcitation.org and citing the WebCite permalink instead of (or in addition to) the unstable live Web page. This journal has amended its “instructions for authors” accordingly, asking authors to archive cited Web pages before submitting a manuscript. Almost 200 other journals are already using the system. We discuss the rationale for WebCite, its technology, and how scholars, editors, and publishers can benefit from the service. Citing scholars initiate an archiving process of all cited Web references, ideally before they submit a manuscript. Authors of online documents and websites which are expected to be cited by others can ensure that their work is permanently available by creating an archived copy using WebCite and providing the citation information including the WebCite link on their Web document(s). Editors should ask their authors to cache all cited Web addresses (Uniform Resource Locators, or URLs) “prospectively” before submitting their manuscripts to their journal. Editors and publishers should also instruct their copyeditors to cache cited Web material if the author has not done so already. Finally, WebCite can process publisher submitted “citing articles” (submitted for example as eXtensible Markup Language [XML] documents) to automatically archive all cited Web pages shortly before or on publication. Finally, WebCite can act as a focussed crawler, caching retrospectively references of already published articles. Copyright issues are addressed by honouring respective Internet standards (robot exclusion files, no-cache and no-archive tags). Long-term preservation is ensured by agreements with libraries and digital preservation organizations. The resulting WebCite Index may also have applications for research assessment exercises, being able to measure the impact of Web services and published Web documents through access and Web citation metrics. PMID:16403724

  3. Clustering XML Documents Using Frequent Subtrees

    NASA Astrophysics Data System (ADS)

    Kutty, Sangeetha; Tran, Tien; Nayak, Richi; Li, Yuefeng

    This paper presents an experimental study conducted over the INEX 2008 Document Mining Challenge corpus using both the structure and the content of XML documents for clustering them. The concise common substructures known as the closed frequent subtrees are generated using the structural information of the XML documents. The closed frequent subtrees are then used to extract the constrained content from the documents. A matrix containing the term distribution of the documents in the dataset is developed using the extracted constrained content. The k-way clustering algorithm is applied to the matrix to obtain the required clusters. In spite of the large number of documents in the INEX 2008 Wikipedia dataset, the proposed frequent subtree-based clustering approach was successful in clustering the documents. This approach significantly reduces the dimensionality of the terms used for clustering without much loss in accuracy.

  4. RSAT 2018: regulatory sequence analysis tools 20th anniversary.

    PubMed

    Nguyen, Nga Thi Thuy; Contreras-Moreira, Bruno; Castro-Mondragon, Jaime A; Santana-Garcia, Walter; Ossio, Raul; Robles-Espinoza, Carla Daniela; Bahin, Mathieu; Collombet, Samuel; Vincens, Pierre; Thieffry, Denis; van Helden, Jacques; Medina-Rivera, Alejandra; Thomas-Chollier, Morgane

    2018-05-02

    RSAT (Regulatory Sequence Analysis Tools) is a suite of modular tools for the detection and the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, including from genome-wide datasets like ChIP-seq/ATAC-seq, (ii) motif scanning, (iii) motif analysis (quality assessment, comparisons and clustering), (iv) analysis of regulatory variations, (v) comparative genomics. Six public servers jointly support 10 000 genomes from all kingdoms. Six novel or refactored programs have been added since the 2015 NAR Web Software Issue, including updated programs to analyse regulatory variants (retrieve-variation-seq, variation-scan, convert-variations), along with tools to extract sequences from a list of coordinates (retrieve-seq-bed), to select motifs from motif collections (retrieve-matrix), and to extract orthologs based on Ensembl Compara (get-orthologs-compara). Three use cases illustrate the integration of new and refactored tools to the suite. This Anniversary update gives a 20-year perspective on the software suite. RSAT is well-documented and available through Web sites, SOAP/WSDL (Simple Object Access Protocol/Web Services Description Language) web services, virtual machines and stand-alone programs at http://www.rsat.eu/.

  5. Supporting the education evidence portal via text mining

    PubMed Central

    Ananiadou, Sophia; Thompson, Paul; Thomas, James; Mu, Tingting; Oliver, Sandy; Rickinson, Mark; Sasaki, Yutaka; Weissenbacher, Davy; McNaught, John

    2010-01-01

    The UK Education Evidence Portal (eep) provides a single, searchable, point of access to the contents of the websites of 33 organizations relating to education, with the aim of revolutionizing work practices for the education community. Use of the portal alleviates the need to spend time searching multiple resources to find relevant information. However, the combined content of the websites of interest is still very large (over 500 000 documents and growing). This means that searches using the portal can produce very large numbers of hits. As users often have limited time, they would benefit from enhanced methods of performing searches and viewing results, allowing them to drill down to information of interest more efficiently, without having to sift through potentially long lists of irrelevant documents. The Joint Information Systems Committee (JISC)-funded ASSIST project has produced a prototype web interface to demonstrate the applicability of integrating a number of text-mining tools and methods into the eep, to facilitate an enhanced searching, browsing and document-viewing experience. New features include automatic classification of documents according to a taxonomy, automatic clustering of search results according to similar document content, and automatic identification and highlighting of key terms within documents. PMID:20643679

  6. Documenting clinical pharmacist intervention before and after the introduction of a web-based tool.

    PubMed

    Nurgat, Zubeir A; Al-Jazairi, Abdulrazaq S; Abu-Shraie, Nada; Al-Jedai, Ahmed

    2011-04-01

    To develop a database for documenting pharmacist intervention through a web-based application. The secondary endpoint was to determine if the new, web-based application provides any benefits with regards to documentation compliance by clinical pharmacists and ease of calculating cost savings compared with our previous method of documenting pharmacist interventions. A tertiary care hospital in Saudi Arabia. The documentation of interventions using a web-based documentation application was retrospectively compared with previous methods of documentation of clinical pharmacists' interventions (multi-user PC software). The number and types of interventions recorded by pharmacists, data mining of archived data, efficiency, cost savings, and the accuracy of the data generated. The number of documented clinical interventions increased from 4,926, using the multi-user PC software, to 6,840 for the web-based application. On average, we observed 653 interventions per clinical pharmacist using the web-based application, which showed an increase compared to an average of 493 interventions using the old multi-user PC software. However, using a paired Student's t-test there was no statistical significance difference between the two means (P = 0.201). Using a χ² test, which captured management level and the type of system used, we found a strong effect of management level (P < 2.2 × 10⁻¹⁶) on the number of documented interventions. We also found a moderately significant relationship between educational level and the number of interventions documented (P = 0.045). The mean ± SD time required to document an intervention using the web-based application was 66.55 ± 8.98 s. Using the web-based application, 29.06% of documented interventions resulted in cost-savings, while using the multi-user PC software only 4.75% of interventions did so. The majority of cost savings across both platforms resulted from the discontinuation of unnecessary drugs and a change in dosage regimen. Data collection using the web-based application was consistently more complete when compared to the multi-user PC software. The web-based application is an efficient system for documenting pharmacist interventions. Its flexibility and accessibility, as well as its detailed report functionality is a useful tool that will hopefully encourage other primary and secondary care facilities to adopt similar applications.

  7. KernPaeP - a web-based pediatric palliative documentation system for home care.

    PubMed

    Hartz, Tobias; Verst, Hendrik; Ueckert, Frank

    2009-01-01

    KernPaeP is a new web-based on- and offline documentation system, which has been developed for pediatric palliative care-teams supporting patient documentation and communication among health care professionals. It provides a reliable system making fast and secure home care documentation possible. KernPaeP is accessible online by registered users using any web-browser. Home care teams use an offline version of KernPaeP running on a netbook for patient documentation on site. Identifying and medical patient data are strictly separated and stored on two database servers. The system offers a stable, enhanced two-way algorithm for synchronization between the offline component and the central database servers. KernPaeP is implemented meeting highest security standards while still maintaining high usability. The web-based documentation system allows ubiquitous and immediate access to patient data. Sumptuous paper work is replaced by secure and comprehensive electronic documentation. KernPaeP helps saving time and improving the quality of documentation. Due to development in close cooperation with pediatric palliative professionals, KernPaeP fulfils the broad needs of home-care documentation. The technique of web-based online and offline documentation is in general applicable for arbitrary home care scenarios.

  8. Webs on surfaces, rings of invariants, and clusters.

    PubMed

    Fomin, Sergey; Pylyavskyy, Pavlo

    2014-07-08

    We construct and study cluster algebra structures in rings of invariants of the special linear group action on collections of 3D vectors, covectors, and matrices. The construction uses Kuperberg's calculus of webs on marked surfaces with boundary.

  9. deepTools2: a next generation web server for deep-sequencing data analysis.

    PubMed

    Ramírez, Fidel; Ryan, Devon P; Grüning, Björn; Bhardwaj, Vivek; Kilpert, Fabian; Richter, Andreas S; Heyne, Steffen; Dündar, Friederike; Manke, Thomas

    2016-07-08

    We present an update to our Galaxy-based web server for processing and visualizing deeply sequenced data. Its core tool set, deepTools, allows users to perform complete bioinformatic workflows ranging from quality controls and normalizations of aligned reads to integrative analyses, including clustering and visualization approaches. Since we first described our deepTools Galaxy server in 2014, we have implemented new solutions for many requests from the community and our users. Here, we introduce significant enhancements and new tools to further improve data visualization and interpretation. deepTools continue to be open to all users and freely available as a web service at deeptools.ie-freiburg.mpg.de The new deepTools2 suite can be easily deployed within any Galaxy framework via the toolshed repository, and we also provide source code for command line usage under Linux and Mac OS X. A public and documented API for access to deepTools functionality is also available. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  10. GeneXplorer: an interactive web application for microarray data visualization and analysis.

    PubMed

    Rees, Christian A; Demeter, Janos; Matese, John C; Botstein, David; Sherlock, Gavin

    2004-10-01

    When publishing large-scale microarray datasets, it is of great value to create supplemental websites where either the full data, or selected subsets corresponding to figures within the paper, can be browsed. We set out to create a CGI application containing many of the features of some of the existing standalone software for the visualization of clustered microarray data. We present GeneXplorer, a web application for interactive microarray data visualization and analysis in a web environment. GeneXplorer allows users to browse a microarray dataset in an intuitive fashion. It provides simple access to microarray data over the Internet and uses only HTML and JavaScript to display graphic and annotation information. It provides radar and zoom views of the data, allows display of the nearest neighbors to a gene expression vector based on their Pearson correlations and provides the ability to search gene annotation fields. The software is released under the permissive MIT Open Source license, and the complete documentation and the entire source code are freely available for download from CPAN http://search.cpan.org/dist/Microarray-GeneXplorer/.

  11. BOWS (bioinformatics open web services) to centralize bioinformatics tools in web services.

    PubMed

    Velloso, Henrique; Vialle, Ricardo A; Ortega, J Miguel

    2015-06-02

    Bioinformaticians face a range of difficulties to get locally-installed tools running and producing results; they would greatly benefit from a system that could centralize most of the tools, using an easy interface for input and output. Web services, due to their universal nature and widely known interface, constitute a very good option to achieve this goal. Bioinformatics open web services (BOWS) is a system based on generic web services produced to allow programmatic access to applications running on high-performance computing (HPC) clusters. BOWS intermediates the access to registered tools by providing front-end and back-end web services. Programmers can install applications in HPC clusters in any programming language and use the back-end service to check for new jobs and their parameters, and then to send the results to BOWS. Programs running in simple computers consume the BOWS front-end service to submit new processes and read results. BOWS compiles Java clients, which encapsulate the front-end web service requisitions, and automatically creates a web page that disposes the registered applications and clients. Bioinformatics open web services registered applications can be accessed from virtually any programming language through web services, or using standard java clients. The back-end can run in HPC clusters, allowing bioinformaticians to remotely run high-processing demand applications directly from their machines.

  12. OrthoVenn: a web server for genome wide comparison and annotation of orthologous clusters across multiple species.

    PubMed

    Wang, Yi; Coleman-Derr, Devin; Chen, Guoping; Gu, Yong Q

    2015-07-01

    Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that is useful for genome wide comparisons and visualization of orthologous clusters. OrthoVenn provides coverage of vertebrates, metazoa, protists, fungi, plants and bacteria for the comparison of orthologous clusters and also supports uploading of customized protein sequences from user-defined species. An interactive Venn diagram, summary counts, and functional summaries of the disjunction and intersection of clusters shared between species are displayed as part of the OrthoVenn result. OrthoVenn also includes in-depth views of the clusters using various sequence analysis tools. Furthermore, OrthoVenn identifies orthologous clusters of single copy genes and allows for a customized search of clusters of specific genes through key words or BLAST. OrthoVenn is an efficient and user-friendly web server freely accessible at http://probes.pw.usda.gov/OrthoVenn or http://aegilops.wheat.ucdavis.edu/OrthoVenn. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  13. Dynamically Allocated Virtual Clustering Management System Users Guide

    DTIC Science & Technology

    2016-11-01

    provides usage instructions for the DAVC version 2.0 web application. 15. SUBJECT TERMS DAVC, Dynamically Allocated Virtual Clustering...This report provides usage instructions for the DAVC version 2.0 web application. This report is separated into the following sections, which detail

  14. Designing Web-based Telemedicine Training for Military Health Care Providers.

    ERIC Educational Resources Information Center

    Bangert, David; Doktor, Boert; Johnson, Erik

    2001-01-01

    Interviews with 48 military health care professionals identified 20 objectives and 4 learning clusters for a telemedicine training curriculum. From these clusters, web-based modules were developed addressing clinical learning, technology, organizational issues, and introduction to telemedicine. (Contains 19 references.) (SK)

  15. BioTextQuest(+): a knowledge integration platform for literature mining and concept discovery.

    PubMed

    Papanikolaou, Nikolas; Pavlopoulos, Georgios A; Pafilis, Evangelos; Theodosiou, Theodosios; Schneider, Reinhard; Satagopam, Venkata P; Ouzounis, Christos A; Eliopoulos, Aristides G; Promponas, Vasilis J; Iliopoulos, Ioannis

    2014-11-15

    The iterative process of finding relevant information in biomedical literature and performing bioinformatics analyses might result in an endless loop for an inexperienced user, considering the exponential growth of scientific corpora and the plethora of tools designed to mine PubMed(®) and related biological databases. Herein, we describe BioTextQuest(+), a web-based interactive knowledge exploration platform with significant advances to its predecessor (BioTextQuest), aiming to bridge processes such as bioentity recognition, functional annotation, document clustering and data integration towards literature mining and concept discovery. BioTextQuest(+) enables PubMed and OMIM querying, retrieval of abstracts related to a targeted request and optimal detection of genes, proteins, molecular functions, pathways and biological processes within the retrieved documents. The front-end interface facilitates the browsing of document clustering per subject, the analysis of term co-occurrence, the generation of tag clouds containing highly represented terms per cluster and at-a-glance popup windows with information about relevant genes and proteins. Moreover, to support experimental research, BioTextQuest(+) addresses integration of its primary functionality with biological repositories and software tools able to deliver further bioinformatics services. The Google-like interface extends beyond simple use by offering a range of advanced parameterization for expert users. We demonstrate the functionality of BioTextQuest(+) through several exemplary research scenarios including author disambiguation, functional term enrichment, knowledge acquisition and concept discovery linking major human diseases, such as obesity and ageing. The service is accessible at http://bioinformatics.med.uoc.gr/biotextquest. g.pavlopoulos@gmail.com or georgios.pavlopoulos@esat.kuleuven.be Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. Free Factories: Unified Infrastructure for Data Intensive Web Services

    PubMed Central

    Zaranek, Alexander Wait; Clegg, Tom; Vandewege, Ward; Church, George M.

    2010-01-01

    We introduce the Free Factory, a platform for deploying data-intensive web services using small clusters of commodity hardware and free software. Independently administered virtual machines called Freegols give application developers the flexibility of a general purpose web server, along with access to distributed batch processing, cache and storage services. Each cluster exploits idle RAM and disk space for cache, and reserves disks in each node for high bandwidth storage. The batch processing service uses a variation of the MapReduce model. Virtualization allows every CPU in the cluster to participate in batch jobs. Each 48-node cluster can achieve 4-8 gigabytes per second of disk I/O. Our intent is to use multiple clusters to process hundreds of simultaneous requests on multi-hundred terabyte data sets. Currently, our applications achieve 1 gigabyte per second of I/O with 123 disks by scheduling batch jobs on two clusters, one of which is located in a remote data center. PMID:20514356

  17. Web Prep: How to Prepare NAS Reports For Publication on the Web

    NASA Technical Reports Server (NTRS)

    Walatka, Pamela; Balakrishnan, Prithika; Clucas, Jean; McCabe, R. Kevin; Felchle, Gail; Brickell, Cristy

    1996-01-01

    This document contains specific advice and requirements for NASA Ames Code IN authors of NAS reports. Much of the information may be of interest to other authors writing for the Web. WebPrep has a graphic Table of Contents in the form of a WebToon, which simulates a discussion between a scientist and a Web publishing consultant. In the WebToon, Frequently Asked Questions about preparing reports for the Web are linked to relevant text in the body of this document. We also provide a text-only Table of Contents. The text for this document is divided into chapters: each chapter corresponds to one frame of the WebToons. The chapter topics are: converting text to HTML, converting 2D graphic images to gif, creating imagemaps and tables, converting movie and audio files to Web formats, supplying 3D interactive data, and (briefly) JAVA capabilities. The last chapter is specifically for NAS staff authors. The Glossary-Index lists web related words and links to topics covered in the main text.

  18. Review of Web-Based Technical Documentation Processes. FY07 NAEP-QA Special Study Report. TR-08-17

    ERIC Educational Resources Information Center

    Gribben, Monica; Wise, Lauress; Becker, D. E.

    2008-01-01

    Beginning with the 2000 and 2001 National Assessment of Educational Progress (NAEP) assessments, the National Center for Education Statistics (NCES) has made technical documentation available on the worldwide web at http://nces.ed.gov/nationsreportcard/tdw/. The web-based documentation is designed to be less dense and more accessible than prior…

  19. The Effects of a Web-Based Nursing Process Documentation Program on Stress and Anxiety of Nursing Students in South Korea.

    PubMed

    Lee, Eunjoo; Noh, Hyun Kyung

    2016-01-01

    To examine the effects of a web-based nursing process documentation system on the stress and anxiety of nursing students during their clinical practice. A quasi-experimental design was employed. The experimental group (n = 110) used a web-based nursing process documentation program for their case reports as part of assignments for a clinical practicum, whereas the control group (n = 106) used traditional paper-based case reports. Stress and anxiety levels were measured with a numeric rating scale before, 2 weeks after, and 4 weeks after using the web-based nursing process documentation program during a clinical practicum. The data were analyzed using descriptive statistics, t tests, chi-square tests, and repeated-measures analyses of variance. Nursing students who used the web-based nursing process documentation program showed significant lower levels of stress and anxiety than the control group. A web-based nursing process documentation program could be used to reduce the stress and anxiety of nursing students during clinical practicum, which ultimately would benefit nursing students by increasing satisfaction with and effectiveness of clinical practicum. © 2015 NANDA International, Inc.

  20. Semantic Clustering of Search Engine Results

    PubMed Central

    Soliman, Sara Saad; El-Sayed, Maged F.; Hassan, Yasser F.

    2015-01-01

    This paper presents a novel approach for search engine results clustering that relies on the semantics of the retrieved documents rather than the terms in those documents. The proposed approach takes into consideration both lexical and semantics similarities among documents and applies activation spreading technique in order to generate semantically meaningful clusters. This approach allows documents that are semantically similar to be clustered together rather than clustering documents based on similar terms. A prototype is implemented and several experiments are conducted to test the prospered solution. The result of the experiment confirmed that the proposed solution achieves remarkable results in terms of precision. PMID:26933673

  1. Utilizing the Structure and Content Information for XML Document Clustering

    NASA Astrophysics Data System (ADS)

    Tran, Tien; Kutty, Sangeetha; Nayak, Richi

    This paper reports on the experiments and results of a clustering approach used in the INEX 2008 document mining challenge. The clustering approach utilizes both the structure and content information of the Wikipedia XML document collection. A latent semantic kernel (LSK) is used to measure the semantic similarity between XML documents based on their content features. The construction of a latent semantic kernel involves the computing of singular vector decomposition (SVD). On a large feature space matrix, the computation of SVD is very expensive in terms of time and memory requirements. Thus in this clustering approach, the dimension of the document space of a term-document matrix is reduced before performing SVD. The document space reduction is based on the common structural information of the Wikipedia XML document collection. The proposed clustering approach has shown to be effective on the Wikipedia collection in the INEX 2008 document mining challenge.

  2. Cosmic web type dependence of halo clustering

    NASA Astrophysics Data System (ADS)

    Fisher, J. D.; Faltenbacher, A.

    2018-01-01

    We use the Millennium Simulation to show that halo clustering varies significantly with cosmic web type. Haloes are classified as node, filament, sheet and void haloes based on the eigenvalue decomposition of the velocity shear tensor. The velocity field is sampled by the peculiar velocities of a fixed number of neighbouring haloes, and spatial derivatives are computed using a kernel borrowed from smoothed particle hydrodynamics. The classification scheme is used to examine the clustering of haloes as a function of web type for haloes with masses larger than 1011 h- 1 M⊙. We find that node haloes show positive bias, filament haloes show negligible bias and void and sheet haloes are antibiased independent of halo mass. Our findings suggest that the mass dependence of halo clustering is rooted in the composition of web types as a function of halo mass. The substantial fraction of node-type haloes for halo masses ≳ 2 × 1013 h- 1 M⊙ leads to positive bias. Filament-type haloes prevail at intermediate masses, 1012-1013 h- 1 M⊙, resulting in unbiased clustering. The large contribution of sheet-type haloes at low halo masses ≲ 1012 h- 1 M⊙ generates antibiasing.

  3. A New MI-Based Visualization Aided Validation Index for Mining Big Longitudinal Web Trial Data

    PubMed Central

    Zhang, Zhaoyang; Fang, Hua; Wang, Honggang

    2016-01-01

    Web-delivered clinical trials generate big complex data. To help untangle the heterogeneity of treatment effects, unsupervised learning methods have been widely applied. However, identifying valid patterns is a priority but challenging issue for these methods. This paper, built upon our previous research on multiple imputation (MI)-based fuzzy clustering and validation, proposes a new MI-based Visualization-aided validation index (MIVOOS) to determine the optimal number of clusters for big incomplete longitudinal Web-trial data with inflated zeros. Different from a recently developed fuzzy clustering validation index, MIVOOS uses a more suitable overlap and separation measures for Web-trial data but does not depend on the choice of fuzzifiers as the widely used Xie and Beni (XB) index. Through optimizing the view angles of 3-D projections using Sammon mapping, the optimal 2-D projection-guided MIVOOS is obtained to better visualize and verify the patterns in conjunction with trajectory patterns. Compared with XB and VOS, our newly proposed MIVOOS shows its robustness in validating big Web-trial data under different missing data mechanisms using real and simulated Web-trial data. PMID:27482473

  4. Cosmic Web of Galaxies in the COMOS Field

    NASA Astrophysics Data System (ADS)

    Darvish, Behnam; Martin, Christopher D.; Mobasher, Bahram; Scoville, Nicholas; Sobral, David; COSMOS science Team

    2017-01-01

    We use a mass complete sample of galaxies with accurate photometric redshifts in the COSMOS field to estimate the density field and to extract the components of the cosmic web. The comic web extraction algorithm relies on the signs and the ratio of eigenvalues of the Hessian matrix and is enable to integrate the density field into clusters, filaments and the field. We show that at z < 0.8, the median star-formation rate in the cosmic web gradually declines from the field to clusters and this decline is especially sharp for satellite galaxies (~1 dex vs. ~0.4 dex for centrals). However, at z > 0.8, the trend flattens out. For star-forming galaxies only, the median star-formation rate declines by ~ 0.3-0.4 dex from the field to clusters for both satellites and centrals, only at z < 0.5. We argue that for satellite galaxies, the main role of the cosmic web environment is to control their star-forming/quiescent fraction, whereas for centrals, it is mainly to control their overall star-formation rate. Given these, we suggest that most satellite galaxies experience a rapid quenching mechanism as they fall from the field into clusters through the channel of filaments, whereas for central galaxies, it is mostly due to a slow quenching process. Our preliminary results highlight the importance of the large-scale cosmic web on the evolution of galaxies.

  5. Multipolar moments of weak lensing signal around clusters. Weighing filaments in harmonic space

    NASA Astrophysics Data System (ADS)

    Gouin, C.; Gavazzi, R.; Codis, S.; Pichon, C.; Peirani, S.; Dubois, Y.

    2017-09-01

    Context. Upcoming weak lensing surveys such as Euclid will provide an unprecedented opportunity to quantify the geometry and topology of the cosmic web, in particular in the vicinity of lensing clusters. Aims: Understanding the connectivity of the cosmic web with unbiased mass tracers, such as weak lensing, is of prime importance to probe the underlying cosmology, seek dynamical signatures of dark matter, and quantify environmental effects on galaxy formation. Methods: Mock catalogues of galaxy clusters are extracted from the N-body PLUS simulation. For each cluster, the aperture multipolar moments of the convergence are calculated in two annuli (inside and outside the virial radius). By stacking their modulus, a statistical estimator is built to characterise the angular mass distribution around clusters. The moments are compared to predictions from perturbation theory and spherical collapse. Results: The main weakly chromatic excess of multipolar power on large scales is understood as arising from the contraction of the primordial cosmic web driven by the growing potential well of the cluster. Besides this boost, the quadrupole prevails in the cluster (ellipsoidal) core, while at the outskirts, harmonic distortions are spread on small angular modes, and trace the non-linear sharpening of the filamentary structures. Predictions for the signal amplitude as a function of the cluster-centric distance, mass, and redshift are presented. The prospects of measuring this signal are estimated for current and future lensing data sets. Conclusions: The Euclid mission should provide all the necessary information for studying the cosmic evolution of the connectivity of the cosmic web around lensing clusters using multipolar moments and probing unique signatures of, for example, baryons and warm dark matter.

  6. Cluster outskirts and the missing baryons

    NASA Astrophysics Data System (ADS)

    Eckert, D.

    2016-06-01

    Galaxy clusters are located at the crossroads of intergalactic filaments and are still forming through the continuous merging and accretion of smaller structures from the surrounding cosmic web. Deep, wide-field X-ray studies of the outskirts of the most massive clusters bring us valuable insight into the processes leading to the growth of cosmic structures. In addition, cluster outskirts are privileged sites to search for the missing baryons, which are thought to reside within the filaments of the cosmic web. I will present the XMM cluster outskirts project, a VLP that aims at mapping the outskirts of 13 nearby clusters. Based on the results obtained with this program, I will then explore ideas to exploit the capabilities of XMM during the next decade.

  7. NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways.

    PubMed

    Brohée, Sylvain; Faust, Karoline; Lima-Mendez, Gipsi; Sand, Olivier; Janky, Rekin's; Vanderstocken, Gilles; Deville, Yves; van Helden, Jacques

    2008-07-01

    The network analysis tools (NeAT) (http://rsat.ulb.ac.be/neat/) provide a user-friendly web access to a collection of modular tools for the analysis of networks (graphs) and clusters (e.g. microarray clusters, functional classes, etc.). A first set of tools supports basic operations on graphs (comparison between two graphs, neighborhood of a set of input nodes, path finding and graph randomization). Another set of programs makes the connection between networks and clusters (graph-based clustering, cliques discovery and mapping of clusters onto a network). The toolbox also includes programs for detecting significant intersections between clusters/classes (e.g. clusters of co-expression versus functional classes of genes). NeAT are designed to cope with large datasets and provide a flexible toolbox for analyzing biological networks stored in various databases (protein interactions, regulation and metabolism) or obtained from high-throughput experiments (two-hybrid, mass-spectrometry and microarrays). The web interface interconnects the programs in predefined analysis flows, enabling to address a series of questions about networks of interest. Each tool can also be used separately by entering custom data for a specific analysis. NeAT can also be used as web services (SOAP/WSDL interface), in order to design programmatic workflows and integrate them with other available resources.

  8. Tools for Material Design and Selection

    NASA Astrophysics Data System (ADS)

    Wehage, Kristopher

    The present thesis focuses on applications of numerical methods to create tools for material characterization, design and selection. The tools generated in this work incorporate a variety of programming concepts, from digital image analysis, geometry, optimization, and parallel programming to data-mining, databases and web design. The first portion of the thesis focuses on methods for characterizing clustering in bimodal 5083 Aluminum alloys created by cryomilling and powder metallurgy. The bimodal samples analyzed in the present work contain a mixture of a coarse grain phase, with a grain size on the order of several microns, and an ultra-fine grain phase, with a grain size on the order of 200 nm. The mixing of the two phases is not homogeneous and clustering is observed. To investigate clustering in these bimodal materials, various microstructures were created experimentally by conventional cryomilling, Hot Isostatic Pressing (HIP), Extrusion, Dual-Mode Dynamic Forging (DMDF) and a new 'Gradient' cryomilling process. Two techniques for quantitative clustering analysis are presented, formulated and implemented. The first technique, the Area Disorder function, provides a metric of the quality of coarse grain dispersion in an ultra-fine grain matrix and the second technique, the Two-Point Correlation function, provides a metric of long and short range spatial arrangements of the two phases, as well as an indication of the mean feature size in any direction. The two techniques are implemented on digital images created by Scanning Electron Microscopy (SEM) and Electron Backscatter Detection (EBSD) of the microstructures. To investigate structure--property relationships through modeling and simulation, strategies for generating synthetic microstructures are discussed and a computer program that generates randomized microstructures with desired configurations of clustering described by the Area Disorder Function is formulated and presented. In the computer program, two-dimensional microstructures are generated by Random Sequential Adsorption (RSA) of voxelized ellipses representing the coarse grain phase. A simulated annealing algorithm is used to geometrically optimize the placement of the ellipses in the model to achieve varying user-defined configurations of spatial arrangement of the coarse grains. During the simulated annealing process, the ellipses are allowed to overlap up to a specified threshold, allowing triple junctions to form in the model. Once the simulated annealing process is complete, the remaining space is populated by smaller ellipses representing the ultra-fine grain phase. Uniform random orientations are assigned to the grains. The program generates text files that can be imported in to Crystal Plasticity Finite Element Analysis Software for stress analysis. Finally, numerical methods and programming are applied to current issues in green engineering and hazard assessment. To understand hazards associated with materials and select safer alternatives, engineers and designers need access to up-to-date hazard information. However, hazard information comes from many disparate sources and aggregating, interpreting and taking action on the wealth of data is not trivial. In light of these challenges, a Framework for Automated Hazard Assessment based on the GreenScreen list translator is presented. The framework consists of a computer program that automatically extracts data from the GHS-Japan hazard database, loads the data into a machine-readable JSON format, transforms the JSON document in to a GreenScreen JSON document using the GreenScreen List Translator v1.2 and performs GreenScreen Benchmark scoring on the material. The GreenScreen JSON documents are then uploaded to a document storage system to allow human operators to search for, modify or add additional hazard information via a web interface.

  9. Food-web structure and network theory: The role of connectance and size

    PubMed Central

    Dunne, Jennifer A.; Williams, Richard J.; Martinez, Neo D.

    2002-01-01

    Networks from a wide range of physical, biological, and social systems have been recently described as “small-world” and “scale-free.” However, studies disagree whether ecological networks called food webs possess the characteristic path lengths, clustering coefficients, and degree distributions required for membership in these classes of networks. Our analysis suggests that the disagreements are based on selective use of relatively few food webs, as well as analytical decisions that obscure important variability in the data. We analyze a broad range of 16 high-quality food webs, with 25–172 nodes, from a variety of aquatic and terrestrial ecosystems. Food webs generally have much higher complexity, measured as connectance (the fraction of all possible links that are realized in a network), and much smaller size than other networks studied, which have important implications for network topology. Our results resolve prior conflicts by demonstrating that although some food webs have small-world and scale-free structure, most do not if they exceed a relatively low level of connectance. Although food-web degree distributions do not display a universal functional form, observed distributions are systematically related to network connectance and size. Also, although food webs often lack small-world structure because of low clustering, we identify a continuum of real-world networks including food webs whose ratios of observed to random clustering coefficients increase as a power–law function of network size over 7 orders of magnitude. Although food webs are generally not small-world, scale-free networks, food-web topology is consistent with patterns found within those classes of networks. PMID:12235364

  10. Revealing the Cosmic Web-dependent Halo Bias

    NASA Astrophysics Data System (ADS)

    Yang, Xiaohu; Zhang, Youcai; Lu, Tianhuan; Wang, Huiyuan; Shi, Feng; Tweed, Dylan; Li, Shijie; Luo, Wentao; Lu, Yi; Yang, Lei

    2017-10-01

    Halo bias is the one of the key ingredients of the halo models. It was shown at a given redshift to be only dependent, to the first order, on the halo mass. In this study, four types of cosmic web environments—clusters, filaments, sheets, and voids—are defined within a state-of-the-art high-resolution N-body simulation. Within these environments, we use both halo-dark matter cross correlation and halo-halo autocorrelation functions to probe the clustering properties of halos. The nature of the halo bias differs strongly between the four different cosmic web environments described here. With respect to the overall population, halos in clusters have significantly lower biases in the {10}11.0˜ {10}13.5 {h}-1 {M}⊙ mass range. In other environments, however, halos show extremely enhanced biases up to a factor 10 in voids for halos of mass ˜ {10}12.0 {h}-1 {M}⊙ . Such a strong cosmic web environment dependence in the halo bias may play an important role in future cosmological and galaxy formation studies. Within this cosmic web framework, the age dependency of halo bias is found to be only significant in clusters and filaments for relatively small halos ≲ {10}12.5 {h}-1 {M}⊙ .

  11. Mapping Dark Matter in Simulated Galaxy Clusters

    NASA Astrophysics Data System (ADS)

    Bowyer, Rachel

    2018-01-01

    Galaxy clusters are the most massive bound objects in the Universe with most of their mass being dark matter. Cosmological simulations of structure formation show that clusters are embedded in a cosmic web of dark matter filaments and large scale structure. It is thought that these filaments are found preferentially close to the long axes of clusters. We extract galaxy clusters from the simulations "cosmo-OWLS" in order to study their properties directly and also to infer their properties from weak gravitational lensing signatures. We investigate various stacking procedures to enhance the signal of the filaments and large scale structure surrounding the clusters to better understand how the filaments of the cosmic web connect with galaxy clusters. This project was supported in part by the NSF REU grant AST-1358980 and by the Nantucket Maria Mitchell Association.

  12. Home Page, Sweet Home Page: Creating a Web Presence.

    ERIC Educational Resources Information Center

    Falcigno, Kathleen; Green, Tim

    1995-01-01

    Focuses primarily on design issues and practical concerns involved in creating World Wide Web documents for use within an organization. Concerns for those developing Web home pages are: learning HyperText Markup Language (HTML); defining customer group; allocating staff resources for maintenance of documents; providing feedback mechanism for…

  13. Towards a methodology for cluster searching to provide conceptual and contextual "richness" for systematic reviews of complex interventions: case study (CLUSTER).

    PubMed

    Booth, Andrew; Harris, Janet; Croot, Elizabeth; Springett, Jane; Campbell, Fiona; Wilkins, Emma

    2013-09-28

    Systematic review methodologies can be harnessed to help researchers to understand and explain how complex interventions may work. Typically, when reviewing complex interventions, a review team will seek to understand the theories that underpin an intervention and the specific context for that intervention. A single published report from a research project does not typically contain this required level of detail. A review team may find it more useful to examine a "study cluster"; a group of related papers that explore and explain various features of a single project and thus supply necessary detail relating to theory and/or context.We sought to conduct a preliminary investigation, from a single case study review, of techniques required to identify a cluster of related research reports, to document the yield from such methods, and to outline a systematic methodology for cluster searching. In a systematic review of community engagement we identified a relevant project - the Gay Men's Task Force. From a single "key pearl citation" we conducted a series of related searches to find contextually or theoretically proximate documents. We followed up Citations, traced Lead authors, identified Unpublished materials, searched Google Scholar, tracked Theories, undertook ancestry searching for Early examples and followed up Related projects (embodied in the CLUSTER mnemonic). Our structured, formalised procedure for cluster searching identified useful reports that are not typically identified from topic-based searches on bibliographic databases. Items previously rejected by an initial sift were subsequently found to inform our understanding of underpinning theory (for example Diffusion of Innovations Theory), context or both. Relevant material included book chapters, a Web-based process evaluation, and peer reviewed reports of projects sharing a common ancestry. We used these reports to understand the context for the intervention and to explore explanations for its relative lack of success. Additional data helped us to challenge simplistic assumptions on the homogeneity of the target population. A single case study suggests the potential utility of cluster searching, particularly for reviews that depend on an understanding of context, e.g. realist synthesis. The methodology is transparent, explicit and reproducible. There is no reason to believe that cluster searching is not generalizable to other review topics. Further research should examine the contribution of the methodology beyond improved yield, to the final synthesis and interpretation, possibly by utilizing qualitative sensitivity analysis.

  14. ICTNET at Web Track 2009 Diversity task

    DTIC Science & Technology

    2009-11-01

    performance. On the World Wide Web, there exist many documents which represents several implicit subtopics. We used commerce search engines to gather those...documents. In this task, our work can be divided into five steps. First, we collect documents returned by commerce search engines , and considered

  15. The Number of Scholarly Documents on the Public Web

    PubMed Central

    Khabsa, Madian; Giles, C. Lee

    2014-01-01

    The number of scholarly documents available on the web is estimated using capture/recapture methods by studying the coverage of two major academic search engines: Google Scholar and Microsoft Academic Search. Our estimates show that at least 114 million English-language scholarly documents are accessible on the web, of which Google Scholar has nearly 100 million. Of these, we estimate that at least 27 million (24%) are freely available since they do not require a subscription or payment of any kind. In addition, at a finer scale, we also estimate the number of scholarly documents on the web for fifteen fields: Agricultural Science, Arts and Humanities, Biology, Chemistry, Computer Science, Economics and Business, Engineering, Environmental Sciences, Geosciences, Material Science, Mathematics, Medicine, Physics, Social Sciences, and Multidisciplinary, as defined by Microsoft Academic Search. In addition, we show that among these fields the percentage of documents defined as freely available varies significantly, i.e., from 12 to 50%. PMID:24817403

  16. The number of scholarly documents on the public web.

    PubMed

    Khabsa, Madian; Giles, C Lee

    2014-01-01

    The number of scholarly documents available on the web is estimated using capture/recapture methods by studying the coverage of two major academic search engines: Google Scholar and Microsoft Academic Search. Our estimates show that at least 114 million English-language scholarly documents are accessible on the web, of which Google Scholar has nearly 100 million. Of these, we estimate that at least 27 million (24%) are freely available since they do not require a subscription or payment of any kind. In addition, at a finer scale, we also estimate the number of scholarly documents on the web for fifteen fields: Agricultural Science, Arts and Humanities, Biology, Chemistry, Computer Science, Economics and Business, Engineering, Environmental Sciences, Geosciences, Material Science, Mathematics, Medicine, Physics, Social Sciences, and Multidisciplinary, as defined by Microsoft Academic Search. In addition, we show that among these fields the percentage of documents defined as freely available varies significantly, i.e., from 12 to 50%.

  17. A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method.

    PubMed

    Yoo, Illhoi; Hu, Xiaohua; Song, Il-Yeol

    2007-11-27

    A huge amount of biomedical textual information has been produced and collected in MEDLINE for decades. In order to easily utilize biomedical information in the free text, document clustering and text summarization together are used as a solution for text information overload problem. In this paper, we introduce a coherent graph-based semantic clustering and summarization approach for biomedical literature. Our extensive experimental results show the approach shows 45% cluster quality improvement and 72% clustering reliability improvement, in terms of misclassification index, over Bisecting K-means as a leading document clustering approach. In addition, our approach provides concise but rich text summary in key concepts and sentences. Our coherent biomedical literature clustering and summarization approach that takes advantage of ontology-enriched graphical representations significantly improves the quality of document clusters and understandability of documents through summaries.

  18. A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method

    PubMed Central

    Yoo, Illhoi; Hu, Xiaohua; Song, Il-Yeol

    2007-01-01

    Background A huge amount of biomedical textual information has been produced and collected in MEDLINE for decades. In order to easily utilize biomedical information in the free text, document clustering and text summarization together are used as a solution for text information overload problem. In this paper, we introduce a coherent graph-based semantic clustering and summarization approach for biomedical literature. Results Our extensive experimental results show the approach shows 45% cluster quality improvement and 72% clustering reliability improvement, in terms of misclassification index, over Bisecting K-means as a leading document clustering approach. In addition, our approach provides concise but rich text summary in key concepts and sentences. Conclusion Our coherent biomedical literature clustering and summarization approach that takes advantage of ontology-enriched graphical representations significantly improves the quality of document clusters and understandability of documents through summaries. PMID:18047705

  19. ESIP Documentation Cluster Session: GCMD Keyword Update

    NASA Technical Reports Server (NTRS)

    Stevens, Tyler

    2018-01-01

    The Global Change Master Directory (GCMD) Keywords are a hierarchical set of controlled Earth Science vocabularies that help ensure Earth science data and services are described in a consistent and comprehensive manner and allow for the precise searching of collection-level metadata and subsequent retrieval of data and services. Initiated over twenty years ago, the GCMD Keywords are periodically analyzed for relevancy and will continue to be refined and expanded in response to user needs. This talk explores the current status of the GCMD keywords, the value and usage that the keywords bring to different tools/agencies as it relates to data discovery, and how the keywords relate to SWEET (Semantic Web for Earth and Environmental Terminology) Ontologies.

  20. Analysis of Basis Weight Uniformity of Microfiber Nonwovens and Its Impact on Permeability and Filtration Properties

    NASA Astrophysics Data System (ADS)

    Amirnasr, Elham

    It is widely recognized that nonwoven basis weight non-uniformity affects various properties of nonwovens. However, few studies can be found in this topic. The development of uniformity definition and measurement methods and the study of their impact on various web properties such as filtration properties and air permeability would be beneficial both in industrial applications and in academia. They can be utilized as a quality control tool and would provide insights about nonwoven behaviors that cannot be solely explained by average values. Therefore, for quantifying nonwoven web basis weight uniformity we purse to develop an optical analytical tool. The quadrant method and clustering analysis was utilized in an image analysis scheme to help define "uniformity" and its spatial variation. Implementing the quadrant method in an image analysis system allows the establishment of a uniformity index that can be used to quantify the degree of uniformity. Clustering analysis has also been modified and verified using uniform and random simulated images with known parameters. Number of clusters and cluster properties such as cluster size, member and density was determined. We also utilized this new measurement method to evaluate uniformity of nonwovens produced with different processes and investigated impacts of uniformity on filtration and permeability. The results of quadrant method shows that uniformity index computed from quadrant method demonstrate a good range for non-uniformity of nonwoven webs. Clustering analysis is also been applied on reference nonwoven with known visual uniformity. From clustering analysis results, cluster size is promising to be used as uniformity parameter. It is been shown that non-uniform nonwovens has provide lager cluster size than uniform nonwovens. It was been tried to find a relationship between web properties and uniformity index (as a web characteristic). To achieve this, filtration properties, air permeability, solidity and uniformity index of meltblown and spunbond samples was measured. Results for filtration test show some deviation between theoretical and experimental filtration efficiency by considering different types of fiber diameter. This deviation can occur due to variation in basis weight non-uniformity. So an appropriate theory is required to predict the variation of filtration efficiency with respect to non-uniformity of nonwoven filter media. And the results for air permeability test showed that uniformity index determined by quadrant method and measured properties have some relationship. In the other word, air permeability decreases as uniformity index on nonwoven web increase.

  1. JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

    PubMed

    Brown, David K; Penkler, David L; Musyoka, Thommas M; Bishop, Özlem Tastan

    2015-01-01

    Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.

  2. JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing

    PubMed Central

    Brown, David K.; Penkler, David L.; Musyoka, Thommas M.; Bishop, Özlem Tastan

    2015-01-01

    Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS. PMID:26280450

  3. The California Central Coast Research Partnership: Building Relationships, Partnerships, and Paradigms for University-Industry Research Collaboration

    DTIC Science & Technology

    2011-03-28

    particular topic of interest. Paper -based documents require the availability of a physical instance of a document, involving the transport of documents...repository of documents via the World Wide Web and search engines offer support in locating documents that are likely to contain relevant information. The... Web , with news agencies, newspapers, various organizations, and individuals as sources. Clearly the analysis, interpretation, and integration of

  4. Cosmic Web of Galaxies in the COSMOS Field: Public Catalog and Different Quenching for Centrals and Satellites

    NASA Astrophysics Data System (ADS)

    Darvish, Behnam; Mobasher, Bahram; Martin, D. Christopher; Sobral, David; Scoville, Nick; Stroe, Andra; Hemmati, Shoubaneh; Kartaltepe, Jeyhan

    2017-03-01

    We use a mass complete (log(M/{M}⊙ ) ≥slant 9.6) sample of galaxies with accurate photometric redshifts in the COSMOS field to construct the density field and the cosmic web to z = 1.2. The comic web extraction relies on the density field Hessian matrix and breaks the density field into clusters, filaments, and the field. We provide the density field and cosmic web measures to the community. We show that at z ≲ 0.8, the median star formation rate (SFR) in the cosmic web gradually declines from the field to clusters and this decline is especially sharp for satellites (˜1 dex versus ˜0.5 dex for centrals). However, at z ≳ 0.8, the trend flattens out for the overall galaxy population and satellites. For star-forming (SF) galaxies only, the median SFR is constant at z ≳ 0.5 but declines by ˜0.3-0.4 dex from the field to clusters for satellites and centrals at z ≲ 0.5. We argue that for satellites, the main role of the cosmic web environment is to control their SF fraction, whereas for centrals, it is mainly to control their overall SFR at z ≲ 0.5 and to set their fraction at z ≳ 0.5. We suggest that most satellites experience a rapid quenching mechanism as they fall from the field into clusters through filaments, whereas centrals mostly undergo a slow environmental quenching at z ≲ 0.5 and a fast mechanism at higher redshifts. Our preliminary results highlight the importance of the large-scale cosmic web on galaxy evolution.

  5. Architecture and Channel-Belt Clustering in the Fluvial lower Wasatch Formation, Uinta Basin, Utah

    NASA Astrophysics Data System (ADS)

    Pisel, J. R.; Pyles, D. R.; Bracken, B.; Rosenbaum, C. D.

    2013-12-01

    The Eocene lower Wasatch Formation of the Uinta Basin contains exceptional outcrops of low net-sand content (27% sand) fluvial strata. This study quantitatively documents the stratigraphy of a 7 km wide by 300 meter thick strike-oriented outcrop in order to develop a quantitative data base that can be used to improve our knowledge of how some fluvial systems evolve over geologic time scales. Data used to document the outcrop are: (1) 550 meters of decimeter to half meter scale resolution stratigraphic columns that document grain size and physical sedimentary structures; (2) detailed photopanels used to document architectural style and lithofacies types in the outcrop; (3) thickness, width, and spatial position for all channel belts in the outcrop, and (4) directional measurements of paleocurrent indicators. Two channel-belt styles are recognized: lateral and downstream accreting channel belts; both of which occur as either single or multi-story. Floodplain strata are well exposed and consist of overbank fines and sand-rich crevasse splay deposits. Key upward and lateral characteristics of the outcrop documented herein are the following. First, the shapes of 243 channels are documented. The average width, thickness and aspect ratios of the channel belts are 110 m, 7 m, and 16:1, respectively. Importantly, the size and shape of channel belts does not change upward through the 300 meter transect. Second, channels are documented to spatially cluster. 9 clusters are documented using a spatial statistic. Key upward patterns in channel belt clustering are a marked change from non-amalgamated isolated channel-belt clusters to amalgamated channel-belt clusters. Critically, stratal surfaces can be correlated from mudstone units within the clusters to time-equivalent floodplain strata adjacent to the cluster demonstrating that clusters are not confined within fluvial valleys. Finally, proportions of floodplain and channel belt elements underlying clusters and channel belts vary with the style of clusters and channel belts laterally and vertically within the outcrop.

  6. Designing Web-based telemedicine training for military health care providers.

    PubMed

    Bangert, D; Doktor, R; Johnson, E

    2001-01-01

    The purpose of the study was to ascertain those learning objectives that will initiate increased use of telemedicine by military health care providers. Telemedicine is increasingly moving to the center of the health care industry's service offerings. As this migration occurs, health professionals will require training for proper and effective change management. The United States Department of Defense (DoD) is embracing the use of telemedicine and wishes to use Web-based training as a tool for effective change management to increase use. This article summarizes the findings of an educational needs assessment of military health care providers for the creation of the DoD Web-based telemedicine training curriculum. Forty-eight health care professionals were interviewed and surveyed to capture their opinions on what learning objectives a telemedicine training curriculum should include. Twenty learning objectives were found to be needed in a telemedicine training program. These 20 learning objectives were grouped into four learning clusters that formed the structure for the training program. In order of importance, the learning clusters were clinical, technical, organizational, and introduction to telemedicine. From these clusters, five Web-based modules were created, with two addressing clinical learning needs and one for each of the other learning objective clusters.

  7. Introduction to the JASIST Special Topic Issue on Web Retrieval and Mining: A Machine Learning Perspective.

    ERIC Educational Resources Information Center

    Chen, Hsinchun

    2003-01-01

    Discusses information retrieval techniques used on the World Wide Web. Topics include machine learning in information extraction; relevance feedback; information filtering and recommendation; text classification and text clustering; Web mining, based on data mining techniques; hyperlink structure; and Web size. (LRW)

  8. Script identification from images using cluster-based templates

    DOEpatents

    Hochberg, J.G.; Kelly, P.M.; Thomas, T.R.

    1998-12-01

    A computer-implemented method identifies a script used to create a document. A set of training documents for each script to be identified is scanned into the computer to store a series of exemplary images representing each script. Pixels forming the exemplary images are electronically processed to define a set of textual symbols corresponding to the exemplary images. Each textual symbol is assigned to a cluster of textual symbols that most closely represents the textual symbol. The cluster of textual symbols is processed to form a representative electronic template for each cluster. A document having a script to be identified is scanned into the computer to form one or more document images representing the script to be identified. Pixels forming the document images are electronically processed to define a set of document textual symbols corresponding to the document images. The set of document textual symbols is compared to the electronic templates to identify the script. 17 figs.

  9. Script identification from images using cluster-based templates

    DOEpatents

    Hochberg, Judith G.; Kelly, Patrick M.; Thomas, Timothy R.

    1998-01-01

    A computer-implemented method identifies a script used to create a document. A set of training documents for each script to be identified is scanned into the computer to store a series of exemplary images representing each script. Pixels forming the exemplary images are electronically processed to define a set of textual symbols corresponding to the exemplary images. Each textual symbol is assigned to a cluster of textual symbols that most closely represents the textual symbol. The cluster of textual symbols is processed to form a representative electronic template for each cluster. A document having a script to be identified is scanned into the computer to form one or more document images representing the script to be identified. Pixels forming the document images are electronically processed to define a set of document textual symbols corresponding to the document images. The set of document textual symbols is compared to the electronic templates to identify the script.

  10. Architecture of marine food webs: To be or not be a 'small-world'.

    PubMed

    Marina, Tomás Ignacio; Saravia, Leonardo A; Cordone, Georgina; Salinas, Vanesa; Doyle, Santiago R; Momo, Fernando R

    2018-01-01

    The search for general properties in network structure has been a central issue for food web studies in recent years. One such property is the small-world topology that combines a high clustering and a small distance between nodes of the network. This property may increase food web resilience but make them more sensitive to the extinction of connected species. Food web theory has been developed principally from freshwater and terrestrial ecosystems, largely omitting marine habitats. If theory needs to be modified to accommodate observations from marine ecosystems, based on major differences in several topological characteristics is still on debate. Here we investigated if the small-world topology is a common structural pattern in marine food webs. We developed a novel, simple and statistically rigorous method to examine the largest set of complex marine food webs to date. More than half of the analyzed marine networks exhibited a similar or lower characteristic path length than the random expectation, whereas 39% of the webs presented a significantly higher clustering than its random counterpart. Our method proved that 5 out of 28 networks fulfilled both features of the small-world topology: short path length and high clustering. This work represents the first rigorous analysis of the small-world topology and its associated features in high-quality marine networks. We conclude that such topology is a structural pattern that is not maximized in marine food webs; thus it is probably not an effective model to study robustness, stability and feasibility of marine ecosystems.

  11. WebStruct and VisualStruct: Web interfaces and visualization for Structure software implemented in a cluster environment.

    PubMed

    Jayashree, B; Rajgopal, S; Hoisington, D; Prasanth, V P; Chandra, S

    2008-09-24

    Structure, is a widely used software tool to investigate population genetic structure with multi-locus genotyping data. The software uses an iterative algorithm to group individuals into "K" clusters, representing possibly K genetically distinct subpopulations. The serial implementation of this programme is processor-intensive even with small datasets. We describe an implementation of the program within a parallel framework. Speedup was achieved by running different replicates and values of K on each node of the cluster. A web-based user-oriented GUI has been implemented in PHP, through which the user can specify input parameters for the programme. The number of processors to be used can be specified in the background command. A web-based visualization tool "Visualstruct", written in PHP (HTML and Java script embedded), allows for the graphical display of population clusters output from Structure, where each individual may be visualized as a line segment with K colors defining its possible genomic composition with respect to the K genetic sub-populations. The advantage over available programs is in the increased number of individuals that can be visualized. The analyses of real datasets indicate a speedup of up to four, when comparing the speed of execution on clusters of eight processors with the speed of execution on one desktop. The software package is freely available to interested users upon request.

  12. A knowledge-driven approach to biomedical document conceptualization.

    PubMed

    Zheng, Hai-Tao; Borchert, Charles; Jiang, Yong

    2010-06-01

    Biomedical document conceptualization is the process of clustering biomedical documents based on ontology-represented domain knowledge. The result of this process is the representation of the biomedical documents by a set of key concepts and their relationships. Most of clustering methods cluster documents based on invariant domain knowledge. The objective of this work is to develop an effective method to cluster biomedical documents based on various user-specified ontologies, so that users can exploit the concept structures of documents more effectively. We develop a flexible framework to allow users to specify the knowledge bases, in the form of ontologies. Based on the user-specified ontologies, we develop a key concept induction algorithm, which uses latent semantic analysis to identify key concepts and cluster documents. A corpus-related ontology generation algorithm is developed to generate the concept structures of documents. Based on two biomedical datasets, we evaluate the proposed method and five other clustering algorithms. The clustering results of the proposed method outperform the five other algorithms, in terms of key concept identification. With respect to the first biomedical dataset, our method has the F-measure values 0.7294 and 0.5294 based on the MeSH ontology and gene ontology (GO), respectively. With respect to the second biomedical dataset, our method has the F-measure values 0.6751 and 0.6746 based on the MeSH ontology and GO, respectively. Both results outperforms the five other algorithms in terms of F-measure. Based on the MeSH ontology and GO, the generated corpus-related ontologies show informative conceptual structures. The proposed method enables users to specify the domain knowledge to exploit the conceptual structures of biomedical document collections. In addition, the proposed method is able to extract the key concepts and cluster the documents with a relatively high precision. Copyright 2010 Elsevier B.V. All rights reserved.

  13. In Search of a Better Search Engine

    ERIC Educational Resources Information Center

    Kolowich, Steve

    2009-01-01

    Early this decade, the number of Web-based documents stored on the servers of the University of Florida hovered near 300,000. By the end of 2006, that number had leapt to four million. Two years later, the university hosts close to eight million Web documents. Web sites for colleges and universities everywhere have become repositories for data…

  14. Parents on the web: risks for quality management of cough in children.

    PubMed

    Pandolfini, C; Impicciatore, P; Bonati, M

    2000-01-01

    Health information on the Internet, with respect to common, self-limited childhood illnesses, has been found to be unreliable. Therefore, parents navigating on the Internet risk finding advice that is incomplete or, more importantly, not evidence-based. The importance that a resource such as the Internet as a source of quality health information for consumers should, however, be taken into consideration. For this reason, studies need to be performed regarding the quality of material provided. Various strategies have been proposed that would allow parents to distinguish trustworthy web documents from unreliable ones. One of these strategies is the use of a checklist for the appraisal of web pages based on their technical aspects. The purpose of this study was to assess the quality of information present on the Internet regarding the home management of cough in children and to examine the applicability of a checklist strategy that would allow consumers to select more trustworthy web pages. The Internet was searched for web pages regarding the home treatment of cough in children with the use of different search engines. Medline and the Cochrane database were searched for available evidence concerning the management of cough in children. Three checklists were created to assess different aspects of the web documents. The first checklist was designed to allow for a technical appraisal of the web pages and was based on components such as the name of the author and references used. The second was constructed to examine the completeness of the health information contained in the documents, such as causes and mechanism of cough, and pharmacological and nonpharmacological treatment. The third checklist assessed the quality of the information by measuring it against a gold standard document. This document was created by combining the policy statement issued by the American Academy of Pediatrics regarding the pharmacological treatment of cough in children with the guide of the World Health Organization on drugs for children. For each checklist, the web page contents were analyzed and quantitative measurements were assigned. Of the 19 web pages identified, 9 explained the purpose and/or mechanism of cough and 14 the causes. The most frequently mentioned pharmacological treatments were single-ingredient suppressant preparations, followed by single-ingredient expectorants. Dextromethorphan was the most commonly referred to suppressant and guaifenesin the most common expectorant. No documents discouraged the use of suppressants, although 4 of the 10 web documents that addressed expectorants discouraged their use. Sixteen web pages addressed nonpharmacological treatment, 14 of which suggested exposure to a humid environment and/or extra fluid. In most cases, the criteria in the technical appraisal checklist were not present in the web documents; moreover, 2 web pages did not provide any of the items. Regarding content completeness, 3 web pages satisfied all the requirements considered in the checklist and 2 documents did not meet any of the criteria. Of the 3 web pages that scored highest in technical aspect, 2 also supplied complete information. No relationship was found, however, between the technical aspect and the content completeness. Concerning the quality of the health information supplied, 10 pages received a negative score because they contained more incorrect than correct information, and 1 web page received a high score. This document was 1 of the 2 that also scored high in technical aspect and content completeness. No relationship was found, however, among quality of information, technical aspect, and content completeness. As the results of this study show, a parent navigating the Internet for information on the home management of cough in children will no doubt find incorrect advice among the search results. (ABSTRACT TRUNCATED)

  15. Features: Real-Time Adaptive Feature and Document Learning for Web Search.

    ERIC Educational Resources Information Center

    Chen, Zhixiang; Meng, Xiannong; Fowler, Richard H.; Zhu, Binhai

    2001-01-01

    Describes Features, an intelligent Web search engine that is able to perform real-time adaptive feature (i.e., keyword) and document learning. Explains how Features learns from users' document relevance feedback and automatically extracts and suggests indexing keywords relevant to a search query, and learns from users' keyword relevance feedback…

  16. Semantic Similarity between Web Documents Using Ontology

    NASA Astrophysics Data System (ADS)

    Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh

    2018-06-01

    The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.

  17. Semantic Similarity between Web Documents Using Ontology

    NASA Astrophysics Data System (ADS)

    Chahal, Poonam; Singh Tomer, Manjeet; Kumar, Suresh

    2018-03-01

    The World Wide Web is the source of information available in the structure of interlinked web pages. However, the procedure of extracting significant information with the assistance of search engine is incredibly critical. This is for the reason that web information is written mainly by using natural language, and further available to individual human. Several efforts have been made in semantic similarity computation between documents using words, concepts and concepts relationship but still the outcome available are not as per the user requirements. This paper proposes a novel technique for computation of semantic similarity between documents that not only takes concepts available in documents but also relationships that are available between the concepts. In our approach documents are being processed by making ontology of the documents using base ontology and a dictionary containing concepts records. Each such record is made up of the probable words which represents a given concept. Finally, document ontology's are compared to find their semantic similarity by taking the relationships among concepts. Relevant concepts and relations between the concepts have been explored by capturing author and user intention. The proposed semantic analysis technique provides improved results as compared to the existing techniques.

  18. Basic firefly algorithm for document clustering

    NASA Astrophysics Data System (ADS)

    Mohammed, Athraa Jasim; Yusof, Yuhanis; Husni, Husniza

    2015-12-01

    The Document clustering plays significant role in Information Retrieval (IR) where it organizes documents prior to the retrieval process. To date, various clustering algorithms have been proposed and this includes the K-means and Particle Swarm Optimization. Even though these algorithms have been widely applied in many disciplines due to its simplicity, such an approach tends to be trapped in a local minimum during its search for an optimal solution. To address the shortcoming, this paper proposes a Basic Firefly (Basic FA) algorithm to cluster text documents. The algorithm employs the Average Distance to Document Centroid (ADDC) as the objective function of the search. Experiments utilizing the proposed algorithm were conducted on the 20Newsgroups benchmark dataset. Results demonstrate that the Basic FA generates a more robust and compact clusters than the ones produced by K-means and Particle Swarm Optimization (PSO).

  19. Providing Multi-Page Data Extraction Services with XWRAPComposer

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Ling; Zhang, Jianjun; Han, Wei

    2008-04-30

    Dynamic Web data sources – sometimes known collectively as the Deep Web – increase the utility of the Web by providing intuitive access to data repositories anywhere that Web access is available. Deep Web services provide access to real-time information, like entertainment event listings, or present a Web interface to large databases or other data repositories. Recent studies suggest that the size and growth rate of the dynamic Web greatly exceed that of the static Web, yet dynamic content is often ignored by existing search engine indexers owing to the technical challenges that arise when attempting to search the Deepmore » Web. To address these challenges, we present DYNABOT, a service-centric crawler for discovering and clustering Deep Web sources offering dynamic content. DYNABOT has three unique characteristics. First, DYNABOT utilizes a service class model of the Web implemented through the construction of service class descriptions (SCDs). Second, DYNABOT employs a modular, self-tuning system architecture for focused crawling of the Deep Web using service class descriptions. Third, DYNABOT incorporates methods and algorithms for efficient probing of the Deep Web and for discovering and clustering Deep Web sources and services through SCD-based service matching analysis. Our experimental results demonstrate the effectiveness of the service class discovery, probing, and matching algorithms and suggest techniques for efficiently managing service discovery in the face of the immense scale of the Deep Web.« less

  20. Ontology-based structured cosine similarity in document summarization: with applications to mobile audio-based knowledge management.

    PubMed

    Yuan, Soe-Tsyr; Sun, Jerry

    2005-10-01

    Development of algorithms for automated text categorization in massive text document sets is an important research area of data mining and knowledge discovery. Most of the text-clustering methods were grounded in the term-based measurement of distance or similarity, ignoring the structure of the documents. In this paper, we present a novel method named structured cosine similarity (SCS) that furnishes document clustering with a new way of modeling on document summarization, considering the structure of the documents so as to improve the performance of document clustering in terms of quality, stability, and efficiency. This study was motivated by the problem of clustering speech documents (of no rich document features) attained from the wireless experience oral sharing conducted by mobile workforce of enterprises, fulfilling audio-based knowledge management. In other words, this problem aims to facilitate knowledge acquisition and sharing by speech. The evaluations also show fairly promising results on our method of structured cosine similarity.

  1. How Teachers Use and Manage Their Blogs? A Cluster Analysis of Teachers' Blogs in Taiwan

    ERIC Educational Resources Information Center

    Liu, Eric Zhi-Feng; Hou, Huei-Tse

    2013-01-01

    The development of Web 2.0 has ushered in a new set of web-based tools, including blogs. This study focused on how teachers use and manage their blogs. A sample of 165 teachers' blogs in Taiwan was analyzed by factor analysis, cluster analysis and qualitative content analysis. First, the teachers' blogs were analyzed according to six criteria…

  2. Annotating Atomic Components of Papers in Digital Libraries: The Semantic and Social Web Heading towards a Living Document Supporting eSciences

    NASA Astrophysics Data System (ADS)

    García Castro, Alexander; García-Castro, Leyla Jael; Labarga, Alberto; Giraldo, Olga; Montaña, César; O'Neil, Kieran; Bateman, John A.

    Rather than a document that is being constantly re-written as in the wiki approach, the Living Document (LD) is one that acts as a document router, operating by means of structured and organized social tagging and existing ontologies. It offers an environment where users can manage papers and related information, share their knowledge with their peers and discover hidden associations among the shared knowledge. The LD builds upon both the Semantic Web, which values the integration of well-structured data, and the Social Web, which aims to facilitate interaction amongst people by means of user-generated content. In this vein, the LD is similar to a social networking system, with users as central nodes in the network, with the difference that interaction is focused on papers rather than people. Papers, with their ability to represent research interests, expertise, affiliations, and links to web based tools and databanks, represent a central axis for interaction amongst users. To begin to show the potential of this vision, we have implemented a novel web prototype that enables researchers to accomplish three activities central to the Semantic Web vision: organizing, sharing and discovering. Availability: http://www.scientifik.info/

  3. Personalization of Rule-based Web Services.

    PubMed

    Choi, Okkyung; Han, Sang Yong

    2008-04-04

    Nowadays Web users have clearly expressed their wishes to receive personalized services directly. Personalization is the way to tailor services directly to the immediate requirements of the user. However, the current Web Services System does not provide any features supporting this such as consideration of personalization of services and intelligent matchmaking. In this research a flexible, personalized Rule-based Web Services System to address these problems and to enable efficient search, discovery and construction across general Web documents and Semantic Web documents in a Web Services System is proposed. This system utilizes matchmaking among service requesters', service providers' and users' preferences using a Rule-based Search Method, and subsequently ranks search results. A prototype of efficient Web Services search and construction for the suggested system is developed based on the current work.

  4. Document-Centred Discourse on the Web: A Publishing Tool for Students, Tutors and Researchers.

    ERIC Educational Resources Information Center

    Shum, Simon Buckingham; Sumner, Tamara

    This paper describes how the authors are exploiting the potential of interactive World Wide Web media to support a central part of academic life--the publishing, critiquing, and discussion of documents. The paper begins with an overview of documents in academic life and a discussion of paper-based or "papyrocentric" print and scholarly…

  5. Graph and Network for Model Elicitation (GNOME Phase 2)

    DTIC Science & Technology

    2013-02-01

    10 3.3 GNOME UI Components for NOEM Web Client...20 Figure 17: Sampling in Web -client...the web -client). The server-side service can run and generate data asynchronously, allowing a cluster of servers to run the sampling. Also, a

  6. MCM generator: a Java-based tool for generating medical metadata.

    PubMed

    Munoz, F; Hersh, W

    1998-01-01

    In a previous paper we introduced the need to implement a mechanism to facilitate the discovery of relevant Web medical documents. We maintained that the use of META tags, specifically ones that define the medical subject and resource type of a document, help towards this goal. We have now developed a tool to facilitate the generation of these tags for the authors of medical documents. Written entirely in Java, this tool makes use of the SAPHIRE server, and helps the author identify the Medical Subject Heading terms that most appropriately describe the subject of the document. Furthermore, it allows the author to generate metadata tags for the 15 elements that the Dublin Core considers as core elements in the description of a document. This paper describes the use of this tool in the cataloguing of Web and non-Web medical documents, such as images, movie, and sound files.

  7. Focused Crawling of the Deep Web Using Service Class Descriptions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rocco, D; Liu, L; Critchlow, T

    2004-06-21

    Dynamic Web data sources--sometimes known collectively as the Deep Web--increase the utility of the Web by providing intuitive access to data repositories anywhere that Web access is available. Deep Web services provide access to real-time information, like entertainment event listings, or present a Web interface to large databases or other data repositories. Recent studies suggest that the size and growth rate of the dynamic Web greatly exceed that of the static Web, yet dynamic content is often ignored by existing search engine indexers owing to the technical challenges that arise when attempting to search the Deep Web. To address thesemore » challenges, we present DynaBot, a service-centric crawler for discovering and clustering Deep Web sources offering dynamic content. DynaBot has three unique characteristics. First, DynaBot utilizes a service class model of the Web implemented through the construction of service class descriptions (SCDs). Second, DynaBot employs a modular, self-tuning system architecture for focused crawling of the DeepWeb using service class descriptions. Third, DynaBot incorporates methods and algorithms for efficient probing of the Deep Web and for discovering and clustering Deep Web sources and services through SCD-based service matching analysis. Our experimental results demonstrate the effectiveness of the service class discovery, probing, and matching algorithms and suggest techniques for efficiently managing service discovery in the face of the immense scale of the Deep Web.« less

  8. Web-based X-ray quality control documentation.

    PubMed

    David, George; Burnett, Lou Ann; Schenkel, Robert

    2003-01-01

    The department of radiology at the Medical College of Georgia Hospital and Clinics has developed an equipment quality control web site. Our goal is to provide immediate access to virtually all medical physics survey data. The web site is designed to assist equipment engineers, department management and technologists. By improving communications and access to equipment documentation, we believe productivity is enhanced. The creation of the quality control web site was accomplished in three distinct steps. First, survey data had to be placed in a computer format. The second step was to convert these various computer files to a format supported by commercial web browsers. Third, a comprehensive home page had to be designed to provide convenient access to the multitude of surveys done in the various x-ray rooms. Because we had spent years previously fine-tuning the computerization of the medical physics quality control program, most survey documentation was already in spreadsheet or database format. A major technical decision was the method of conversion of survey spreadsheet and database files into documentation appropriate for the web. After an unsatisfactory experience with a HyperText Markup Language (HTML) converter (packaged with spreadsheet and database software), we tried creating Portable Document Format (PDF) files using Adobe Acrobat software. This process preserves the original formatting of the document and takes no longer than conventional printing; therefore, it has been very successful. Although the PDF file generated by Adobe Acrobat is a proprietary format, it can be displayed through a conventional web browser using the freely distributed Adobe Acrobat Reader program that is available for virtually all platforms. Once a user installs the software, it is automatically invoked by the web browser whenever the user follows a link to a file with a PDF extension. Although no confidential patient information is available on the web site, our legal department recommended that we secure the site in order to keep out those wishing to make mischief. Our interim solution has not been to password protect the page, which we feared would hinder access for occasional legitimate users, but also not to provide links to it from other hospital and department pages. Utility and productivity were improved and time and money were saved by making radiological equipment quality control documentation instantly available on-line.

  9. Swarm Intelligence in Text Document Clustering

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cui, Xiaohui; Potok, Thomas E

    2008-01-01

    Social animals or insects in nature often exhibit a form of emergent collective behavior. The research field that attempts to design algorithms or distributed problem-solving devices inspired by the collective behavior of social insect colonies is called Swarm Intelligence. Compared to the traditional algorithms, the swarm algorithms are usually flexible, robust, decentralized and self-organized. These characters make the swarm algorithms suitable for solving complex problems, such as document collection clustering. The major challenge of today's information society is being overwhelmed with information on any topic they are searching for. Fast and high-quality document clustering algorithms play an important role inmore » helping users to effectively navigate, summarize, and organize the overwhelmed information. In this chapter, we introduce three nature inspired swarm intelligence clustering approaches for document clustering analysis. These clustering algorithms use stochastic and heuristic principles discovered from observing bird flocks, fish schools and ant food forage.« less

  10. Analysis of Documentation Speed Using Web-Based Medical Speech Recognition Technology: Randomized Controlled Trial.

    PubMed

    Vogel, Markus; Kaisers, Wolfgang; Wassmuth, Ralf; Mayatepek, Ertan

    2015-11-03

    Clinical documentation has undergone a change due to the usage of electronic health records. The core element is to capture clinical findings and document therapy electronically. Health care personnel spend a significant portion of their time on the computer. Alternatives to self-typing, such as speech recognition, are currently believed to increase documentation efficiency and quality, as well as satisfaction of health professionals while accomplishing clinical documentation, but few studies in this area have been published to date. This study describes the effects of using a Web-based medical speech recognition system for clinical documentation in a university hospital on (1) documentation speed, (2) document length, and (3) physician satisfaction. Reports of 28 physicians were randomized to be created with (intervention) or without (control) the assistance of a Web-based system of medical automatic speech recognition (ASR) in the German language. The documentation was entered into a browser's text area and the time to complete the documentation including all necessary corrections, correction effort, number of characters, and mood of participant were stored in a database. The underlying time comprised text entering, text correction, and finalization of the documentation event. Participants self-assessed their moods on a scale of 1-3 (1=good, 2=moderate, 3=bad). Statistical analysis was done using permutation tests. The number of clinical reports eligible for further analysis stood at 1455. Out of 1455 reports, 718 (49.35%) were assisted by ASR and 737 (50.65%) were not assisted by ASR. Average documentation speed without ASR was 173 (SD 101) characters per minute, while it was 217 (SD 120) characters per minute using ASR. The overall increase in documentation speed through Web-based ASR assistance was 26% (P=.04). Participants documented an average of 356 (SD 388) characters per report when not assisted by ASR and 649 (SD 561) characters per report when assisted by ASR. Participants' average mood rating was 1.3 (SD 0.6) using ASR assistance compared to 1.6 (SD 0.7) without ASR assistance (P<.001). We conclude that medical documentation with the assistance of Web-based speech recognition leads to an increase in documentation speed, document length, and participant mood when compared to self-typing. Speech recognition is a meaningful and effective tool for the clinical documentation process.

  11. Clustering More than Two Million Biomedical Publications: Comparing the Accuracies of Nine Text-Based Similarity Approaches

    PubMed Central

    Boyack, Kevin W.; Newman, David; Duhon, Russell J.; Klavans, Richard; Patek, Michael; Biberstine, Joseph R.; Schijvenaars, Bob; Skupin, André; Ma, Nianli; Börner, Katy

    2011-01-01

    Background We investigate the accuracy of different similarity approaches for clustering over two million biomedical documents. Clustering large sets of text documents is important for a variety of information needs and applications such as collection management and navigation, summary and analysis. The few comparisons of clustering results from different similarity approaches have focused on small literature sets and have given conflicting results. Our study was designed to seek a robust answer to the question of which similarity approach would generate the most coherent clusters of a biomedical literature set of over two million documents. Methodology We used a corpus of 2.15 million recent (2004-2008) records from MEDLINE, and generated nine different document-document similarity matrices from information extracted from their bibliographic records, including titles, abstracts and subject headings. The nine approaches were comprised of five different analytical techniques with two data sources. The five analytical techniques are cosine similarity using term frequency-inverse document frequency vectors (tf-idf cosine), latent semantic analysis (LSA), topic modeling, and two Poisson-based language models – BM25 and PMRA (PubMed Related Articles). The two data sources were a) MeSH subject headings, and b) words from titles and abstracts. Each similarity matrix was filtered to keep the top-n highest similarities per document and then clustered using a combination of graph layout and average-link clustering. Cluster results from the nine similarity approaches were compared using (1) within-cluster textual coherence based on the Jensen-Shannon divergence, and (2) two concentration measures based on grant-to-article linkages indexed in MEDLINE. Conclusions PubMed's own related article approach (PMRA) generated the most coherent and most concentrated cluster solution of the nine text-based similarity approaches tested, followed closely by the BM25 approach using titles and abstracts. Approaches using only MeSH subject headings were not competitive with those based on titles and abstracts. PMID:21437291

  12. The case for electron re-acceleration at galaxy cluster shocks

    NASA Astrophysics Data System (ADS)

    van Weeren, Reinout J.; Andrade-Santos, Felipe; Dawson, William A.; Golovich, Nathan; Lal, Dharam V.; Kang, Hyesung; Ryu, Dongsu; Brìggen, Marcus; Ogrean, Georgiana A.; Forman, William R.; Jones, Christine; Placco, Vinicius M.; Santucci, Rafael M.; Wittman, David; Jee, M. James; Kraft, Ralph P.; Sobral, David; Stroe, Andra; Fogarty, Kevin

    2017-01-01

    On the largest scales, the Universe consists of voids and filaments making up the cosmic web. Galaxy clusters are located at the knots in this web, at the intersection of filaments. Clusters grow through accretion from these large-scale filaments and by mergers with other clusters and groups. In a growing number of galaxy clusters, elongated Mpc-sized radio sources have been found1,2 . Also known as radio relics, these regions of diffuse radio emission are thought to trace relativistic electrons in the intracluster plasma accelerated by low-Mach-number shocks generated by cluster-cluster merger events 3 . A long-standing problem is how low-Mach-number shocks can accelerate electrons so efficiently to explain the observed radio relics. Here, we report the discovery of a direct connection between a radio relic and a radio galaxy in the merging galaxy cluster Abell 3411-3412 by combining radio, X-ray and optical observations. This discovery indicates that fossil relativistic electrons from active galactic nuclei are re-accelerated at cluster shocks. It also implies that radio galaxies play an important role in governing the non-thermal component of the intracluster medium in merging clusters.

  13. Web portal for dynamic creation and publication of teaching materials in multiple formats from a single source representation

    NASA Astrophysics Data System (ADS)

    Roganov, E. A.; Roganova, N. A.; Aleksandrov, A. I.; Ukolova, A. V.

    2017-01-01

    We implement a web portal which dynamically creates documents in more than 30 different formats including html, pdf and docx from a single original material source. It is obtained by using a number of free software such as Markdown (markup language), Pandoc (document converter), MathJax (library to display mathematical notation in web browsers), framework Ruby on Rails. The portal enables the creation of documents with a high quality visualization of mathematical formulas, is compatible with a mobile device and allows one to search documents by text or formula fragments. Moreover, it gives professors the ability to develop the latest technology educational materials, without qualified technicians' assistance, thus improving the quality of the whole educational process.

  14. Globe Teachers Guide and Photographic Data on the Web

    NASA Technical Reports Server (NTRS)

    Kowal, Dan

    2004-01-01

    The task of managing the GLOBE Online Teacher s Guide during this time period focused on transforming the technology behind the delivery system of this document. The web application transformed from a flat file retrieval system to a dynamic database access approach. The new methodology utilizes Java Server Pages (JSP) on the front-end and an Oracle relational database on the backend. This new approach allows users of the web site, mainly teachers, to access content efficiently by grade level and/or by investigation or educational concept area. Moreover, teachers can gain easier access to data sheets and lab and field guides. The new online guide also included updated content for all GLOBE protocols. The GLOBE web management team was given documentation for maintaining the new application. Instructions for modifying the JSP templates and managing database content were included in this document. It was delivered to the team by the end of October, 2003. The National Geophysical Data Center (NGDC) continued to manage the school study site photos on the GLOBE website. 333 study site photo images were added to the GLOBE database and posted on the web during this same time period for 64 schools. Documentation for processing study site photos was also delivered to the new GLOBE web management team. Lastly, assistance was provided in transferring reference applications such as the Cloud and LandSat quizzes and Earth Systems Online Poster from NGDC servers to GLOBE servers along with documentation for maintaining these applications.

  15. Case Studies in Describing Scientific Research Efforts as Linked Data

    NASA Astrophysics Data System (ADS)

    Gandara, A.; Villanueva-Rosales, N.; Gates, A.

    2013-12-01

    The Web is growing with numerous scientific resources, prompting increased efforts in information management to consider integration and exchange of scientific resources. Scientists have many options to share scientific resources on the Web; however, existing options provide limited support to scientists in annotating and relating research resources resulting from a scientific research effort. Moreover, there is no systematic approach to documenting scientific research and sharing it on the Web. This research proposes the Collect-Annotate-Refine-Publish (CARP) Methodology as an approach for guiding documentation of scientific research on the Semantic Web as scientific collections. Scientific collections are structured descriptions about scientific research that make scientific results accessible based on context. In addition, scientific collections enhance the Linked Data data space and can be queried by machines. Three case studies were conducted on research efforts at the Cyber-ShARE Research Center of Excellence in order to assess the effectiveness of the methodology to create scientific collections. The case studies exposed the challenges and benefits of leveraging the Semantic Web and Linked Data data space to facilitate access, integration and processing of Web-accessible scientific resources and research documentation. As such, we present the case study findings and lessons learned in documenting scientific research using CARP.

  16. Web page sorting algorithm based on query keyword distance relation

    NASA Astrophysics Data System (ADS)

    Yang, Han; Cui, Hong Gang; Tang, Hao

    2017-08-01

    In order to optimize the problem of page sorting, according to the search keywords in the web page in the relationship between the characteristics of the proposed query keywords clustering ideas. And it is converted into the degree of aggregation of the search keywords in the web page. Based on the PageRank algorithm, the clustering degree factor of the query keyword is added to make it possible to participate in the quantitative calculation. This paper proposes an improved algorithm for PageRank based on the distance relation between search keywords. The experimental results show the feasibility and effectiveness of the method.

  17. ESTminer: a Web interface for mining EST contig and cluster databases.

    PubMed

    Huang, Yecheng; Pumphrey, Janie; Gingle, Alan R

    2005-03-01

    ESTminer is a Web application and database schema for interactive mining of expressed sequence tag (EST) contig and cluster datasets. The Web interface contains a query frame that allows the selection of contigs/clusters with specific cDNA library makeup or a threshold number of members. The results are displayed as color-coded tree nodes, where the color indicates the fractional size of each cDNA library component. The nodes are expandable, revealing library statistics as well as EST or contig members, with links to sequence data, GenBank records or user configurable links. Also, the interface allows 'queries within queries' where the result set of a query is further filtered by the subsequent query. ESTminer is implemented in Java/JSP and the package, including MySQL and Oracle schema creation scripts, is available from http://cggc.agtec.uga.edu/Data/download.asp agingle@uga.edu.

  18. UW Inventory of Freight Emissions (WIFE3) heavy duty diesel vehicle web calculator methodology.

    DOT National Transportation Integrated Search

    2013-09-01

    This document serves as an overview and technical documentation for the University of Wisconsin Inventory of : Freight Emissions (WIFE3) calculator. The WIFE3 web calculator rapidly estimates future heavy duty diesel : vehicle (HDDV) roadway emission...

  19. SeMPI: a genome-based secondary metabolite prediction and identification web server.

    PubMed

    Zierep, Paul F; Padilla, Natàlia; Yonchev, Dimitar G; Telukunta, Kiran K; Klementz, Dennis; Günther, Stefan

    2017-07-03

    The secondary metabolism of bacteria, fungi and plants yields a vast number of bioactive substances. The constantly increasing amount of published genomic data provides the opportunity for an efficient identification of gene clusters by genome mining. Conversely, for many natural products with resolved structures, the encoding gene clusters have not been identified yet. Even though genome mining tools have become significantly more efficient in the identification of biosynthetic gene clusters, structural elucidation of the actual secondary metabolite is still challenging, especially due to as yet unpredictable post-modifications. Here, we introduce SeMPI, a web server providing a prediction and identification pipeline for natural products synthesized by polyketide synthases of type I modular. In order to limit the possible structures of PKS products and to include putative tailoring reactions, a structural comparison with annotated natural products was introduced. Furthermore, a benchmark was designed based on 40 gene clusters with annotated PKS products. The web server of the pipeline (SeMPI) is freely available at: http://www.pharmaceutical-bioinformatics.de/sempi. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. a Web-Based Interactive Platform for Co-Clustering Spatio-Temporal Data

    NASA Astrophysics Data System (ADS)

    Wu, X.; Poorthuis, A.; Zurita-Milla, R.; Kraak, M.-J.

    2017-09-01

    Since current studies on clustering analysis mainly focus on exploring spatial or temporal patterns separately, a co-clustering algorithm is utilized in this study to enable the concurrent analysis of spatio-temporal patterns. To allow users to adopt and adapt the algorithm for their own analysis, it is integrated within the server side of an interactive web-based platform. The client side of the platform, running within any modern browser, is a graphical user interface (GUI) with multiple linked visualizations that facilitates the understanding, exploration and interpretation of the raw dataset and co-clustering results. Users can also upload their own datasets and adjust clustering parameters within the platform. To illustrate the use of this platform, an annual temperature dataset from 28 weather stations over 20 years in the Netherlands is used. After the dataset is loaded, it is visualized in a set of linked visualizations: a geographical map, a timeline and a heatmap. This aids the user in understanding the nature of their dataset and the appropriate selection of co-clustering parameters. Once the dataset is processed by the co-clustering algorithm, the results are visualized in the small multiples, a heatmap and a timeline to provide various views for better understanding and also further interpretation. Since the visualization and analysis are integrated in a seamless platform, the user can explore different sets of co-clustering parameters and instantly view the results in order to do iterative, exploratory data analysis. As such, this interactive web-based platform allows users to analyze spatio-temporal data using the co-clustering method and also helps the understanding of the results using multiple linked visualizations.

  1. Automatic script identification from images using cluster-based templates

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hochberg, J.; Kerns, L.; Kelly, P.

    We have developed a technique for automatically identifying the script used to generate a document that is stored electronically in bit image form. Our approach differs from previous work in that the distinctions among scripts are discovered by an automatic learning procedure, without any handson analysis. We first develop a set of representative symbols (templates) for each script in our database (Cyrillic, Roman, etc.). We do this by identifying all textual symbols in a set of training documents, scaling each symbol to a fixed size, clustering similar symbols, pruning minor clusters, and finding each cluster`s centroid. To identify a newmore » document`s script, we identify and scale a subset of symbols from the document and compare them to the templates for each script. We choose the script whose templates provide the best match. Our current system distinguishes among the Armenian, Burmese, Chinese, Cyrillic, Ethiopic, Greek, Hebrew, Japanese, Korean, Roman, and Thai scripts with over 90% accuracy.« less

  2. How Japanese students characterize information from web-sites.

    PubMed

    Iwahara, A; Yamada, M; Hatta, T; Kawakami, A; Okamoto, M

    2000-12-01

    How 352 Japanese university students regard web-site information was investigated by two kinds of survey. Application of correspondence analysis and cluster analysis to the questionnaire responses to the web-site advertisement showed students regarded a web-site as a new alien medium which is different from current media. Students regarded web-sites as simply complicated, intellectual, and impermanent, or not memorable. Students got precise information from web-sites but they did not use it in making decisions to purchase goods.

  3. Pragmatic service development and customisation with the CEDA OGC Web Services framework

    NASA Astrophysics Data System (ADS)

    Pascoe, Stephen; Stephens, Ag; Lowe, Dominic

    2010-05-01

    The CEDA OGC Web Services framework (COWS) emphasises rapid service development by providing a lightweight layer of OGC web service logic on top of Pylons, a mature web application framework for the Python language. This approach gives developers a flexible web service development environment without compromising access to the full range of web application tools and patterns: Model-View-Controller paradigm, XML templating, Object-Relational-Mapper integration and authentication/authorization. We have found this approach useful for exploring evolving standards and implementing protocol extensions to meet the requirements of operational deployments. This paper outlines how COWS is being used to implement customised WMS, WCS, WFS and WPS services in a variety of web applications from experimental prototypes to load-balanced cluster deployments serving 10-100 simultaneous users. In particular we will cover 1) The use of Climate Science Modeling Language (CSML) in complex-feature aware WMS, WCS and WFS services, 2) Extending WMS to support applications with features specific to earth system science and 3) A cluster-enabled Web Processing Service (WPS) supporting asynchronous data processing. The COWS WPS underpins all backend services in the UK Climate Projections User Interface where users can extract, plot and further process outputs from a multi-dimensional probabilistic climate model dataset. The COWS WPS supports cluster job execution, result caching, execution time estimation and user management. The COWS WMS and WCS components drive the project-specific NCEO and QESDI portals developed by the British Atmospheric Data Centre. These portals use CSML as a backend description format and implement features such as multiple WMS layer dimensions and climatology axes that are beyond the scope of general purpose GIS tools and yet vital for atmospheric science applications.

  4. Web Mining for Web Image Retrieval.

    ERIC Educational Resources Information Center

    Chen, Zheng; Wenyin, Liu; Zhang, Feng; Li, Mingjing; Zhang, Hongjiang

    2001-01-01

    Presents a prototype system for image retrieval from the Internet using Web mining. Discusses the architecture of the Web image retrieval prototype; document space modeling; user log mining; and image retrieval experiments to evaluate the proposed system. (AEF)

  5. Comparing Web, Group and Telehealth Formats of a Military Parenting Program

    DTIC Science & Technology

    2017-06-01

    AWARD NUMBER: W81XWH-14-1-0143 TITLE: Comparing Web, Group and Telehealth Formats of a Military Parenting Program PRINCIPAL INVESTIGATOR...be construed as an official Department of the Army position, policy or decision unless so designated by other documentation. REPORT DOCUMENTATION...2017 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Comparing Web, Group and Telehealth Formats of a Military Parenting Program 5b. GRANT NUMBER 5c

  6. Scale-free characteristics of random networks: the topology of the world-wide web

    NASA Astrophysics Data System (ADS)

    Barabási, Albert-László; Albert, Réka; Jeong, Hawoong

    2000-06-01

    The world-wide web forms a large directed graph, whose vertices are documents and edges are links pointing from one document to another. Here we demonstrate that despite its apparent random character, the topology of this graph has a number of universal scale-free characteristics. We introduce a model that leads to a scale-free network, capturing in a minimal fashion the self-organization processes governing the world-wide web.

  7. Assessing the Cost-Effectiveness of Modernizing the KC-10 to Meet Global Air Traffic Management Mandates

    DTIC Science & Technology

    2009-01-01

    representation of RAND intellectual property is provided for non-commercial use only. Unauthorized posting of RAND PDFs to a non-RAND Web site is...duplicated for commercial purposes. Unauthorized posting of RAND documents to a non-RAND Web site is prohibited. RAND documents are protected under...Employment; Manpower, Personnel, and Train- ing; Resource Management; and Strategy and Doctrine. Additional information about PAF is available on our Web

  8. Using the web to validate document recognition results: experiments with business cards

    NASA Astrophysics Data System (ADS)

    Oertel, Clemens; O'Shea, Shauna; Bodnar, Adam; Blostein, Dorothea

    2004-12-01

    The World Wide Web is a vast information resource which can be useful for validating the results produced by document recognizers. Three computational steps are involved, all of them challenging: (1) use the recognition results in a Web search to retrieve Web pages that contain information similar to that in the document, (2) identify the relevant portions of the retrieved Web pages, and (3) analyze these relevant portions to determine what corrections (if any) should be made to the recognition result. We have conducted exploratory implementations of steps (1) and (2) in the business-card domain: we use fields of the business card to retrieve Web pages and identify the most relevant portions of those Web pages. In some cases, this information appears suitable for correcting OCR errors in the business card fields. In other cases, the approach fails due to stale information: when business cards are several years old and the business-card holder has changed jobs, then websites (such as the home page or company website) no longer contain information matching that on the business card. Our exploratory results indicate that in some domains it may be possible to develop effective means of querying the Web with recognition results, and to use this information to correct the recognition results and/or detect that the information is stale.

  9. Using the web to validate document recognition results: experiments with business cards

    NASA Astrophysics Data System (ADS)

    Oertel, Clemens; O'Shea, Shauna; Bodnar, Adam; Blostein, Dorothea

    2005-01-01

    The World Wide Web is a vast information resource which can be useful for validating the results produced by document recognizers. Three computational steps are involved, all of them challenging: (1) use the recognition results in a Web search to retrieve Web pages that contain information similar to that in the document, (2) identify the relevant portions of the retrieved Web pages, and (3) analyze these relevant portions to determine what corrections (if any) should be made to the recognition result. We have conducted exploratory implementations of steps (1) and (2) in the business-card domain: we use fields of the business card to retrieve Web pages and identify the most relevant portions of those Web pages. In some cases, this information appears suitable for correcting OCR errors in the business card fields. In other cases, the approach fails due to stale information: when business cards are several years old and the business-card holder has changed jobs, then websites (such as the home page or company website) no longer contain information matching that on the business card. Our exploratory results indicate that in some domains it may be possible to develop effective means of querying the Web with recognition results, and to use this information to correct the recognition results and/or detect that the information is stale.

  10. The case for electron re-acceleration at galaxy cluster shocks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    van Weeren, Reinout J.; Andrade-Santos, Felipe; Dawson, William A.

    On the largest scales, the Universe consists of voids and filaments making up the cosmic web. Galaxy clusters are located at the knots in this web, at the intersection of filaments. Clusters grow through accretion from these large-scale filaments and by mergers with other clusters and groups. In a growing number of galaxy clusters, elongated Mpc-sized radio sources have been found. Also known as radio relics, these regions of diffuse radio emission are thought to trace relativistic electrons in the intracluster plasma accelerated by low-Mach-number shocks generated by cluster–cluster merger events. A long-standing problem is how low-Mach-number shocks can acceleratemore » electrons so efficiently to explain the observed radio relics. Here, we report the discovery of a direct connection between a radio relic and a radio galaxy in the merging galaxy cluster Abell 3411–3412 by combining radio, X-ray and optical observations. This discovery indicates that fossil relativistic electrons from active galactic nuclei are re-accelerated at cluster shocks. Lastly, it also implies that radio galaxies play an important role in governing the non-thermal component of the intracluster medium in merging clusters.« less

  11. The case for electron re-acceleration at galaxy cluster shocks

    DOE PAGES

    van Weeren, Reinout J.; Andrade-Santos, Felipe; Dawson, William A.; ...

    2017-01-04

    On the largest scales, the Universe consists of voids and filaments making up the cosmic web. Galaxy clusters are located at the knots in this web, at the intersection of filaments. Clusters grow through accretion from these large-scale filaments and by mergers with other clusters and groups. In a growing number of galaxy clusters, elongated Mpc-sized radio sources have been found. Also known as radio relics, these regions of diffuse radio emission are thought to trace relativistic electrons in the intracluster plasma accelerated by low-Mach-number shocks generated by cluster–cluster merger events. A long-standing problem is how low-Mach-number shocks can acceleratemore » electrons so efficiently to explain the observed radio relics. Here, we report the discovery of a direct connection between a radio relic and a radio galaxy in the merging galaxy cluster Abell 3411–3412 by combining radio, X-ray and optical observations. This discovery indicates that fossil relativistic electrons from active galactic nuclei are re-accelerated at cluster shocks. Lastly, it also implies that radio galaxies play an important role in governing the non-thermal component of the intracluster medium in merging clusters.« less

  12. Automatic generation of Web mining environments

    NASA Astrophysics Data System (ADS)

    Cibelli, Maurizio; Costagliola, Gennaro

    1999-02-01

    The main problem related to the retrieval of information from the world wide web is the enormous number of unstructured documents and resources, i.e., the difficulty of locating and tracking appropriate sources. This paper presents a web mining environment (WME), which is capable of finding, extracting and structuring information related to a particular domain from web documents, using general purpose indices. The WME architecture includes a web engine filter (WEF), to sort and reduce the answer set returned by a web engine, a data source pre-processor (DSP), which processes html layout cues in order to collect and qualify page segments, and a heuristic-based information extraction system (HIES), to finally retrieve the required data. Furthermore, we present a web mining environment generator, WMEG, that allows naive users to generate a WME specific to a given domain by providing a set of specifications.

  13. Poster — Thur Eve — 52: A Web-based Platform for Collaborative Document Management in Radiotherapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kildea, J.; Joseph, A.

    We describe DepDocs, a web-based platform that we have developed to manage the committee meetings, policies, procedures and other documents within our otherwise paperless radiotherapy clinic. DepDocs is essentially a document management system based on the popular Drupal content management software. For security and confidentiality, it is hosted on a linux server internal to our hospital network such that documents are never sent to the cloud or outside of the hospital firewall. We used Drupal's in-built role-based user rights management system to assign a role, and associated document editing rights, to each user. Documents are accessed for viewing using eithermore » a simple Google-like search or by generating a list of related documents from a taxonomy of categorization terms. Our system provides document revision tracking and an document review and approval mechanism for all official policies and procedures. Committee meeting schedules, agendas and minutes are maintained by committee chairs and are restricted to committee members. DepDocs has been operational within our department for over six months and has already 45 unique users and an archive of over 1000 documents, mostly policies and procedures. Documents are easily retrievable from the system using any web browser within our hospital's network.« less

  14. Multiple Imputation based Clustering Validation (MIV) for Big Longitudinal Trial Data with Missing Values in eHealth.

    PubMed

    Zhang, Zhaoyang; Fang, Hua; Wang, Honggang

    2016-06-01

    Web-delivered trials are an important component in eHealth services. These trials, mostly behavior-based, generate big heterogeneous data that are longitudinal, high dimensional with missing values. Unsupervised learning methods have been widely applied in this area, however, validating the optimal number of clusters has been challenging. Built upon our multiple imputation (MI) based fuzzy clustering, MIfuzzy, we proposed a new multiple imputation based validation (MIV) framework and corresponding MIV algorithms for clustering big longitudinal eHealth data with missing values, more generally for fuzzy-logic based clustering methods. Specifically, we detect the optimal number of clusters by auto-searching and -synthesizing a suite of MI-based validation methods and indices, including conventional (bootstrap or cross-validation based) and emerging (modularity-based) validation indices for general clustering methods as well as the specific one (Xie and Beni) for fuzzy clustering. The MIV performance was demonstrated on a big longitudinal dataset from a real web-delivered trial and using simulation. The results indicate MI-based Xie and Beni index for fuzzy-clustering are more appropriate for detecting the optimal number of clusters for such complex data. The MIV concept and algorithms could be easily adapted to different types of clustering that could process big incomplete longitudinal trial data in eHealth services.

  15. Multiple Imputation based Clustering Validation (MIV) for Big Longitudinal Trial Data with Missing Values in eHealth

    PubMed Central

    Zhang, Zhaoyang; Wang, Honggang

    2016-01-01

    Web-delivered trials are an important component in eHealth services. These trials, mostly behavior-based, generate big heterogeneous data that are longitudinal, high dimensional with missing values. Unsupervised learning methods have been widely applied in this area, however, validating the optimal number of clusters has been challenging. Built upon our multiple imputation (MI) based fuzzy clustering, MIfuzzy, we proposed a new multiple imputation based validation (MIV) framework and corresponding MIV algorithms for clustering big longitudinal eHealth data with missing values, more generally for fuzzy-logic based clustering methods. Specifically, we detect the optimal number of clusters by auto-searching and -synthesizing a suite of MI-based validation methods and indices, including conventional (bootstrap or cross-validation based) and emerging (modularity-based) validation indices for general clustering methods as well as the specific one (Xie and Beni) for fuzzy clustering. The MIV performance was demonstrated on a big longitudinal dataset from a real web-delivered trial and using simulation. The results indicate MI-based Xie and Beni index for fuzzy-clustering is more appropriate for detecting the optimal number of clusters for such complex data. The MIV concept and algorithms could be easily adapted to different types of clustering that could process big incomplete longitudinal trial data in eHealth services. PMID:27126063

  16. JPL, NASA and the Historical Record: Key Events/Documents in Lunar and Mars Exploration

    NASA Technical Reports Server (NTRS)

    Hooks, Michael Q.

    1999-01-01

    This document represents a presentation about the Jet Propulsion Laboratory (JPL) historical archives in the area of Lunar and Martian Exploration. The JPL archives documents the history of JPL's flight projects, research and development activities and administrative operations. The archives are in a variety of format. The presentation reviews the information available through the JPL archives web site, information available through the Regional Planetary Image Facility web site, and the information on past missions available through the web sites. The presentation also reviews the NASA historical resources at the NASA History Office and the National Archives and Records Administration.

  17. Going, going, still there: using the WebCite service to permanently archive cited Web pages.

    PubMed

    Eysenbach, Gunther

    2006-01-01

    Scholars are increasingly citing electronic "web references" which are not preserved in libraries or full text archives. WebCite is a new standard for citing web references. To "webcite" a document involves archiving the cited Web page through www.webcitation.org and citing the WebCite permalink instead of (or in addition to) the unstable live Web page.

  18. 78 FR 68100 - Luminant Generation Company, LLC

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-11-13

    ... following methods: Federal Rulemaking Web site: Go to http://www.regulations.gov and search for Docket ID.../adams.html . To begin the search, select ``ADAMS Public Documents'' and then select ``Begin Web- based ADAMS Search.'' For problems with ADAMS, please contact the NRC's Public Document Room (PDR) reference...

  19. Biotea: RDFizing PubMed Central in support for the paper as an interface to the Web of Data

    PubMed Central

    2013-01-01

    Background The World Wide Web has become a dissemination platform for scientific and non-scientific publications. However, most of the information remains locked up in discrete documents that are not always interconnected or machine-readable. The connectivity tissue provided by RDF technology has not yet been widely used to support the generation of self-describing, machine-readable documents. Results In this paper, we present our approach to the generation of self-describing machine-readable scholarly documents. We understand the scientific document as an entry point and interface to the Web of Data. We have semantically processed the full-text, open-access subset of PubMed Central. Our RDF model and resulting dataset make extensive use of existing ontologies and semantic enrichment services. We expose our model, services, prototype, and datasets at http://biotea.idiginfo.org/ Conclusions The semantic processing of biomedical literature presented in this paper embeds documents within the Web of Data and facilitates the execution of concept-based queries against the entire digital library. Our approach delivers a flexible and adaptable set of tools for metadata enrichment and semantic processing of biomedical documents. Our model delivers a semantically rich and highly interconnected dataset with self-describing content so that software can make effective use of it. PMID:23734622

  20. Documenting pharmacist interventions on an intranet.

    PubMed

    Simonian, Armen I

    2003-01-15

    The process of developing and implementing an intranet Web site for clinical intervention documentation is described. An inpatient pharmacy department initiated an organizationwide effort to improve documentation of interventions by pharmacists at its seven hospitals to achieve real-time capture of meaningful benchmarking data. Standardization of intervention types would allow the health system to contrast and compare medication use, process improvement, and patient care initiatives among its hospitals. After completing a needs assessment and reviewing current methodologies, a computerized tracking tool was developed in-house and integrated with the organization's intranet. Representatives from all hospitals agreed on content and functionality requirements for the Web site. The site was completed and activated in February 2002. Before this Web site was established, the most documented intervention types were Renal Adjustment and Clarify Dose, with a daily average of four and three, respectively. After site activation, daily averages for Renal Adjustment remained unchanged, but Clarify Dose is now documented nine times per day. Drug Information and i.v.-to-p.o. intervention types, which previously averaged less than one intervention per day, are now documented an average of four times daily. Approximately 91% of staff pharmacists are using this site. Future plans for this site include enhanced accessibility to the site with wireless personal digital assistants. The design and implementation of an intranet Web site to document pharmacists' interventions doubled the rate of intervention documentation and standardized the intervention types among hospitals in the health system.

  1. The topology of the cosmic web in terms of persistent Betti numbers

    NASA Astrophysics Data System (ADS)

    Pranav, Pratyush; Edelsbrunner, Herbert; van de Weygaert, Rien; Vegter, Gert; Kerber, Michael; Jones, Bernard J. T.; Wintraecken, Mathijs

    2017-03-01

    We introduce a multiscale topological description of the Megaparsec web-like cosmic matter distribution. Betti numbers and topological persistence offer a powerful means of describing the rich connectivity structure of the cosmic web and of its multiscale arrangement of matter and galaxies. Emanating from algebraic topology and Morse theory, Betti numbers and persistence diagrams represent an extension and deepening of the cosmologically familiar topological genus measure and the related geometric Minkowski functionals. In addition to a description of the mathematical background, this study presents the computational procedure for computing Betti numbers and persistence diagrams for density field filtrations. The field may be computed starting from a discrete spatial distribution of galaxies or simulation particles. The main emphasis of this study concerns an extensive and systematic exploration of the imprint of different web-like morphologies and different levels of multiscale clustering in the corresponding computed Betti numbers and persistence diagrams. To this end, we use Voronoi clustering models as templates for a rich variety of web-like configurations and the fractal-like Soneira-Peebles models exemplify a range of multiscale configurations. We have identified the clear imprint of cluster nodes, filaments, walls, and voids in persistence diagrams, along with that of the nested hierarchy of structures in multiscale point distributions. We conclude by outlining the potential of persistent topology for understanding the connectivity structure of the cosmic web, in large simulations of cosmic structure formation and in the challenging context of the observed galaxy distribution in large galaxy surveys.

  2. Handwritten text line segmentation by spectral clustering

    NASA Astrophysics Data System (ADS)

    Han, Xuecheng; Yao, Hui; Zhong, Guoqiang

    2017-02-01

    Since handwritten text lines are generally skewed and not obviously separated, text line segmentation of handwritten document images is still a challenging problem. In this paper, we propose a novel text line segmentation algorithm based on the spectral clustering. Given a handwritten document image, we convert it to a binary image first, and then compute the adjacent matrix of the pixel points. We apply spectral clustering on this similarity metric and use the orthogonal kmeans clustering algorithm to group the text lines. Experiments on Chinese handwritten documents database (HIT-MW) demonstrate the effectiveness of the proposed method.

  3. Dark Web 101

    DTIC Science & Technology

    2016-07-21

    Todays internet has multiple webs. The surface web is what Google and other search engines index and pull based on links. Essentially, the surface...financial records, research and development), and personal data (medical records or legal documents). These are all deep web. Standard search engines dont

  4. Creating Polyphony with Exploratory Web Documentation in Singapore

    ERIC Educational Resources Information Center

    Lim, Sirene; Hoo, Lum Chee

    2012-01-01

    We introduce and reflect on "Images of Teaching", an ongoing web documentation research project on preschool teaching in Singapore. This paper discusses the project's purpose, methodological process, and our learning points as researchers who aim to contribute towards inquiry-based professional learning. The website offers a window into…

  5. E-Texts, Mobile Browsing, and Rich Internet Applications

    ERIC Educational Resources Information Center

    Godwin-Jones, Robert

    2007-01-01

    Online reading is evolving beyond the perusal of static documents with Web pages inviting readers to become commentators, collaborators, and critics. The much-ballyhooed Web 2.0 is essentially a transition from online consumer to consumer/producer/participant. An online document may well include embedded multimedia or contain other forms of…

  6. Mantis BT Cluster Support

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Riot, V.

    2009-06-05

    The software is a modidication to the Mantis BT V1.5 open source application provided by the mantis BT group to support cluster web servers. It also provides various cosmetic modifications used a LLNL.

  7. Results from a Web Impact Factor Crawler.

    ERIC Educational Resources Information Center

    Thelwall, Mike

    2001-01-01

    Discusses Web impact factors (WIFs), Web versions of the impact factors for journals, and how they can be calculated by using search engines. Highlights include HTML and document indexing; Web page links; a Web crawler designed for calculating WIFs; and WIFs for United Kingdom universities that measured research profiles or capability. (Author/LRW)

  8. PuReD-MCL: a graph-based PubMed document clustering methodology.

    PubMed

    Theodosiou, T; Darzentas, N; Angelis, L; Ouzounis, C A

    2008-09-01

    Biomedical literature is the principal repository of biomedical knowledge, with PubMed being the most complete database collecting, organizing and analyzing such textual knowledge. There are numerous efforts that attempt to exploit this information by using text mining and machine learning techniques. We developed a novel approach, called PuReD-MCL (Pubmed Related Documents-MCL), which is based on the graph clustering algorithm MCL and relevant resources from PubMed. PuReD-MCL avoids using natural language processing (NLP) techniques directly; instead, it takes advantage of existing resources, available from PubMed. PuReD-MCL then clusters documents efficiently using the MCL graph clustering algorithm, which is based on graph flow simulation. This process allows users to analyse the results by highlighting important clues, and finally to visualize the clusters and all relevant information using an interactive graph layout algorithm, for instance BioLayout Express 3D. The methodology was applied to two different datasets, previously used for the validation of the document clustering tool TextQuest. The first dataset involves the organisms Escherichia coli and yeast, whereas the second is related to Drosophila development. PuReD-MCL successfully reproduces the annotated results obtained from TextQuest, while at the same time provides additional insights into the clusters and the corresponding documents. Source code in perl and R are available from http://tartara.csd.auth.gr/~theodos/

  9. WordCluster: detecting clusters of DNA words and genomic elements

    PubMed Central

    2011-01-01

    Background Many k-mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds. Results We introduce here an algorithm to detect clusters of DNA words (k-mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used WordCluster to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome. Conclusions WordCluster seems to predict biological meaningful clusters of DNA words (k-mers) and genomic entities. The implementation of the method into a web server is available at http://bioinfo2.ugr.es/wordCluster/wordCluster.php including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes. PMID:21261981

  10. An Educational Tool for Browsing the Semantic Web

    ERIC Educational Resources Information Center

    Yoo, Sujin; Kim, Younghwan; Park, Seongbin

    2013-01-01

    The Semantic Web is an extension of the current Web where information is represented in a machine processable way. It is not separate from the current Web and one of the confusions that novice users might have is where the Semantic Web is. In fact, users can easily encounter RDF documents that are components of the Semantic Web while they navigate…

  11. Research on dissociative seizures: A bibliometric analysis and visualization of the scientific landscape.

    PubMed

    Popkirov, Stoyan; Jungilligens, Johannes; Schlegel, Uwe; Wellmer, Jörg

    2018-06-01

    Dissociative seizures are a common and often elusive differential diagnosis in epilepsy centers. Considering their high prevalence, long diagnostic delays, and disappointing rates of treatment response, scientific research dedicated to dissociative seizures is surprisingly scarce. In order to chart the scientific landscape of dissociative seizures and to visualize thematic clusters and trends in research, a comprehensive bibliometric analysis was performed. The Web of Science database was examined to identify relevant English language documents from the last half-century. A total of 1751 documents with titles referring to dissociative seizures were identified. Automated textual analysis of all titles and abstracts revealed that research clusters around three major topics: differential diagnosis in epilepsy centers, management and treatment, and psychopathology. Time analysis of term networks revealed that the focus of clinical research has moved from diagnostic procedures to treatment approaches. Furthermore, interest within etiological research is shifting from an emphasis on early life trauma and personality traits to the role of anxiety and emotion regulation. With respect to individual contributing authors, a relatively small network of prolific scientists with a remarkable degree of collaboration emerges. By mapping relevant publications, it becomes evident that dissociative seizures still represent a subject mostly within the realm of neurology and epileptology, with a tendency to settle in the latter domain. This analysis sheds light on an important niche subject and highlights trends in research focus and output. Copyright © 2018 Elsevier Inc. All rights reserved.

  12. Subject and Citation Indexing. Part I: The Clustering Structure of Composite Representations in the Cystic Fibrosis Document Collection. Part II: The Optimal, Cluster-Based Retrieval Performance of Composite Representations.

    ERIC Educational Resources Information Center

    Shaw, W. M., Jr.

    1991-01-01

    Two articles discuss the clustering of composite representations in the Cystic Fibrosis Document Collection from the National Library of Medicine's MEDLINE file. Clustering is evaluated as a function of the exhaustivity of composite representations based on Medical Subject Headings (MeSH) and citation indexes, and evaluation of retrieval…

  13. Indexing and Retrieval for the Web.

    ERIC Educational Resources Information Center

    Rasmussen, Edie M.

    2003-01-01

    Explores current research on indexing and ranking as retrieval functions of search engines on the Web. Highlights include measuring search engine stability; evaluation of Web indexing and retrieval; Web crawlers; hyperlinks for indexing and ranking; ranking for metasearch; document structure; citation indexing; relevance; query evaluation;…

  14. 75 FR 78725 - Recreational Boating Safety Projects, Programs and Activities Funded Under Provisions of the...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-16

    ... defect. ($17,335). Web-based Document Management System: Funding was provided to continue to provide a web-based document management system to better enable the handling of thousands of recreational... program strategy support to the nation-wide RBS effort. The goal is to coordinate the RBS outreach...

  15. 12 CFR 611.1216 - Public availability of documents related to the termination.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... termination. 611.1216 Section 611.1216 Banks and Banking FARM CREDIT ADMINISTRATION FARM CREDIT SYSTEM ORGANIZATION Termination of System Institution Status § 611.1216 Public availability of documents related to the termination. (a) We may post on our Web site, or require you to post on your Web site: (1) Results...

  16. World-Wide Web: The Information Universe.

    ERIC Educational Resources Information Center

    Berners-Lee, Tim; And Others

    1992-01-01

    Describes the World-Wide Web (W3) project, which is designed to create a global information universe using techniques of hypertext, information retrieval, and wide area networking. Discussion covers the W3 data model, W3 architecture, the document naming scheme, protocols, document formats, comparison with other systems, experience with the W3…

  17. Experimenting with semantic web services to understand the role of NLP technologies in healthcare.

    PubMed

    Jagannathan, V

    2006-01-01

    NLP technologies can play a significant role in healthcare where a predominant segment of the clinical documentation is in text form. In a graduate course focused on understanding semantic web services at West Virginia University, a class project was designed with the purpose of exploring potential use for NLP-based abstraction of clinical documentation. The role of NLP-technology was simulated using human abstractors and various workflows were investigated using public domain workflow and semantic web service technologies. This poster explores the potential use of NLP and the role of workflow and semantic web technologies in developing healthcare IT environments.

  18. Semantic annotation of Web data applied to risk in food.

    PubMed

    Hignette, Gaëlle; Buche, Patrice; Couvert, Olivier; Dibie-Barthélemy, Juliette; Doussot, David; Haemmerlé, Ollivier; Mettler, Eric; Soler, Lydie

    2008-11-30

    A preliminary step to risk in food assessment is the gathering of experimental data. In the framework of the Sym'Previus project (http://www.symprevius.org), a complete data integration system has been designed, grouping data provided by industrial partners and data extracted from papers published in the main scientific journals of the domain. Those data have been classified by means of a predefined vocabulary, called ontology. Our aim is to complement the database with data extracted from the Web. In the framework of the WebContent project (www.webcontent.fr), we have designed a semi-automatic acquisition tool, called @WEB, which retrieves scientific documents from the Web. During the @WEB process, data tables are extracted from the documents and then annotated with the ontology. We focus on the data tables as they contain, in general, a synthesis of data published in the documents. In this paper, we explain how the columns of the data tables are automatically annotated with data types of the ontology and how the relations represented by the table are recognised. We also give the results of our experimentation to assess the quality of such an annotation.

  19. Polar Domain Discovery with Sparkler

    NASA Astrophysics Data System (ADS)

    Duerr, R.; Khalsa, S. J. S.; Mattmann, C. A.; Ottilingam, N. K.; Singh, K.; Lopez, L. A.

    2017-12-01

    The scientific web is vast and ever growing. It encompasses millions of textual, scientific and multimedia documents describing research in a multitude of scientific streams. Most of these documents are hidden behind forms which require user action to retrieve and thus can't be directly accessed by content crawlers. These documents are hosted on web servers across the world, most often on outdated hardware and network infrastructure. Hence it is difficult and time-consuming to aggregate documents from the scientific web, especially those relevant to a specific domain. Thus generating meaningful domain-specific insights is currently difficult. We present an automated discovery system (Figure 1) using Sparkler, an open-source, extensible, horizontally scalable crawler which facilitates high throughput and focused crawling of documents pertinent to a particular domain such as information about polar regions. With this set of highly domain relevant documents, we show that it is possible to answer analytical questions about that domain. Our domain discovery algorithm leverages prior domain knowledge to reach out to commercial/scientific search engines to generate seed URLs. Subject matter experts then annotate these seed URLs manually on a scale from highly relevant to irrelevant. We leverage this annotated dataset to train a machine learning model which predicts the `domain relevance' of a given document. We extend Sparkler with this model to focus crawling on documents relevant to that domain. Sparkler avoids disruption of service by 1) partitioning URLs by hostname such that every node gets a different host to crawl and by 2) inserting delays between subsequent requests. With an NSF-funded supercomputer Wrangler, we scaled our domain discovery pipeline to crawl about 200k polar specific documents from the scientific web, within a day.

  20. The ABCs of PDFs.

    ERIC Educational Resources Information Center

    Adler, Steve

    2000-01-01

    Explains the use of Adobe Acrobat's Portable Document Format (PDF) for school Web sites and Intranets. Explains the PDF workflow; components for Web-based PDF delivery, including the Web server, preparing content of the PDF files, and the browser; incorporating PDFs into the Web site; incorporating multimedia; and software. (LRW)

  1. Evolution of the cosmic web

    NASA Astrophysics Data System (ADS)

    Cautun, Marius; van de Weygaert, Rien; Jones, Bernard J. T.; Frenk, Carlos S.

    2014-07-01

    The cosmic web is the largest scale manifestation of the anisotropic gravitational collapse of matter. It represents the transitional stage between linear and non-linear structures and contains easily accessible information about the early phases of structure formation processes. Here we investigate the characteristics and the time evolution of morphological components. Our analysis involves the application of the NEXUS Multiscale Morphology Filter technique, predominantly its NEXUS+ version, to high resolution and large volume cosmological simulations. We quantify the cosmic web components in terms of their mass and volume content, their density distribution and halo populations. We employ new analysis techniques to determine the spatial extent of filaments and sheets, like their total length and local width. This analysis identifies clusters and filaments as the most prominent components of the web. In contrast, while voids and sheets take most of the volume, they correspond to underdense environments and are devoid of group-sized and more massive haloes. At early times the cosmos is dominated by tenuous filaments and sheets, which, during subsequent evolution, merge together, such that the present-day web is dominated by fewer, but much more massive, structures. The analysis of the mass transport between environments clearly shows how matter flows from voids into walls, and then via filaments into cluster regions, which form the nodes of the cosmic web. We also study the properties of individual filamentary branches, to find long, almost straight, filaments extending to distances larger than 100 h-1 Mpc. These constitute the bridges between massive clusters, which seem to form along approximatively straight lines.

  2. Using Cluster Analysis for Data Mining in Educational Technology Research

    ERIC Educational Resources Information Center

    Antonenko, Pavlo D.; Toy, Serkan; Niederhauser, Dale S.

    2012-01-01

    Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through…

  3. Clustering header categories extracted from web tables

    NASA Astrophysics Data System (ADS)

    Nagy, George; Embley, David W.; Krishnamoorthy, Mukkai; Seth, Sharad

    2015-01-01

    Revealing related content among heterogeneous web tables is part of our long term objective of formulating queries over multiple sources of information. Two hundred HTML tables from institutional web sites are segmented and each table cell is classified according to the fundamental indexing property of row and column headers. The categories that correspond to the multi-dimensional data cube view of a table are extracted by factoring the (often multi-row/column) headers. To reveal commonalities between tables from diverse sources, the Jaccard distances between pairs of category headers (and also table titles) are computed. We show how about one third of our heterogeneous collection can be clustered into a dozen groups that exhibit table-title and header similarities that can be exploited for queries.

  4. Bibliometrics of the World Wide Web: An Exploratory Analysis of the Intellectual Structure of Cyberspace.

    ERIC Educational Resources Information Center

    Larson, Ray R.

    1996-01-01

    Examines the bibliometrics of the World Wide Web based on analysis of Web pages collected by the Inktomi "Web Crawler" and on the use of the DEC AltaVista search engine for cocitation analysis of a set of Earth Science related Web sites. Looks at the statistical characteristics of Web documents and their hypertext links, and the…

  5. The Implementation of Cosine Similarity to Calculate Text Relevance between Two Documents

    NASA Astrophysics Data System (ADS)

    Gunawan, D.; Sembiring, C. A.; Budiman, M. A.

    2018-03-01

    Rapidly increasing number of web pages or documents leads to topic specific filtering in order to find web pages or documents efficiently. This is a preliminary research that uses cosine similarity to implement text relevance in order to find topic specific document. This research is divided into three parts. The first part is text-preprocessing. In this part, the punctuation in a document will be removed, then convert the document to lower case, implement stop word removal and then extracting the root word by using Porter Stemming algorithm. The second part is keywords weighting. Keyword weighting will be used by the next part, the text relevance calculation. Text relevance calculation will result the value between 0 and 1. The closer value to 1, then both documents are more related, vice versa.

  6. The Use of Supporting Documentation for Information Architecture by Australian Libraries

    ERIC Educational Resources Information Center

    Hider, Philip; Burford, Sally; Ferguson, Stuart

    2009-01-01

    This article reports the results of an online survey that examined the development of information architecture of Australian library Web sites with reference to documented methods and guidelines. A broad sample of library Web managers responded from across the academic, public, and special sectors. A majority of libraries used either in-house or…

  7. Publishing Accessible Materials on the Web and CD-ROM.

    ERIC Educational Resources Information Center

    Federal Resource Center for Special Education, Washington, DC.

    While it is generally simple to make electronic content accessible, it is also easy inadvertently to make it inaccessible. This guide covers the many formats of electronic documents and points out what to keep in mind and what procedures to follow to make documents accessible to all when disseminating information via the World Wide Web and on…

  8. 77 FR 50541 - STP Nuclear Operating Company, South Texas Project, Units 1 and 2; Application for Amendment to...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-08-21

    ... any of the following methods: Federal Rulemaking Web Site: Go to http://www.regulations.gov and search.../reading-rm/adams.html . To begin the search, select ``ADAMS Public Documents'' and then select ``Begin Web- based ADAMS Search.'' For problems with ADAMS, please contact the NRC's Public Document Room (PDR...

  9. eCDRweb User Guide–Primary Support

    EPA Pesticide Factsheets

    This document presents the user guide for the Office of Pollution Prevention and Toxics’ (OPPT) e-CDR web tool. E-CDRweb is the electronic, web-based tool provided by the Environmental Protection Agency (EPA) for the submission of Chemical Data Reporting (CDR) information. This document is the user guide for the Primary Support user of the e-CDRweb tool.

  10. 75 FR 816 - Identification of Additional Classes of Facilities for Development of Financial Responsibility...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-01-06

    ... protected through www.regulations.gov or e-mail. The www.regulations.gov Web site is an ``anonymous access... Can I Get Copies of This Document and Other Related Information? This Federal Register notice and.... EPA-HQ-SFUND-2009-0834. All documents in the docket are listed on the http://www.regulations.gov Web...

  11. 77 FR 22847 - National Emission Standards for Hazardous Air Pollutants for Polyvinyl Chloride and Copolymers...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-04-17

    .... EPA-HQ-OAR-2002-0037. All documents in the docket are listed on the http://www.regulations.gov Web... voluntary consensus standards VOC volatile organic compound WWW World Wide Web Organization of This Document. The following outline is provided to aid in locating information in this preamble. I. General...

  12. 78 FR 3470 - DTE Electric Company (Formerly the Detroit Edison Company), Notice of Availability of Final...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-01-16

    ...--specific Web page at http://www.nrc.gov/reactors/new-reactors/col/fermi.html . The Ellis Library and... possesses and are publicly-available, using any of the following methods: Federal Rulemaking Web site: Go to... Documents Access and Management System (ADAMS): You may access publicly-available documents online in the...

  13. eCDRweb User Guide–Secondary Support

    EPA Pesticide Factsheets

    This document presents the user guide for the Office of Pollution Prevention and Toxics’ (OPPT) e-CDR web tool. E-CDRweb is the electronic, web-based tool provided by the Environmental Protection Agency (EPA) for the submission of Chemical Data Reporting (CDR) information. This document is the user guide for the Secondary Support user of the e-CDRweb tool.

  14. EPA Web Training Classes

    EPA Pesticide Factsheets

    Scheduled webinars can help you better manage EPA web content. Class topics include Drupal basics, creating different types of pages in the WebCMS such as document pages and forms, using Google Analytics, and best practices for metadata and accessibility.

  15. Code AI Personal Web Pages

    NASA Technical Reports Server (NTRS)

    Garcia, Joseph A.; Smith, Charles A. (Technical Monitor)

    1998-01-01

    The document consists of a publicly available web site (george.arc.nasa.gov) for Joseph A. Garcia's personal web pages in the AI division. Only general information will be posted and no technical material. All the information is unclassified.

  16. What Constitutes Adoption of the Web: A Methodological Problem in Assessing Adoption of the World Wide Web for Electronic Commerce.

    ERIC Educational Resources Information Center

    White, Marilyn Domas; Abels, Eileen G.; Gordon-Murnane, Laura

    1998-01-01

    Reports on methodological developments in a project to assess the adoption of the Web by publishers of business information for electronic commerce. Describes the approach used on a sample of 20 business publishers to identify five clusters of publishers ranging from traditionalist to innovator. Distinguishes between adopters and nonadopters of…

  17. An Efficient Approach for Web Indexing of Big Data through Hyperlinks in Web Crawling.

    PubMed

    Devi, R Suganya; Manjula, D; Siddharth, R K

    2015-01-01

    Web Crawling has acquired tremendous significance in recent times and it is aptly associated with the substantial development of the World Wide Web. Web Search Engines face new challenges due to the availability of vast amounts of web documents, thus making the retrieved results less applicable to the analysers. However, recently, Web Crawling solely focuses on obtaining the links of the corresponding documents. Today, there exist various algorithms and software which are used to crawl links from the web which has to be further processed for future use, thereby increasing the overload of the analyser. This paper concentrates on crawling the links and retrieving all information associated with them to facilitate easy processing for other uses. In this paper, firstly the links are crawled from the specified uniform resource locator (URL) using a modified version of Depth First Search Algorithm which allows for complete hierarchical scanning of corresponding web links. The links are then accessed via the source code and its metadata such as title, keywords, and description are extracted. This content is very essential for any type of analyser work to be carried on the Big Data obtained as a result of Web Crawling.

  18. An Efficient Approach for Web Indexing of Big Data through Hyperlinks in Web Crawling

    PubMed Central

    Devi, R. Suganya; Manjula, D.; Siddharth, R. K.

    2015-01-01

    Web Crawling has acquired tremendous significance in recent times and it is aptly associated with the substantial development of the World Wide Web. Web Search Engines face new challenges due to the availability of vast amounts of web documents, thus making the retrieved results less applicable to the analysers. However, recently, Web Crawling solely focuses on obtaining the links of the corresponding documents. Today, there exist various algorithms and software which are used to crawl links from the web which has to be further processed for future use, thereby increasing the overload of the analyser. This paper concentrates on crawling the links and retrieving all information associated with them to facilitate easy processing for other uses. In this paper, firstly the links are crawled from the specified uniform resource locator (URL) using a modified version of Depth First Search Algorithm which allows for complete hierarchical scanning of corresponding web links. The links are then accessed via the source code and its metadata such as title, keywords, and description are extracted. This content is very essential for any type of analyser work to be carried on the Big Data obtained as a result of Web Crawling. PMID:26137592

  19. Towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension.

    PubMed

    Liu, Yuanchao; Liu, Ming; Wang, Xin

    2015-01-01

    The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach.

  20. Towards Semantically Sensitive Text Clustering: A Feature Space Modeling Technology Based on Dimension Extension

    PubMed Central

    Liu, Yuanchao; Liu, Ming; Wang, Xin

    2015-01-01

    The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach. PMID:25794172

  1. Autocorrelation and Regularization of Query-Based Information Retrieval Scores

    DTIC Science & Technology

    2008-02-01

    of the most general information retrieval models [ Salton , 1968]. By treating a query as a very short document, documents and queries can be rep... Salton , 1971]. In the context of single link hierarchical clustering, Jardine and van Rijsbergen showed that ranking all k clusters and retrieving a...a document about “dogs”, then the system will always miss this document when a user queries “dog”. Salton recognized that a document’s representation

  2. Characteristics of Food Industry Web Sites and "Advergames" Targeting Children

    ERIC Educational Resources Information Center

    Culp, Jennifer; Bell, Robert A.; Cassady, Diana

    2010-01-01

    Objective: To assess the content of food industry Web sites targeting children by describing strategies used to prolong their visits and foster brand loyalty; and to document health-promoting messages on these Web sites. Design: A content analysis was conducted of Web sites advertised on 2 children's networks, Cartoon Network and Nickelodeon. A…

  3. Environmental Models as a Service: Enabling Interoperability ...

    EPA Pesticide Factsheets

    Achieving interoperability in environmental modeling has evolved as software technology has progressed. The recent rise of cloud computing and proliferation of web services initiated a new stage for creating interoperable systems. Scientific programmers increasingly take advantage of streamlined deployment processes and affordable cloud access to move algorithms and data to the web for discoverability and consumption. In these deployments, environmental models can become available to end users through RESTful web services and consistent application program interfaces (APIs) that consume, manipulate, and store modeling data. RESTful modeling APIs also promote discoverability and guide usability through self-documentation. Embracing the RESTful paradigm allows models to be accessible via a web standard, and the resulting endpoints are platform- and implementation-agnostic while simultaneously presenting significant computational capabilities for spatial and temporal scaling. RESTful APIs present data in a simple verb-noun web request interface: the verb dictates how a resource is consumed using HTTP methods (e.g., GET, POST, and PUT) and the noun represents the URL reference of the resource on which the verb will act. The RESTful API can self-document in both the HTTP response and an interactive web page using the Open API standard. This lets models function as an interoperable service that promotes sharing, documentation, and discoverability. Here, we discuss the

  4. Determining the trophic guilds of fishes and macroinvertebrates in a seagrass food web

    USGS Publications Warehouse

    Luczkovich, J.J.; Ward, G.P.; Johnson, J.C.; Christian, R.R.; Baird, D.; Neckles, H.; Rizzo, W.M.

    2002-01-01

    We established trophic guilds of macroinvertebrate and fish taxa using correspondence analysis and a hierarchical clustering strategy for a seagrass food web in winter in the northeastern Gulf of Mexico. To create the diet matrix, we characterized the trophic linkages of macroinvertebrate and fish taxa present in Halodule wrightii seagrass habitat areas within the St. Marks National Wildlife Refuge (Florida) using binary data, combining dietary links obtained from relevant literature for macroinvertebrates with stomach analysis of common fishes collected during January and February of 1994. Heirarchical average-linkage cluster analysis of the 73 taxa of fishes and macroinvertebrates in the diet matrix yielded 14 clusters with diet similarity ??? 0.60. We then used correspondence analysis with three factors to jointly plot the coordinates of the consumers (identified by cluster membership) and of the 33 food sources. Correspondence analysis served as a visualization tool for assigning each taxon to one of eight trophic guilds: herbivores, detritivores, suspension feeders, omnivores, molluscivores, meiobenthos consumers, macrobenthos consumers, and piscivores. These trophic groups, cross-classified with major taxonomic groups, were further used to develop consumer compartments in a network analysis model of carbon flow in this seagrass ecosystem. The method presented here should greatly improve the development of future network models of food webs by providing an objective procedure for aggregating trophic groups.

  5. 37 CFR 2.190 - Addresses for trademark correspondence with the United States Patent and Trademark Office.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... filed through the Office's web site, at http://www.uspto.gov. Paper documents and cover sheets to be... trademark documents can be ordered through the Office's web site at www.uspto.gov. Paper requests for...: Madrid Processing Unit, 600 Dulany Street, MDE-7B87, Alexandria, VA 22314-5793. [68 FR 48289, Aug. 13...

  6. e-CDRweb User Guide – Secondary Authorized Official

    EPA Pesticide Factsheets

    This document presents the user guide for the Office of Pollution Prevention and Toxics’ (OPPT) e-CDRweb tool. E-CDRweb is the electronic, web-based tool provided by the Environmental Protection Agency (EPA) for the submission of Chemical Data Reporting (CDR) information. This document is the user guide for the Secondary Authorized Official (AO) user of the e-CDR web tool.

  7. 77 FR 70449 - Medical Device User Fee and Modernization Act; Notice to Public of Web Site Location of Fiscal...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-11-26

    ... of guidance documents that the Center for Devices and Radiological Health (CDRH) is intending to... notice announces the Web site location of the two lists of guidance documents which CDRH is intending to... list. FDA and CDRH priorities are subject to change at any time. Topics on this and past guidance...

  8. 76 FR 61367 - Medical Device User Fee and Modernization Act; Notice to Public of Web Site Location of Fiscal...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-10-04

    ... the Agency will post a list of guidance documents the Center for Devices and Radiological Health (CDRH... guidance documents that CDRH is considering for development and providing stakeholders an opportunity to.... This notice announces the Web site location of the list of guidances on which CDRH is intending to work...

  9. Mental health first aid guidelines: an evaluation of impact following download from the World Wide Web.

    PubMed

    Hart, Laura M; Jorm, Anthony F; Paxton, Susan J; Cvetkovski, Stefan

    2012-11-01

    Mental health first aid guidelines provide the public with consensus-based information about how to assist someone who is developing a mental illness or experiencing a mental health crisis. The aim of the current study was to evaluate the usefulness and impact of the guidelines on web users who download them. Web users who downloaded the documents were invited to respond to an initial demographic questionnaire, then a follow up about how the documents had been used, their perceived usefulness, whether first-aid situations had been encountered and if these were influenced by the documents. Over 9.8 months, 706 web users responded to the initial questionnaire and 154 responded to the second. A majority reported downloading the document because their job involved contact with people with mental illness. Sixty-three web users reported providing first aid, 44 of whom reported that the person they were assisting had sought professional care as a result of their suggestion. Twenty-three web users reported seeking care themselves. A majority of those who provided first aid reported feeling that they had been successful in helping the person, that they had been able to assist in a way that was more knowledgeable, skilful and supportive, and that the guidelines had contributed to these outcomes. Information made freely available on the Internet, about how to provide mental health first aid to someone who is developing a mental health problem or experiencing a mental health crisis, is associated with more positive, empathic and successful helping behaviours. © 2012 Wiley Publishing Asia Pty Ltd.

  10. Web Annotation and Threaded Forum: How Did Learners Use the Two Environments in an Online Discussion?

    ERIC Educational Resources Information Center

    Sun, Yanyan; Gao, Fei

    2014-01-01

    Web annotation is a Web 2.0 technology that allows learners to work collaboratively on web pages or electronic documents. This study explored the use of Web annotation as an online discussion tool by comparing it to a traditional threaded discussion forum. Ten graduate students participated in the study. Participants had access to both a Web…

  11. Development of Innovative Design Processor

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Park, Y.S.; Park, C.O.

    2004-07-01

    The nuclear design analysis requires time-consuming and erroneous model-input preparation, code run, output analysis and quality assurance process. To reduce human effort and improve design quality and productivity, Innovative Design Processor (IDP) is being developed. Two basic principles of IDP are the document-oriented design and the web-based design. The document-oriented design is that, if the designer writes a design document called active document and feeds it to a special program, the final document with complete analysis, table and plots is made automatically. The active documents can be written with ordinary HTML editors or created automatically on the web, which ismore » another framework of IDP. Using the proper mix-up of server side and client side programming under the LAMP (Linux/Apache/MySQL/PHP) environment, the design process on the web is modeled as a design wizard style so that even a novice designer makes the design document easily. This automation using the IDP is now being implemented for all the reload design of Korea Standard Nuclear Power Plant (KSNP) type PWRs. The introduction of this process will allow large reduction in all reload design efforts of KSNP and provide a platform for design and R and D tasks of KNFC. (authors)« less

  12. T-RMSD: a web server for automated fine-grained protein structural classification.

    PubMed

    Magis, Cedrik; Di Tommaso, Paolo; Notredame, Cedric

    2013-07-01

    This article introduces the T-RMSD web server (tree-based on root-mean-square deviation), a service allowing the online computation of structure-based protein classification. It has been developed to address the relation between structural and functional similarity in proteins, and it allows a fine-grained structural clustering of a given protein family or group of structurally related proteins using distance RMSD (dRMSD) variations. These distances are computed between all pairs of equivalent residues, as defined by the ungapped columns within a given multiple sequence alignment. Using these generated distance matrices (one per equivalent position), T-RMSD produces a structural tree with support values for each cluster node, reminiscent of bootstrap values. These values, associated with the tree topology, allow a quantitative estimate of structural distances between proteins or group of proteins defined by the tree topology. The clusters thus defined have been shown to be structurally and functionally informative. The T-RMSD web server is a free website open to all users and available at http://tcoffee.crg.cat/apps/tcoffee/do:trmsd.

  13. T-RMSD: a web server for automated fine-grained protein structural classification

    PubMed Central

    Magis, Cedrik; Di Tommaso, Paolo; Notredame, Cedric

    2013-01-01

    This article introduces the T-RMSD web server (tree-based on root-mean-square deviation), a service allowing the online computation of structure-based protein classification. It has been developed to address the relation between structural and functional similarity in proteins, and it allows a fine-grained structural clustering of a given protein family or group of structurally related proteins using distance RMSD (dRMSD) variations. These distances are computed between all pairs of equivalent residues, as defined by the ungapped columns within a given multiple sequence alignment. Using these generated distance matrices (one per equivalent position), T-RMSD produces a structural tree with support values for each cluster node, reminiscent of bootstrap values. These values, associated with the tree topology, allow a quantitative estimate of structural distances between proteins or group of proteins defined by the tree topology. The clusters thus defined have been shown to be structurally and functionally informative. The T-RMSD web server is a free website open to all users and available at http://tcoffee.crg.cat/apps/tcoffee/do:trmsd. PMID:23716642

  14. Eliciting end-user expectations to guide the implementation process of a new electronic health record: A case study using concept mapping.

    PubMed

    Joukes, Erik; Cornet, Ronald; de Bruijne, Martine C; de Keizer, Nicolette F

    2016-03-01

    To evaluate the usability of concept mapping to elicit the expectations of healthcare professionals regarding the implementation of a new electronic health record (EHR). These expectations need to be taken into account during the implementation process to maximize the chance of success of the EHR. Two university hospitals in Amsterdam, The Netherlands, in the preparation phase of jointly implementing a new EHR. During this study the hospitals had different methods of documenting patient information (legacy EHR vs. paper-based records). Concept mapping was used to determine and classify the expectations of healthcare professionals regarding the implementation of a new EHR. A multidisciplinary group of 46 healthcare professionals from both university hospitals participated in this study. Expectations were elicited in focus groups, their relevance and feasibility were assessed through a web-questionnaire. Nonmetric multidimensional scaling and clustering methods were used to identify clusters of expectations. We found nine clusters of expectations, each covering an important topic to enable the healthcare professionals to work properly with the new EHR once implemented: usability, data use and reuse, facility conditions, data registration, support, training, internal communication, patients, and collaboration. Average importance and feasibility of each of the clusters was high. Concept mapping is an effective method to find topics that, according to healthcare professionals, are important to consider during the implementation of a new EHR. The method helps to combine the input of a large group of stakeholders at limited efforts. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  15. Effective electron-density map improvement and structure validation on a Linux multi-CPU web cluster: The TB Structural Genomics Consortium Bias Removal Web Service.

    PubMed

    Reddy, Vinod; Swanson, Stanley M; Segelke, Brent; Kantardjieff, Katherine A; Sacchettini, James C; Rupp, Bernhard

    2003-12-01

    Anticipating a continuing increase in the number of structures solved by molecular replacement in high-throughput crystallography and drug-discovery programs, a user-friendly web service for automated molecular replacement, map improvement, bias removal and real-space correlation structure validation has been implemented. The service is based on an efficient bias-removal protocol, Shake&wARP, and implemented using EPMR and the CCP4 suite of programs, combined with various shell scripts and Fortran90 routines. The service returns improved maps, converted data files and real-space correlation and B-factor plots. User data are uploaded through a web interface and the CPU-intensive iteration cycles are executed on a low-cost Linux multi-CPU cluster using the Condor job-queuing package. Examples of map improvement at various resolutions are provided and include model completion and reconstruction of absent parts, sequence correction, and ligand validation in drug-target structures.

  16. Web-based routing assistance tool to reduce pavement damage by overweight and oversize vehicles.

    DOT National Transportation Integrated Search

    2016-10-30

    This report documents the results of a completed project titled Web-Based Routing Assistance Tool to Reduce Pavement Damage by Overweight and Oversize Vehicles. The tasks involved developing a Web-based GIS routing assistance tool and evaluate ...

  17. 36 CFR 219.54 - Filing an objection.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... or regulation. (2) Forest Service Directive System documents and land management plans or other... the objection process. (b) Including documents by reference is not allowed, except for the following... relevant section of the cited document. All other documents or Web links to those documents, or both must...

  18. 36 CFR 219.54 - Filing an objection.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... or regulation. (2) Forest Service Directive System documents and land management plans or other... the objection process. (b) Including documents by reference is not allowed, except for the following... relevant section of the cited document. All other documents or Web links to those documents, or both must...

  19. Human rights abuses, transparency, impunity and the Web.

    PubMed

    Miles, Steven H

    2007-01-01

    This paper reviews how human rights advocates during the "war-on-terror" have found new ways to use the World Wide Web (Web) to combat human rights abuses. These include posting of human rights reports; creating large, open-access and updated archives of government documents and other data, tracking CIA rendition flights and maintaining blogs, e-zines, list-serves and news services that rapidly distribute information between journalists, scholars and human rights advocates. The Web is a powerful communication tool for human rights advocates. It is international, instantaneous, and accessible for uploading, archiving, locating and downloading information. For its human rights potential to be fully realized, international law must be strengthened to promote the declassification of government documents, as is done by various freedom of information acts. It is too early to assess the final impact of the Web on human rights abuses in the "war-on-terror". Wide dissemination of government documents and human rights advocates' reports has put the United States government on the defensive and some of its policies have changed in response to public pressure. Even so, the essential elements of secret prisons, detention without charges or trials, and illegal rendition remain intact.

  20. 32 CFR 701.119 - Privacy and the web.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 32 National Defense 5 2010-07-01 2010-07-01 false Privacy and the web. 701.119 Section 701.119... THE NAVY DOCUMENTS AFFECTING THE PUBLIC DON Privacy Program § 701.119 Privacy and the web. DON activities shall consult SECNAVINST 5720.47B for guidance on what may be posted on a Navy Web site. ...

  1. 75 FR 44020 - Biweekly Notice; Applications and Amendments to Facility Operating Licenses Involving No...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-27

    ... representative, already holds an NRC- issued digital ID certificate). Based upon this information, the Secretary... online, Web-based submission form. In order to serve documents through EIE, users will be required to install a Web browser plug-in from the NRC Web site. Further information on the Web- based submission form...

  2. User-Friendly Interface Developed for a Web-Based Service for SpaceCAL Emulations

    NASA Technical Reports Server (NTRS)

    Liszka, Kathy J.; Holtz, Allen P.

    2004-01-01

    A team at the NASA Glenn Research Center is developing a Space Communications Architecture Laboratory (SpaceCAL) for protocol development activities for coordinated satellite missions. SpaceCAL will provide a multiuser, distributed system to emulate space-based Internet architectures, backbone networks, formation clusters, and constellations. As part of a new effort in 2003, building blocks are being defined for an open distributed system to make the satellite emulation test bed accessible through an Internet connection. The first step in creating a Web-based service to control the emulation remotely is providing a user-friendly interface for encoding the data into a well-formed and complete Extensible Markup Language (XML) document. XML provides coding that allows data to be transferred between dissimilar systems. Scenario specifications include control parameters, network routes, interface bandwidths, delay, and bit error rate. Specifications for all satellite, instruments, and ground stations in a given scenario are also included in the XML document. For the SpaceCAL emulation, the XML document can be created using XForms, a Webbased forms language for data collection. Contrary to older forms technology, the interactive user interface makes the science prevalent, not the data representation. Required versus optional input fields, default values, automatic calculations, data validation, and reuse will help researchers quickly and accurately define missions. XForms can apply any XML schema defined for the test mission to validate data before forwarding it to the emulation facility. New instrument definitions, facilities, and mission types can be added to the existing schema. The first prototype user interface incorporates components for interactive input and form processing. Internet address, data rate, and the location of the facility are implemented with basic form controls with default values provided for convenience and efficiency using basic XForms operations. Because different emulation scenarios will vary widely in their component structure, more complex operations are used to add and delete facilities.

  3. Comparing Web, Group and Telehealth Formats of a Military Parenting Program

    DTIC Science & Technology

    2017-06-01

    directed approaches. Comparative effectiveness will be tested by specifying a non - equivalence hypothesis for group -based and web-facilitated relative...Comparative effectiveness will be tested by specifying a non - equivalence hypothesis fro group based and individualized facilitated relative to self-directed...documents for review and approval. 1a. Finalize human subjects protocol and consent documents for pilot group (N=5 families), and randomized controlled

  4. Measuring the Success of the Academic Library Website Using Banner Advertisements and Web Conversion Rates: A Case Study

    ERIC Educational Resources Information Center

    Whang, Michael

    2007-01-01

    Measuring website success is critical not only to the web development process but also to demonstrate the value of library services to the institution. This article documents one library's approach to the measurement of website success. LibQUAL+[TM] results and strategic-planning documents indicated a need for a new type of measurement. The…

  5. Documentation systems for educators seeking academic promotion in U.S. medical schools.

    PubMed

    Simpson, Deborah; Hafler, Janet; Brown, Diane; Wilkerson, LuAnn

    2004-08-01

    To explore the state and use of teaching portfolios in promotion and tenure in U.S. medical schools. A two-phase qualitative study using a Web-based search procedure and telephone interviews was conducted. The first phase assessed the penetration of teaching portfolio-like systems in U.S. medical schools using a keyword search of medical school Web sites. The second phase examined the current use of teaching portfolios in 16 U.S. medical schools that reported their use in a survey in 1992. The individual designated as having primary responsibility for faculty appointments/promotions was contacted to participate in a 30-60 minute interview. The Phase 1 search of U.S. medical schools' Web sites revealed that 76 medical schools have Web-based access to information on documenting educational activities for promotion. A total of 16 of 17 medical schools responded to Phase 2. All 16 continued to use a portfolio-like system in 2003. Two documentation categories, honors/awards and philosophy/personal statement regarding education, were included by six more of these schools than used these categories in 1992. Dissemination of work to colleagues is now a key inclusion at 15 of the Phase 2 schools. The most common type of evidence used to document education was learner and/or peer ratings with infrequent use of outcome measures and internal/external review. The number of medical schools whose promotion packets include portfolio-like documentation associated with a faculty member's excellence in education has increased by more than 400% in just over ten years. Among early-responder schools the types of documentation categories have increased, but students' ratings of teaching remain the primary evidence used to document the quality or outcomes of the educational efforts reported.

  6. A novel architecture for information retrieval system based on semantic web

    NASA Astrophysics Data System (ADS)

    Zhang, Hui

    2011-12-01

    Nowadays, the web has enabled an explosive growth of information sharing (there are currently over 4 billion pages covering most areas of human endeavor) so that the web has faced a new challenge of information overhead. The challenge that is now before us is not only to help people locating relevant information precisely but also to access and aggregate a variety of information from different resources automatically. Current web document are in human-oriented formats and they are suitable for the presentation, but machines cannot understand the meaning of document. To address this issue, Berners-Lee proposed a concept of semantic web. With semantic web technology, web information can be understood and processed by machine. It provides new possibilities for automatic web information processing. A main problem of semantic web information retrieval is that when these is not enough knowledge to such information retrieval system, the system will return to a large of no sense result to uses due to a huge amount of information results. In this paper, we present the architecture of information based on semantic web. In addiction, our systems employ the inference Engine to check whether the query should pose to Keyword-based Search Engine or should pose to the Semantic Search Engine.

  7. Chemical markup, XML and the World-Wide Web. 3. Toward a signed semantic chemical web of trust.

    PubMed

    Gkoutos, G V; Murray-Rust, P; Rzepa, H S; Wright, M

    2001-01-01

    We describe how a collection of documents expressed in XML-conforming languages such as CML and XHTML can be authenticated and validated against digital signatures which make use of established X.509 certificate technology. These can be associated either with specific nodes in the XML document or with the entire document. We illustrate this with two examples. An entire journal article expressed in XML has its individual components digitally signed by separate authors, and the collection is placed in an envelope and again signed. The second example involves using a software robot agent to acquire a collection of documents from a specified URL, to perform various operations and transformations on the content, including expressing molecules in CML, and to automatically sign the various components and deposit the result in a repository. We argue that these operations can used as components for building what we term an authenticated and semantic chemical web of trust.

  8. 78 FR 5838 - NRC Enforcement Policy

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-01-28

    ... submit comments by any of the following methods: Federal Rulemaking Web site: Go to http://www... of the following methods: Federal Rulemaking Web site: Go to http://www.regulations.gov and search... the search, select ``ADAMS Public Documents'' and then select ``Begin Web-based ADAMS Search.'' For...

  9. A Web-Based Multidrug-Resistant Organisms Surveillance and Outbreak Detection System with Rule-Based Classification and Clustering

    PubMed Central

    Tseng, Yi-Ju; Wu, Jung-Hsuan; Ping, Xiao-Ou; Lin, Hui-Chi; Chen, Ying-Yu; Shang, Rung-Ji; Chen, Ming-Yuan; Lai, Feipei

    2012-01-01

    Background The emergence and spread of multidrug-resistant organisms (MDROs) are causing a global crisis. Combating antimicrobial resistance requires prevention of transmission of resistant organisms and improved use of antimicrobials. Objectives To develop a Web-based information system for automatic integration, analysis, and interpretation of the antimicrobial susceptibility of all clinical isolates that incorporates rule-based classification and cluster analysis of MDROs and implements control chart analysis to facilitate outbreak detection. Methods Electronic microbiological data from a 2200-bed teaching hospital in Taiwan were classified according to predefined criteria of MDROs. The numbers of organisms, patients, and incident patients in each MDRO pattern were presented graphically to describe spatial and time information in a Web-based user interface. Hierarchical clustering with 7 upper control limits (UCL) was used to detect suspicious outbreaks. The system’s performance in outbreak detection was evaluated based on vancomycin-resistant enterococcal outbreaks determined by a hospital-wide prospective active surveillance database compiled by infection control personnel. Results The optimal UCL for MDRO outbreak detection was the upper 90% confidence interval (CI) using germ criterion with clustering (area under ROC curve (AUC) 0.93, 95% CI 0.91 to 0.95), upper 85% CI using patient criterion (AUC 0.87, 95% CI 0.80 to 0.93), and one standard deviation using incident patient criterion (AUC 0.84, 95% CI 0.75 to 0.92). The performance indicators of each UCL were statistically significantly higher with clustering than those without clustering in germ criterion (P < .001), patient criterion (P = .04), and incident patient criterion (P < .001). Conclusion This system automatically identifies MDROs and accurately detects suspicious outbreaks of MDROs based on the antimicrobial susceptibility of all clinical isolates. PMID:23195868

  10. An open annotation ontology for science on web 3.0

    PubMed Central

    2011-01-01

    Background There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies with the scientific document, in the form of an Annotation Ontology (AO), thus closing this gap and enabling application of formal biomedical ontologies directly to the literature as it emerges. Methods Initial requirements for AO were elicited by analysis of integration needs between biomedical web communities, and of needs for representing and integrating results of biomedical text mining. Analysis of strengths and weaknesses of previous efforts in this area was also performed. A series of increasingly refined annotation tools were then developed along with a metadata model in OWL, and deployed for feedback and additional requirements the ontology to users at a major pharmaceutical company and a major academic center. Further requirements and critiques of the model were also elicited through discussions with many colleagues and incorporated into the work. Results This paper presents Annotation Ontology (AO), an open ontology in OWL-DL for annotating scientific documents on the web. AO supports both human and algorithmic content annotation. It enables “stand-off” or independent metadata anchored to specific positions in a web document by any one of several methods. In AO, the document may be annotated but is not required to be under update control of the annotator. AO contains a provenance model to support versioning, and a set model for specifying groups and containers of annotation. AO is freely available under open source license at http://purl.org/ao/, and extensive documentation including screencasts is available on AO’s Google Code page: http://code.google.com/p/annotation-ontology/ . Conclusions The Annotation Ontology meets critical requirements for an open, freely shareable model in OWL, of annotation metadata created against scientific documents on the Web. We believe AO can become a very useful common model for annotation metadata on Web documents, and will enable biomedical domain ontologies to be used quite widely to annotate the scientific literature. Potential collaborators and those with new relevant use cases are invited to contact the authors. PMID:21624159

  11. An open annotation ontology for science on web 3.0.

    PubMed

    Ciccarese, Paolo; Ocana, Marco; Garcia Castro, Leyla Jael; Das, Sudeshna; Clark, Tim

    2011-05-17

    There is currently a gap between the rich and expressive collection of published biomedical ontologies, and the natural language expression of biomedical papers consumed on a daily basis by scientific researchers. The purpose of this paper is to provide an open, shareable structure for dynamic integration of biomedical domain ontologies with the scientific document, in the form of an Annotation Ontology (AO), thus closing this gap and enabling application of formal biomedical ontologies directly to the literature as it emerges. Initial requirements for AO were elicited by analysis of integration needs between biomedical web communities, and of needs for representing and integrating results of biomedical text mining. Analysis of strengths and weaknesses of previous efforts in this area was also performed. A series of increasingly refined annotation tools were then developed along with a metadata model in OWL, and deployed for feedback and additional requirements the ontology to users at a major pharmaceutical company and a major academic center. Further requirements and critiques of the model were also elicited through discussions with many colleagues and incorporated into the work. This paper presents Annotation Ontology (AO), an open ontology in OWL-DL for annotating scientific documents on the web. AO supports both human and algorithmic content annotation. It enables "stand-off" or independent metadata anchored to specific positions in a web document by any one of several methods. In AO, the document may be annotated but is not required to be under update control of the annotator. AO contains a provenance model to support versioning, and a set model for specifying groups and containers of annotation. AO is freely available under open source license at http://purl.org/ao/, and extensive documentation including screencasts is available on AO's Google Code page: http://code.google.com/p/annotation-ontology/ . The Annotation Ontology meets critical requirements for an open, freely shareable model in OWL, of annotation metadata created against scientific documents on the Web. We believe AO can become a very useful common model for annotation metadata on Web documents, and will enable biomedical domain ontologies to be used quite widely to annotate the scientific literature. Potential collaborators and those with new relevant use cases are invited to contact the authors.

  12. Hierarchic Agglomerative Clustering Methods for Automatic Document Classification.

    ERIC Educational Resources Information Center

    Griffiths, Alan; And Others

    1984-01-01

    Considers classifications produced by application of single linkage, complete linkage, group average, and word clustering methods to Keen and Cranfield document test collections, and studies structure of hierarchies produced, extent to which methods distort input similarity matrices during classification generation, and retrieval effectiveness…

  13. The Document Management Alliance.

    ERIC Educational Resources Information Center

    Fay, Chuck

    1998-01-01

    Describes the Document Management Alliance, a standards effort for document management systems that manages and tracks changes to electronic documents created and used by collaborative teams, provides secure access, and facilitates online information retrieval via the Internet and World Wide Web. Future directions are also discussed. (LRW)

  14. Monotone Increasing Binary Similarity and Its Application to Automatic Document-Acquisition of a Category

    NASA Astrophysics Data System (ADS)

    Suzuki, Izumi; Mikami, Yoshiki; Ohsato, Ario

    A technique that acquires documents in the same category with a given short text is introduced. Regarding the given text as a training document, the system marks up the most similar document, or sufficiently similar documents, from among the document domain (or entire Web). The system then adds the marked documents to the training set to learn the set, and this process is repeated until no more documents are marked. Setting a monotone increasing property to the similarity as it learns enables the system to 1) detect the correct timing so that no more documents remain to be marked and to 2) decide the threshold value that the classifier uses. In addition, under the condition that the normalization process is limited to what term weights are divided by a p-norm of the weights, the linear classifier in which training documents are indexed in a binary manner is the only instance that satisfies the monotone increasing property. The feasibility of the proposed technique was confirmed through an examination of binary similarity and using English and German documents randomly selected from the Web.

  15. SEEDisCs: How Clusters Form and Galaxies Transform in the Cosmic Web

    NASA Astrophysics Data System (ADS)

    Jablonka, P.

    2017-08-01

    This presentation introduces a new survey, the Spatial Extended EDisCS Survey (SEEDisCS), which aims at understanding how clusters assemble and the level at which galaxies are preprocessed before falling on the cluster cores. I focus on the changes in galaxy properties in the cluster large scale environments, and how we can get constraints on the timescale of star formation quenching. I also discuss new ALMA CO observations, which trace the fate of the galaxy cold gas content along the infalling paths towards the cluster cores.

  16. 12 CFR 611.1216 - Public availability of documents related to the termination.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... the termination. (a) We may post on our Web site, or require you to post on your Web site: (1) Results... related transactions. (b) We will not post confidential information on our Web site and will not require you to post it on your Web site. (c) You may request that we treat specific information as...

  17. 12 CFR 611.1216 - Public availability of documents related to the termination.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... the termination. (a) We may post on our Web site, or require you to post on your Web site: (1) Results... related transactions. (b) We will not post confidential information on our Web site and will not require you to post it on your Web site. (c) You may request that we treat specific information as...

  18. 12 CFR 611.1216 - Public availability of documents related to the termination.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... the termination. (a) We may post on our Web site, or require you to post on your Web site: (1) Results... related transactions. (b) We will not post confidential information on our Web site and will not require you to post it on your Web site. (c) You may request that we treat specific information as...

  19. 12 CFR 611.1216 - Public availability of documents related to the termination.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... the termination. (a) We may post on our Web site, or require you to post on your Web site: (1) Results... related transactions. (b) We will not post confidential information on our Web site and will not require you to post it on your Web site. (c) You may request that we treat specific information as...

  20. XML Content Finally Arrives on the Web!

    ERIC Educational Resources Information Center

    Funke, Susan

    1998-01-01

    Explains extensible markup language (XML) and how it differs from hypertext markup language (HTML) and standard generalized markup language (SGML). Highlights include features of XML, including better formatting of documents, better searching capabilities, multiple uses for hyperlinking, and an increase in Web applications; Web browsers; and what…

  1. Automating Information Discovery Within the Invisible Web

    NASA Astrophysics Data System (ADS)

    Sweeney, Edwina; Curran, Kevin; Xie, Ermai

    A Web crawler or spider crawls through the Web looking for pages to index, and when it locates a new page it passes the page on to an indexer. The indexer identifies links, keywords, and other content and stores these within its database. This database is searched by entering keywords through an interface and suitable Web pages are returned in a results page in the form of hyperlinks accompanied by short descriptions. The Web, however, is increasingly moving away from being a collection of documents to a multidimensional repository for sounds, images, audio, and other formats. This is leading to a situation where certain parts of the Web are invisible or hidden. The term known as the "Deep Web" has emerged to refer to the mass of information that can be accessed via the Web but cannot be indexed by conventional search engines. The concept of the Deep Web makes searches quite complex for search engines. Google states that the claim that conventional search engines cannot find such documents as PDFs, Word, PowerPoint, Excel, or any non-HTML page is not fully accurate and steps have been taken to address this problem by implementing procedures to search items such as academic publications, news, blogs, videos, books, and real-time information. However, Google still only provides access to a fraction of the Deep Web. This chapter explores the Deep Web and the current tools available in accessing it.

  2. Illinois Occupational Skill Standards: Telecommunications Technician Cluster.

    ERIC Educational Resources Information Center

    Illinois Occupational Skill Standards and Credentialing Council, Carbondale.

    This document, which is intended as a guide for workforce preparation program providers, details the Illinois Occupational Skill Standards for programs preparing students for employment in the telecommunications technician occupational cluster. The document begins with a brief overview of the Illinois perspective on occupational skills standards…

  3. MODPATH-LGR; documentation of a computer program for particle tracking in shared-node locally refined grids by using MODFLOW-LGR

    USGS Publications Warehouse

    Dickinson, Jesse; Hanson, R.T.; Mehl, Steffen W.; Hill, Mary C.

    2011-01-01

    The computer program described in this report, MODPATH-LGR, is designed to allow simulation of particle tracking in locally refined grids. The locally refined grids are simulated by using MODFLOW-LGR, which is based on MODFLOW-2005, the three-dimensional groundwater-flow model published by the U.S. Geological Survey. The documentation includes brief descriptions of the methods used and detailed descriptions of the required input files and how the output files are typically used. The code for this model is available for downloading from the World Wide Web from a U.S. Geological Survey software repository. The repository is accessible from the U.S. Geological Survey Water Resources Information Web page at http://water.usgs.gov/software/ground_water.html. The performance of the MODPATH-LGR program has been tested in a variety of applications. Future applications, however, might reveal errors that were not detected in the test simulations. Users are requested to notify the U.S. Geological Survey of any errors found in this document or the computer program by using the email address available on the Web site. Updates might occasionally be made to this document and to the MODPATH-LGR program, and users should check the Web site periodically.

  4. A Linear Algebra Measure of Cluster Quality.

    ERIC Educational Resources Information Center

    Mather, Laura A.

    2000-01-01

    Discussion of models for information retrieval focuses on an application of linear algebra to text clustering, namely, a metric for measuring cluster quality based on the theory that cluster quality is proportional to the number of terms that are disjoint across the clusters. Explains term-document matrices and clustering algorithms. (Author/LRW)

  5. Determining the trophic guilds of fishes and macroinvertebrates in a seagrass food web

    USGS Publications Warehouse

    Luczkovich, J.J.; Ward, G.P.; Johnson, J.C.; Christian, R.R.; Baird, D.; Neckles, H.; Rizzo, W.M.

    2002-01-01

    We established trophic guilds of macroinvertebrate and fish taxa using correspondence analysis and a hierarchical clustering strategy for a seagrass food web in winter in the northeastern Gulf of Mexico. To create the diet matrix, we characterized the trophic linkages of macroinvertebrate and fish taxa. present in Hatodule wrightii seagrass habitat areas within the St. Marks National Wildlife Refuge (Florida) using binary data, combining dietary links obtained from relevant literature for macroinvertebrates with stomach analysis of common fishes collected during January and February of 1994. Heirarchical average-linkage cluster analysis of the 73 taxa of fishes and macroinvertebrates in the diet matrix yielded 14 clusters with diet similarity greater than or equal to 0.60. We then used correspondence analysis with three factors to jointly plot the coordinates of the consumers (identified by cluster membership) and of the 33 food sources. Correspondence analysis served as a visualization tool for assigning each taxon to one of eight trophic guilds: herbivores, detritivores, suspension feeders, omnivores, molluscivores, meiobenthos consumers, macrobenthos consumers, and piscivores. These trophic groups, cross-classified with major taxonomic groups, were further used to develop consumer compartments in a network analysis model of carbon flow in this seagrass ecosystem. The method presented here should greatly improve the development of future network models of food webs by providing an objective procedure for aggregating trophic groups.

  6. Text Summarization Model based on Facility Location Problem

    NASA Astrophysics Data System (ADS)

    Takamura, Hiroya; Okumura, Manabu

    e propose a novel multi-document generic summarization model based on the budgeted median problem, which is a facility location problem. The summarization method based on our model is an extractive method, which selects sentences from the given document cluster and generates a summary. Each sentence in the document cluster will be assigned to one of the selected sentences, where the former sentece is supposed to be represented by the latter. Our method selects sentences to generate a summary that yields a good sentence assignment and hence covers the whole content of the document cluster. An advantage of this method is that it can incorporate asymmetric relations between sentences such as textual entailment. Through experiments, we showed that the proposed method yields good summaries on the dataset of DUC'04.

  7. InCHlib - interactive cluster heatmap for web applications.

    PubMed

    Skuta, Ctibor; Bartůněk, Petr; Svozil, Daniel

    2014-12-01

    Hierarchical clustering is an exploratory data analysis method that reveals the groups (clusters) of similar objects. The result of the hierarchical clustering is a tree structure called dendrogram that shows the arrangement of individual clusters. To investigate the row/column hierarchical cluster structure of a data matrix, a visualization tool called 'cluster heatmap' is commonly employed. In the cluster heatmap, the data matrix is displayed as a heatmap, a 2-dimensional array in which the colour of each element corresponds to its value. The rows/columns of the matrix are ordered such that similar rows/columns are near each other. The ordering is given by the dendrogram which is displayed on the side of the heatmap. We developed InCHlib (Interactive Cluster Heatmap Library), a highly interactive and lightweight JavaScript library for cluster heatmap visualization and exploration. InCHlib enables the user to select individual or clustered heatmap rows, to zoom in and out of clusters or to flexibly modify heatmap appearance. The cluster heatmap can be augmented with additional metadata displayed in a different colour scale. In addition, to further enhance the visualization, the cluster heatmap can be interconnected with external data sources or analysis tools. Data clustering and the preparation of the input file for InCHlib is facilitated by the Python utility script inchlib_clust . The cluster heatmap is one of the most popular visualizations of large chemical and biomedical data sets originating, e.g., in high-throughput screening, genomics or transcriptomics experiments. The presented JavaScript library InCHlib is a client-side solution for cluster heatmap exploration. InCHlib can be easily deployed into any modern web application and configured to cooperate with external tools and data sources. Though InCHlib is primarily intended for the analysis of chemical or biological data, it is a versatile tool which application domain is not limited to the life sciences only.

  8. WikiHyperGlossary (WHG): an information literacy technology for chemistry documents.

    PubMed

    Bauer, Michael A; Berleant, Daniel; Cornell, Andrew P; Belford, Robert E

    2015-01-01

    The WikiHyperGlossary is an information literacy technology that was created to enhance reading comprehension of documents by connecting them to socially generated multimedia definitions as well as semantically relevant data. The WikiHyperGlossary enhances reading comprehension by using the lexicon of a discipline to generate dynamic links in a document to external resources that can provide implicit information the document did not explicitly provide. Currently, the most common method to acquire additional information when reading a document is to access a search engine and browse the web. This may lead to skimming of multiple documents with the novice actually never returning to the original document of interest. The WikiHyperGlossary automatically brings information to the user within the current document they are reading, enhancing the potential for deeper document understanding. The WikiHyperGlossary allows users to submit a web URL or text to be processed against a chosen lexicon, returning the document with tagged terms. The selection of a tagged term results in the appearance of the WikiHyperGlossary Portlet containing a definition, and depending on the type of word, tabs to additional information and resources. Current types of content include multimedia enhanced definitions, ChemSpider query results, 3D molecular structures, and 2D editable structures connected to ChemSpider queries. Existing glossaries can be bulk uploaded, locked for editing and associated with multiple social generated definitions. The WikiHyperGlossary leverages both social and semantic web technologies to bring relevant information to a document. This can not only aid reading comprehension, but increases the users' ability to obtain additional information within the document. We have demonstrated a molecular editor enabled knowledge framework that can result in a semantic web inductive reasoning process, and integration of the WikiHyperGlossary into other software technologies, like the Jikitou Biomedical Question and Answer system. Although this work was developed in the chemical sciences and took advantage of open science resources and initiatives, the technology is extensible to other knowledge domains. Through the DeepLit (Deeper Literacy: Connecting Documents to Data and Discourse) startup, we seek to extend WikiHyperGlossary technologies to other knowledge domains, and integrate them into other knowledge acquisition workflows.

  9. Visualization of usability and functionality of a professional website through web-mining.

    PubMed

    Jones, Josette F; Mahoui, Malika; Gopa, Venkata Devi Pragna

    2007-10-11

    Functional interface design requires understanding of the information system structure and the user. Web logs record user interactions with the interface, and thus provide some insight into user search behavior and efficiency of the search process. The present study uses a data-mining approach with techniques such as association rules, clustering and classification, to visualize the usability and functionality of a digital library through in depth analyses of web logs.

  10. Cluster Analysis of Adolescent Blogs

    ERIC Educational Resources Information Center

    Liu, Eric Zhi-Feng; Lin, Chun-Hung; Chen, Feng-Yi; Peng, Ping-Chuan

    2012-01-01

    Emerging web applications and networking systems such as blogs have become popular, and they offer unique opportunities and environments for learners, especially for adolescent learners. This study attempts to explore the writing styles and genres used by adolescents in their blogs by employing content, factor, and cluster analyses. Factor…

  11. 39 CFR 3001.12 - Service of documents.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... or presiding officer has determined is unable to receive service through the Commission's Web site... presiding officer has determined is unable to receive service through the Commission Web site shall be by... service list for each current proceeding will be available on the Commission's Web site http://www.prc.gov...

  12. 32 CFR 701.119 - Privacy and the web.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 32 National Defense 5 2013-07-01 2013-07-01 false Privacy and the web. 701.119 Section 701.119 National Defense Department of Defense (Continued) DEPARTMENT OF THE NAVY UNITED STATES NAVY REGULATIONS... THE NAVY DOCUMENTS AFFECTING THE PUBLIC DON Privacy Program § 701.119 Privacy and the web. DON...

  13. 32 CFR 701.119 - Privacy and the web.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 32 National Defense 5 2011-07-01 2011-07-01 false Privacy and the web. 701.119 Section 701.119 National Defense Department of Defense (Continued) DEPARTMENT OF THE NAVY UNITED STATES NAVY REGULATIONS... THE NAVY DOCUMENTS AFFECTING THE PUBLIC DON Privacy Program § 701.119 Privacy and the web. DON...

  14. 32 CFR 701.119 - Privacy and the web.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 32 National Defense 5 2012-07-01 2012-07-01 false Privacy and the web. 701.119 Section 701.119 National Defense Department of Defense (Continued) DEPARTMENT OF THE NAVY UNITED STATES NAVY REGULATIONS... THE NAVY DOCUMENTS AFFECTING THE PUBLIC DON Privacy Program § 701.119 Privacy and the web. DON...

  15. 32 CFR 701.119 - Privacy and the web.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 32 National Defense 5 2014-07-01 2014-07-01 false Privacy and the web. 701.119 Section 701.119 National Defense Department of Defense (Continued) DEPARTMENT OF THE NAVY UNITED STATES NAVY REGULATIONS... THE NAVY DOCUMENTS AFFECTING THE PUBLIC DON Privacy Program § 701.119 Privacy and the web. DON...

  16. Avoiding Pornography Landmines while Traveling the Information Superhighway.

    ERIC Educational Resources Information Center

    Lehmann, Kay

    2002-01-01

    Discusses how to avoid pornographic sites when using the Internet in classrooms. Highlights include re-setting the Internet home page; putting appropriate links in a Word document; creating a Web page with appropriate links; downloading the content of a Web site; educating the students; and re-checking all Web addresses. (LRW)

  17. ICCE/ICCAI 2000 Full & Short Papers (Web-Based Learning).

    ERIC Educational Resources Information Center

    2000

    This document contains full and short papers on World Wide Web-based learning from ICCE/ICCAI 2000 (International Conference on Computers in Education/International Conference on Computer-Assisted Instruction). Topics covered include: design and development of CAL (Computer Assisted Learning) systems; design and development of WBI (Web-Based…

  18. Social Networking on the Semantic Web

    ERIC Educational Resources Information Center

    Finin, Tim; Ding, Li; Zhou, Lina; Joshi, Anupam

    2005-01-01

    Purpose: Aims to investigate the way that the semantic web is being used to represent and process social network information. Design/methodology/approach: The Swoogle semantic web search engine was used to construct several large data sets of Resource Description Framework (RDF) documents with social network information that were encoded using the…

  19. Illinois Occupational Skill Standards: Automotive Technician Cluster.

    ERIC Educational Resources Information Center

    Illinois Occupational Skill Standards and Credentialing Council, Carbondale.

    This document, which is intended as a guide for work force preparation program providers, details the Illinois occupational skill standards for programs preparing students for employment in occupations in the automotive technician cluster. The document begins with overviews of the Illinois perspective on occupational skill standards and…

  20. Illinois Occupational Skill Standards. Beef Production Cluster.

    ERIC Educational Resources Information Center

    Illinois Occupational Skill Standards and Credentialing Council, Carbondale.

    This document, which is intended as a guide for workforce preparation program providers, details the Illinois occupational skill standards for programs preparing students for employment in occupations in the beef production cluster. The document begins with a brief overview of the Illinois perspective on occupational skill standards and…

  1. What Can Pictures Tell Us About Web Pages? Improving Document Search Using Images.

    PubMed

    Rodriguez-Vaamonde, Sergio; Torresani, Lorenzo; Fitzgibbon, Andrew W

    2015-06-01

    Traditional Web search engines do not use the images in the HTML pages to find relevant documents for a given query. Instead, they typically operate by computing a measure of agreement between the keywords provided by the user and only the text portion of each page. In this paper we study whether the content of the pictures appearing in a Web page can be used to enrich the semantic description of an HTML document and consequently boost the performance of a keyword-based search engine. We present a Web-scalable system that exploits a pure text-based search engine to find an initial set of candidate documents for a given query. Then, the candidate set is reranked using visual information extracted from the images contained in the pages. The resulting system retains the computational efficiency of traditional text-based search engines with only a small additional storage cost needed to encode the visual information. We test our approach on one of the TREC Million Query Track benchmarks where we show that the exploitation of visual content yields improvement in accuracies for two distinct text-based search engines, including the system with the best reported performance on this benchmark. We further validate our approach by collecting document relevance judgements on our search results using Amazon Mechanical Turk. The results of this experiment confirm the improvement in accuracy produced by our image-based reranker over a pure text-based system.

  2. MPEG-7 audio-visual indexing test-bed for video retrieval

    NASA Astrophysics Data System (ADS)

    Gagnon, Langis; Foucher, Samuel; Gouaillier, Valerie; Brun, Christelle; Brousseau, Julie; Boulianne, Gilles; Osterrath, Frederic; Chapdelaine, Claude; Dutrisac, Julie; St-Onge, Francis; Champagne, Benoit; Lu, Xiaojian

    2003-12-01

    This paper reports on the development status of a Multimedia Asset Management (MAM) test-bed for content-based indexing and retrieval of audio-visual documents within the MPEG-7 standard. The project, called "MPEG-7 Audio-Visual Document Indexing System" (MADIS), specifically targets the indexing and retrieval of video shots and key frames from documentary film archives, based on audio-visual content like face recognition, motion activity, speech recognition and semantic clustering. The MPEG-7/XML encoding of the film database is done off-line. The description decomposition is based on a temporal decomposition into visual segments (shots), key frames and audio/speech sub-segments. The visible outcome will be a web site that allows video retrieval using a proprietary XQuery-based search engine and accessible to members at the Canadian National Film Board (NFB) Cineroute site. For example, end-user will be able to ask to point on movie shots in the database that have been produced in a specific year, that contain the face of a specific actor who tells a specific word and in which there is no motion activity. Video streaming is performed over the high bandwidth CA*net network deployed by CANARIE, a public Canadian Internet development organization.

  3. Semantic Metadata for Heterogeneous Spatial Planning Documents

    NASA Astrophysics Data System (ADS)

    Iwaniak, A.; Kaczmarek, I.; Łukowicz, J.; Strzelecki, M.; Coetzee, S.; Paluszyński, W.

    2016-09-01

    Spatial planning documents contain information about the principles and rights of land use in different zones of a local authority. They are the basis for administrative decision making in support of sustainable development. In Poland these documents are published on the Web according to a prescribed non-extendable XML schema, designed for optimum presentation to humans in HTML web pages. There is no document standard, and limited functionality exists for adding references to external resources. The text in these documents is discoverable and searchable by general-purpose web search engines, but the semantics of the content cannot be discovered or queried. The spatial information in these documents is geographically referenced but not machine-readable. Major manual efforts are required to integrate such heterogeneous spatial planning documents from various local authorities for analysis, scenario planning and decision support. This article presents results of an implementation using machine-readable semantic metadata to identify relationships among regulations in the text, spatial objects in the drawings and links to external resources. A spatial planning ontology was used to annotate different sections of spatial planning documents with semantic metadata in the Resource Description Framework in Attributes (RDFa). The semantic interpretation of the content, links between document elements and links to external resources were embedded in XHTML pages. An example and use case from the spatial planning domain in Poland is presented to evaluate its efficiency and applicability. The solution enables the automated integration of spatial planning documents from multiple local authorities to assist decision makers with understanding and interpreting spatial planning information. The approach is equally applicable to legal documents from other countries and domains, such as cultural heritage and environmental management.

  4. Croatian Medical Journal citation score in Web of Science, Scopus, and Google Scholar.

    PubMed

    Sember, Marijan; Utrobicić, Ana; Petrak, Jelka

    2010-04-01

    To analyze the 2007 citation count of articles published by the Croatian Medical Journal in 2005-2006 based on data from the Web of Science, Scopus, and Google Scholar. Web of Science and Scopus were searched for the articles published in 2005-2006. As all articles returned by Scopus were included in Web of Science, the latter list was the sample for further analysis. Total citation counts for each article on the list were retrieved from Web of Science, Scopus, and Google Scholar. The overlap and unique citations were compared and analyzed. Proportions were compared using chi(2)-test. Google Scholar returned the greatest proportion of articles with citations (45%), followed by Scopus (42%), and Web of Science (38%). Almost a half (49%) of articles had no citations and 11% had an equal number of identical citations in all 3 databases. The greatest overlap was found between Web of Science and Scopus (54%), followed by Scopus and Google Scholar (51%), and Web of Science and Google Scholar (44%). The greatest number of unique citations was found by Google Scholar (n=86). The majority of these citations (64%) came from journals, followed by books and PhD theses. Approximately 55% of all citing documents were full-text resources in open access. The language of citing documents was mostly English, but as many as 25 citing documents (29%) were in Chinese. Google Scholar shares a total of 42% citations returned by two others, more influential, bibliographic resources. The list of unique citations in Google Scholar is predominantly journal based, but these journals are mainly of local character. Citations received by internationally recognized medical journals are crucial for increasing the visibility of small medical journals but Google Scholar may serve as an alternative bibliometric tool for an orientational citation insight.

  5. Development and evaluation of a web-based application for digital findings and documentation in physiotherapy education.

    PubMed

    Spieler, Bernadette; Burgsteiner, Harald; Messer-Misak, Karin; Gödl-Purrer, Barbara; Salchinger, Beate

    2015-01-01

    Findings in physiotherapy have standardized approaches in treatment, but there is also a significant margin of differences in how to implement these standards. Clinical decisions require experience and continuous learning processes to consolidate personal values and opinions and studies suggest that lecturers can influence students positively. Recently, the study course of Physiotherapy at the University of Applied Science in Graz has offered a paper based finding document. This document supported decisions through the adaption of the clinical reasoning process. The document was the starting point for our learning application called "EasyAssess", a Java based web-application for a digital findings documentation. A central point of our work was to ensure efficiency, effectiveness and usability of the web-application through usability tests utilized by both students and lecturers. Results show that our application fulfills the previously defined requirements and can be efficiently used in daily routine largely because of its simple user interface and its modest design. Due to the close cooperation with the study course Physiotherapy, the application has incorporated the various needs of the target audiences and confirmed the usefulness of our application.

  6. Biotool2Web: creating simple Web interfaces for bioinformatics applications.

    PubMed

    Shahid, Mohammad; Alam, Intikhab; Fuellen, Georg

    2006-01-01

    Currently there are many bioinformatics applications being developed, but there is no easy way to publish them on the World Wide Web. We have developed a Perl script, called Biotool2Web, which makes the task of creating web interfaces for simple ('home-made') bioinformatics applications quick and easy. Biotool2Web uses an XML document containing the parameters to run the tool on the Web, and generates the corresponding HTML and common gateway interface (CGI) files ready to be published on a web server. This tool is available for download at URL http://www.uni-muenster.de/Bioinformatics/services/biotool2web/ Georg Fuellen (fuellen@alum.mit.edu).

  7. Cluster-lensing: A Python Package for Galaxy Clusters and Miscentering

    NASA Astrophysics Data System (ADS)

    Ford, Jes; VanderPlas, Jake

    2016-12-01

    We describe a new open source package for calculating properties of galaxy clusters, including Navarro, Frenk, and White halo profiles with and without the effects of cluster miscentering. This pure-Python package, cluster-lensing, provides well-documented and easy-to-use classes and functions for calculating cluster scaling relations, including mass-richness and mass-concentration relations from the literature, as well as the surface mass density {{Σ }}(R) and differential surface mass density {{Δ }}{{Σ }}(R) profiles, probed by weak lensing magnification and shear. Galaxy cluster miscentering is especially a concern for stacked weak lensing shear studies of galaxy clusters, where offsets between the assumed and the true underlying matter distribution can lead to a significant bias in the mass estimates if not accounted for. This software has been developed and released in a public GitHub repository, and is licensed under the permissive MIT license. The cluster-lensing package is archived on Zenodo. Full documentation, source code, and installation instructions are available at http://jesford.github.io/cluster-lensing/.

  8. Developer Network

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2012-08-21

    NREL's Developer Network, developer.nrel.gov, provides data that users can access to provide data to their own analyses, mobile and web applications. Developers can retrieve the data through a Web services API (application programming interface). The Developer Network handles overhead of serving up web services such as key management, authentication, analytics, reporting, documentation standards, and throttling in a common architecture, while allowing web services and APIs to be maintained and managed independently.

  9. 78 FR 66746 - Medical Device User Fee and Modernization Act; Notice to Public of Web Site Location of Fiscal...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-11-06

    ...] Medical Device User Fee and Modernization Act; Notice to Public of Web Site Location of Fiscal Year 2014... and Drug Administration (FDA or the Agency) is announcing the Web site location where the Agency will... documents, FDA has committed to updating its Web site in a timely manner to reflect the Agency's review of...

  10. A Query Integrator and Manager for the Query Web

    PubMed Central

    Brinkley, James F.; Detwiler, Landon T.

    2012-01-01

    We introduce two concepts: the Query Web as a layer of interconnected queries over the document web and the semantic web, and a Query Web Integrator and Manager (QI) that enables the Query Web to evolve. QI permits users to write, save and reuse queries over any web accessible source, including other queries saved in other installations of QI. The saved queries may be in any language (e.g. SPARQL, XQuery); the only condition for interconnection is that the queries return their results in some form of XML. This condition allows queries to chain off each other, and to be written in whatever language is appropriate for the task. We illustrate the potential use of QI for several biomedical use cases, including ontology view generation using a combination of graph-based and logical approaches, value set generation for clinical data management, image annotation using terminology obtained from an ontology web service, ontology-driven brain imaging data integration, small-scale clinical data integration, and wider-scale clinical data integration. Such use cases illustrate the current range of applications of QI and lead us to speculate about the potential evolution from smaller groups of interconnected queries into a larger query network that layers over the document and semantic web. The resulting Query Web could greatly aid researchers and others who now have to manually navigate through multiple information sources in order to answer specific questions. PMID:22531831

  11. Web Application Software for Ground Operations Planning Database (GOPDb) Management

    NASA Technical Reports Server (NTRS)

    Lanham, Clifton; Kallner, Shawn; Gernand, Jeffrey

    2013-01-01

    A Web application facilitates collaborative development of the ground operations planning document. This will reduce costs and development time for new programs by incorporating the data governance, access control, and revision tracking of the ground operations planning data. Ground Operations Planning requires the creation and maintenance of detailed timelines and documentation. The GOPDb Web application was created using state-of-the-art Web 2.0 technologies, and was deployed as SaaS (Software as a Service), with an emphasis on data governance and security needs. Application access is managed using two-factor authentication, with data write permissions tied to user roles and responsibilities. Multiple instances of the application can be deployed on a Web server to meet the robust needs for multiple, future programs with minimal additional cost. This innovation features high availability and scalability, with no additional software that needs to be bought or installed. For data governance and security (data quality, management, business process management, and risk management for data handling), the software uses NAMS. No local copy/cloning of data is permitted. Data change log/tracking is addressed, as well as collaboration, work flow, and process standardization. The software provides on-line documentation and detailed Web-based help. There are multiple ways that this software can be deployed on a Web server to meet ground operations planning needs for future programs. The software could be used to support commercial crew ground operations planning, as well as commercial payload/satellite ground operations planning. The application source code and database schema are owned by NASA.

  12. Illinois Occupational Skill Standards: Mechanical Drafting Cluster.

    ERIC Educational Resources Information Center

    Illinois Occupational Skill Standards and Credentialing Council, Carbondale.

    This document, which is intended as a guide for work force preparation program providers, details the Illinois occupational skill standards for programs preparing students for employment in occupations in the mechanical drafting cluster. The document begins with a brief overview of the Illinois perspective on occupational skill standards and…

  13. Illinois Occupational Skill Standards: Architectural Drafting Cluster.

    ERIC Educational Resources Information Center

    Illinois Occupational Skill Standards and Credentialing Council, Carbondale.

    This document, which is intended as a guide for work force preparation program providers, details the Illinois occupational skill standards for programs preparing students for employment in occupations in the architectural drafting cluster. The document begins with a brief overview of the Illinois perspective on occupational skill standards and…

  14. Illinois Occupational Skill Standards: In-Store Retailing Cluster.

    ERIC Educational Resources Information Center

    Illinois Occupational Skill Standards and Credentialing Council, Carbondale.

    This document, which is intended to serve as a guide for work force preparation program providers, details the Illinois occupational skill standards for programs preparing students for employment in occupations in the in-store retailing cluster. The document begins with a brief overview of the Illinois perspective on occupational skill standards…

  15. Illinois Occupational Skill Standards: Finishing and Distribution Cluster.

    ERIC Educational Resources Information Center

    Illinois Occupational Skill Standards and Credentialing Council, Carbondale.

    This document, which is intended as a guide for work force preparation program providers, details the Illinois occupational skill standards for programs preparing students for employment in occupations in the finishing and distribution cluster. The document begins with a brief overview of the Illinois perspective on occupational skill standards…

  16. Illinois Occupational Skill Standards: Imaging/Pre-Press Cluster.

    ERIC Educational Resources Information Center

    Illinois Occupational Skill Standards and Credentialing Council, Carbondale.

    This document, which is intended as a guide for work force preparation program providers, details the Illinois occupational skill standards for programs preparing students for employment in occupations in the imaging/pre-press cluster. The document begins with a brief overview of the Illinois perspective on occupational skill standards and…

  17. Intranet-based quality improvement documentation at the Veterans Affairs Maryland Health Care System.

    PubMed

    Borkowski, A; Lee, D H; Sydnor, D L; Johnson, R J; Rabinovitch, A; Moore, G W

    2001-01-01

    The Pathology and Laboratory Medicine Service of the Veterans Affairs Maryland Health Care System is inspected biannually by the College of American Pathologists (CAP). As of the year 2000, all documentation in the Anatomic Pathology Section is available to all staff through the VA Intranet. Signed, supporting paper documents are on file in the office of the department chair. For the year 2000 CAP inspection, inspectors conducted their document review by use of these Web-based documents, in which each CAP question had a hyperlink to the corresponding section of the procedure manual. Thus inspectors were able to locate the documents relevant to each question quickly and efficiently. The procedure manuals consist of 87 procedures for surgical pathology, 52 procedures for cytopathology, and 25 procedures for autopsy pathology. Each CAP question requiring documentation had from one to three hyperlinks to the corresponding section of the procedure manual. Intranet documentation allows for easier sharing among decentralized institutions and for centralized updates of the laboratory documentation. These documents can be upgraded to allow for multimedia presentations, including text search for key words, hyperlinks to other documents, and images, audio, and video. Use of Web-based documents can improve the efficiency of the inspection process.

  18. Observations of a nearby filament of galaxy clusters with the Sardinia Radio Telescope

    NASA Astrophysics Data System (ADS)

    Vacca, Valentina; Murgia, M.; Loi, F. Govoni F.; Vazza, F.; Finoguenov, A.; Carretti, E.; Feretti, L.; Giovannini, G.; Concu, R.; Melis, A.; Gheller, C.; Paladino, R.; Poppi, S.; Valente, G.; Bernardi, G.; Boschin, W.; Brienza, M.; Clarke, T. E.; Colafrancesco, S.; Enßlin, T.; Ferrari, C.; de Gasperin, F.; Gastaldello, F.; Girardi, M.; Gregorini, L.; Johnston-Hollitt, M.; Junklewitz, H.; Orrù, E.; Parma, P.; Perley, R.; Taylor, G. B.

    2018-05-01

    We report the detection of diffuse radio emission which might be connected to a large-scale filament of the cosmic web covering a 8° × 8° area in the sky, likely associated with a z≈0.1 over-density traced by nine massive galaxy clusters. In this work, we present radio observations of this region taken with the Sardinia Radio Telescope. Two of the clusters in the field host a powerful radio halo sustained by violent ongoing mergers and provide direct proof of intra-cluster magnetic fields. In order to investigate the presence of large-scale diffuse radio synchrotron emission in and beyond the galaxy clusters in this complex system, we combined the data taken at 1.4 GHz with the Sardinia Radio Telescope with higher resolution data taken with the NRAO VLA Sky Survey. We found 28 candidate new sources with a size larger and X-ray emission fainter than known diffuse large-scale synchrotron cluster sources for a given radio power. This new population is potentially the tip of the iceberg of a class of diffuse large-scale synchrotron sources associated with the filaments of the cosmic web. In addition, we found in the field a candidate new giant radio galaxy.

  19. ADASS Web Database XML Project

    NASA Astrophysics Data System (ADS)

    Barg, M. I.; Stobie, E. B.; Ferro, A. J.; O'Neil, E. J.

    In the spring of 2000, at the request of the ADASS Program Organizing Committee (POC), we began organizing information from previous ADASS conferences in an effort to create a centralized database. The beginnings of this database originated from data (invited speakers, participants, papers, etc.) extracted from HyperText Markup Language (HTML) documents from past ADASS host sites. Unfortunately, not all HTML documents are well formed and parsing them proved to be an iterative process. It was evident at the beginning that if these Web documents were organized in a standardized way, such as XML (Extensible Markup Language), the processing of this information across the Web could be automated, more efficient, and less error prone. This paper will briefly review the many programming tools available for processing XML, including Java, Perl and Python, and will explore the mapping of relational data from our MySQL database to XML.

  20. VRprofile: gene-cluster-detection-based profiling of virulence and antibiotic resistance traits encoded within genome sequences of pathogenic bacteria.

    PubMed

    Li, Jun; Tai, Cui; Deng, Zixin; Zhong, Weihong; He, Yongqun; Ou, Hong-Yu

    2017-01-10

    VRprofile is a Web server that facilitates rapid investigation of virulence and antibiotic resistance genes, as well as extends these trait transfer-related genetic contexts, in newly sequenced pathogenic bacterial genomes. The used backend database MobilomeDB was firstly built on sets of known gene cluster loci of bacterial type III/IV/VI/VII secretion systems and mobile genetic elements, including integrative and conjugative elements, prophages, class I integrons, IS elements and pathogenicity/antibiotic resistance islands. VRprofile is thus able to co-localize the homologs of these conserved gene clusters using HMMer or BLASTp searches. With the integration of the homologous gene cluster search module with a sequence composition module, VRprofile has exhibited better performance for island-like region predictions than the other widely used methods. In addition, VRprofile also provides an integrated Web interface for aligning and visualizing identified gene clusters with MobilomeDB-archived gene clusters, or a variety set of bacterial genomes. VRprofile might contribute to meet the increasing demands of re-annotations of bacterial variable regions, and aid in the real-time definitions of disease-relevant gene clusters in pathogenic bacteria of interest. VRprofile is freely available at http://bioinfo-mml.sjtu.edu.cn/VRprofile. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  1. Content Recognition and Context Modeling for Document Analysis and Retrieval

    ERIC Educational Resources Information Center

    Zhu, Guangyu

    2009-01-01

    The nature and scope of available documents are changing significantly in many areas of document analysis and retrieval as complex, heterogeneous collections become accessible to virtually everyone via the web. The increasing level of diversity presents a great challenge for document image content categorization, indexing, and retrieval.…

  2. Comprehensive cluster analysis with Transitivity Clustering.

    PubMed

    Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan

    2011-03-01

    Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.

  3. An automatically updateable web publishing solution: taking document sharing and conversion to enterprise level

    NASA Astrophysics Data System (ADS)

    Rahman, Fuad; Tarnikova, Yuliya; Hartono, Rachmat; Alam, Hassan

    2006-01-01

    This paper presents a novel automatic web publishing solution, Pageview (R). PageView (R) is a complete working solution for document processing and management. The principal aim of this tool is to allow workgroups to share, access and publish documents on-line on a regular basis. For example, assuming that a person is working on some documents. The user will, in some fashion, organize his work either in his own local directory or in a shared network drive. Now extend that concept to a workgroup. Within a workgroup, some users are working together on some documents, and they are saving them in a directory structure somewhere on a document repository. The next stage of this reasoning is that a workgroup is working on some documents, and they want to publish them routinely on-line. Now it may happen that they are using different editing tools, different software, and different graphics tools. The resultant documents may be in PDF, Microsoft Office (R), HTML, or Word Perfect format, just to name a few. In general, this process needs the documents to be processed in a fashion so that they are in the HTML format, and then a web designer needs to work on that collection to make them available on-line. PageView (R) takes care of this whole process automatically, making the document workflow clean and easy to follow. PageView (R) Server publishes documents, complete with the directory structure, for online use. The documents are automatically converted to HTML and PDF so that users can view the content without downloading the original files, or having to download browser plug-ins. Once published, other users can access the documents as if they are accessing them from their local folders. The paper will describe the complete working system and will discuss possible applications within the document management research.

  4. DMINDA: an integrated web server for DNA motif identification and analyses

    PubMed Central

    Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

    2014-01-01

    DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. PMID:24753419

  5. Web-based document image processing

    NASA Astrophysics Data System (ADS)

    Walker, Frank L.; Thoma, George R.

    1999-12-01

    Increasing numbers of research libraries are turning to the Internet for electron interlibrary loan and for document delivery to patrons. This has been made possible through the widespread adoption of software such as Ariel and DocView. Ariel, a product of the Research Libraries Group, converts paper-based documents to monochrome bitmapped images, and delivers them over the Internet. The National Library of Medicine's DocView is primarily designed for library patrons are beginning to reap the benefits of this new technology, barriers exist, e.g., differences in image file format, that lead to difficulties in the use of library document information. To research how to overcome such barriers, the Communications Engineering Branch of the Lister Hill National Center for Biomedical Communications, an R and D division of NLM, has developed a web site called the DocMorph Server. This is part of an ongoing intramural R and D program in document imaging that has spanned many aspects of electronic document conversion and preservation, Internet document transmission and document usage. The DocMorph Server Web site is designed to fill two roles. First, in a role that will benefit both libraries and their patrons, it allows Internet users to upload scanned image files for conversion to alternative formats, thereby enabling wider delivery and easier usage of library document information. Second, the DocMorph Server provides the design team an active test bed for evaluating the effectiveness and utility of new document image processing algorithms and functions, so that they may be evaluated for possible inclusion in other image processing software products being developed at NLM or elsewhere. This paper describes the design of the prototype DocMorph Server and the image processing functions being implemented on it.

  6. The Next Linear Collider Program

    Science.gov Websites

    posted to the new SLAC ILC web site http://www-project.slac.stanford.edu/ilc/. Also, see the new site for . The NLC web site will remain accessible as an archive of important work done on the many systems | Navbar || || Documentation | NLC Playpen | Web Comments & Suggestions | Desktop Trouble Call | LC

  7. Implementing a Dynamic Database-Driven Course Using LAMP

    ERIC Educational Resources Information Center

    Laverty, Joseph Packy; Wood, David; Turchek, John

    2011-01-01

    This paper documents the formulation of a database driven open source architecture web development course. The design of a web-based curriculum faces many challenges: a) relative emphasis of client and server-side technologies, b) choice of a server-side language, and c) the cost and efficient delivery of a dynamic web development, database-driven…

  8. Google Wave: Collaboration Reworked

    ERIC Educational Resources Information Center

    Rethlefsen, Melissa L.

    2010-01-01

    Over the past several years, Internet users have become accustomed to Web 2.0 and cloud computing-style applications. It's commonplace and even intuitive to drag and drop gadgets on personalized start pages, to comment on a Facebook post without reloading the page, and to compose and save documents through a web browser. The web paradigm has…

  9. 77 FR 36583 - NRC Form 5, Occupational Dose Record for a Monitoring Period

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-19

    ... methods: Federal Rulemaking Web site: Go to http://www.regulations.gov and search for Docket ID NRC-2012... following methods: Federal Rulemaking Web Site: Go to http://www.regulations.gov and search for Docket ID... begin the search, select ``ADAMS Public Documents'' and then select ``Begin Web- based ADAMS Search...

  10. BioCatalogue: a universal catalogue of web services for the life sciences

    PubMed Central

    Bhagat, Jiten; Tanoh, Franck; Nzuobontane, Eric; Laurent, Thomas; Orlowski, Jerzy; Roos, Marco; Wolstencroft, Katy; Aleksejevs, Sergejs; Stevens, Robert; Pettifer, Steve; Lopez, Rodrigo; Goble, Carole A.

    2010-01-01

    The use of Web Services to enable programmatic access to on-line bioinformatics is becoming increasingly important in the Life Sciences. However, their number, distribution and the variable quality of their documentation can make their discovery and subsequent use difficult. A Web Services registry with information on available services will help to bring together service providers and their users. The BioCatalogue (http://www.biocatalogue.org/) provides a common interface for registering, browsing and annotating Web Services to the Life Science community. Services in the BioCatalogue can be described and searched in multiple ways based upon their technical types, bioinformatics categories, user tags, service providers or data inputs and outputs. They are also subject to constant monitoring, allowing the identification of service problems and changes and the filtering-out of unavailable or unreliable resources. The system is accessible via a human-readable ‘Web 2.0’-style interface and a programmatic Web Service interface. The BioCatalogue follows a community approach in which all services can be registered, browsed and incrementally documented with annotations by any member of the scientific community. PMID:20484378

  11. BioCatalogue: a universal catalogue of web services for the life sciences.

    PubMed

    Bhagat, Jiten; Tanoh, Franck; Nzuobontane, Eric; Laurent, Thomas; Orlowski, Jerzy; Roos, Marco; Wolstencroft, Katy; Aleksejevs, Sergejs; Stevens, Robert; Pettifer, Steve; Lopez, Rodrigo; Goble, Carole A

    2010-07-01

    The use of Web Services to enable programmatic access to on-line bioinformatics is becoming increasingly important in the Life Sciences. However, their number, distribution and the variable quality of their documentation can make their discovery and subsequent use difficult. A Web Services registry with information on available services will help to bring together service providers and their users. The BioCatalogue (http://www.biocatalogue.org/) provides a common interface for registering, browsing and annotating Web Services to the Life Science community. Services in the BioCatalogue can be described and searched in multiple ways based upon their technical types, bioinformatics categories, user tags, service providers or data inputs and outputs. They are also subject to constant monitoring, allowing the identification of service problems and changes and the filtering-out of unavailable or unreliable resources. The system is accessible via a human-readable 'Web 2.0'-style interface and a programmatic Web Service interface. The BioCatalogue follows a community approach in which all services can be registered, browsed and incrementally documented with annotations by any member of the scientific community.

  12. A new information architecture, website and services for the CMS experiment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Taylor, Lucas; Rusack, Eleanor; Zemleris, Vidmantas

    2012-01-01

    The age and size of the CMS collaboration at the LHC means it now has many hundreds of inhomogeneous web sites and services, and hundreds of thousands of documents. We describe a major initiative to create a single coherent CMS internal and public web site. This uses the Drupal web Content Management System (now supported by CERN/IT) on top of a standard LAMP stack (Linux, Apache, MySQL, and php/perl). The new navigation, content and search services are coherently integrated with numerous existing CERN services (CDS, EDMS, Indico, phonebook, Twiki) as well as many CMS internal Web services. We describe themore » information architecture, the system design, implementation and monitoring, the document and content database, security aspects, and our deployment strategy, which ensured continual smooth operation of all systems at all times.« less

  13. A new Information Architecture, Website and Services for the CMS Experiment

    NASA Astrophysics Data System (ADS)

    Taylor, Lucas; Rusack, Eleanor; Zemleris, Vidmantas

    2012-12-01

    The age and size of the CMS collaboration at the LHC means it now has many hundreds of inhomogeneous web sites and services, and hundreds of thousands of documents. We describe a major initiative to create a single coherent CMS internal and public web site. This uses the Drupal web Content Management System (now supported by CERN/IT) on top of a standard LAMP stack (Linux, Apache, MySQL, and php/perl). The new navigation, content and search services are coherently integrated with numerous existing CERN services (CDS, EDMS, Indico, phonebook, Twiki) as well as many CMS internal Web services. We describe the information architecture; the system design, implementation and monitoring; the document and content database; security aspects; and our deployment strategy, which ensured continual smooth operation of all systems at all times.

  14. Web service module for access to g-Lite

    NASA Astrophysics Data System (ADS)

    Goranova, R.; Goranov, G.

    2012-10-01

    G-Lite is a lightweight grid middleware for grid computing installed on all clusters of the European Grid Infrastructure (EGI). The middleware is partially service-oriented and does not provide well-defined Web services for job management. The existing Web services in the environment cannot be directly used by grid users for building service compositions in the EGI. In this article we present a module of well-defined Web services for job management in the EGI. We describe the architecture of the module and the design of the developed Web services. The presented Web services are composable and can participate in service compositions (workflows). An example of usage of the module with tools for service compositions in g-Lite is shown.

  15. Crossroads 2000 proceedings [table of contents hyperlinked to documents

    DOT National Transportation Integrated Search

    1998-08-19

    This document's table of contents hyperlinks to the 76 papers presented at the Crossroads 2000 Conference. The documents are housed at the web site for Iowa State University Center for Transportation Research and Education. A selection of 14 individu...

  16. Supporting online learning with games

    NASA Astrophysics Data System (ADS)

    Yao, JingTao; Kim, DongWon; Herbert, Joseph P.

    2007-04-01

    This paper presents a study on Web-based learning support systems that is enhanced with two major subsystems: a Web-based learning game and a learning-oriented Web search. The Internet and theWeb may be considered as a first resource for students seeking for information and help. However, much of the information available online is not related to the course contents or is wrong in the worse case. The search subsystem aims to provide students with precise, relative and adaptable documents about certain courses or classes. Therefore, students do not have to spend time to verify the relationship of documents to the class. The learning game subsystem stimulates students to study, enables students to review their studies and to perform self-evaluation through a Web-based learning game such as a treasure hunt game. During the challenge and entertaining learning and evaluation process, it is hoped that students will eventually understand and master the course concepts easily. The goal of developing such a system is to provide students with an efficient and effective learning environment.

  17. Tech-Prep Competency Profiles within the Engineering Technologies Cluster.

    ERIC Educational Resources Information Center

    Ohio State Univ., Columbus. Center on Education and Training for Employment.

    This document contains 12 competency profiles for tech prep courses within the engineering technologies cluster. The document consists of the following sections: (1) systemic curriculum reform philosophy--Ohio's vision of tech prep and its six critical components; (2) an explanation of the process of developing the tech prep competencies; (3) a…

  18. A usability evaluation exploring the design of American Nurses Association state web sites.

    PubMed

    Alexander, Gregory L; Wakefield, Bonnie J; Anbari, Allison B; Lyons, Vanessa; Prentice, Donna; Shepherd, Marilyn; Strecker, E Bradley; Weston, Marla J

    2014-08-01

    National leaders are calling for opportunities to facilitate the Future of Nursing. Opportunities can be encouraged through state nurses association Web sites, which are part of the American Nurses Association, that are well designed, with appropriate content, and in a language professional nurses understand. The American Nurses Association and constituent state nurses associations provide information about nursing practice, ethics, credentialing, and health on Web sites. We conducted usability evaluations to determine compliance with heuristic and ethical principles for Web site design. We purposefully sampled 27 nursing association Web sites and used 68 heuristic and ethical criteria to perform systematic usability assessments of nurse association Web sites. Web site analysis included seven double experts who were all RNs trained in usability analysis. The extent to which heuristic and ethical criteria were met ranged widely from one state that met 0% of the criteria for "help and documentation" to states that met greater than 92% of criteria for "visibility of system status" and "aesthetic and minimalist design." Suggested improvements are simple yet make an impact on a first-time visitor's impression of the Web site. For example, adding internal navigation and tracking features and providing more details about the application process through help and frequently asked question documentation would facilitate better use. Improved usability will improve effectiveness, efficiency, and consumer satisfaction with these Web sites.

  19. The CloudBoard Research Platform: an interactive whiteboard for corporate users

    NASA Astrophysics Data System (ADS)

    Barrus, John; Schwartz, Edward L.

    2013-03-01

    Over one million interactive whiteboards (IWBs) are sold annually worldwide, predominantly for classroom use with few sales for corporate use. Unmet needs for IWB corporate use were investigated and the CloudBoard Research Platform (CBRP) was developed to investigate and test technology for meeting these needs. The CBRP supports audio conferencing with shared remote drawing activity, casual capture of whiteboard activity for long-term storage and retrieval, use of standard formats such as PDF for easy import of documents via the web and email and easy export of documents. Company RFID badges and key fobs provide secure access to documents at the board and automatic logout occurs after a period of inactivity. Users manage their documents with a web browser. Analytics and remote device management is provided for administrators. The IWB hardware consists of off-the-shelf components (a Hitachi UST Projector, SMART Technologies, Inc. IWB hardware, Mac Mini, Polycom speakerphone, etc.) and a custom occupancy sensor. The three back-end servers provide the web interface, document storage, stroke and audio streaming. Ease of use, security, and robustness sufficient for internal adoption was achieved. Five of the 10 boards installed at various Ricoh sites have been in daily or weekly use for the past year and total system downtime was less than an hour in 2012. Since CBRP was installed, 65 registered users, 9 of whom use the system regularly, have created over 2600 documents.

  20. PC-based web authoring: How to learn as little unix as possible while getting on the Web

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gennari, L.T.; Breaux, M.; Minton, S.

    1996-09-01

    This document is a general guide for creating Web pages, using commonly available word processing and file transfer applications. It is not a full guide to HTML, nor does it provide an introduction to the many WYSIWYG HTML editors available. The viability of the authoring method it describes will not be affected by changes in the HTML specification or the rapid release-and-obsolescence cycles of commercial WYSIWYG HTML editors. This document provides a gentle introduction to HTML for the beginner, and as the user gains confidence and experience, encourages greater familiarity with HTML through continued exposure to and hands-on usage ofmore » HTML code.« less

  1. Mac-based Web authoring: How to learn as little Unix as possible while getting on the Web.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gennari, L.T.

    1996-06-01

    This document is a general guide for creating Web pages, using commonly available word processing and file transfer applications. It is not a full guide to HTML, nor does it provide an introduction to the many WYSIWYG HTML editors available. The viability of the authoring method it describes will not be affected by changes in the HTML specification or the rapid release-and-obsolescence cycles of commercial WYSIWYG HTML editors. This document provides a gentle introduction to HTML for the beginner and as the user gains confidence and experience, encourages greater familiarity with HTML through continued exposure to and hands-on usage ofmore » HTML code.« less

  2. Semantic Document Model to Enhance Data and Knowledge Interoperability

    NASA Astrophysics Data System (ADS)

    Nešić, Saša

    To enable document data and knowledge to be efficiently shared and reused across application, enterprise, and community boundaries, desktop documents should be completely open and queryable resources, whose data and knowledge are represented in a form understandable to both humans and machines. At the same time, these are the requirements that desktop documents need to satisfy in order to contribute to the visions of the Semantic Web. With the aim of achieving this goal, we have developed the Semantic Document Model (SDM), which turns desktop documents into Semantic Documents as uniquely identified and semantically annotated composite resources, that can be instantiated into human-readable (HR) and machine-processable (MP) forms. In this paper, we present the SDM along with an RDF and ontology-based solution for the MP document instance. Moreover, on top of the proposed model, we have built the Semantic Document Management System (SDMS), which provides a set of services that exploit the model. As an application example that takes advantage of SDMS services, we have extended MS Office with a set of tools that enables users to transform MS Office documents (e.g., MS Word and MS PowerPoint) into Semantic Documents, and to search local and distant semantic document repositories for document content units (CUs) over Semantic Web protocols.

  3. Embedding the shapes of regions of interest into a Clinical Document Architecture document.

    PubMed

    Minh, Nguyen Hai; Yi, Byoung-Kee; Kim, Il Kon; Song, Joon Hyun; Binh, Pham Viet

    2015-03-01

    Sharing a medical image visually annotated by a region of interest with a remotely located specialist for consultation is a good practice. It may, however, require a special-purpose (and most likely expensive) system to send and view them, which is an unfeasible solution in developing countries such as Vietnam. In this study, we design and implement interoperable methods based on the HL7 Clinical Document Architecture and the eXtensible Markup Language Stylesheet Language for Transformation standards to seamlessly exchange and visually present the shapes of regions of interest using web browsers. We also propose a new integration architecture for a Clinical Document Architecture generator that enables embedding of regions of interest and simultaneous auto-generation of corresponding style sheets. Using the Clinical Document Architecture document and style sheet, a sender can transmit clinical documents and medical images together with coordinate values of regions of interest to recipients. Recipients can easily view the documents and display embedded regions of interest by rendering them in their web browser of choice. © The Author(s) 2014.

  4. Clustering of Farsi sub-word images for whole-book recognition

    NASA Astrophysics Data System (ADS)

    Soheili, Mohammad Reza; Kabir, Ehsanollah; Stricker, Didier

    2015-01-01

    Redundancy of word and sub-word occurrences in large documents can be effectively utilized in an OCR system to improve recognition results. Most OCR systems employ language modeling techniques as a post-processing step; however these techniques do not use important pictorial information that exist in the text image. In case of large-scale recognition of degraded documents, this information is even more valuable. In our previous work, we proposed a subword image clustering method for the applications dealing with large printed documents. In our clustering method, the ideal case is when all equivalent sub-word images lie in one cluster. To overcome the issues of low print quality, the clustering method uses an image matching algorithm for measuring the distance between two sub-word images. The measured distance with a set of simple shape features were used to cluster all sub-word images. In this paper, we analyze the effects of adding more shape features on processing time, purity of clustering, and the final recognition rate. Previously published experiments have shown the efficiency of our method on a book. Here we present extended experimental results and evaluate our method on another book with totally different font face. Also we show that the number of the new created clusters in a page can be used as a criteria for assessing the quality of print and evaluating preprocessing phases.

  5. A web portal for hydrodynamical, cosmological simulations

    NASA Astrophysics Data System (ADS)

    Ragagnin, A.; Dolag, K.; Biffi, V.; Cadolle Bel, M.; Hammer, N. J.; Krukau, A.; Petkova, M.; Steinborn, D.

    2017-07-01

    This article describes a data centre hosting a web portal for accessing and sharing the output of large, cosmological, hydro-dynamical simulations with a broad scientific community. It also allows users to receive related scientific data products by directly processing the raw simulation data on a remote computing cluster. The data centre has a multi-layer structure: a web portal, a job control layer, a computing cluster and a HPC storage system. The outer layer enables users to choose an object from the simulations. Objects can be selected by visually inspecting 2D maps of the simulation data, by performing highly compounded and elaborated queries or graphically by plotting arbitrary combinations of properties. The user can run analysis tools on a chosen object. These services allow users to run analysis tools on the raw simulation data. The job control layer is responsible for handling and performing the analysis jobs, which are executed on a computing cluster. The innermost layer is formed by a HPC storage system which hosts the large, raw simulation data. The following services are available for the users: (I) CLUSTERINSPECT visualizes properties of member galaxies of a selected galaxy cluster; (II) SIMCUT returns the raw data of a sub-volume around a selected object from a simulation, containing all the original, hydro-dynamical quantities; (III) SMAC creates idealized 2D maps of various, physical quantities and observables of a selected object; (IV) PHOX generates virtual X-ray observations with specifications of various current and upcoming instruments.

  6. Dealing with Multiple Documents on the WWW: The Role of Metacognition in the Formation of Documents Models

    ERIC Educational Resources Information Center

    Stadtler, Marc; Bromme, Rainer

    2007-01-01

    Drawing on the theory of documents representation (Perfetti et al., Toward a theory of documents representation. In: H. v. Oostendorp & S. R. Goldman (Eds.), "The construction of mental representations during reading." Mahwah, NJ: Erlbaum, 1999), we argue that successfully dealing with multiple documents on the World Wide Web requires readers to…

  7. Lessons learned from a practice-based, multi-site intervention study with nurse participants

    PubMed Central

    Friese, Christopher R.; Mendelsohn-Victor, Kari; Ginex, Pamela; McMahon, Carol M.; Fauer, Alex J.; McCullagh, Marjorie C.

    2016-01-01

    Purpose To identify challenges and solutions to the efficient conduct of a multi-site, practice-based randomized controlled trial to improve nurses’ adherence to personal protective equipment use in ambulatory oncology settings. Design The Drug Exposure Feedback and Education for Nurses’ Safety (DEFENS) study is a clustered, randomized, controlled trial. Participating sites are randomized to web-based feedback on hazardous drug exposures in the sites plus tailored messages to address barriers versus a control intervention of a web-based continuing education video. Approach The study principal investigator, the study coordinator, and two site leaders identified challenges to study implementation and potential solutions, plus potential methods to prevent logistical challenges in future studies. Findings Noteworthy challenges included variation in human subjects protection policies, grants and contracts budgeting, infrastructure for nursing-led research, and information technology variation. Successful strategies included scheduled web conferences, site-based study champions, site visits by the principal investigator, and centrally-based document preparation. Strategies to improve efficiency in future studies include early and continued engagement with contract personnel in sites, and proposed changes to the common rule concerning human subjects. The DEFENS study successfully recruited 393 nurses across 12 sites. To date, 369 have completed surveys and 174 nurses have viewed educational materials. Conclusions Multi-site studies of nursing personnel are rare and challenging to existing infrastructure. These barriers can be overcome with strong engagement and planning. Clinical Relevance Leadership engagement, onsite staff support, and continuous communication can facilitate successful recruitment to a workplace-based randomized, controlled behavioral trial. PMID:28098951

  8. Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure.

    PubMed

    Zhang, Wen; Xiao, Fan; Li, Bin; Zhang, Siguang

    2016-01-01

    Recently, LSI (Latent Semantic Indexing) based on SVD (Singular Value Decomposition) is proposed to overcome the problems of polysemy and homonym in traditional lexical matching. However, it is usually criticized as with low discriminative power for representing documents although it has been validated as with good representative quality. In this paper, SVD on clusters is proposed to improve the discriminative power of LSI. The contribution of this paper is three manifolds. Firstly, we make a survey of existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. Secondly, we propose SVD on clusters for LSI and theoretically explain that dimension expansion of document vectors and dimension projection using SVD are the two manipulations involved in SVD on clusters. Moreover, we develop updating processes to fold in new documents and terms in a decomposed matrix by SVD on clusters. Thirdly, two corpora, a Chinese corpus and an English corpus, are used to evaluate the performances of the proposed methods. Experiments demonstrate that, to some extent, SVD on clusters can improve the precision of interdocument similarity measure in comparison with other SVD based LSI methods.

  9. Using SVD on Clusters to Improve Precision of Interdocument Similarity Measure

    PubMed Central

    Xiao, Fan; Li, Bin; Zhang, Siguang

    2016-01-01

    Recently, LSI (Latent Semantic Indexing) based on SVD (Singular Value Decomposition) is proposed to overcome the problems of polysemy and homonym in traditional lexical matching. However, it is usually criticized as with low discriminative power for representing documents although it has been validated as with good representative quality. In this paper, SVD on clusters is proposed to improve the discriminative power of LSI. The contribution of this paper is three manifolds. Firstly, we make a survey of existing linear algebra methods for LSI, including both SVD based methods and non-SVD based methods. Secondly, we propose SVD on clusters for LSI and theoretically explain that dimension expansion of document vectors and dimension projection using SVD are the two manipulations involved in SVD on clusters. Moreover, we develop updating processes to fold in new documents and terms in a decomposed matrix by SVD on clusters. Thirdly, two corpora, a Chinese corpus and an English corpus, are used to evaluate the performances of the proposed methods. Experiments demonstrate that, to some extent, SVD on clusters can improve the precision of interdocument similarity measure in comparison with other SVD based LSI methods. PMID:27579031

  10. GRAMM-X public web server for protein–protein docking

    PubMed Central

    Tovchigrechko, Andrey; Vakser, Ilya A.

    2006-01-01

    Protein docking software GRAMM-X and its web interface () extend the original GRAMM Fast Fourier Transformation methodology by employing smoothed potentials, refinement stage, and knowledge-based scoring. The web server frees users from complex installation of database-dependent parallel software and maintaining large hardware resources needed for protein docking simulations. Docking problems submitted to GRAMM-X server are processed by a 320 processor Linux cluster. The server was extensively tested by benchmarking, several months of public use, and participation in the CAPRI server track. PMID:16845016

  11. View of Arabella, one of two Skylab spiders and her web

    NASA Technical Reports Server (NTRS)

    1973-01-01

    A close-up view of Arabella, one of the two Skylab 3 common cross spiders 'aranous diadematus,' and the web it had spun in the zero gravity of space aboard the Skylab space station cluster in Earth orbit. During the 59 day Skylab 3 mission the two spiders Arabella and Anita, were housed in an enclosure onto which a motion picture and still camera were attached to record the spiders' attempts to build a web in the weightless environment.

  12. Clustering document fragments using background color and texture information

    NASA Astrophysics Data System (ADS)

    Chanda, Sukalpa; Franke, Katrin; Pal, Umapada

    2012-01-01

    Forensic analysis of questioned documents sometimes can be extensively data intensive. A forensic expert might need to analyze a heap of document fragments and in such cases to ensure reliability he/she should focus only on relevant evidences hidden in those document fragments. Relevant document retrieval needs finding of similar document fragments. One notion of obtaining such similar documents could be by using document fragment's physical characteristics like color, texture, etc. In this article we propose an automatic scheme to retrieve similar document fragments based on visual appearance of document paper and texture. Multispectral color characteristics using biologically inspired color differentiation techniques are implemented here. This is done by projecting document color characteristics to Lab color space. Gabor filter-based texture analysis is used to identify document texture. It is desired that document fragments from same source will have similar color and texture. For clustering similar document fragments of our test dataset we use a Self Organizing Map (SOM) of dimension 5×5, where the document color and texture information are used as features. We obtained an encouraging accuracy of 97.17% from 1063 test images.

  13. A Clustering Methodology of Web Log Data for Learning Management Systems

    ERIC Educational Resources Information Center

    Valsamidis, Stavros; Kontogiannis, Sotirios; Kazanidis, Ioannis; Theodosiou, Theodosios; Karakos, Alexandros

    2012-01-01

    Learning Management Systems (LMS) collect large amounts of data. Data mining techniques can be applied to analyse their web data log files. The instructors may use this data for assessing and measuring their courses. In this respect, we have proposed a methodology for analysing LMS courses and students' activity. This methodology uses a Markov…

  14. 47 CFR 73.8000 - Incorporation by reference.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... Engineering and Technology (OET) Web site: http://www.fcc.gov/oet/info/documents/bulletins/. (1) OET Bulletin...., Suite 1200, Washington, DC 20006, or at the ATSC Web site: http://www.atsc.org/standards.html. (1) ATSC... Standards Institute (ANSI), 25 West 43rd Street, 4th Floor, New York, NY 10036 or at the ANSI Web site: http...

  15. Ontology-Based Approaches to Improve RDF Triple Store

    ERIC Educational Resources Information Center

    Albahli, Saleh M.

    2016-01-01

    The World Wide Web enables an easy, instant access to a huge quantity of information. Over the last few decades, a number of improvements have been achieved that helped the web reach its current state. However, the current Internet links documents together without understanding them, and thus, makes the content of web only human-readable rather…

  16. Methodology for Localized and Accessible Image Formation and Elucidation

    ERIC Educational Resources Information Center

    Patil, Sandeep R.; Katiyar, Manish

    2009-01-01

    Accessibility is one of the key checkpoints in all software products, applications, and Web sites. Accessibility with digital images has always been a major challenge for the industry. Images form an integral part of certain type of documents and most Web 2.0-compliant Web sites. Individuals challenged with blindness and many dyslexics only make…

  17. 78 FR 6142 - Vogtle Electric Generating Plant, Units 3 and 4; Application and Amendment to Combined Licenses...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-01-29

    ... NRC-2008- 0252. You may submit comments by any of the following methods: Federal Rulemaking Web site... publicly available, by any of the following methods: Federal Rulemaking Web site: Go to http://www... ``Begin Web- based ADAMS Search.'' For problems with ADAMS, please contact the NRC's Public Document Room...

  18. QCS : a system for querying, clustering, and summarizing documents.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dunlavy, Daniel M.

    2006-08-01

    Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system--the Query, Cluster, Summarize (QCS) system--which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test setsmore » from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence ''trimming'', and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.« less

  19. QCS: a system for querying, clustering and summarizing documents.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dunlavy, Daniel M.; Schlesinger, Judith D.; O'Leary, Dianne P.

    2006-10-01

    Information retrieval systems consist of many complicated components. Research and development of such systems is often hampered by the difficulty in evaluating how each particular component would behave across multiple systems. We present a novel hybrid information retrieval system--the Query, Cluster, Summarize (QCS) system--which is portable, modular, and permits experimentation with different instantiations of each of the constituent text analysis components. Most importantly, the combination of the three types of components in the QCS design improves retrievals by providing users more focused information organized by topic. We demonstrate the improved performance by a series of experiments using standard test setsmore » from the Document Understanding Conferences (DUC) along with the best known automatic metric for summarization system evaluation, ROUGE. Although the DUC data and evaluations were originally designed to test multidocument summarization, we developed a framework to extend it to the task of evaluation for each of the three components: query, clustering, and summarization. Under this framework, we then demonstrate that the QCS system (end-to-end) achieves performance as good as or better than the best summarization engines. Given a query, QCS retrieves relevant documents, separates the retrieved documents into topic clusters, and creates a single summary for each cluster. In the current implementation, Latent Semantic Indexing is used for retrieval, generalized spherical k-means is used for the document clustering, and a method coupling sentence 'trimming', and a hidden Markov model, followed by a pivoted QR decomposition, is used to create a single extract summary for each cluster. The user interface is designed to provide access to detailed information in a compact and useful format. Our system demonstrates the feasibility of assembling an effective IR system from existing software libraries, the usefulness of the modularity of the design, and the value of this particular combination of modules.« less

  20. Relational Learning via Collective Matrix Factorization

    DTIC Science & Technology

    2008-06-01

    well-known example of such a schema is pLSI- pHITS [13], which models document-word counts and document-document citations: E1 = words and E2 = E3...relational co- clustering include pLSI, pLSI- pHITS , the symmetric block models of Long et. al. [23, 24, 25], and Bregman tensor clustering [5] (which can...to pLSI- pHITS In this section we provide an example where the additional flexibility of collective matrix factorization leads to better results; and

  1. Improving clustering with metabolic pathway data.

    PubMed

    Milone, Diego H; Stegmayer, Georgina; López, Mariana; Kamenetzky, Laura; Carrari, Fernando

    2014-04-10

    It is a common practice in bioinformatics to validate each group returned by a clustering algorithm through manual analysis, according to a-priori biological knowledge. This procedure helps finding functionally related patterns to propose hypotheses for their behavior and the biological processes involved. Therefore, this knowledge is used only as a second step, after data are just clustered according to their expression patterns. Thus, it could be very useful to be able to improve the clustering of biological data by incorporating prior knowledge into the cluster formation itself, in order to enhance the biological value of the clusters. A novel training algorithm for clustering is presented, which evaluates the biological internal connections of the data points while the clusters are being formed. Within this training algorithm, the calculation of distances among data points and neurons centroids includes a new term based on information from well-known metabolic pathways. The standard self-organizing map (SOM) training versus the biologically-inspired SOM (bSOM) training were tested with two real data sets of transcripts and metabolites from Solanum lycopersicum and Arabidopsis thaliana species. Classical data mining validation measures were used to evaluate the clustering solutions obtained by both algorithms. Moreover, a new measure that takes into account the biological connectivity of the clusters was applied. The results of bSOM show important improvements in the convergence and performance for the proposed clustering method in comparison to standard SOM training, in particular, from the application point of view. Analyses of the clusters obtained with bSOM indicate that including biological information during training can certainly increase the biological value of the clusters found with the proposed method. It is worth to highlight that this fact has effectively improved the results, which can simplify their further analysis.The algorithm is available as a web-demo at http://fich.unl.edu.ar/sinc/web-demo/bsom-lite/. The source code and the data sets supporting the results of this article are available at http://sourceforge.net/projects/sourcesinc/files/bsom.

  2. Informatics in radiology: A prototype Web-based reporting system for onsite-offsite clinician communication.

    PubMed

    Arnold, Corey W; Bui, Alex A T; Morioka, Craig; El-Saden, Suzie; Kangarloo, Hooshang

    2007-01-01

    The communication of imaging findings to a referring physician is an important role of the radiologist. However, communication between onsite and offsite physicians is a time-consuming process that can obstruct work flow and frequently involves no exchange of visual information, which is especially problematic given the importance of radiologic images for diagnosis and treatment. A prototype World Wide Web-based image documentation and reporting system was developed for use in supporting a "communication loop" that is based on the concept of a classic "wet-read" system. The proposed system represents an attempt to address many of the problems seen in current communication work flows by implementing a well-documented and easily accessible communication loop that is adaptable to different types of imaging study evaluation. Images are displayed in a native (DICOM) Digital Imaging and Communications in Medicine format with a Java applet, which allows accurate presentation along with use of various image manipulation tools. The Web-based infrastructure consists of a server that stores imaging studies and reports, with Web browsers that download and install necessary client software on demand. Application logic consists of a set of PHP (hypertext preprocessor) modules that are accessible with an application programming interface. The system may be adapted to any clinician-specialist communication loop, and, because it integrates radiologic standards with Web-based technologies, can more effectively communicate and document imaging data. RSNA, 2007

  3. Factors associated with choice of web or print intervention materials in the healthy directions 2 study.

    PubMed

    Greaney, Mary L; Puleo, Elaine; Bennett, Gary G; Haines, Jess; Viswanath, K; Gillman, Matthew W; Sprunck-Harrild, Kim; Coeling, Molly; Rusinak, Donna; Emmons, Karen M

    2014-02-01

    Many U.S. adults have multiple behavioral risk factors, and effective, scalable interventions are needed to promote population-level health. In the health care setting, interventions are often provided in print, although accessible to nearly everyone, are brief (e.g., pamphlets), are not interactive, and can require some logistics around distribution. Web-based interventions offer more interactivity but may not be accessible to all. Healthy Directions 2 was a primary care-based cluster randomized controlled trial designed to improve five behavioral cancer risk factors among a diverse sample of adults (n = 2,440) in metropolitan Boston. Intervention materials were available via print or the web. Purpose. To (a) describe the Healthy Directions 2 study design and (b) identify baseline factors associated with whether participants opted for print or web-based materials. Hierarchical regression models corrected for clustering by physician were built to examine factors associated with choice of intervention modality. At baseline, just 4.0% of participants met all behavioral recommendations. Nearly equivalent numbers of intervention participants opted for print and web-based materials (44.6% vs. 55.4%). Participants choosing web-based materials were younger, and reported having a better financial status, better perceived health, greater computer comfort, and more frequent Internet use (p < .05) than those opting for print. In addition, Whites were more likely to pick web-based material than Black participants. Interventions addressing multiple behaviors are needed in the primary care setting, but they should be available in web and print formats as nearly equal number of participants chose each option, and there are significant differences in the population groups using each modality.

  4. Spider-web amphiphiles as artificial lipid clusters: design, synthesis, and accommodation of lipid components at the air-water interface.

    PubMed

    Ariga, Katsuhiko; Urakawa, Toshihiro; Michiue, Atsuo; Kikuchi, Jun-ichi

    2004-08-03

    As a novel category of two-dimensional lipid clusters, dendrimers having an amphiphilic structure in every unit were synthesized and labeled "spider-web amphiphiles". Amphiphilic units based on a Lys-Lys-Glu tripeptide with hydrophobic tails at the C-terminal and a polar head at the N-terminal are dendrically connected through stepwise peptide coupling. This structural design allowed us to separately introduce the polar head and hydrophobic tails. Accordingly, we demonstrated the synthesis of the spider-web amphiphile series in three combinations: acetyl head/C16 chain, acetyl head/C18 chain, and ammonium head/C16 chain. All the spider-web amphiphiles were synthesized in satisfactory yields, and characterized by 1H NMR, MALDI-TOFMS, GPC, and elemental analyses. Surface pressure (pi)-molecular area (A) isotherms showed the formation of expanded monolayers except for the C18-chain amphiphile at 10 degrees C, for which the molecular area in the condensed phase is consistent with the cross-sectional area assigned for all the alkyl chains. In all the spider-web amphiphiles, the molecular areas at a given pressure in the expanded phase increased in proportion to the number of units, indicating that alkyl chains freely fill the inner space of the dendritic core. The mixing of octadecanoic acid with the spider-web amphiphiles at the air-water interface induced condensation of the molecular area. From the molecular area analysis, the inclusion of the octadecanoic acid bears a stoichiometric characteristic; i.e., the number of captured octadecanoic acids in the spider-web amphiphile roughly agrees with the number of branching points in the spider-web amphiphile.

  5. Using Web 2.0 for health promotion and social marketing efforts: lessons learned from Web 2.0 experts.

    PubMed

    Dooley, Jennifer Allyson; Jones, Sandra C; Iverson, Don

    2014-01-01

    Web 2.0 experts working in social marketing participated in qualitative in-depth interviews. The research aimed to document the current state of Web 2.0 practice. Perceived strengths (such as the viral nature of Web 2.0) and weaknesses (such as the time consuming effort it took to learn new Web 2.0 platforms) existed when using Web 2.0 platforms for campaigns. Lessons learned were identified--namely, suggestions for engaging in specific types of content creation strategies (such as plain language and transparent communication practices). Findings present originality and value to practitioners working in social marketing who want to effectively use Web 2.0.

  6. Health on the Net Foundation: assessing the quality of health web pages all over the world.

    PubMed

    Boyer, Célia; Gaudinat, Arnaud; Baujard, Vincent; Geissbühler, Antoine

    2007-01-01

    The Internet provides a great amount of information and has become one of the communication media which is most widely used [1]. However, the problem is no longer finding information but assessing the credibility of the publishers as well as the relevance and accuracy of the documents retrieved from the web. This problem is particularly relevant in the medical area which has a direct impact on the well-being of citizens. In this paper, we assume that the quality of web pages can be controlled, even when a huge amount of documents has to be reviewed. But this must be supported by both specific automatic tools and human expertise. In this context, we present various initiatives of the Health on the Net Foundation informing the citizens about the reliability of the medical content on the web.

  7. Navy Controls for Invoice, Receipt, Acceptance, and Property Transfer System Need Improvement

    DTIC Science & Technology

    2016-02-25

    iR APT as a web-based system to electronically invoice, receipt, and accept ser vices and product s from its contractors and vendors. The iR APT system...electronically shares document s bet ween DoD and it s contractors and vendors to eliminate redundant data entr y, increase data accuracy, and reduce...The iR APT system allows contractors to submit and track invoices and receipt and acceptance documents over the web and allows government personnel to

  8. Application of microarray analysis on computer cluster and cloud platforms.

    PubMed

    Bernau, C; Boulesteix, A-L; Knaus, J

    2013-01-01

    Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.

  9. Impact of a cancer clinical trials web site on discussions about trial participation: a cluster randomized trial.

    PubMed

    Dear, R F; Barratt, A L; Askie, L M; Butow, P N; McGeechan, K; Crossing, S; Currow, D C; Tattersall, M H N

    2012-07-01

    Cancer patients want access to reliable information about currently recruiting clinical trials. Oncologists and their patients were randomly assigned to access a consumer-friendly cancer clinical trials web site [Australian Cancer Trials (ACT), www.australiancancertrials.gov.au] or to usual care in a cluster randomized controlled trial. The primary outcome, measured from audio recordings of oncologist-patient consultations, was the proportion of patients with whom participation in any clinical trial was discussed. Analysis was by intention-to-treat accounting for clustering and stratification. Thirty medical oncologists and 493 patients were recruited. Overall, 46% of consultations in the intervention group compared with 34% in the control group contained a discussion about clinical trials (P=0.08). The mean consultation length in both groups was 29 min (P=0.69). The proportion consenting to a trial was 10% in both groups (P=0.65). Patients' knowledge about randomized trials was lower in the intervention than the control group (mean score 3.0 versus 3.3, P=0.03) but decisional conflict scores were similar (mean score 42 versus 43, P=0.83). Good communication between patients and physicians is essential. Within this context, a web site such as Australian Cancer Trials may be an important tool to encourage discussion about clinical trial participation.

  10. New Interfaces to Web Documents and Services

    NASA Technical Reports Server (NTRS)

    Carlisle, W. H.

    1996-01-01

    This paper reports on investigations into how to extend capabilities of the Virtual Research Center (VRC) for NASA's Advanced Concepts Office. The work was performed as part of NASA's 1996 Summer Faculty Fellowship program, and involved research into and prototype development of software components that provide documents and services for the World Wide Web (WWW). The WWW has become a de-facto standard for sharing resources over the internet, primarily because web browsers are freely available for the most common hardware platforms and their operating systems. As a consequence of the popularity of the internet, tools, and techniques associated with web browsers are changing rapidly. New capabilities are offered by companies that support web browsers in order to achieve or remain a dominant participant in internet services. Because a goal of the VRC is to build an environment for NASA centers, universities, and industrial partners to share information associated with Advanced Concepts Office activities, the VRC tracks new techniques and services associated with the web in order to determine the their usefulness for distributed and collaborative engineering research activities. Most recently, Java has emerged as a new tool for providing internet services. Because the major web browser providers have decided to include Java in their software, investigations into Java were conducted this summer.

  11. Hospice palliative care article publications: An analysis of the Web of Science database from 1993 to 2013.

    PubMed

    Chang, Hsiao-Ting; Lin, Ming-Hwai; Chen, Chun-Ku; Hwang, Shinn-Jang; Hwang, I-Hsuan; Chen, Yu-Chun

    2016-01-01

    Academic publications are important for developing a medical specialty or discipline and improvements of quality of care. As hospice palliative care medicine is a rapidly growing medical specialty in Taiwan, this study aimed to analyze the hospice palliative care-related publications from 1993 through 2013 both worldwide and in Taiwan, by using the Web of Science database. Academic articles published with topics including "hospice", "palliative care", "end of life care", and "terminal care" were retrieved and analyzed from the Web of Science database, which includes documents published in Science Citation Index-Expanded and Social Science Citation Indexed journals from 1993 to 2013. Compound annual growth rates (CAGRs) were calculated to evaluate the trends of publications. There were a total of 27,788 documents published worldwide during the years 1993 to 2013. The top five most prolific countries/areas with published documents were the United States (11,419 documents, 41.09%), England (3620 documents, 13.03%), Canada (2428 documents, 8.74%), Germany (1598 documents, 5.75%), and Australia (1580 documents, 5.69%). Three hundred and ten documents (1.12%) were published from Taiwan, which ranks second among Asian countries (after Japan, with 594 documents, 2.14%) and 16(th) in the world. During this 21-year period, the number of hospice palliative care-related article publications increased rapidly. The worldwide CAGR for hospice palliative care publications during 1993 through 2013 was 12.9%. As for Taiwan, the CAGR for publications during 1999 through 2013 was 19.4%. The majority of these documents were submitted from universities or hospitals affiliated to universities. The number of hospice palliative care-related publications increased rapidly from 1993 to 2013 in the world and in Taiwan; however, the number of publications from Taiwan is still far below those published in several other countries. Further research is needed to identify and try to reduce the barriers to hospice palliative care research and publication in Taiwan. Copyright © 2015. Published by Elsevier Taiwan LLC.

  12. Compliance Options Diagrams for the Paper and Other Web Coating National Emission Standards for Hazardous Air Pollutants (NESHAP)

    EPA Pesticide Factsheets

    This January 2004 document contains 14 diagrams illustrating the different compliance options available for those facilities that fall under the Paper and Web Coating Maximum Achievable control Technology (MACT).

  13. LCS Content Document Application

    NASA Technical Reports Server (NTRS)

    Hochstadt, Jake

    2011-01-01

    My project at KSC during my spring 2011 internship was to develop a Ruby on Rails application to manage Content Documents..A Content Document is a collection of documents and information that describes what software is installed on a Launch Control System Computer. It's important for us to make sure the tools we use everyday are secure, up-to-date, and properly licensed. Previously, keeping track of the information was done by Excel and Word files between different personnel. The goal of the new application is to be able to manage and access the Content Documents through a single database backed web application. Our LCS team will benefit greatly with this app. Admin's will be able to login securely to keep track and update the software installed on each computer in a timely manner. We also included exportability such as attaching additional documents that can be downloaded from the web application. The finished application will ease the process of managing Content Documents while streamlining the procedure. Ruby on Rails is a very powerful programming language and I am grateful to have the opportunity to build this application.

  14. Croatian Medical Journal Citation Score in Web of Science, Scopus, and Google Scholar

    PubMed Central

    Šember, Marijan; Utrobičić, Ana; Petrak, Jelka

    2010-01-01

    Aim To analyze the 2007 citation count of articles published by the Croatian Medical Journal in 2005-2006 based on data from the Web of Science, Scopus, and Google Scholar. Methods Web of Science and Scopus were searched for the articles published in 2005-2006. As all articles returned by Scopus were included in Web of Science, the latter list was the sample for further analysis. Total citation counts for each article on the list were retrieved from Web of Science, Scopus, and Google Scholar. The overlap and unique citations were compared and analyzed. Proportions were compared using χ2-test. Results Google Scholar returned the greatest proportion of articles with citations (45%), followed by Scopus (42%), and Web of Science (38%). Almost a half (49%) of articles had no citations and 11% had an equal number of identical citations in all 3 databases. The greatest overlap was found between Web of Science and Scopus (54%), followed by Scopus and Google Scholar (51%), and Web of Science and Google Scholar (44%). The greatest number of unique citations was found by Google Scholar (n = 86). The majority of these citations (64%) came from journals, followed by books and PhD theses. Approximately 55% of all citing documents were full-text resources in open access. The language of citing documents was mostly English, but as many as 25 citing documents (29%) were in Chinese. Conclusion Google Scholar shares a total of 42% citations returned by two others, more influential, bibliographic resources. The list of unique citations in Google Scholar is predominantly journal based, but these journals are mainly of local character. Citations received by internationally recognized medical journals are crucial for increasing the visibility of small medical journals but Google Scholar may serve as an alternative bibliometric tool for an orientational citation insight. PMID:20401951

  15. Capitalizing on Web 2.0 in the Social Studies Context

    ERIC Educational Resources Information Center

    Holcomb, Lori B.; Beal, Candy M.

    2010-01-01

    This paper focuses primarily on the integration of Web 2.0 technologies into social studies education. It documents how various Web 2.0 tools can be utilized in the social studies context to support and enhance teaching and learning. For the purposes of focusing on one specific topic, global connections at the middle school level will be the…

  16. Automated MeSH indexing of the World-Wide Web.

    PubMed Central

    Fowler, J.; Kouramajian, V.; Maram, S.; Devadhar, V.

    1995-01-01

    To facilitate networked discovery and information retrieval in the biomedical domain, we have designed a system for automatic assignment of Medical Subject Headings to documents retrieved from the World-Wide Web. Our prototype implementations show significant promise. We describe our methods and discuss the further development of a completely automated indexing tool called the "Web-MeSH Medibot." PMID:8563421

  17. 75 FR 76401 - Pilot Program for Extended Time Period To Reply to a Notice To File Missing Parts of...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-08

    ... filing system, EFS-Web, and selecting the document description of ``Certification and Request for Missing... Filing System Web (EFS-Web), 74 FR 55200 (Oct. 27, 2009), 1348 Off. Gaz. Pat. Office 394 (Nov. 24, 2009... parts notice, including increased use of the eighteen-month publication system, more time for applicants...

  18. 78 FR 44603 - Byron Nuclear Station, Units 1 and 2, and Braidwood Nuclear Station, Units 1 and 2; Exelon...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-07-24

    ... rule, the participant must file the document using the NRC's online, Web-based submission form. In... form, including the installation of the Web browser plug-in, is available on the NRC's public Web site... 61010, and near Braidwood at the Fossil Ridge (Braidwood) Public Library, 386 W. Kennedy Road, Braidwood...

  19. Panning for Gold: Utility of the World Wide Web for Metadata and Authority Control in Special Collections.

    ERIC Educational Resources Information Center

    Ellero, Nadine P.

    2002-01-01

    Describes the use of the World Wide Web as a name authority resource and tool for special collections' analytic-level cataloging, based on experiences at The Claude Moore Health Sciences Library. Highlights include primary documents and metadata; authority control and the Web as authority source information; and future possibilities. (Author/LRW)

  20. Network dynamics: The World Wide Web

    NASA Astrophysics Data System (ADS)

    Adamic, Lada Ariana

    Despite its rapidly growing and dynamic nature, the Web displays a number of strong regularities which can be understood by drawing on methods of statistical physics. This thesis finds power-law distributions in website sizes, traffic, and links, and more importantly, develops a stochastic theory which explains them. Power-law link distributions are shown to lead to network characteristics which are especially suitable for scalable localized search. It is also demonstrated that the Web is a "small world": to reach one site from any other takes an average of only 4 hops, while most related sites cluster together. Additional dynamical properties of the Web graph are extracted from diffusion processes.

  1. Bibliometric analysis of nutrition and dietetics research activity in Arab countries using ISI Web of Science database.

    PubMed

    Sweileh, Waleed M; Al-Jabi, Samah W; Sawalha, Ansam F; Zyoud, Sa'ed H

    2014-01-01

    Reducing nutrition-related health problems in Arab countries requires an understanding of the performance of Arab countries in the field of nutrition and dietetics research. Assessment of research activity from a particular country or region could be achieved through bibliometric analysis. This study was carried out to investigate research activity in "nutrition and dietetics" in Arab countries. Original and review articles published from Arab countries in "nutrition and dietetics" Web of Science category up until 2012 were retrieved and analyzed using the ISI Web of Science database. The total number of documents published in "nutrition and dietetics" category from Arab countries was 2062. This constitutes 1% of worldwide research activity in the field. Annual research productivity showed a significant increase after 2005. Approximately 60% of published documents originated from three Arab countries, particularly Egypt, Kingdom of Saudi Arabia, and Tunisia. However, Kuwait has the highest research productivity per million inhabitants. Main research areas of published documents were in "Food Science/Technology" and "Chemistry" which constituted 75% of published documents compared with 25% for worldwide documents in nutrition and dietetics. A total of 329 (15.96%) nutrition - related diabetes or obesity or cancer documents were published from Arab countries compared with 21% for worldwide published documents. Interest in nutrition and dietetics research is relatively recent in Arab countries. Focus of nutrition research is mainly toward food technology and chemistry with lesser activity toward nutrition-related health research. International cooperation in nutrition research will definitely help Arab researchers in implementing nutrition research that will lead to better national policies regarding nutrition.

  2. Fast segmentation of satellite images using SLIC, WebGL and Google Earth Engine

    NASA Astrophysics Data System (ADS)

    Donchyts, Gennadii; Baart, Fedor; Gorelick, Noel; Eisemann, Elmar; van de Giesen, Nick

    2017-04-01

    Google Earth Engine (GEE) is a parallel geospatial processing platform, which harmonizes access to petabytes of freely available satellite images. It provides a very rich API, allowing development of dedicated algorithms to extract useful geospatial information from these images. At the same time, modern GPUs provide thousands of computing cores, which are mostly not utilized in this context. In the last years, WebGL became a popular and well-supported API, allowing fast image processing directly in web browsers. In this work, we will evaluate the applicability of WebGL to enable fast segmentation of satellite images. A new implementation of a Simple Linear Iterative Clustering (SLIC) algorithm using GPU shaders will be presented. SLIC is a simple and efficient method to decompose an image in visually homogeneous regions. It adapts a k-means clustering approach to generate superpixels efficiently. While this approach will be hard to scale, due to a significant amount of data to be transferred to the client, it should significantly improve exploratory possibilities and simplify development of dedicated algorithms for geoscience applications. Our prototype implementation will be used to improve surface water detection of the reservoirs using multispectral satellite imagery.

  3. BAGEL4: a user-friendly web server to thoroughly mine RiPPs and bacteriocins.

    PubMed

    van Heel, Auke J; de Jong, Anne; Song, Chunxu; Viel, Jakob H; Kok, Jan; Kuipers, Oscar P

    2018-05-21

    Interest in secondary metabolites such as RiPPs (ribosomally synthesized and posttranslationally modified peptides) is increasing worldwide. To facilitate the research in this field we have updated our mining web server. BAGEL4 is faster than its predecessor and is now fully independent from ORF-calling. Gene clusters of interest are discovered using the core-peptide database and/or through HMM motifs that are present in associated context genes. The databases used for mining have been updated and extended with literature references and links to UniProt and NCBI. Additionally, we have included automated promoter and terminator prediction and the option to upload RNA expression data, which can be displayed along with the identified clusters. Further improvements include the annotation of the context genes, which is now based on a fast blast against the prokaryote part of the UniRef90 database, and the improved web-BLAST feature that dynamically loads structural data such as internal cross-linking from UniProt. Overall BAGEL4 provides the user with more information through a user-friendly web-interface which simplifies data evaluation. BAGEL4 is freely accessible at http://bagel4.molgenrug.nl.

  4. DMINDA: an integrated web server for DNA motif identification and analyses.

    PubMed

    Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying

    2014-07-01

    DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Implementation of a scalable, web-based, automated clinical decision support risk-prediction tool for chronic kidney disease using C-CDA and application programming interfaces.

    PubMed

    Samal, Lipika; D'Amore, John D; Bates, David W; Wright, Adam

    2017-11-01

    Clinical decision support tools for risk prediction are readily available, but typically require workflow interruptions and manual data entry so are rarely used. Due to new data interoperability standards for electronic health records (EHRs), other options are available. As a clinical case study, we sought to build a scalable, web-based system that would automate calculation of kidney failure risk and display clinical decision support to users in primary care practices. We developed a single-page application, web server, database, and application programming interface to calculate and display kidney failure risk. Data were extracted from the EHR using the Consolidated Clinical Document Architecture interoperability standard for Continuity of Care Documents (CCDs). EHR users were presented with a noninterruptive alert on the patient's summary screen and a hyperlink to details and recommendations provided through a web application. Clinic schedules and CCDs were retrieved using existing application programming interfaces to the EHR, and we provided a clinical decision support hyperlink to the EHR as a service. We debugged a series of terminology and technical issues. The application was validated with data from 255 patients and subsequently deployed to 10 primary care clinics where, over the course of 1 year, 569 533 CCD documents were processed. We validated the use of interoperable documents and open-source components to develop a low-cost tool for automated clinical decision support. Since Consolidated Clinical Document Architecture-based data extraction extends to any certified EHR, this demonstrates a successful modular approach to clinical decision support. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association.

  6. Subject Indexing and Citation Indexing--Part I: Clustering Structure in the Cystic Fibrosis Document Collection [and] Part II: An Evaluation and Comparison.

    ERIC Educational Resources Information Center

    Shaw, W. M., Jr.

    1990-01-01

    These two articles discuss clustering structure in the Cystic Fibrosis Document Collection, which is derived from the National Library of Medicine's MEDLINE file. The exhaustivity of four subject representations and two citation representations is examined, and descriptor-weight thresholds and similarity thresholds are used to compute…

  7. 29 CFR 2200.8 - Filing.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... shall be in the manner specified by the Commission's Web site (http://www.OSHRC.gov). (2) A document...: (i) If Social Security numbers must be included in a document, only the last four digits of that...

  8. Comparing cosmic web classifiers using information theory

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Leclercq, Florent; Lavaux, Guilhem; Wandelt, Benjamin

    We introduce a decision scheme for optimally choosing a classifier, which segments the cosmic web into different structure types (voids, sheets, filaments, and clusters). Our framework, based on information theory, accounts for the design aims of different classes of possible applications: (i) parameter inference, (ii) model selection, and (iii) prediction of new observations. As an illustration, we use cosmographic maps of web-types in the Sloan Digital Sky Survey to assess the relative performance of the classifiers T-WEB, DIVA and ORIGAMI for: (i) analyzing the morphology of the cosmic web, (ii) discriminating dark energy models, and (iii) predicting galaxy colors. Ourmore » study substantiates a data-supported connection between cosmic web analysis and information theory, and paves the path towards principled design of analysis procedures for the next generation of galaxy surveys. We have made the cosmic web maps, galaxy catalog, and analysis scripts used in this work publicly available.« less

  9. Topic Models for Link Prediction in Document Networks

    ERIC Educational Resources Information Center

    Kataria, Saurabh

    2012-01-01

    Recent explosive growth of interconnected document collections such as citation networks, network of web pages, content generated by crowd-sourcing in collaborative environments, etc., has posed several challenging problems for data mining and machine learning community. One central problem in the domain of document networks is that of "link…

  10. Paper and Other Web Coating: National Emission Standards for Hazardous Air Pollutants (NESHAP)

    EPA Pesticide Factsheets

    Find information on the NESHAP for paper and other web coatings. Read the rule summary, history and supporting documents including fact sheets, responses to public comments, related rules, and compliance and applicability information for this regulation.

  11. An experiment with content distribution methods in touchscreen mobile devices.

    PubMed

    Garcia-Lopez, Eva; Garcia-Cabot, Antonio; de-Marcos, Luis

    2015-09-01

    This paper compares the usability of three different content distribution methods (scrolling, paging and internal links) in touchscreen mobile devices as means to display web documents. Usability is operationalized in terms of effectiveness, efficiency and user satisfaction. These dimensions are then measured in an experiment (N = 23) in which users are required to find words in regular-length web documents. Results suggest that scrolling is statistically better in terms of efficiency and user satisfaction. It is also found to be more effective but results were not significant. Our findings are also compared with existing literature to propose the following guideline: "try to use vertical scrolling in web pages for mobile devices instead of paging or internal links, except when the content is too large, then paging is recommended". With an ever increasing number of touchscreen web-enabled mobile devices, this new guideline can be relevant for content developers targeting the mobile web as well as institutions trying to improve the usability of their content for mobile platforms. Copyright © 2015 Elsevier Ltd and The Ergonomics Society. All rights reserved.

  12. Baryons at the edge of the X-ray-brightest galaxy cluster.

    PubMed

    Simionescu, Aurora; Allen, Steven W; Mantz, Adam; Werner, Norbert; Takei, Yoh; Morris, R Glenn; Fabian, Andrew C; Sanders, Jeremy S; Nulsen, Paul E J; George, Matthew R; Taylor, Gregory B

    2011-03-25

    Studies of the diffuse x-ray-emitting gas in galaxy clusters have provided powerful constraints on cosmological parameters and insights into plasma astrophysics. However, measurements of the faint cluster outskirts have become possible only recently. Using data from the Suzaku x-ray telescope, we determined an accurate, spatially resolved census of the gas, metals, and dark matter out to the edge of the Perseus Cluster. Contrary to previous results, our measurements of the cluster baryon fraction are consistent with the expected universal value at half of the virial radius. The apparent baryon fraction exceeds the cosmic mean at larger radii, suggesting a clumpy distribution of the gas, which is important for understanding the ongoing growth of clusters from the surrounding cosmic web.

  13. Sub-word image clustering in Farsi printed books

    NASA Astrophysics Data System (ADS)

    Soheili, Mohammad Reza; Kabir, Ehsanollah; Stricker, Didier

    2015-02-01

    Most OCR systems are designed for the recognition of a single page. In case of unfamiliar font faces, low quality papers and degraded prints, the performance of these products drops sharply. However, an OCR system can use redundancy of word occurrences in large documents to improve recognition results. In this paper, we propose a sub-word image clustering method for the applications dealing with large printed documents. We assume that the whole document is printed by a unique unknown font with low quality print. Our proposed method finds clusters of equivalent sub-word images with an incremental algorithm. Due to the low print quality, we propose an image matching algorithm for measuring the distance between two sub-word images, based on Hamming distance and the ratio of the area to the perimeter of the connected components. We built a ground-truth dataset of more than 111000 sub-word images to evaluate our method. All of these images were extracted from an old Farsi book. We cluster all of these sub-words, including isolated letters and even punctuation marks. Then all centers of created clusters are labeled manually. We show that all sub-words of the book can be recognized with more than 99.7% accuracy by assigning the label of each cluster center to all of its members.

  14. Text grouping in patent analysis using adaptive K-means clustering algorithm

    NASA Astrophysics Data System (ADS)

    Shanie, Tiara; Suprijadi, Jadi; Zulhanif

    2017-03-01

    Patents are one of the Intellectual Property. Analyzing patent is one requirement in knowing well the development of technology in each country and in the world now. This study uses the patent document coming from the Espacenet server about Green Tea. Patent documents related to the technology in the field of tea is still widespread, so it will be difficult for users to information retrieval (IR). Therefore, it is necessary efforts to categorize documents in a specific group of related terms contained therein. This study uses titles patent text data with the proposed Green Tea in Statistical Text Mining methods consists of two phases: data preparation and data analysis stage. The data preparation phase uses Text Mining methods and data analysis stage is done by statistics. Statistical analysis in this study using a cluster analysis algorithm, the Adaptive K-Means Clustering Algorithm. Results from this study showed that based on the maximum value Silhouette, generate 87 clusters associated fifteen terms therein that can be utilized in the process of information retrieval needs.

  15. Emergency Response Capability Baseline Needs Assessment - Requirements Document

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sharry, John A.

    This document was prepared by John A. Sharry, LLNL Fire Marshal and LLNL Division Leader for Fire Protection and reviewed by LLNL Emergency Management Department Head James Colson. The document follows and expands upon the format and contents of the DOE Model Fire Protection Baseline Capabilities Assessment document contained on the DOE Fire Protection Web Site, but only addresses emergency response.

  16. Warm-hot baryons comprise 5-10 per cent of filaments in the cosmic web.

    PubMed

    Eckert, Dominique; Jauzac, Mathilde; Shan, HuanYuan; Kneib, Jean-Paul; Erben, Thomas; Israel, Holger; Jullo, Eric; Klein, Matthias; Massey, Richard; Richard, Johan; Tchernin, Céline

    2015-12-03

    Observations of the cosmic microwave background indicate that baryons account for 5 per cent of the Universe's total energy content. In the local Universe, the census of all observed baryons falls short of this estimate by a factor of two. Cosmological simulations indicate that the missing baryons have not condensed into virialized haloes, but reside throughout the filaments of the cosmic web (where matter density is larger than average) as a low-density plasma at temperatures of 10(5)-10(7) kelvin, known as the warm-hot intergalactic medium. There have been previous claims of the detection of warm-hot baryons along the line of sight to distant blazars and of hot gas between interacting clusters. These observations were, however, unable to trace the large-scale filamentary structure, or to estimate the total amount of warm-hot baryons in a representative volume of the Universe. Here we report X-ray observations of filamentary structures of gas at 10(7) kelvin associated with the galaxy cluster Abell 2744. Previous observations of this cluster were unable to resolve and remove coincidental X-ray point sources. After subtracting these, we find hot gas structures that are coherent over scales of 8 megaparsecs. The filaments coincide with over-densities of galaxies and dark matter, with 5-10 per cent of their mass in baryonic gas. This gas has been heated up by the cluster's gravitational pull and is now feeding its core. Our findings strengthen evidence for a picture of the Universe in which a large fraction of the missing baryons reside in the filaments of the cosmic web.

  17. New atlas of open star clusters

    NASA Astrophysics Data System (ADS)

    Seleznev, Anton F.; Avvakumova, Ekaterina; Kulesh, Maxim; Filina, Julia; Tsaregorodtseva, Polina; Kvashnina, Alvira

    2017-11-01

    Due to numerous new discoveries of open star clusters in the last two decades, astronomers need an easy-touse resource to get visual information on the relative position of clusters in the sky. Therefore we propose a new atlas of open star clusters. It is based on a table compiled from the largest modern cluster catalogues. The atlas shows the positions and sizes of 3291 clusters and associations, and consists of two parts. The first contains 108 maps of 12 by 12 degrees with an overlapping of 2 degrees in three strips along the Galactic equator. The second one is an online web application, which shows a square field of an arbitrary size, either in equatorial coordinates or in galactic coordinates by request. The atlas is proposed for the sampling of clusters and cluster stars for further investigation. Another use is the identification of clusters among overdensities in stellar density maps or among stellar groups in images of the sky.

  18. 31 CFR Appendix A to Part 560 - Persons Determined to be the Government of Iran, as defined in § 560.304 of This Part

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ...), available on OFAC's Web site. New names of persons determined to be the Government of Iran and changes to...'s Web site. Appendix A to Part 560 will be republished annually. This document and additional information concerning OFAC are available from OFAC's Web site (http://www.treas.gov/ofac). Certain general...

  19. 31 CFR Appendix A to Part 560 - Persons Determined To Be the Government of Iran, as Defined in § 560.304 of This Part

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... through the following page on OFAC's Web site: http://www.treasury.gov/sdn. Additional information.... This document and additional information concerning OFAC are available from OFAC's Web site: http://www... via facsimile through a 24-hour fax-on-demand service, tel.: 202/622-0077. Please consult OFAC's Web...

  20. Unveiling the Synchrotron Cosmic Web: Pilot Study

    NASA Astrophysics Data System (ADS)

    Brown, Shea; Rudnick, Lawrence; Pfrommer, Christoph; Jones, Thomas

    2011-10-01

    The overall goal of this project is to challenge our current theoretical understanding of the relativistic particle populations in the inter-galactic medium (IGM) through deep 1.4 GHz observations of 13 massive, high-redshift clusters of galaxies. Designed to compliment/extend the GMRT radio halo survey (Venturi et al. 2007), these observations will attempt to detect the peaks of the purported synchrotron cosmic-web, and place serious limits on models of CR acceleration and magnetic field amplification during large-scale structure formation. The primary goals of this survey are: 1) Confirm the bi-modal nature of the radio halo population, which favors turbulent re-acceleration of cosmic-ray electrons (CRe) during cluster mergers as the source of the diffuse radio emission; 2) Directly test hadronic secondary models which predict the presence of cosmic-ray protons (CRp) in the cores of massive X-ray clusters; 3) Search in polarization for shock structures, a potential source of CR acceleration in the IGM.

  1. CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats.

    PubMed

    Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine

    2007-07-01

    Clustered regularly interspaced short palindromic repeats (CRISPRs) constitute a particular family of tandem repeats found in a wide range of prokaryotic genomes (half of eubacteria and almost all archaea). They consist of a succession of highly conserved regions (DR) varying in size from 23 to 47 bp, separated by similarly sized unique sequences (spacer) of usually viral origin. A CRISPR cluster is flanked on one side by an AT-rich sequence called the leader and assumed to be a transcriptional promoter. Recent studies suggest that this structure represents a putative RNA-interference-based immune system. Here we describe CRISPRFinder, a web service offering tools to (i) detect CRISPRs including the shortest ones (one or two motifs); (ii) define DRs and extract spacers; (iii) get the flanking sequences to determine the leader; (iv) blast spacers against Genbank database and (v) check if the DR is found elsewhere in prokaryotic sequenced genomes. CRISPRFinder is freely accessible at http://crispr.u-psud.fr/Server/CRISPRfinder.php.

  2. 49 CFR 40.45 - What form is used to document a DOT urine collection?

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... view this form on the Department's web site (http://www.dot.gov/ost/dapc) or the HHS web site (http... employee (other than a social security number (SSN) or other employee identification (ID) number) to a...

  3. 49 CFR 40.45 - What form is used to document a DOT urine collection?

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... view this form on the Department's web site (http://www.dot.gov/ost/dapc) or the HHS web site (http... employee (other than a social security number (SSN) or other employee identification (ID) number) to a...

  4. Contingency Contractor Optimization Phase 3 Sustainment Third-Party Software List - Contingency Contractor Optimization Tool - Prototype

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Durfee, Justin David; Frazier, Christopher Rawls; Bandlow, Alisa

    2016-05-01

    The Contingency Contractor Optimization Tool - Prototype (CCOT-P) requires several third-party software packages. These are documented below for each of the CCOT-P elements: client, web server, database server, solver, web application and polling application.

  5. 77 FR 71711 - Commission's Rules Regarding the Office of Managing Director and the Office of Inspector General

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-12-04

    ... Transition Order is also available on the Internet at the Commission's Electronic Filing System Web Page at... you may contact BCPI at its Web site: http://www.BCPIWEB.com . When ordering documents from BCPI...

  6. Paper and Other Web Coating National Emission Standards for Hazardous Air Pollutants (NESHAP) Questions and Answers

    EPA Pesticide Factsheets

    This May 2003 document contains questions and answers on the Paper and Web Coating National Emission Standards for Hazardous Air Pollutants (NESHAP) regulation. The questions cover topics such as compliance, applicability, and initial notification.

  7. 70 FR 69562 - Office of Environmental Information; Request for Comment and Request for Information on System...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2005-11-16

    ... Reference System (TRS) [see http://www.epa.gov/trs ] in order to better support future semantic Web needs... creation of glossaries for Web pages and documents, a common vocabulary for search engines, and in the...

  8. Web-Education Systems in Europe. ZIFF Papiere.

    ERIC Educational Resources Information Center

    Paulsen, Morten; Keegan, Desmond; Dias, Ana; Dias, Paulo; Pimenta, Pedro; Fritsch, Helmut; Follmer, Holger; Micincova, Maria; Olsen, Gro-Anett

    This document contains the following papers on Web-based education systems in Europe: (1) "European Experiences with Learning Management Systems" (Morten Flate Paulsen and Desmond Keegan); (2) "Online Education Systems: Definition of Terms" (Morten Flate Paulsen); (3) "Learning Management Systems (LMS) Used in Southern…

  9. The Sargassum Early Advisory System (SEAS)

    NASA Astrophysics Data System (ADS)

    Armstrong, D.; Gallegos, S. C.

    2016-02-01

    The Sargassum Early Advisory System (SEAS) web-app was designed to automatically detect Sargassum at sea, forecast movement of the seaweed, and alert users of potential landings. Inspired to help address the economic hardships caused by large landings of Sargassum, the web app automates and enhances the manual tasks conducted by the SEAS group of Texas A&M University at Galveston. The SEAS web app is a modular, mobile-friendly tool that automates the entire workflow from data acquisition to user management. The modules include: 1) an Imagery Retrieval Module to automatically download Landsat-8 Operational Land Imagery (OLI) from the United States Geological Survey (USGS), 2) a Processing Module for automatic detection of Sargassum in the OLI imagery, and subsequent mapping of theses patches in the HYCOM grid, producing maps that show Sargassum clusters; 3) a Forecasting engine fed by the HYbrid Coordinate Ocean Model (HYCOM) model currents and winds from weather buoys; and 4) a mobile phone optimized geospatial user interface. The user can view the last known position of Sargassum clusters, trajectory and location projections for the next 24, 72 and 168 hrs. Users can also subscribe to alerts generated for particular areas. Currently, the SEAS web app produces advisories for Texas beaches. The forecasted Sargassum landing locations are validated by reports from Texas beach managers. However, the SEAS web app was designed to easily expand to other areas, and future plans call for extending the SEAS web app to Mexico and the Caribbean islands. The SEAS web app development is led by NASA, with participation by ASRC Federal/Computer Science Corporation, and the Naval Research Laboratory, all at Stennis Space Center, and Texas A&M University at Galveston.

  10. Processing ARM VAP data on an AWS cluster

    NASA Astrophysics Data System (ADS)

    Martin, T.; Macduff, M.; Shippert, T.

    2017-12-01

    The Atmospheric Radiation Measurement (ARM) Data Management Facility (DMF) manages over 18,000 processes and 1.3 TB of data each day. This includes many Value Added Products (VAPs) that make use of multiple instruments to produce the derived products that are scientifically relevant. A thermodynamic and cloud profile VAP is being developed to provide input to the ARM Large-eddy simulation (LES) ARM Symbiotic Simulation and Observation (LASSO) project (https://www.arm.gov/capabilities/vaps/lasso-122) . This algorithm is CPU intensive and the processing requirements exceeded the available DMF computing capacity. Amazon Web Service (AWS) along with CfnCluster was investigated to see how it would perform. This cluster environment is cost effective and scales dynamically based on demand. We were able to take advantage of autoscaling which allowed the cluster to grow and shrink based on the size of the processing queue. We also were able to take advantage of the Amazon Web Services spot market to further reduce the cost. Our test was very successful and found that cloud resources can be used to efficiently and effectively process time series data. This poster will present the resources and methodology used to successfully run the algorithm.

  11. Tags Extarction from Spatial Documents in Search Engines

    NASA Astrophysics Data System (ADS)

    Borhaninejad, S.; Hakimpour, F.; Hamzei, E.

    2015-12-01

    Nowadays the selective access to information on the Web is provided by search engines, but in the cases which the data includes spatial information the search task becomes more complex and search engines require special capabilities. The purpose of this study is to extract the information which lies in spatial documents. To that end, we implement and evaluate information extraction from GML documents and a retrieval method in an integrated approach. Our proposed system consists of three components: crawler, database and user interface. In crawler component, GML documents are discovered and their text is parsed for information extraction; storage. The database component is responsible for indexing of information which is collected by crawlers. Finally the user interface component provides the interaction between system and user. We have implemented this system as a pilot system on an Application Server as a simulation of Web. Our system as a spatial search engine provided searching capability throughout the GML documents and thus an important step to improve the efficiency of search engines has been taken.

  12. How many scientific papers are mentioned in policy-related documents? An empirical investigation using Web of Science and Altmetric data.

    PubMed

    Haunschild, Robin; Bornmann, Lutz

    2017-01-01

    In this short communication, we provide an overview of a relatively newly provided source of altmetrics data which could possibly be used for societal impact measurements in scientometrics. Recently, Altmetric-a start-up providing publication level metrics-started to make data for publications available which have been mentioned in policy-related documents. Using data from Altmetric, we study how many papers indexed in the Web of Science (WoS) are mentioned in policy-related documents. We find that less than 0.5% of the papers published in different subject categories are mentioned at least once in policy-related documents. Based on our results, we recommend that the analysis of (WoS) publications with at least one policy-related mention is repeated regularly (annually) in order to check the usefulness of the data. Mentions in policy-related documents should not be used for impact measurement until new policy-related sites are tracked.

  13. Symmetric nonnegative matrix factorization: algorithms and applications to probabilistic clustering.

    PubMed

    He, Zhaoshui; Xie, Shengli; Zdunek, Rafal; Zhou, Guoxu; Cichocki, Andrzej

    2011-12-01

    Nonnegative matrix factorization (NMF) is an unsupervised learning method useful in various applications including image processing and semantic analysis of documents. This paper focuses on symmetric NMF (SNMF), which is a special case of NMF decomposition. Three parallel multiplicative update algorithms using level 3 basic linear algebra subprograms directly are developed for this problem. First, by minimizing the Euclidean distance, a multiplicative update algorithm is proposed, and its convergence under mild conditions is proved. Based on it, we further propose another two fast parallel methods: α-SNMF and β -SNMF algorithms. All of them are easy to implement. These algorithms are applied to probabilistic clustering. We demonstrate their effectiveness for facial image clustering, document categorization, and pattern clustering in gene expression.

  14. CoNekT: an open-source framework for comparative genomic and transcriptomic network analyses.

    PubMed

    Proost, Sebastian; Mutwil, Marek

    2018-05-01

    The recent accumulation of gene expression data in the form of RNA sequencing creates unprecedented opportunities to study gene regulation and function. Furthermore, comparative analysis of the expression data from multiple species can elucidate which functional gene modules are conserved across species, allowing the study of the evolution of these modules. However, performing such comparative analyses on raw data is not feasible for many biologists. Here, we present CoNekT (Co-expression Network Toolkit), an open source web server, that contains user-friendly tools and interactive visualizations for comparative analyses of gene expression data and co-expression networks. These tools allow analysis and cross-species comparison of (i) gene expression profiles; (ii) co-expression networks; (iii) co-expressed clusters involved in specific biological processes; (iv) tissue-specific gene expression; and (v) expression profiles of gene families. To demonstrate these features, we constructed CoNekT-Plants for green alga, seed plants and flowering plants (Picea abies, Chlamydomonas reinhardtii, Vitis vinifera, Arabidopsis thaliana, Oryza sativa, Zea mays and Solanum lycopersicum) and thus provide a web-tool with the broadest available collection of plant phyla. CoNekT-Plants is freely available from http://conekt.plant.tools, while the CoNekT source code and documentation can be found at https://github.molgen.mpg.de/proost/CoNekT/.

  15. 76 FR 22735 - Shaw AREVA MOX Services, Mixed Oxide Fuel Fabrication Facility; License Amendment Request, Notice...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-04-22

    ... NUCLEAR REGULATORY COMMISSION [Docket No. 70-3098; NRC-2011-0081] Shaw AREVA MOX Services, Mixed... following methods: Federal Rulemaking Web site: Go to http://www.regulations.gov and search for documents... publicly available documents related to this notice using the following methods: NRC's Public Document Room...

  16. Document Concurrence System

    NASA Technical Reports Server (NTRS)

    Muhsin, Mansour; Walters, Ian

    2004-01-01

    The Document Concurrence System is a combination of software modules for routing users expressions of concurrence with documents. This system enables determination of the current status of concurrences and eliminates the need for the prior practice of manually delivering paper documents to all persons whose approvals were required. This system runs on a server, and participants gain access via personal computers equipped with Web-browser and electronic-mail software. A user can begin a concurrence routing process by logging onto an administration module, naming the approvers and stating the sequence for routing among them, and attaching documents. The server then sends a message to the first person on the list. Upon concurrence by the first person, the system sends a message to the second person, and so forth. A person on the list indicates approval, places the documents on hold, or indicates disapproval, via a Web-based module. When the last person on the list has concurred, a message is sent to the initiator, who can then finalize the process through the administration module. A background process running on the server identifies concurrence processes that are overdue and sends reminders to the appropriate persons.

  17. 75 FR 4449 - Requested Administrative Waiver of the Coastwise Trade Laws

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-01-27

    ... electronic version of this document and all documents entered into this docket is available on the World Wide Web at http://www.regulations.gov . FOR FURTHER INFORMATION CONTACT: Joann Spittle, U.S. Department of...

  18. 2016 eCDRweb User Guide–Primary Support

    EPA Pesticide Factsheets

    This document presents the user guide for the Office of Pollution Prevention and Toxics’ (OPPT) 2016 e-CDR web tool. This document is the user guide for the Primary Support user of the 2016 e-CDRweb tool.

  19. WIWS: a protein structure bioinformatics Web service collection.

    PubMed

    Hekkelman, M L; Te Beek, T A H; Pettifer, S R; Thorne, D; Attwood, T K; Vriend, G

    2010-07-01

    The WHAT IF molecular-modelling and drug design program is widely distributed in the world of protein structure bioinformatics. Although originally designed as an interactive application, its highly modular design and inbuilt control language have recently enabled its deployment as a collection of programmatically accessible web services. We report here a collection of WHAT IF-based protein structure bioinformatics web services: these relate to structure quality, the use of symmetry in crystal structures, structure correction and optimization, adding hydrogens and optimizing hydrogen bonds and a series of geometric calculations. The freely accessible web services are based on the industry standard WS-I profile and the EMBRACE technical guidelines, and are available via both REST and SOAP paradigms. The web services run on a dedicated computational cluster; their function and availability is monitored daily.

  20. Online interactive analysis of protein structure ensembles with Bio3D-web.

    PubMed

    Skjærven, Lars; Jariwala, Shashank; Yao, Xin-Qiu; Grant, Barry J

    2016-11-15

    Bio3D-web is an online application for analyzing the sequence, structure and conformational heterogeneity of protein families. Major functionality is provided for identifying protein structure sets for analysis, their alignment and refined structure superposition, sequence and structure conservation analysis, mapping and clustering of conformations and the quantitative comparison of their predicted structural dynamics. Bio3D-web is based on the Bio3D and Shiny R packages. All major browsers are supported and full source code is available under a GPL2 license from http://thegrantlab.org/bio3d-web CONTACT: bjgrant@umich.edu or lars.skjarven@uib.no. © The Author 2016. Published by Oxford University Press.

  1. 33 CFR 148.207 - How and where may I view docketed documents?

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Docket Management System Web site at http://www.dot.dms.gov. The projects are also listed by name and the assigned docket number at the G-PSO-5 Web site: http://www.uscg.mil/hq/g-m/mso/mso5.htm. ...

  2. 78 FR 69710 - Luminant Generation Company, LLC

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-11-20

    ... methods: Federal Rulemaking Web site: Go to http://www.regulations.gov and search for Docket ID NRC-2008... . To begin the search, select ``ADAMS Public Documents'' and then select ``Begin Web- based ADAMS Search.'' For problems with ADAMS, please contact the NRC's Public [[Page 69711

  3. 77 FR 2677 - National Emission Standards for Hazardous Air Pollutants: Primary Aluminum Reduction Plants...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-01-19

    ... be charged for copying. World Wide Web. The EPA Web site for this rulemaking is at: http://www.epa... period will end on February 1, 2012, rather than January 20, 2012. How can I get copies of this document...

  4. Graduate and Inservice Education. [SITE 2002 Section].

    ERIC Educational Resources Information Center

    Crawford, Caroline M., Ed.

    This document contains the papers on graduate and inservice education from the SITE (Society for Information Technology & Teacher Education) 2002 conference. Topics covered include: Geographic Information Systems in teacher education; re-certification and accreditation; construction of a Web site by graduate teacher education students; Web-based…

  5. A Java viewer to publish Digital Imaging and Communications in Medicine (DICOM) radiologic images on the World Wide Web.

    PubMed

    Setti, E; Musumeci, R

    2001-06-01

    The world wide web is an exciting service that allows one to publish electronic documents made of text and images on the internet. Client software called a web browser can access these documents, and display and print them. The most popular browsers are currently Microsoft Internet Explorer (Microsoft, Redmond, WA) and Netscape Communicator (Netscape Communications, Mountain View, CA). These browsers can display text in hypertext markup language (HTML) format and images in Joint Photographic Expert Group (JPEG) and Graphic Interchange Format (GIF). Currently, neither browser can display radiologic images in native Digital Imaging and Communications in Medicine (DICOM) format. With the aim to publish radiologic images on the internet, we wrote a dedicated Java applet. Our software can display radiologic and histologic images in DICOM, JPEG, and GIF formats, and provides a a number of functions like windowing and magnification lens. The applet is compatible with some web browsers, even the older versions. The software is free and available from the author.

  6. flexCloud: Deployment of the FLEXPART Atmospheric Transport Model as a Cloud SaaS Environment

    NASA Astrophysics Data System (ADS)

    Morton, Don; Arnold, Dèlia

    2014-05-01

    FLEXPART (FLEXible PARTicle dispersion model) is a Lagrangian transport and dispersion model used by a growing international community. We have used it to simulate and forecast the atmospheric transport of wildfire smoke, volcanic ash and radionuclides. Additionally, FLEXPART may be run in backwards mode to provide information for the determination of emission sources such as nuclear emissions and greenhouse gases. This open source software is distributed in source code form, and has several compiler and library dependencies that users need to address. Although well-documented, getting it compiled, set up, running, and post-processed is often tedious, making it difficult for the inexperienced user. Our interest is in moving scientific modeling and simulation activities from site-specific clusters and supercomputers to a cloud model as a service paradigm. Choosing FLEXPART for our prototyping, our vision is to construct customised IaaS images containing fully-compiled and configured FLEXPART codes, including pre-processing, execution and postprocessing components. In addition, with the inclusion of a small web server in the image, we introduce a web-accessible graphical user interface that drives the system. A further initiative being pursued is the deployment of multiple, simultaneous FLEXPART ensembles in the cloud. A single front-end web interface is used to define the ensemble members, and separate cloud instances are launched, on-demand, to run the individual models and to conglomerate the outputs into a unified display. The outcome of this work is a Software as a Service (Saas) deployment whereby the details of the underlying modeling systems are hidden, allowing modelers to perform their science activities without the burden of considering implementation details.

  7. The use of fingerprints available on the web in false identity documents: Analysis from a forensic intelligence perspective.

    PubMed

    Girelli, Carlos Magno Alves

    2016-05-01

    Fingerprints present in false identity documents were found on the web. In some cases, laterally reversed (mirrored) images of a same fingerprint were observed in different documents. In the present work, 100 fingerprints images downloaded from the web, as well as their reversals obtained by image editing, were compared between themselves and against the database of the Brazilian Federal Police AFIS, in order to better understand trends about this kind of forgery in Brazil. Some image editing effects were observed in the analyzed fingerprints: addition of artifacts (such as watermarks), image rotation, image stylization, lateral reversal and tonal reversal. Discussion about lateral reversals' detection is presented in this article, as well as suggestion to reduce errors due to missed HIT decisions between reversed fingerprints. The present work aims to highlight the importance of the fingerprints' analysis when performing document examination, especially when only copies of documents are available, something very common in Brazil. Besides the intrinsic features of the fingermarks considered in three levels of details by ACE-V methodology, some visual features of the fingerprints images can be helpful to identify sources of forgeries and modus operandi, such as: limits and image contours, fails in the friction ridges caused by excess or lack of inking and presence of watermarks and artifacts arising from the background. Based on the agreement of such features in fingerprints present in different identity documents and also on the analysis of the time and location where the documents were seized, it is possible to highlight potential links between apparently unconnected crimes. Therefore, fingerprints have potential to reduce linkage blindness and the present work suggests the analysis of fingerprints when profiling false identity documents, as well as the inclusion of fingerprints features in the profile of the documents. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  8. What Are the Usage Conditions of Web 2.0 Tools Faculty of Education Students?

    ERIC Educational Resources Information Center

    Agir, Ahmet

    2014-01-01

    As a result of advances in technology and then the emergence of using Internet in every step of life, web that provides access to the documents such as picture, audio, animation and text in Internet started to be used. At first, web consists of only visual and text pages that couldn't enable to make user's interaction. However, it is seen that not…

  9. Cloud Computing Trace Characterization and Synthetic Workload Generation

    DTIC Science & Technology

    2013-03-01

    measurements [44]. Olio is primarily for learning Web 2.0 technologies, evaluating the three implementations (PHP, Java EE, and RubyOnRails (ROR...Add Event 17 Olio is well documented, but assumes prerequisite knowledge with setup and operation of apache web servers and MySQL databases. Olio...Faban supports numerous servers such as Apache httpd, Sun Java System Web, Portal and Mail Servers, Oracle RDBMS, memcached, and others [18]. Perhaps

  10. 78 FR 14689 - Medicare Program; Extension of the Payment Adjustment for Low-volume Hospitals and the Medicare...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-03-07

    ... use of a Web-based mapping tool, such as MapQuest, as part of documenting that the hospital meets the... only through the Internet on the CMS Web site at http://www.cms.hhs.gov/AcuteInpatientPPS/01_overview...)'' hospitals with claims in the March 2012 update of the FY 2011 MedPAR file, is also available on the CMS Web...

  11. Knowledge representation and management: benefits and challenges of the semantic web for the fields of KRM and NLP.

    PubMed

    Rassinoux, A-M

    2011-01-01

    To summarize excellent current research in the field of knowledge representation and management (KRM). A synopsis of the articles selected for the IMIA Yearbook 2011 is provided and an attempt to highlight the current trends in the field is sketched. This last decade, with the extension of the text-based web towards a semantic-structured web, NLP techniques have experienced a renewed interest in knowledge extraction. This trend is corroborated through the five papers selected for the KRM section of the Yearbook 2011. They all depict outstanding studies that exploit NLP technologies whenever possible in order to accurately extract meaningful information from various biomedical textual sources. Bringing semantic structure to the meaningful content of textual web pages affords the user with cooperative sharing and intelligent finding of electronic data. As exemplified by the best paper selection, more and more advanced biomedical applications aim at exploiting the meaningful richness of free-text documents in order to generate semantic metadata and recently to learn and populate domain ontologies. These later are becoming a key piece as they allow portraying the semantics of the Semantic Web content. Maintaining their consistency with documents and semantic annotations that refer to them is a crucial challenge of the Semantic Web for the coming years.

  12. Information extraction for enhanced access to disease outbreak reports.

    PubMed

    Grishman, Ralph; Huttunen, Silja; Yangarber, Roman

    2002-08-01

    Document search is generally based on individual terms in the document. However, for collections within limited domains it is possible to provide more powerful access tools. This paper describes a system designed for collections of reports of infectious disease outbreaks. The system, Proteus-BIO, automatically creates a table of outbreaks, with each table entry linked to the document describing that outbreak; this makes it possible to use database operations such as selection and sorting to find relevant documents. Proteus-BIO consists of a Web crawler which gathers relevant documents; an information extraction engine which converts the individual outbreak events to a tabular database; and a database browser which provides access to the events and, through them, to the documents. The information extraction engine uses sets of patterns and word classes to extract the information about each event. Preparing these patterns and word classes has been a time-consuming manual operation in the past, but automated discovery tools now make this task significantly easier. A small study comparing the effectiveness of the tabular index with conventional Web search tools demonstrated that users can find substantially more documents in a given time period with Proteus-BIO.

  13. BioServices: a common Python package to access biological Web Services programmatically.

    PubMed

    Cokelaer, Thomas; Pultz, Dennis; Harder, Lea M; Serra-Musach, Jordi; Saez-Rodriguez, Julio

    2013-12-15

    Web interfaces provide access to numerous biological databases. Many can be accessed to in a programmatic way thanks to Web Services. Building applications that combine several of them would benefit from a single framework. BioServices is a comprehensive Python framework that provides programmatic access to major bioinformatics Web Services (e.g. KEGG, UniProt, BioModels, ChEMBLdb). Wrapping additional Web Services based either on Representational State Transfer or Simple Object Access Protocol/Web Services Description Language technologies is eased by the usage of object-oriented programming. BioServices releases and documentation are available at http://pypi.python.org/pypi/bioservices under a GPL-v3 license.

  14. Is nursing ready for WebQuests?

    PubMed

    Lahaie, Ulysses David

    2008-12-01

    Based on an inquiry-oriented framework, WebQuests facilitate the construction of effective learning activities. Developed by Bernie Dodge and Tom March in 1995 at the San Diego State University, WebQuests have gained worldwide popularity among educators in the kindergarten through grade 12 educational sector. However, their application at the college and university levels is not well documented. WebQuests enhance and promote higher order-thinking skills, are consistent with Bloom's Taxonomy, and reflect a learner-centered instructional methodology (constructivism). They are based on solid theoretical foundations and promote critical thinking, inquiry, and problem solving. There is a role for WebQuests in nursing education. A WebQuest example is described in this article.

  15. 2016 eCDRweb User Guide–Primary Authorized Official

    EPA Pesticide Factsheets

    This document presents the user guide for the Office of Pollution Prevention and Toxics’ (OPPT) 2016 e-CDRweb tool. This document is the user guide for the Primary Authorized Official (AO) user of the 2016 e-CDR web tool.

  16. 32 CFR 21.330 - How are the DoDGARs published and maintained?

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ..., and sections, to parallel the CFR publication. Cross references within the DoD document are stated as... document on the World Wide Web at http://www.dtic.mil/whs/directives. (c) A standing working group...

  17. 32 CFR 21.330 - How are the DoDGARs published and maintained?

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ..., and sections, to parallel the CFR publication. Cross references within the DoD document are stated as... document on the World Wide Web at http://www.dtic.mil/whs/directives. (c) A standing working group...

  18. 32 CFR 21.330 - How are the DoDGARs published and maintained?

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ..., and sections, to parallel the CFR publication. Cross references within the DoD document are stated as... document on the World Wide Web at http://www.dtic.mil/whs/directives. (c) A standing working group...

  19. A Multiple-Label Guided Clustering Algorithm for Historical Document Dating and Localization.

    PubMed

    He, Sheng; Samara, Petros; Burgers, Jan; Schomaker, Lambert

    2016-11-01

    It is of essential importance for historians to know the date and place of origin of the documents they study. It would be a huge advancement for historical scholars if it would be possible to automatically estimate the geographical and temporal provenance of a handwritten document by inferring them from the handwriting style of such a document. We propose a multiple-label guided clustering algorithm to discover the correlations between the concrete low-level visual elements in historical documents and abstract labels, such as date and location. First, a novel descriptor, called histogram of orientations of handwritten strokes, is proposed to extract and describe the visual elements, which is built on a scale-invariant polar-feature space. In addition, the multi-label self-organizing map (MLSOM) is proposed to discover the correlations between the low-level visual elements and their labels in a single framework. Our proposed MLSOM can be used to predict the labels directly. Moreover, the MLSOM can also be considered as a pre-structured clustering method to build a codebook, which contains more discriminative information on date and geography. The experimental results on the medieval paleographic scale data set demonstrate that our method achieves state-of-the-art results.

  20. 7 CFR 3430.55 - Technical reporting.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... the Current Research Information System (CRIS). (b) Initial Documentation in the CRIS Database... identification of equipment purchased with any Federal funds under the award and any subsequent use of such equipment. (e) CRIS Web Site Via Internet. The CRIS database is available to the public on the worldwide web...

  1. 49 CFR 571.5 - Matter incorporated by reference.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ...), Superintendent of Documents, U.S. Government Printing Office, Washington, DC 20402 Illuminating Engineering... Services, Hyattsville, MD 20782. Phone: 1-800-232-4636; Web: http://www.cdc.gov/nchs National Highway..., Warrendale, Pennsylvania 15096. Phone: 1-724-776-4841; Web: http://www.sae.org Society of Automotive...

  2. The Quebec National Library on the Web.

    ERIC Educational Resources Information Center

    Kieran, Shirley; Sauve, Diane

    1997-01-01

    Provides an overview of the Quebec National Library (Bibliotheque Nationale du Quebec, or BNQ) Web site. Highlights include issues related to content, design, and technology; IRIS, the BNQ online public access catalog; development of the multimedia catalog; software; digitization of documents; links to bibliographic records; and future…

  3. Hydrogen Financial Analysis Scenario Tool (H2FAST) Documentation

    Science.gov Websites

    for the web and spreadsheet versions of H2FAST. H2FAST Web Tool User's Manual H2FAST Spreadsheet Tool User's Manual (DRAFT) Technical Support Send questions or feedback about H2FAST to H2FAST@nrel.gov. Home

  4. Assessing Greek Public Hospitals' Websites.

    PubMed

    Tsirintani, Maria; Binioris, Spyros

    2015-01-01

    Following a previous (2011) survey, this study assesses the web pages of Greek public hospitals according to specific criteria, which are included in the same web page evaluation model. Our purpose is to demonstrate the evolution of hospitals' web pages and document e-health applications trends. Using descriptive methods we found that public hospitals have made significant steps towards establishing and improving their web presence but there is still a lot of work that needs to be carried out in order to take advantage of the benefits of new technologies in the e-health ecosystem.

  5. Self-aggregation in scaled principal component space

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ding, Chris H.Q.; He, Xiaofeng; Zha, Hongyuan

    2001-10-05

    Automatic grouping of voluminous data into meaningful structures is a challenging task frequently encountered in broad areas of science, engineering and information processing. These data clustering tasks are frequently performed in Euclidean space or a subspace chosen from principal component analysis (PCA). Here we describe a space obtained by a nonlinear scaling of PCA in which data objects self-aggregate automatically into clusters. Projection into this space gives sharp distinctions among clusters. Gene expression profiles of cancer tissue subtypes, Web hyperlink structure and Internet newsgroups are analyzed to illustrate interesting properties of the space.

  6. ClustVis: a web tool for visualizing clustering of multivariate data using Principal Component Analysis and heatmap

    PubMed Central

    Metsalu, Tauno; Vilo, Jaak

    2015-01-01

    The Principal Component Analysis (PCA) is a widely used method of reducing the dimensionality of high-dimensional data, often followed by visualizing two of the components on the scatterplot. Although widely used, the method is lacking an easy-to-use web interface that scientists with little programming skills could use to make plots of their own data. The same applies to creating heatmaps: it is possible to add conditional formatting for Excel cells to show colored heatmaps, but for more advanced features such as clustering and experimental annotations, more sophisticated analysis tools have to be used. We present a web tool called ClustVis that aims to have an intuitive user interface. Users can upload data from a simple delimited text file that can be created in a spreadsheet program. It is possible to modify data processing methods and the final appearance of the PCA and heatmap plots by using drop-down menus, text boxes, sliders etc. Appropriate defaults are given to reduce the time needed by the user to specify input parameters. As an output, users can download PCA plot and heatmap in one of the preferred file formats. This web server is freely available at http://biit.cs.ut.ee/clustvis/. PMID:25969447

  7. Septic Systems

    EPA Pesticide Factsheets

    The web site provides guidance and technical assistance for homeowners, government officials, industry professionals, and EPA partners about how to properly develop and manage individual onsite and community cluster systems that treat domestic wastewater.

  8. Astrophysical data mining with GPU. A case study: Genetic classification of globular clusters

    NASA Astrophysics Data System (ADS)

    Cavuoti, S.; Garofalo, M.; Brescia, M.; Paolillo, M.; Pescape', A.; Longo, G.; Ventre, G.

    2014-01-01

    We present a multi-purpose genetic algorithm, designed and implemented with GPGPU/CUDA parallel computing technology. The model was derived from our CPU serial implementation, named GAME (Genetic Algorithm Model Experiment). It was successfully tested and validated on the detection of candidate Globular Clusters in deep, wide-field, single band HST images. The GPU version of GAME will be made available to the community by integrating it into the web application DAMEWARE (DAta Mining Web Application REsource, http://dame.dsf.unina.it/beta_info.html), a public data mining service specialized on massive astrophysical data. Since genetic algorithms are inherently parallel, the GPGPU computing paradigm leads to a speedup of a factor of 200× in the training phase with respect to the CPU based version.

  9. 4 CFR 201.3 - Publicly available documents and electronic reading room.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 4 Accounts 1 2011-01-01 2011-01-01 false Publicly available documents and electronic reading room. 201.3 Section 201.3 Accounts RECOVERY ACCOUNTABILITY AND TRANSPARENCY BOARD PUBLIC INFORMATION AND REQUESTS § 201.3 Publicly available documents and electronic reading room. (a) Many Board records are available electronically at the Board's Web sit...

  10. 4 CFR 201.3 - Publicly available documents and electronic reading room.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 4 Accounts 1 2012-01-01 2012-01-01 false Publicly available documents and electronic reading room. 201.3 Section 201.3 Accounts RECOVERY ACCOUNTABILITY AND TRANSPARENCY BOARD PUBLIC INFORMATION AND REQUESTS § 201.3 Publicly available documents and electronic reading room. (a) Many Board records are available electronically at the Board's Web sit...

  11. 4 CFR 201.3 - Publicly available documents and electronic reading room.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 4 Accounts 1 2013-01-01 2013-01-01 false Publicly available documents and electronic reading room. 201.3 Section 201.3 Accounts RECOVERY ACCOUNTABILITY AND TRANSPARENCY BOARD PUBLIC INFORMATION AND REQUESTS § 201.3 Publicly available documents and electronic reading room. (a) Many Board records are available electronically at the Board's Web sit...

  12. 4 CFR 201.3 - Publicly available documents and electronic reading room.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 4 Accounts 1 2014-01-01 2013-01-01 true Publicly available documents and electronic reading room. 201.3 Section 201.3 Accounts RECOVERY ACCOUNTABILITY AND TRANSPARENCY BOARD PUBLIC INFORMATION AND REQUESTS § 201.3 Publicly available documents and electronic reading room. (a) Many Board records are available electronically at the Board's Web site...

  13. The Profile-Query Relationship.

    ERIC Educational Resources Information Center

    Shepherd, Michael A.; Phillips, W. J.

    1986-01-01

    Defines relationship between user profile and user query in terms of relationship between clusters of documents retrieved by each, and explores the expression of cluster similarity and cluster overlap as linear functions of similarity existing between original pairs of profiles and queries, given the desired retrieval threshold. (23 references)…

  14. Grid Computing Application for Brain Magnetic Resonance Image Processing

    NASA Astrophysics Data System (ADS)

    Valdivia, F.; Crépeault, B.; Duchesne, S.

    2012-02-01

    This work emphasizes the use of grid computing and web technology for automatic post-processing of brain magnetic resonance images (MRI) in the context of neuropsychiatric (Alzheimer's disease) research. Post-acquisition image processing is achieved through the interconnection of several individual processes into pipelines. Each process has input and output data ports, options and execution parameters, and performs single tasks such as: a) extracting individual image attributes (e.g. dimensions, orientation, center of mass), b) performing image transformations (e.g. scaling, rotation, skewing, intensity standardization, linear and non-linear registration), c) performing image statistical analyses, and d) producing the necessary quality control images and/or files for user review. The pipelines are built to perform specific sequences of tasks on the alphanumeric data and MRIs contained in our database. The web application is coded in PHP and allows the creation of scripts to create, store and execute pipelines and their instances either on our local cluster or on high-performance computing platforms. To run an instance on an external cluster, the web application opens a communication tunnel through which it copies the necessary files, submits the execution commands and collects the results. We present result on system tests for the processing of a set of 821 brain MRIs from the Alzheimer's Disease Neuroimaging Initiative study via a nonlinear registration pipeline composed of 10 processes. Our results show successful execution on both local and external clusters, and a 4-fold increase in performance if using the external cluster. However, the latter's performance does not scale linearly as queue waiting times and execution overhead increase with the number of tasks to be executed.

  15. A Secure Web Application Providing Public Access to High-Performance Data Intensive Scientific Resources - ScalaBLAST Web Application

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Curtis, Darren S.; Peterson, Elena S.; Oehmen, Chris S.

    2008-05-04

    This work presents the ScalaBLAST Web Application (SWA), a web based application implemented using the PHP script language, MySQL DBMS, and Apache web server under a GNU/Linux platform. SWA is an application built as part of the Data Intensive Computer for Complex Biological Systems (DICCBS) project at the Pacific Northwest National Laboratory (PNNL). SWA delivers accelerated throughput of bioinformatics analysis via high-performance computing through a convenient, easy-to-use web interface. This approach greatly enhances emerging fields of study in biology such as ontology-based homology, and multiple whole genome comparisons which, in the absence of a tool like SWA, require a heroicmore » effort to overcome the computational bottleneck associated with genome analysis. The current version of SWA includes a user account management system, a web based user interface, and a backend process that generates the files necessary for the Internet scientific community to submit a ScalaBLAST parallel processing job on a dedicated cluster.« less

  16. Webizing mobile augmented reality content

    NASA Astrophysics Data System (ADS)

    Ahn, Sangchul; Ko, Heedong; Yoo, Byounghyun

    2014-01-01

    This paper presents a content structure for building mobile augmented reality (AR) applications in HTML5 to achieve a clean separation of the mobile AR content and the application logic for scaling as on the Web. We propose that the content structure contains the physical world as well as virtual assets for mobile AR applications as document object model (DOM) elements and that their behaviour and user interactions are controlled through DOM events by representing objects and places with a uniform resource identifier. Our content structure enables mobile AR applications to be seamlessly developed as normal HTML documents under the current Web eco-system.

  17. Phylowood: interactive web-based animations of biogeographic and phylogeographic histories.

    PubMed

    Landis, Michael J; Bedford, Trevor

    2014-01-01

    Phylowood is a web service that uses JavaScript to generate in-browser animations of biogeographic and phylogeographic histories from annotated phylogenetic input. The animations are interactive, allowing the user to adjust spatial and temporal resolution, and highlight phylogenetic lineages of interest. All documentation and source code for Phylowood is freely available at https://github.com/mlandis/phylowood, and a live web application is available at https://mlandis.github.io/phylowood.

  18. Web Standard: PDF - When to Use, Document Metadata, PDF Sections

    EPA Pesticide Factsheets

    PDF files provide some benefits when used appropriately. PDF files should not be used for short documents ( 5 pages) unless retaining the format for printing is important. PDFs should have internal file metadata and meet section 508 standards.

  19. 2016 e-CDRweb User Guide – Secondary Authorized Official

    EPA Pesticide Factsheets

    This document presents the user guide for the Office of Pollution Prevention and Toxics’ (OPPT) 2016 e-CDRweb tool. This document is the user guide for the Secondary Authorized Official (AO) user of the 2016 e-CDR web tool.

  20. 77 FR 42197 - Small Business Size Standards: Construction

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-07-18

    ... ``exception'') under NAICS 237990, Other Heavy and Civil Engineering Construction, from $20 million to $30... available on its Web site at www.sba.gov/size for public review and comments. The ``Size Standards... developing, reviewing, and modifying size standards when necessary. SBA published the document on its Web...

  1. An Evaluative Methodology for Virtual Communities Using Web Analytics

    ERIC Educational Resources Information Center

    Phippen, A. D.

    2004-01-01

    The evaluation of virtual community usage and user behaviour has its roots in social science approaches such as interview, document analysis and survey. Little evaluation is carried out using traffic or protocol analysis. Business approaches to evaluating customer/business web site usage are more advanced, in particular using advanced web…

  2. Interactive Information Organization: Techniques and Evaluation

    DTIC Science & Technology

    2001-05-01

    information search and access. Locating interesting information on the World Wide Web is the main task of on-line search engines . Such engines accept a...likelihood of being relevant to the user’s request. The majority of today’s Web search engines follow this scenario. The ordering of documents in the

  3. 77 FR 26321 - Virginia Electric and Power Company

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-05-03

    ... NUCLEAR REGULATORY COMMISSION [Docket Nos. 50-338 and 50-339; NRC-2012-0051; License Nos. NPF-4...: Federal Rulemaking Web Site: Go to http://www.regulations.gov and search for Docket ID NRC-2012-0051... search, select ``ADAMS Public Documents'' and then select ``Begin Web- based ADAMS Search.'' For problems...

  4. 77 FR 31917 - Energy Conservation Program: Energy Conservation Standards for Residential Dishwashers

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-05-30

    ... the docket Web page can be found at: http://www.regulations.gov/#!docketDetail ;D=EERE-2011-BT-STD-0060. The regulations.gov Web page contains instructions on how to access all documents, including...: (202) 586-7796. Email: [email protected] . SUPPLEMENTARY INFORMATION: Table of Contents I...

  5. 32 CFR 701.102 - Online resources.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 32 National Defense 5 2011-07-01 2011-07-01 false Online resources. 701.102 Section 701.102... THE NAVY DOCUMENTS AFFECTING THE PUBLIC DON Privacy Program § 701.102 Online resources. (a) Navy PA online Web site (http://www.privacy.navy.mil). This Web site supplements this subpart and subpart G. It...

  6. 32 CFR 701.102 - Online resources.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 32 National Defense 5 2014-07-01 2014-07-01 false Online resources. 701.102 Section 701.102... THE NAVY DOCUMENTS AFFECTING THE PUBLIC DON Privacy Program § 701.102 Online resources. (a) Navy PA online Web site (http://www.privacy.navy.mil). This Web site supplements this subpart and subpart G. It...

  7. 32 CFR 701.102 - Online resources.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 32 National Defense 5 2012-07-01 2012-07-01 false Online resources. 701.102 Section 701.102... THE NAVY DOCUMENTS AFFECTING THE PUBLIC DON Privacy Program § 701.102 Online resources. (a) Navy PA online Web site (http://www.privacy.navy.mil). This Web site supplements this subpart and subpart G. It...

  8. 32 CFR 701.102 - Online resources.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 32 National Defense 5 2013-07-01 2013-07-01 false Online resources. 701.102 Section 701.102... THE NAVY DOCUMENTS AFFECTING THE PUBLIC DON Privacy Program § 701.102 Online resources. (a) Navy PA online Web site (http://www.privacy.navy.mil). This Web site supplements this subpart and subpart G. It...

  9. Paper and Other Web Coating Maximum Achievable Control Technology (MACT): Work Practice, Testing, Monitoring, Recordkeeping, and Reporting Summary Table

    EPA Pesticide Factsheets

    This April 2004 document is a table that details the various requirements of the Paper and Other Web Coating NESHAP, broken down by category. This table covers applicability, recordkeeping, emission limits, work practice standards, and other requirements

  10. World Wide Web Page Design: A Structured Approach.

    ERIC Educational Resources Information Center

    Gregory, Gwen; Brown, M. Marlo

    1997-01-01

    Describes how to develop a World Wide Web site based on structured programming concepts. Highlights include flowcharting, first page design, evaluation, page titles, documenting source code, text, graphics, and browsers. Includes a template for HTML writers, tips for using graphics, a sample homepage, guidelines for authoring structured HTML, and…

  11. Viewing Files — EDRN Public Portal

    Cancer.gov

    In addition to standard HTML Web pages, our web site contain other file formats. You may need additional software or browser plug-ins to view some of the information available on our site. This document lists show each format, along with links to the corresponding freely available plug-ins or viewers.

  12. Wikis and Collaborative Inquiry

    ERIC Educational Resources Information Center

    Lamb, Annette; Johnson, Larry

    2009-01-01

    Wikis are simply Web sites that provide easy-to-use tools for creating, editing, and sharing digital documents, images, and media files. Multiple participants can enter, submit, manage, and update a single Web workspace creating a community of authors and editors. Wiki projects help young people shift from being "consumers" of the Internet to…

  13. ICCE/ICCAI 2000 Full & Short Papers (Methodologies).

    ERIC Educational Resources Information Center

    2000

    This document contains the full text of the following full and short papers on methodologies from ICCE/ICCAI 2000 (International Conference on Computers in Education/International Conference on Computer-Assisted Instruction): (1) "A Methodology for Learning Pattern Analysis from Web Logs by Interpreting Web Page Contents" (Chih-Kai Chang and…

  14. Academic Research Integration System

    ERIC Educational Resources Information Center

    Surugiu, Iula; Velicano, Manole

    2008-01-01

    This paper comprises results concluding the research activity done so far regarding enhanced web services and system integration. The objective of the paper is to define the software architecture for a coherent framework and methodology for enhancing existing web services into an integrated system. This document presents the research work that has…

  15. 76 FR 43960 - NARA Records Reproduction Fees

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-07-22

    ... transferred to NARA and maintain its fee schedule on NARA's Web site http://www.archives.gov . The proposed... document is faint or too dark, it requires additional time to obtain a readable image. In TABLE 1 below... our Web site ( http://www.archives.gov ) annually when announcing that records reproduction fees will...

  16. Biomedical Online Learning: The Route to Success

    ERIC Educational Resources Information Center

    Harvey, Patricia J.; Cookson, Barry; Meerabeau, Elizabeth; Muggleston, Diana

    2003-01-01

    The potential of the World Wide Web for rapid global communication is driving the creation of specifically tailored courses for employees, yet few practitioners have the necessary experience in on-line teaching methods, or in preparing documents for the web. Experience gained in developing six online training modules for the biotechnology and…

  17. Search Interface Design Using Faceted Indexing for Web Resources.

    ERIC Educational Resources Information Center

    Devadason, Francis; Intaraksa, Neelawat; Patamawongjariya, Pornprapa; Desai, Kavita

    2001-01-01

    Describes an experimental system designed to organize and provide access to Web documents using a faceted pre-coordinate indexing system based on the Deep Structure Indexing System (DSIS) derived from POPSI (Postulate based Permuted Subject Indexing) of Bhattacharyya, and the facet analysis and chain indexing system of Ranganathan. (AEF)

  18. WorldWide Web: Hypertext from CERN.

    ERIC Educational Resources Information Center

    Nickerson, Gord

    1992-01-01

    Discussion of software tools for accessing information on the Internet focuses on the WorldWideWeb (WWW) system, which was developed at the European Particle Physics Laboratory (CERN) in Switzerland to build a worldwide network of hypertext links using available networking technology. Its potential for use with multimedia documents is also…

  19. Soft Clustering Criterion Functions for Partitional Document Clustering

    DTIC Science & Technology

    2004-05-26

    in the clus- ter that it already belongs to. The refinement phase ends, as soon as we perform an iteration in which no documents moved between...for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1. REPORT DATE 26 MAY 2004 2... it with the one obtained by the hard criterion functions. We present a comprehensive experimental evaluation involving twelve differ- ent datasets

  20. The Online Bioinformatics Resources Collection at the University of Pittsburgh Health Sciences Library System--a one-stop gateway to online bioinformatics databases and software tools.

    PubMed

    Chen, Yi-Bu; Chattopadhyay, Ansuman; Bergen, Phillip; Gadd, Cynthia; Tannery, Nancy

    2007-01-01

    To bridge the gap between the rising information needs of biological and medical researchers and the rapidly growing number of online bioinformatics resources, we have created the Online Bioinformatics Resources Collection (OBRC) at the Health Sciences Library System (HSLS) at the University of Pittsburgh. The OBRC, containing 1542 major online bioinformatics databases and software tools, was constructed using the HSLS content management system built on the Zope Web application server. To enhance the output of search results, we further implemented the Vivísimo Clustering Engine, which automatically organizes the search results into categories created dynamically based on the textual information of the retrieved records. As the largest online collection of its kind and the only one with advanced search results clustering, OBRC is aimed at becoming a one-stop guided information gateway to the major bioinformatics databases and software tools on the Web. OBRC is available at the University of Pittsburgh's HSLS Web site (http://www.hsls.pitt.edu/guides/genetics/obrc).

  1. Cluster Analysis in Nursing Research: An Introduction, Historical Perspective, and Future Directions.

    PubMed

    Dunn, Heather; Quinn, Laurie; Corbridge, Susan J; Eldeirawi, Kamal; Kapella, Mary; Collins, Eileen G

    2017-05-01

    The use of cluster analysis in the nursing literature is limited to the creation of classifications of homogeneous groups and the discovery of new relationships. As such, it is important to provide clarity regarding its use and potential. The purpose of this article is to provide an introduction to distance-based, partitioning-based, and model-based cluster analysis methods commonly utilized in the nursing literature, provide a brief historical overview on the use of cluster analysis in nursing literature, and provide suggestions for future research. An electronic search included three bibliographic databases, PubMed, CINAHL and Web of Science. Key terms were cluster analysis and nursing. The use of cluster analysis in the nursing literature is increasing and expanding. The increased use of cluster analysis in the nursing literature is positioning this statistical method to result in insights that have the potential to change clinical practice.

  2. Mixed-Initiative Clustering

    ERIC Educational Resources Information Center

    Huang, Yifen

    2010-01-01

    Mixed-initiative clustering is a task where a user and a machine work collaboratively to analyze a large set of documents. We hypothesize that a user and a machine can both learn better clustering models through enriched communication and interactive learning from each other. The first contribution or this thesis is providing a framework of…

  3. Hierarchical Clustering: A Bibliography. Technical Report No. 1.

    ERIC Educational Resources Information Center

    Farrell, William T.

    "Classification: Purposes, Principles, Progress, Prospects" by Robert R. Sokal is reprinted in this document. It summarizes the principles of classification and cluster analysis in a manner which is of specific value to the Marine Corps Office of Manpower Utilization. Following the article is a 184 item bibliography on cluster analysis…

  4. V-TECS Career Cluster Frameworks.

    ERIC Educational Resources Information Center

    Vocational Technical Education Consortium of States, Decatur, GA.

    This document includes 16 vocational-technical crosswalk wheels relating the 14 Vocational Technical Education Consortium of States (V-TECS) Career Families to the 16 Career Clusters developed by the U.S. Department of Education. The career clusters are based on the common academic, workplace, and technical knowledge and skills that cut across all…

  5. Electronic Derivative Classifier/Reviewing Official

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Harris, Joshua C; McDuffie, Gregory P; Light, Ken L

    2017-02-17

    The electronic Derivative Classifier, Reviewing Official (eDC/RO) is a web based document management and routing system that reduces security risks and increases workflow efficiencies. The system automates the upload, notification review request, and document status tracking of documents for classification review on a secure server. It supports a variety of document formats (i.e., pdf, doc, docx, xls, xlsx, xlsm, ppt, pptx, vsd, vsdx and txt), and allows for the dynamic placement of classification markings such as the classification level, category and caveats on the document, in addition to a document footer and digital signature.

  6. Documenting the use of computers in Swedish Health Care up to 1980.

    PubMed

    Peterson, H E; Lundin, P

    2011-01-01

    This paper describes a documentation project to create, collect and preserve previously unavailable sources on informatics in Sweden (including health care as one of 16 subgroups), and making them available on the Web. Time was critical as the personal documentation and artifacts of early pioneers could be irretrievably lost. The criteria for participation were that a person had developed a system in a clinical environment which was used by others prior to 1980. Participants were interviewed and asked for early documentation such as notes, minutes from meetings, drawings, test results and early models - together with related artifacts. The approach included traditional oral history interviews, collection of autobiographies and new self-structuring and time saving methods, such as witness seminars and an Internet-based repository of their recollections (the Writers' Web). The combination of methods obtained new information on system errors, and challenges in reaching the goals due partly to inadequacies of the early technology, and partly to the insufficient understanding of the complexity of the many problems which needed to be solved before a useful electronic patient record could be realized. A very important result was the development of a method to collect information in an easier, faster and much less expensive way than using the traditional scientific method, and still reach results that are qualitative and quantitative for the purpose of documenting the early period of computer-based health care technology. The witness seminars and the Writers' Web yielded especially large amounts of hitherto-unknown information. With all material in one database available to everyone on the Web, it is accessed very frequently - especially by students, researchers, journalists and teachers. Study of the materials explains and clarifies the reasons behind the delays and difficulties that have been encountered in developing electronic patient records, as described in an article [3] published in the IMIA Yearbook 2006.

  7. Innovative recruitment using online networks: lessons learned from an online study of alcohol and other drug use utilizing a web-based, respondent-driven sampling (webRDS) strategy.

    PubMed

    Bauermeister, José A; Zimmerman, Marc A; Johns, Michelle M; Glowacki, Pietreck; Stoddard, Sarah; Volz, Erik

    2012-09-01

    We used a web version of Respondent-Driven Sampling (webRDS) to recruit a sample of young adults (ages 18-24) and examined whether this strategy would result in alcohol and other drug (AOD) prevalence estimates comparable to national estimates (National Survey on Drug Use and Health [NSDUH]). We recruited 22 initial participants (seeds) via Facebook to complete a web survey examining AOD risk correlates. Sequential, incentivized recruitment continued until our desired sample size was achieved. After correcting for webRDS clustering effects, we contrasted our AOD prevalence estimates (past 30 days) to NSDUH estimates by comparing the 95% confidence intervals of prevalence estimates. We found comparable AOD prevalence estimates between our sample and NSDUH for the past 30 days for alcohol, marijuana, cocaine, Ecstasy (3,4-methylenedioxymethamphetamine, or MDMA), and hallucinogens. Cigarette use was lower than NSDUH estimates. WebRDS may be a suitable strategy to recruit young adults online. We discuss the unique strengths and challenges that may be encountered by public health researchers using webRDS methods.

  8. 78 FR 76405 - Requested Administrative Waiver of the Coastwise Trade Laws: Vessel KNIGHT HAWK; Invitation for...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-12-17

    ... this document and all documents entered into this docket is available on the World Wide Web at http... individual submitting the comment (or signing the comment, if submitted on behalf of an association, business...

  9. CLUSTAG: hierarchical clustering and graph methods for selecting tag SNPs.

    PubMed

    Ao, S I; Yip, Kevin; Ng, Michael; Cheung, David; Fong, Pui-Yee; Melhado, Ian; Sham, Pak C

    2005-04-15

    Cluster and set-cover algorithms are developed to obtain a set of tag single nucleotide polymorphisms (SNPs) that can represent all the known SNPs in a chromosomal region, subject to the constraint that all SNPs must have a squared correlation R2>C with at least one tag SNP, where C is specified by the user. http://hkumath.hku.hk/web/link/CLUSTAG/CLUSTAG.html mng@maths.hku.hk.

  10. The Use of Web 2.0 Tools by Students in Learning and Leisure Contexts: A Study in a Portuguese Institution of Higher Education

    ERIC Educational Resources Information Center

    Costa, Carolina; Alvelos, Helena; Teixeira, Leonor

    2016-01-01

    This study analyses and compares the use of Web 2.0 tools by students in both learning and leisure contexts. Data were collected based on a questionnaire applied to 234 students from the University of Aveiro (Portugal) and the results were analysed by using descriptive analysis, paired samples t-tests, cluster analyses and Kruskal-Wallis tests.…

  11. Data Mining Meets HCI: Making Sense of Large Graphs

    DTIC Science & Technology

    2012-07-01

    graph algo- rithms, won the Open Source Software World Challenge, Silver Award. We have released Pegasus as free , open-source software, downloaded by...METIS [77], spectral clustering [108], and the parameter- free “Cross-associations” (CA) [26]. Belief Propagation can also be used for clus- tering, as...number of tools have been developed to support “ landscape ” views of information. These include WebBook and Web- Forager [23], which use a book metaphor

  12. Frequency-sensitive competitive learning for scalable balanced clustering on high-dimensional hyperspheres.

    PubMed

    Banerjee, Arindam; Ghosh, Joydeep

    2004-05-01

    Competitive learning mechanisms for clustering, in general, suffer from poor performance for very high-dimensional (>1000) data because of "curse of dimensionality" effects. In applications such as document clustering, it is customary to normalize the high-dimensional input vectors to unit length, and it is sometimes also desirable to obtain balanced clusters, i.e., clusters of comparable sizes. The spherical kmeans (spkmeans) algorithm, which normalizes the cluster centers as well as the inputs, has been successfully used to cluster normalized text documents in 2000+ dimensional space. Unfortunately, like regular kmeans and its soft expectation-maximization-based version, spkmeans tends to generate extremely imbalanced clusters in high-dimensional spaces when the desired number of clusters is large (tens or more). This paper first shows that the spkmeans algorithm can be derived from a certain maximum likelihood formulation using a mixture of von Mises-Fisher distributions as the generative model, and in fact, it can be considered as a batch-mode version of (normalized) competitive learning. The proposed generative model is then adapted in a principled way to yield three frequency-sensitive competitive learning variants that are applicable to static data and produced high-quality and well-balanced clusters for high-dimensional data. Like kmeans, each iteration is linear in the number of data points and in the number of clusters for all the three algorithms. A frequency-sensitive algorithm to cluster streaming data is also proposed. Experimental results on clustering of high-dimensional text data sets are provided to show the effectiveness and applicability of the proposed techniques. Index Terms-Balanced clustering, expectation maximization (EM), frequency-sensitive competitive learning (FSCL), high-dimensional clustering, kmeans, normalized data, scalable clustering, streaming data, text clustering.

  13. Evaluating the Informative Quality of Documents in SGML Format from Judgements by Means of Fuzzy Linguistic Techniques Based on Computing with Words.

    ERIC Educational Resources Information Center

    Herrera-Viedma, Enrique; Peis, Eduardo

    2003-01-01

    Presents a fuzzy evaluation method of SGML documents based on computing with words. Topics include filtering the amount of information available on the Web to assist users in their search processes; document type definitions; linguistic modeling; user-system interaction; and use with XML and other markup languages. (Author/LRW)

  14. Astronomy Fun with Mobile Devices

    NASA Astrophysics Data System (ADS)

    Pilachowski, Catherine A.; Morris, Frank

    2016-01-01

    Those mobile devices your students bring to class can do more that tweet and text. Engage your students with these web-based astronomy learning tools that allow students to manipulate astronomical data to learn important concepts. The tools are HTML5, CSS3, Javascript-based applications that provide access to the content on iPad and Android tablets. With "Three Color" students can combine monochrome astronomical images taken through different color filters or in different wavelength regions into a single color image. "Star Clusters" allows students to compare images of clusters with a pre-defined template of colors and sizes to compare clusters of different ages. An adaptation of Travis Rector's "NovaSearch" allows students to examine images of the central regions of the Andromeda Galaxy to find novae and to measure the time over which the nova fades away. New additions to our suite of applications allow students to estimate the surface temperatures of exoplanets and the probability of life elsewhere in the Universe. Further information and access to these web-based tools are available at www.astro.indiana.edu/ala/.

  15. Concepts and Technologies for a Comprehensive Information System for Historical Research and Heritage Documentation

    NASA Astrophysics Data System (ADS)

    Henze, F.; Magdalinski, N.; Schwarzbach, F.; Schulze, A.; Gerth, Ph.; Schäfer, F.

    2013-07-01

    Information systems play an important role in historical research as well as in heritage documentation. As part of a joint research project of the German Archaeological Institute, the Brandenburg University of Technology Cottbus and the Dresden University of Applied Sciences a web-based documentation system is currently being developed, which can easily be adapted to the needs of different projects with individual scientific concepts, methods and questions. Based on open source and standardized technologies it will focus on open and well-documented interfaces to ease the dissemination and re-use of its content via web-services and to communicate with desktop applications for further evaluation and analysis. Core of the system is a generic data model that represents a wide range of topics and methods of archaeological work. By the provision of a concerted amount of initial themes and attributes a cross project analysis of research data will be possible. The development of enhanced search and retrieval functionalities will simplify the processing and handling of large heterogeneous data sets. To achieve a high degree of interoperability with existing external data, systems and applications, standardized interfaces will be integrated. The analysis of spatial data shall be possible through the integration of web-based GIS functions. As an extension to this, customized functions for storage, processing and provision of 3D geo data are being developed. As part of the contribution system requirements and concepts will be presented and discussed. A particular focus will be on introducing the generic data model and the derived database schema. The research work on enhanced search and retrieval capabilities will be illustrated by prototypical developments, as well as concepts and first implementations for an integrated 2D/3D Web-GIS.

  16. Effects of brine contamination from energy development on wetland macroinvertebrate community structure in the Prairie Pothole Region

    USGS Publications Warehouse

    Preston, Todd M.; Borgreen, Michael J.; Ray, Andrew M.

    2018-01-01

    Wetlands in the Prairie Pothole Region (PPR) of North America support macroinvertebrate communities that are integral to local food webs and important to breeding waterfowl. Macroinvertebrates in PPR wetlands are primarily generalists and well adapted to within and among year changes in water permanence and salinity. The Williston Basin, a major source of U.S. energy production, underlies the southwest portion of the PPR. Development of oil and gas results in the coproduction of large volumes of highly saline, sodium chloride dominated water (brine) and the introduction of brine can alter wetland salinity. To assess potential effects of brine contamination on macroinvertebrate communities, 155 PPR wetlands spanning a range of hydroperiods and salinities were sampled between 2014 and 2016. Brine contamination was documented in 34 wetlands with contaminated wetlands having significantly higher chloride concentrations, specific conductance and percent dominant taxa, and significantly lower taxonomic richness, Shannon diversity, and Pielou evenness scores compared to uncontaminated wetlands. Non-metric multidimensional scaling found significant correlations between several water quality parameters and macroinvertebrate communities. Chloride concentration and specific conductance, which can be elevated in naturally saline wetlands, but are also associated with brine contamination, had the strongest correlations. Five wetland groups were identified from cluster analysis with many of the highly contaminated wetlands located in a single cluster. Low or moderately contaminated wetlands were distributed among the remaining clusters and had macroinvertebrate communities similar to uncontaminated wetlands. While aggregate changes in macroinvertebrate community structure were observed with brine contamination, systematic changes were not evident, likely due to the strong and potentially confounding influence of hydroperiod and natural salinity. Therefore, despite the observed negative response of macroinvertebrate communities to brine contamination, macroinvertebrate community structure alone is likely not the most sensitive indicator of brine contamination in PPR wetlands.

  17. Galaxy Transformations In The Cosmic Web

    NASA Astrophysics Data System (ADS)

    Jablonka, Pascale

    2017-06-01

    In this talk, I present a new survey, the Spatial Extended EDisCS Survey (SEEDisCS), that aims at understanding how clusters assemble and the level at which galaxies are preprocessed before falling on the cluster cores. SEEDisCS therefore focusses on the changes in galaxy properties along the large scale structures surrounding a couple of z 0.5 medium mass clusters, I first describe how spiral disc stellar populations are affected by the environment,and how we can get constraints on the timescale of star formation quenching. I then present new NOEMA and ALMA CO observations that trace the fate of the galaxy cold gas content along the infalling paths towards the cluster cores.

  18. Generalized Intelligent Framework for Tutoring (GIFT) Cloud/Virtual Open Campus Quick-Start Guide

    DTIC Science & Technology

    2016-03-01

    distribution is unlimited. 13. SUPPLEMENTARY NOTES 14. ABSTRACT This document serves as the quick-start guide for GIFT Cloud, the web -based...to users with a GIFT Account at no cost. GIFT Cloud is a new implementation of GIFT. This web -based application allows learners, authors, and...distribution is unlimited. 3 3. Requirements for GIFT Cloud GIFT Cloud is accessed via a web browser. Officially, GIFT Cloud has been tested to work on

  19. Web Application Design Using Server-Side JavaScript

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hampton, J.; Simons, R.

    1999-02-01

    This document describes the application design philosophy for the Comprehensive Nuclear Test Ban Treaty Research & Development Web Site. This design incorporates object-oriented techniques to produce a flexible and maintainable system of applications that support the web site. These techniques will be discussed at length along with the issues they address. The overall structure of the applications and their relationships with one another will also be described. The current problems and future design changes will be discussed as well.

  20. Migration of the ATLAS Metadata Interface (AMI) to Web 2.0 and cloud

    NASA Astrophysics Data System (ADS)

    Odier, J.; Albrand, S.; Fulachier, J.; Lambert, F.

    2015-12-01

    The ATLAS Metadata Interface (AMI), a mature application of more than 10 years of existence, is currently under adaptation to some recently available technologies. The web interfaces, which previously manipulated XML documents using XSL transformations, are being migrated to Asynchronous JavaScript (AJAX). Web development is considerably simplified by the introduction of a framework based on JQuery and Twitter Bootstrap. Finally, the AMI services are being migrated to an OpenStack cloud infrastructure.

  1. An investigation of document aesthetics for web-to-print repurposing of small-medium business marketing collateral

    NASA Astrophysics Data System (ADS)

    Allebach, J. P.; Ortiz Segovia, Maria; Atkins, C. Brian; O'Brien-Strain, Eamonn; Damera-Venkata, Niranjan; Bhatti, Nina; Liu, Jerry; Lin, Qian

    2010-02-01

    Businesses have traditionally relied on different types of media to communicate with existing and potential customers. With the emergence of the Web, the relation between the use of print and electronic media has continually evolved. In this paper, we investigate one possible scenario that combines the use of the Web and print. Specifically, we consider the scenario where a small- or medium-sized business (SMB) has an existing web site from which they wish to pull content to create a print piece. Our assumption is that the web site was developed by a professional designer, working in conjunction with the business owner or marketing team, and that it contains a rich assembly of content that is presented in an aesthetically pleasing manner. Our goal is to understand the process that a designer would follow to create an effective and aesthetically pleasing print piece. We are particularly interested to understand the choices made by the designer with respect to placement and size of the text and graphic elements on the page. Toward this end, we conducted an experiment in which professional designers worked with SMBs to create print pieces from their respective web pages. In this paper, we report our findings from this experiment, and examine the underlying conclusions regarding the resulting document aesthetics in the context of the existing design, and engineering and computer science literatures that address this topic

  2. Using component technologies for web based wavelet enhanced mammographic image visualization.

    PubMed

    Sakellaropoulos, P; Costaridou, L; Panayiotakis, G

    2000-01-01

    The poor contrast detectability of mammography can be dealt with by domain specific software visualization tools. Remote desktop client access and time performance limitations of a previously reported visualization tool are addressed, aiming at more efficient visualization of mammographic image resources existing in web or PACS image servers. This effort is also motivated by the fact that at present, web browsers do not support domain-specific medical image visualization. To deal with desktop client access the tool was redesigned by exploring component technologies, enabling the integration of stand alone domain specific mammographic image functionality in a web browsing environment (web adaptation). The integration method is based on ActiveX Document Server technology. ActiveX Document is a part of Object Linking and Embedding (OLE) extensible systems object technology, offering new services in existing applications. The standard DICOM 3.0 part 10 compatible image-format specification Papyrus 3.0 is supported, in addition to standard digitization formats such as TIFF. The visualization functionality of the tool has been enhanced by including a fast wavelet transform implementation, which allows for real time wavelet based contrast enhancement and denoising operations. Initial use of the tool with mammograms of various breast structures demonstrated its potential in improving visualization of diagnostic mammographic features. Web adaptation and real time wavelet processing enhance the potential of the previously reported tool in remote diagnosis and education in mammography.

  3. Readability of ASPS and ASAPS educational web sites: an analysis of consumer impact.

    PubMed

    Aliu, Oluseyi; Chung, Kevin C

    2010-04-01

    Patients use the Internet to educate themselves about health-related topics, and learning about plastic surgery is a common activity for enthusiastic consumers in the United States. How to educate consumers regarding plastic surgical procedures is a continued concern for plastic surgeons when faced with the growing portion of the American population having relatively low health care literacy. The usefulness of health-related education materials on the Internet depends largely on their comprehensibility and understandability for all who visit the Web sites. The authors studied the readability of patient education materials related to common plastic surgery procedures from the American Society of Plastic Surgeons (ASPS) and the American Society for Aesthetic Plastic Surgery (ASAPS) Web sites and compared them with materials on similar topics from 10 popular health information-providing sites. The authors found that all analyzed documents on the ASPS and ASAPS Web sites targeted to the consumers were rated to be more difficult than the recommended reading grade level for most American adults, and these documents were consistently among the most difficult to read when compared with the other health information Web sites. The Internet is an increasingly popular avenue for patients to educate themselves about plastic surgery procedures. Patient education material provided on ASPS and ASAPS Web sites should be written at recommended reading grade levels to ensure that it is readable and comprehensible to the targeted audience.

  4. Characteristics of food industry web sites and "advergames" targeting children.

    PubMed

    Culp, Jennifer; Bell, Robert A; Cassady, Diana

    2010-01-01

    To assess the content of food industry Web sites targeting children by describing strategies used to prolong their visits and foster brand loyalty; and to document health-promoting messages on these Web sites. A content analysis was conducted of Web sites advertised on 2 children's networks, Cartoon Network and Nickelodeon. A total of 290 Web pages and 247 unique games on 19 Internet sites were examined. Games, found on 81% of Web sites, were the most predominant promotion strategy used. All games had at least 1 brand identifier, with logos being most frequently used. On average Web sites contained 1 "healthful" message for every 45 exposures to brand identifiers. Food companies use Web sites to extend their television advertising to promote brand loyalty among children. These sites almost exclusively promoted food items high in sugar and fat. Health professionals need to monitor food industry marketing practices used in "new media." Published by Elsevier Inc.

  5. Accounting Data to Web Interface Using PERL

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hargeaves, C

    2001-08-13

    This document will explain the process to create a web interface for the accounting information generated by the High Performance Storage Systems (HPSS) accounting report feature. The accounting report contains useful data but it is not easily accessed in a meaningful way. The accounting report is the only way to see summarized storage usage information. The first step is to take the accounting data, make it meaningful and store the modified data in persistent databases. The second step is to generate the various user interfaces, HTML pages, that will be used to access the data. The third step is tomore » transfer all required files to the web server. The web pages pass parameters to Common Gateway Interface (CGI) scripts that generate dynamic web pages and graphs. The end result is a web page with specific information presented in text with or without graphs. The accounting report has a specific format that allows the use of regular expressions to verify if a line is storage data. Each storage data line is stored in a detailed database file with a name that includes the run date. The detailed database is used to create a summarized database file that also uses run date in its name. The summarized database is used to create the group.html web page that includes a list of all storage users. Scripts that query the database folder to build a list of available databases generate two additional web pages. A master script that is run monthly as part of a cron job, after the accounting report has completed, manages all of these individual scripts. All scripts are written in the PERL programming language. Whenever possible data manipulation scripts are written as filters. All scripts are written to be single source, which means they will function properly on both the open and closed networks at LLNL. The master script handles the command line inputs for all scripts, file transfers to the web server and records run information in a log file. The rest of the scripts manipulate the accounting data or use the files created to generate HTML pages. Each script will be described in detail herein. The following is a brief description of HPSS taken directly from an HPSS web site. ''HPSS is a major development project, which began in 1993 as a Cooperative Research and Development Agreement (CRADA) between government and industry. The primary objective of HPSS is to move very large data objects between high performance computers, workstation clusters, and storage libraries at speeds many times faster than is possible with today's software systems. For example, HPSS can manage parallel data transfers from multiple network-connected disk arrays at rates greater than 1 Gbyte per second, making it possible to access high definition digitized video in real time.'' The HPSS accounting report is a canned report whose format is controlled by the HPSS developers.« less

  6. Searching the world wide Web

    PubMed

    Lawrence; Giles

    1998-04-03

    The coverage and recency of the major World Wide Web search engines was analyzed, yielding some surprising results. The coverage of any one engine is significantly limited: No single engine indexes more than about one-third of the "indexable Web," the coverage of the six engines investigated varies by an order of magnitude, and combining the results of the six engines yields about 3.5 times as many documents on average as compared with the results from only one engine. Analysis of the overlap between pairs of engines gives an estimated lower bound on the size of the indexable Web of 320 million pages.

  7. Extreme Mergers from the Massive Cluster Survey

    NASA Astrophysics Data System (ADS)

    Morris, Roger

    2010-09-01

    We propose to observe two extraordinary, high-redshift galaxy clusters from the Massive Cluster Survey. Both targets are very rare, triple merger systems (one a nearly co-linear merger), and likely lie at the deepest nodes of the cosmic web. Both targets show multiple strong gravitational lensing arcs in the cluster cores. These targets only possess very short (10ks) Chandra observations, and are unobserved by XMM-Newton. The X-ray data will be used to probe the mass distribution of hot, baryonic gas, and to reveal the details of the merger physics and the process of cluster assembly. We will also search for hints of X-ray emission from filaments between the merging clumps. Subaru and Hubble Space Telescope imaging data are in hand; we request additional HST coverage for one object.

  8. Plasma Physics Calculations on a Parallel Macintosh Cluster

    NASA Astrophysics Data System (ADS)

    Decyk, Viktor; Dauger, Dean; Kokelaar, Pieter

    2000-03-01

    We have constructed a parallel cluster consisting of 16 Apple Macintosh G3 computers running the MacOS, and achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. For large problems where message packets are large and relatively few in number, performance of 50-150 MFlops/node is possible, depending on the problem. This is fast enough that 3D calculations can be routinely done. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. Full details are available on our web site: http://exodus.physics.ucla.edu/appleseed/.

  9. Plasma Physics Calculations on a Parallel Macintosh Cluster

    NASA Astrophysics Data System (ADS)

    Decyk, Viktor K.; Dauger, Dean E.; Kokelaar, Pieter R.

    We have constructed a parallel cluster consisting of 16 Apple Macintosh G3 computers running the MacOS, and achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. For large problems where message packets are large and relatively few in number, performance of 50-150 Mflops/node is possible, depending on the problem. This is fast enough that 3D calculations can be routinely done. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. Full details are available on our web site: http://exodus.physics.ucla.edu/appleseed/.

  10. Compliance Timeline and Applicability Determination for Paper and Other Web Coating National Emission Standards for Hazardous Air Pollutants (NESHAP)

    EPA Pesticide Factsheets

    This February 2003 document contains a diagram of dates and events for compliance with the NESHAP for Paper and Other Web Coating. Also on this page is an April 2004 flow chart to determine if the NESHAP applies to your facility.

  11. 77 FR 33786 - NRC Enforcement Policy Revision

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-07

    ... methods: Federal Rulemaking Web site: Go to http://www.regulations.gov and search for Docket ID NRC-2011... search, select ``ADAMS Public Documents'' and then select ``Begin Web- based ADAMS Search.'' For problems... either 2.3.2.a. or b. must be met for the disposition of a violation as an NCV.'' The following new...

  12. Documenting historical data and accessing it on the World Wide Web

    Treesearch

    Malchus B. Baker; Daniel P. Huebner; Peter F. Ffolliott

    2000-01-01

    New computer technologies facilitate the storage, retrieval, and summarization of watershed-based data sets on the World Wide Web. These data sets are used by researchers when testing and validating predictive models, managers when planning and implementing watershed management practices, educators when learning about hydrologic processes, and decisionmakers when...

  13. The Full Monty: Locating Resources, Creating, and Presenting a Web Enhanced History Course.

    ERIC Educational Resources Information Center

    Bazillion, Richard J.; Braun, Connie L.

    2001-01-01

    Discusses how to develop a history course using the World Wide Web; course development software; full text digitized articles, electronic books, primary documents, images, and audio files; and computer equipment such as LCD projectors and interactive whiteboards. Addresses the importance of support for faculty using technology in teaching. (PAL)

  14. Does Interface Matter? A Study of Web Authoring and Editing by Inexperienced Web Writers

    ERIC Educational Resources Information Center

    Dick, Rodney F.

    2006-01-01

    This study explores the complicated nature of the interface as a mediational tool for inexperienced writers as they composed hypertext documents. Because technology can become so quickly and inextricably connected to people's everyday lives, it is essential to explore the effects on these technologies before they become invisible. Because…

  15. Web-Based Interactive Electronic Technical Manual (IETM) Common User Interface Style Guide, Version 2.0

    DTIC Science & Technology

    2003-07-01

    Technical Report WEB-BASED INTERACTIVE ELECTRONIC TECHNICAL MANUAL (IETM) COMMON USER INTERFACE STYLE GUIDE Version 2.0 – July 2003 by L. John Junod ...ACKNOWLEDGEMENTS The principal authors of this document were: John Junod – NSWC, Carderock Division, Phil Deuell – AMSEC LLC, Kathleen Moore

  16. The New Frontier: Conquering the World Wide Web by Mule.

    ERIC Educational Resources Information Center

    Gresham, Morgan

    1999-01-01

    Examines effects of teaching hypertext markup language on students' perceptions of class goals in a networked composition classroom. Suggests sending documents via file transfer protocol by command line and viewing the Web with a textual browser shifted emphasis from writing to coding. Argues that helping students identify a balance between…

  17. World Wide Web Server Standards and Guidelines.

    ERIC Educational Resources Information Center

    Stubbs, Keith M.

    This document defines the specific standards and general guidelines which the U.S. Department of Education (ED) will use to make information available on the World Wide Web (WWW). The purpose of providing such guidance is to ensure high quality and consistent content, organization, and presentation of information on ED WWW servers, in order to…

  18. 78 FR 7818 - Duane Arnold Energy Center; Application for Amendment to Facility Operating License

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-02-04

    ... methods: Federal Rulemaking Web site: Go to http://www.regulations.gov and search for Docket ID NRC-2013... search, select ``ADAMS Public Documents'' and then select ``Begin Web- based ADAMS Search.'' For problems... INFORMATION CONTACT: Karl D. Feintuch, Project Manager, Office of Nuclear Reactor Regulation, U.S. Nuclear...

  19. 77 FR 67837 - Callaway Plant, Unit 1; Application for Amendment to Facility Operating License

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-11-14

    ... methods: Federal Rulemaking Web site: Go to http://www.regulations.gov and search for Docket ID NRC-2012... search, select ``ADAMS Public Documents'' and then select ``Begin Web- based ADAMS Search.'' For problems... INFORMATION CONTACT: Carl F. Lyon, Project Manager, Office of Nuclear Reactor Regulation, U.S. Nuclear...

  20. 78 FR 36315 - Energy Conservation Program: Energy Conservation Standards for Standby Mode and Off Mode for...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-06-17

    ... publicly available, such as information that is exempt from public disclosure. A link to the docket Web.... The regulations.gov Web page will contain simple instructions on how to access all documents...: (202) 287-6307. Email: [email protected] . SUPPLEMENTARY INFORMATION: Table of Contents I. Summary...

  1. Some Thoughts on Free Textbooks

    ERIC Educational Resources Information Center

    Stewart, Robert

    2009-01-01

    The author publishes and freely distributes three online textbooks. "Introduction to Physical Oceanography" is available as a typeset book in Portable Document Format (PDF) or as web pages. "Our Ocean Planet: Oceanography in the 21st Century" and "Environmental Science in the 21st Century" are both available as web pages. All three books, which…

  2. 78 FR 52219 - Notice of Acceptance of Renewal Application for Special Nuclear Materials License From Tennessee...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-08-22

    ... Order Imposing Procedures for Access to Sensitive Unclassified Non-Safeguards Information for Contention... related to the license renewal application using any of the following methods: Federal Rulemaking Web site..., select ``ADAMS Public Documents'' and then select ``Begin Web- based ADAMS Search.'' For problems with...

  3. 76 FR 52357 - Exelon Generation Company, LLC; PSEG Nuclear, LLC; Peach Bottom Atomic Power Station, Unit 3...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-08-22

    ... Amendment to Facility Operating License, Proposed No Significant Hazards Consideration Determination, and Opportunity for a Hearing and Order Imposing Procedures for Document Access to Sensitive Unclassified Non... on the NRC Web site and on the Federal rulemaking Web site, http://www.regulations.gov . Because your...

  4. Engineering a Multi-Purpose Test Collection for Web Retrieval Experiments.

    ERIC Educational Resources Information Center

    Bailey, Peter; Craswell, Nick; Hawking, David

    2003-01-01

    Describes a test collection that was developed as a multi-purpose testbed for experiments on the Web in distributed information retrieval, hyperlink algorithms, and conventional ad hoc retrieval. Discusses inter-server connectivity, integrity of server holdings, inclusion of documents related to a wide spread of likely queries, and distribution of…

  5. The Status of African Studies Digitized Content: Three Metadata Schemes.

    ERIC Educational Resources Information Center

    Kuntz, Patricia S.

    The proliferation of Web pages and digitized material mounted on Internet servers has become unmanageable. Librarians and users are concerned that documents and information are being lost in cyberspace as a result of few bibliographic controls and common standards. Librarians in cooperation with software creators and Web page designers are…

  6. Now That We've Found the "Hidden Web," What Can We Do with It?

    ERIC Educational Resources Information Center

    Cole, Timothy W.; Kaczmarek, Joanne; Marty, Paul F.; Prom, Christopher J.; Sandore, Beth; Shreeves, Sarah

    The Open Archives Initiative (OAI) Protocol for Metadata Harvesting (PMH) is designed to facilitate discovery of the "hidden web" of scholarly information, such as that contained in databases, finding aids, and XML documents. OAI-PMH supports standardized exchange of metadata describing items in disparate collections, of such as those…

  7. MetaSpider: Meta-Searching and Categorization on the Web.

    ERIC Educational Resources Information Center

    Chen, Hsinchun; Fan, Haiyan; Chau, Michael; Zeng, Daniel

    2001-01-01

    Discusses the difficulty of locating relevant information on the Web and studies two approaches to addressing the low precision and poor presentation of search results: meta-search and document categorization. Introduces MetaSpider, a meta-search engine, and presents results of a user evaluation study that compared three search engines.…

  8. Finding Information on the World Wide Web: The Retrieval Effectiveness of Search Engines.

    ERIC Educational Resources Information Center

    Pathak, Praveen; Gordon, Michael

    1999-01-01

    Describes a study that examined the effectiveness of eight search engines for the World Wide Web. Calculated traditional information-retrieval measures of recall and precision at varying numbers of retrieved documents to use as the bases for statistical comparisons of retrieval effectiveness. Also examined the overlap between search engines.…

  9. UFOs, NGOs, or IGOs: Using International Documents for General Reference.

    ERIC Educational Resources Information Center

    Shreve, Catherine

    1997-01-01

    Discusses accessing and using documents from international (intergovernmental) organizations. Profiles the United Nations, the European Union and other Intergovernmental Organizations (IGOs). Discusses the librarian as "Web detective," notes questions to focus on, and presents examples to demonstrate navigation of IGO sites. Lists basic…

  10. Relevance of Web Documents:Ghosts Consensus Method.

    ERIC Educational Resources Information Center

    Gorbunov, Andrey L.

    2002-01-01

    Discusses how to improve the quality of Internet search systems and introduces the Ghosts Consensus Method which is free from the drawbacks of digital democracy algorithms and is based on linear programming tasks. Highlights include vector space models; determining relevant documents; and enriching query terms. (LRW)

  11. Linking to EPA Publications in the National Service Center for Environmental Publications (NSCEP)

    EPA Pesticide Factsheets

    Linking to a document at NSCEP rather than uploading your own copy meets EPA standards and best practices for web content. If you follow this procedure, you can link directly to the PDF document without NSCEP's viewing pane or navigation.

  12. 10 CFR 2.1303 - Availability of documents.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... NUCLEAR REGULATORY COMMISSION RULES OF PRACTICE FOR DOMESTIC LICENSING PROCEEDINGS AND ISSUANCE OF ORDERS Procedures for Hearings on License Transfer Applications § 2.1303 Availability of documents. Unless exempt... for a license transfer requiring Commission approval will be placed at the NRC Web site, http://www...

  13. 10 CFR 2.1303 - Availability of documents.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... NUCLEAR REGULATORY COMMISSION RULES OF PRACTICE FOR DOMESTIC LICENSING PROCEEDINGS AND ISSUANCE OF ORDERS Procedures for Hearings on License Transfer Applications § 2.1303 Availability of documents. Unless exempt... for a license transfer requiring Commission approval will be placed at the NRC Web site, http://www...

  14. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goldsmith, John E. M.; Brennan, James S.; Brubaker, Erik

    A wide range of NSC (Neutron Scatter Camera) activities were conducted under this lifecycle plan. This document outlines the highlights of those activities, broadly characterized as system improvements, laboratory measurements, and deployments, and presents sample results in these areas. Additional information can be found in the documents that reside in WebPMIS.

  15. COGNAT: a web server for comparative analysis of genomic neighborhoods.

    PubMed

    Klimchuk, Olesya I; Konovalov, Kirill A; Perekhvatov, Vadim V; Skulachev, Konstantin V; Dibrova, Daria V; Mulkidjanian, Armen Y

    2017-11-22

    In prokaryotic genomes, functionally coupled genes can be organized in conserved gene clusters enabling their coordinated regulation. Such clusters could contain one or several operons, which are groups of co-transcribed genes. Those genes that evolved from a common ancestral gene by speciation (i.e. orthologs) are expected to have similar genomic neighborhoods in different organisms, whereas those copies of the gene that are responsible for dissimilar functions (i.e. paralogs) could be found in dissimilar genomic contexts. Comparative analysis of genomic neighborhoods facilitates the prediction of co-regulated genes and helps to discern different functions in large protein families. We intended, building on the attribution of gene sequences to the clusters of orthologous groups of proteins (COGs), to provide a method for visualization and comparative analysis of genomic neighborhoods of evolutionary related genes, as well as a respective web server. Here we introduce the COmparative Gene Neighborhoods Analysis Tool (COGNAT), a web server for comparative analysis of genomic neighborhoods. The tool is based on the COG database, as well as the Pfam protein families database. As an example, we show the utility of COGNAT in identifying a new type of membrane protein complex that is formed by paralog(s) of one of the membrane subunits of the NADH:quinone oxidoreductase of type 1 (COG1009) and a cytoplasmic protein of unknown function (COG3002). This article was reviewed by Drs. Igor Zhulin, Uri Gophna and Igor Rogozin.

  16. Human use regulatory affairs advisor (HURAA): learning about research ethics with intelligent learning modules.

    PubMed

    Hu, Xiangen; Graesser, Arthur C

    2004-05-01

    The Human Use Regulatory Affairs Advisor (HURAA) is a Web-based facility that provides help and training on the ethical use of human subjects in research, based on documents and regulations in United States federal agencies. HURAA has a number of standard features of conventional Web facilities and computer-based training, such as hypertext, multimedia, help modules, glossaries, archives, links to other sites, and page-turning didactic instruction. HURAA also has these intelligent features: (1) an animated conversational agent that serves as a navigational guide for the Web facility, (2) lessons with case-based and explanation-based reasoning, (3) document retrieval through natural language queries, and (4) a context-sensitive Frequently Asked Questions segment, called Point & Query. This article describes the functional learning components of HURAA, specifies its computational architecture, and summarizes empirical tests of the facility on learners.

  17. Semantic enrichment of medical forms - semi-automated coding of ODM-elements via web services.

    PubMed

    Breil, Bernhard; Watermann, Andreas; Haas, Peter; Dziuballe, Philipp; Dugas, Martin

    2012-01-01

    Semantic interoperability is an unsolved problem which occurs while working with medical forms from different information systems or institutions. Standards like ODM or CDA assure structural homogenization but in order to compare elements from different data models it is necessary to use semantic concepts and codes on an item level of those structures. We developed and implemented a web-based tool which enables a domain expert to perform semi-automated coding of ODM-files. For each item it is possible to inquire web services which result in unique concept codes without leaving the context of the document. Although it was not feasible to perform a totally automated coding we have implemented a dialog based method to perform an efficient coding of all data elements in the context of the whole document. The proportion of codable items was comparable to results from previous studies.

  18. Web Content Management Systems: An Analysis of Forensic Investigatory Challenges.

    PubMed

    Horsman, Graeme

    2018-02-26

    With an increase in the creation and maintenance of personal websites, web content management systems are now frequently utilized. Such systems offer a low cost and simple solution for those seeking to develop an online presence, and subsequently, a platform from which reported defamatory content, abuse, and copyright infringement has been witnessed. This article provides an introductory forensic analysis of the three current most popular web content management systems available, WordPress, Drupal, and Joomla! Test platforms have been created, and their site structures have been examined to provide guidance for forensic practitioners facing investigations of this type. Result's document available metadata for establishing site ownership, user interactions, and stored content following analysis of artifacts including Wordpress's wp_users, and wp_comments tables, Drupal's "watchdog" records, and Joomla!'s _users, and _content tables. Finally, investigatory limitations documenting the difficulties of investigating WCMS usage are noted, and analysis recommendations are offered. © 2018 American Academy of Forensic Sciences.

  19. Documentation of Heritage Structures Through Geo-Crowdsourcing and Web-Mapping

    NASA Astrophysics Data System (ADS)

    Dhonju, H. K.; Xiao, W.; Shakya, B.; Mills, J. P.; Sarhosis, V.

    2017-09-01

    Heritage documentation has become increasingly urgent due to both natural impacts and human influences. The documentation of countless heritage sites around the globe is a massive project that requires significant amounts of financial and labour resources. With the concepts of volunteered geographic information (VGI) and citizen science, heritage data such as digital photographs can be collected through online crowd participation. Whilst photographs are not strictly geographic data, they can be geo-tagged by the participants. They can also be automatically geo-referenced into a global coordinate system if collected via mobile phones which are now ubiquitous. With the assistance of web-mapping, an online geo-crowdsourcing platform has been developed to collect and display heritage structure photographs. Details of platform development are presented in this paper. The prototype is demonstrated with several heritage examples. Potential applications and advancements are discussed.

  20. E-Portfolio Web-based for Students’ Internship Program Activities

    NASA Astrophysics Data System (ADS)

    Juhana, A.; Abdullah, A. G.; Somantri, M.; Aryadi, S.; Zakaria, D.; Amelia, N.; Arasid, W.

    2018-02-01

    Internship program is an important part in vocational education process to improve the quality of competent graduates. The complete work documentation process in electronic portfolio (e-Portfolio) platform will facilitate students in reporting the results of their work to both university and industry supervisor. The purpose of this research is to create a more easily accessed e-Portfolio which is appropriate for students and supervisors’ need in documenting their work and monitoring process. The method used in this research is fundamental research. This research is focused on the implementation of internship e-Portfolio features by demonstrating them to students who have conducted internship program. The result of this research is to create a proper web-based e-Portfolio which can be used to facilitate students in documenting the results of their work and aid supervisors in monitoring process during internship.

  1. DelPhi web server v2: incorporating atomic-style geometrical figures into the computational protocol.

    PubMed

    Smith, Nicholas; Witham, Shawn; Sarkar, Subhra; Zhang, Jie; Li, Lin; Li, Chuan; Alexov, Emil

    2012-06-15

    A new edition of the DelPhi web server, DelPhi web server v2, is released to include atomic presentation of geometrical figures. These geometrical objects can be used to model nano-size objects together with real biological macromolecules. The position and size of the object can be manipulated by the user in real time until desired results are achieved. The server fixes structural defects, adds hydrogen atoms and calculates electrostatic energies and the corresponding electrostatic potential and ionic distributions. The web server follows a client-server architecture built on PHP and HTML and utilizes DelPhi software. The computation is carried out on supercomputer cluster and results are given back to the user via http protocol, including the ability to visualize the structure and corresponding electrostatic potential via Jmol implementation. The DelPhi web server is available from http://compbio.clemson.edu/delphi_webserver.

  2. LigSearch: a knowledge-based web server to identify likely ligands for a protein target

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Beer, Tjaart A. P. de; Laskowski, Roman A.; Duban, Mark-Eugene

    LigSearch is a web server for identifying ligands likely to bind to a given protein. Identifying which ligands might bind to a protein before crystallization trials could provide a significant saving in time and resources. LigSearch, a web server aimed at predicting ligands that might bind to and stabilize a given protein, has been developed. Using a protein sequence and/or structure, the system searches against a variety of databases, combining available knowledge, and provides a clustered and ranked output of possible ligands. LigSearch can be accessed at http://www.ebi.ac.uk/thornton-srv/databases/LigSearch.

  3. Locating and Searching Electronic Documents: A User Study of Supply Publications in the United States Marine Corps

    DTIC Science & Technology

    2007-12-01

    Boyle, “Important issues in hypertext documentation usability,” In Proceedings of the 9th Annual international Conference on Systems Documentation...Tufte’s principles of information design to creating effective Web sites.” In Proceedings of the 15th Annual international Conference on Computer...usability,” In Proceedings of the 9th Annual international Conference on Systems Documentation (Chicago, Illinois, 1991). SIGDOC 󈨟. ACM, New York, NY

  4. Internet printing

    NASA Astrophysics Data System (ADS)

    Rahgozar, M. Armon; Hastings, Tom; McCue, Daniel L.

    1997-04-01

    The Internet is rapidly changing the traditional means of creation, distribution and retrieval of information. Today, information publishers leverage the capabilities provided by Internet technologies to rapidly communicate information to a much wider audience in unique customized ways. As a result, the volume of published content has been astronomically increasing. This, in addition to the ease of distribution afforded by the Internet has resulted in more and more documents being printed. This paper introduces several axes along which Internet printing may be examined and addresses some of the technological challenges that lay ahead. Some of these axes include: (1) submission--the use of the Internet protocols for selecting printers and submitting documents for print, (2) administration--the management and monitoring of printing engines and other print resources via Web pages, and (3) formats--printing document formats whose spectrum now includes HTML documents with simple text, layout-enhanced documents with Style Sheets, documents that contain audio, graphics and other active objects as well as the existing desktop and PDL formats. The format axis of the Internet Printing becomes even more exciting when one considers that the Web documents are inherently compound and the traversal into the various pieces may uncover various formats. The paper also examines some imaging specific issues that are paramount to Internet Printing. These include formats and structures for representing raster documents and images, compression, fonts rendering and color spaces.

  5. Old document image segmentation using the autocorrelation function and multiresolution analysis

    NASA Astrophysics Data System (ADS)

    Mehri, Maroua; Gomez-Krämer, Petra; Héroux, Pierre; Mullot, Rémy

    2013-01-01

    Recent progress in the digitization of heterogeneous collections of ancient documents has rekindled new challenges in information retrieval in digital libraries and document layout analysis. Therefore, in order to control the quality of historical document image digitization and to meet the need of a characterization of their content using intermediate level metadata (between image and document structure), we propose a fast automatic layout segmentation of old document images based on five descriptors. Those descriptors, based on the autocorrelation function, are obtained by multiresolution analysis and used afterwards in a specific clustering method. The method proposed in this article has the advantage that it is performed without any hypothesis on the document structure, either about the document model (physical structure), or the typographical parameters (logical structure). It is also parameter-free since it automatically adapts to the image content. In this paper, firstly, we detail our proposal to characterize the content of old documents by extracting the autocorrelation features in the different areas of a page and at several resolutions. Then, we show that is possible to automatically find the homogeneous regions defined by similar indices of autocorrelation without knowledge about the number of clusters using adapted hierarchical ascendant classification and consensus clustering approaches. To assess our method, we apply our algorithm on 316 old document images, which encompass six centuries (1200-1900) of French history, in order to demonstrate the performance of our proposal in terms of segmentation and characterization of heterogeneous corpus content. Moreover, we define a new evaluation metric, the homogeneity measure, which aims at evaluating the segmentation and characterization accuracy of our methodology. We find a 85% of mean homogeneity accuracy. Those results help to represent a document by a hierarchy of layout structure and content, and to define one or more signatures for each page, on the basis of a hierarchical representation of homogeneous blocks and their topology.

  6. Data Interactive Publications

    NASA Astrophysics Data System (ADS)

    Domenico, B.; Weber, J.

    2012-04-01

    For some years now, the authors have developed examples of online documents that allowed the reader to interact directly with datasets, but there were limitations that restricted the interaction to specific desktop analysis and display tools that were not generally available to all readers of the documents. Recent advances in web service technology and related standards are making it possible to develop systems for publishing online documents that enable readers to access, analyze, and display the data discussed in the publication from the perspective and in the manner from which the author wants it to be represented. By clicking on embedded links, the reader accesses not only the usual textual information in a publication, but also data residing on a local or remote web server as well as a set of processing tools for analyzing and displaying the data. With the option of having the analysis and display processing provided on the server (or in the cloud), there are now a broader set of possibilities on the client side where the reader can interact with the data via a thin web client, a rich desktop application, or a mobile platform "app." The presentation will outline the architecture of data interactive publications along with illustrative examples.

  7. Data Interactive Publications Revisited

    NASA Astrophysics Data System (ADS)

    Domenico, B.; Weber, W. J.

    2011-12-01

    A few years back, the authors presented examples of online documents that allowed the reader to interact directly with datasets, but there were limitations that restricted the interaction to specific desktop analysis and display tools that were not generally available to all readers of the documents. Recent advances in web service technology and related standards are making it possible to develop systems for publishing online documents that enable readers to access, analyze, and display the data discussed in the publication from the perspective and in the manner from which the author wants it to be represented. By clicking on embedded links, the reader accesses not only the usual textual information in a publication, but also data residing on a local or remote web server as well as a set of processing tools for analyzing and displaying the data. With the option of having the analysis and display processing provided on the server, there are now a broader set of possibilities on the client side where the reader can interact with the data via a thin web client, a rich desktop application, or a mobile platform "app." The presentation will outline the architecture of data interactive publications along with illustrative examples.

  8. Scientometric and patentometric analyses to determine the knowledge landscape in innovative technologies: The case of 3D bioprinting

    PubMed Central

    2017-01-01

    This research proposes an innovative data model to determine the landscape of emerging technologies. It is based on a competitive technology intelligence methodology that incorporates the assessment of scientific publications and patent analysis production, and is further supported by experts’ feedback. It enables the definition of the growth rate of scientific and technological output in terms of the top countries, institutions and journals producing knowledge within the field as well as the identification of main areas of research and development by analyzing the International Patent Classification codes including keyword clusterization and co-occurrence of patent assignees and patent codes. This model was applied to the evolving domain of 3D bioprinting. Scientific documents from the Scopus and Web of Science databases, along with patents from 27 authorities and 140 countries, were retrieved. In total, 4782 scientific publications and 706 patents were identified from 2000 to mid-2016. The number of scientific documents published and patents in the last five years showed an annual average growth of 20% and 40%, respectively. Results indicate that the most prolific nations and institutions publishing on 3D bioprinting are the USA and China, including the Massachusetts Institute of Technology (USA), Nanyang Technological University (Singapore) and Tsinghua University (China), respectively. Biomaterials and Biofabrication are the predominant journals. The most prolific patenting countries are China and the USA; while Organovo Holdings Inc. (USA) and Tsinghua University (China) are the institutions leading. International Patent Classification codes reveal that most 3D bioprinting inventions intended for medical purposes apply porous or cellular materials or biologically active materials. Knowledge clusters and expert drivers indicate that there is a research focus on tissue engineering including the fabrication of organs, bioinks and new 3D bioprinting systems. Our model offers a guide to researchers to understand the knowledge production of pioneering technologies, in this case 3D bioprinting. PMID:28662187

  9. Scientometric and patentometric analyses to determine the knowledge landscape in innovative technologies: The case of 3D bioprinting.

    PubMed

    Rodríguez-Salvador, Marisela; Rio-Belver, Rosa María; Garechana-Anacabe, Gaizka

    2017-01-01

    This research proposes an innovative data model to determine the landscape of emerging technologies. It is based on a competitive technology intelligence methodology that incorporates the assessment of scientific publications and patent analysis production, and is further supported by experts' feedback. It enables the definition of the growth rate of scientific and technological output in terms of the top countries, institutions and journals producing knowledge within the field as well as the identification of main areas of research and development by analyzing the International Patent Classification codes including keyword clusterization and co-occurrence of patent assignees and patent codes. This model was applied to the evolving domain of 3D bioprinting. Scientific documents from the Scopus and Web of Science databases, along with patents from 27 authorities and 140 countries, were retrieved. In total, 4782 scientific publications and 706 patents were identified from 2000 to mid-2016. The number of scientific documents published and patents in the last five years showed an annual average growth of 20% and 40%, respectively. Results indicate that the most prolific nations and institutions publishing on 3D bioprinting are the USA and China, including the Massachusetts Institute of Technology (USA), Nanyang Technological University (Singapore) and Tsinghua University (China), respectively. Biomaterials and Biofabrication are the predominant journals. The most prolific patenting countries are China and the USA; while Organovo Holdings Inc. (USA) and Tsinghua University (China) are the institutions leading. International Patent Classification codes reveal that most 3D bioprinting inventions intended for medical purposes apply porous or cellular materials or biologically active materials. Knowledge clusters and expert drivers indicate that there is a research focus on tissue engineering including the fabrication of organs, bioinks and new 3D bioprinting systems. Our model offers a guide to researchers to understand the knowledge production of pioneering technologies, in this case 3D bioprinting.

  10. WebMOTIFS: automated discovery, filtering and scoring of DNA sequence motifs using multiple programs and Bayesian approaches

    PubMed Central

    Romer, Katherine A.; Kayombya, Guy-Richard; Fraenkel, Ernest

    2007-01-01

    WebMOTIFS provides a web interface that facilitates the discovery and analysis of DNA-sequence motifs. Several studies have shown that the accuracy of motif discovery can be significantly improved by using multiple de novo motif discovery programs and using randomized control calculations to identify the most significant motifs or by using Bayesian approaches. WebMOTIFS makes it easy to apply these strategies. Using a single submission form, users can run several motif discovery programs and score, cluster and visualize the results. In addition, the Bayesian motif discovery program THEME can be used to determine the class of transcription factors that is most likely to regulate a set of sequences. Input can be provided as a list of gene or probe identifiers. Used with the default settings, WebMOTIFS accurately identifies biologically relevant motifs from diverse data in several species. WebMOTIFS is freely available at http://fraenkel.mit.edu/webmotifs. PMID:17584794

  11. Alzforum and SWAN: the present and future of scientific web communities.

    PubMed

    Clark, Tim; Kinoshita, June

    2007-05-01

    Scientists drove the early development of the World Wide Web, primarily as a means for rapid communication, document sharing and data access. They have been far slower to adopt the web as a medium for building research communities. Yet, web-based communities hold great potential for accelerating the pace of scientific research. In this article, we will describe the 10-year experience of the Alzheimer Research Forum ('Alzforum'), a unique example of a thriving scientific web community, and explain the features that contributed to its success. We will then outline the SWAN (Semantic Web Applications in Neuromedicine) project, in which Alzforum curators are collaborating with informatics researchers to develop novel approaches that will enable communities to share richly contextualized information about scientific data, claims and hypotheses.

  12. SMART (Shop floor Modeling, Analysis and Reporting Tool Project

    NASA Technical Reports Server (NTRS)

    Centeno, Martha A.; Garcia, Maretys L.; Mendoza, Alicia C.; Molina, Louis A.; Correa, Daisy; Wint, Steve; Doice, Gregorie; Reyes, M. Florencia

    1999-01-01

    This document presents summarizes the design and prototype of the Shop floor Modeling, Analysis, and Reporting Tool (S.M.A.R.T.) A detailed description of it is found on the full documentation given to the NASA liaison. This documentation is also found on the A.R.I.S.E. Center web site, under a projected directory. Only authorized users can gain access to this site.

  13. Web-Based Evaluation System to Measure Learning Effectiveness in Kampo Medicine

    PubMed Central

    Usuku, Koichiro; Segawa, Makoto; Wang, Yue; Ogashiwa, Kahori; Fujita, Yusuke; Ogihara, Hiroyuki; Tazuma, Susumu

    2016-01-01

    Measuring the learning effectiveness of Kampo Medicine (KM) education is challenging. The aim of this study was to develop a web-based test to measure the learning effectiveness of KM education among medical students (MSs). We used an open-source Moodle platform to test 30 multiple-choice questions classified into 8-type fields (eight basic concepts of KM) including “qi-blood-fluid” and “five-element” theories, on 117 fourth-year MSs. The mean (±standard deviation [SD]) score on the web-based test was 30.2 ± 11.9 (/100). The correct answer rate ranged from 17% to 36%. A pattern-based portfolio enabled these rates to be individualized in terms of KM proficiency. MSs with scores higher (n = 19) or lower (n = 14) than mean ± 1SD were defined as high or low achievers, respectively. Cluster analysis using the correct answer rates for the 8-type field questions revealed clear divisions between high and low achievers. Interestingly, each high achiever had a different proficiency pattern. In contrast, three major clusters were evident among low achievers, all of whom responded with a low percentage of or no correct answers. In addition, a combination of three questions accurately classified high and low achievers. These findings suggest that our web-based test allows individual quantitative assessment of the learning effectiveness of KM education among MSs. PMID:27738440

  14. Web-Based Evaluation System to Measure Learning Effectiveness in Kampo Medicine.

    PubMed

    Iizuka, Norio; Usuku, Koichiro; Nakae, Hajime; Segawa, Makoto; Wang, Yue; Ogashiwa, Kahori; Fujita, Yusuke; Ogihara, Hiroyuki; Tazuma, Susumu; Hamamoto, Yoshihiko

    2016-01-01

    Measuring the learning effectiveness of Kampo Medicine (KM) education is challenging. The aim of this study was to develop a web-based test to measure the learning effectiveness of KM education among medical students (MSs). We used an open-source Moodle platform to test 30 multiple-choice questions classified into 8-type fields (eight basic concepts of KM) including "qi-blood-fluid" and "five-element" theories, on 117 fourth-year MSs. The mean (±standard deviation [SD]) score on the web-based test was 30.2 ± 11.9 (/100). The correct answer rate ranged from 17% to 36%. A pattern-based portfolio enabled these rates to be individualized in terms of KM proficiency. MSs with scores higher ( n = 19) or lower ( n = 14) than mean ± 1SD were defined as high or low achievers, respectively. Cluster analysis using the correct answer rates for the 8-type field questions revealed clear divisions between high and low achievers. Interestingly, each high achiever had a different proficiency pattern. In contrast, three major clusters were evident among low achievers, all of whom responded with a low percentage of or no correct answers. In addition, a combination of three questions accurately classified high and low achievers. These findings suggest that our web-based test allows individual quantitative assessment of the learning effectiveness of KM education among MSs.

  15. Improving the Accuracy of Attribute Extraction using the Relatedness between Attribute Values

    NASA Astrophysics Data System (ADS)

    Bollegala, Danushka; Tani, Naoki; Ishizuka, Mitsuru

    Extracting attribute-values related to entities from web texts is an important step in numerous web related tasks such as information retrieval, information extraction, and entity disambiguation (namesake disambiguation). For example, for a search query that contains a personal name, we can not only return documents that contain that personal name, but if we have attribute-values such as the organization for which that person works, we can also suggest documents that contain information related to that organization, thereby improving the user's search experience. Despite numerous potential applications of attribute extraction, it remains a challenging task due to the inherent noise in web data -- often a single web page contains multiple entities and attributes. We propose a graph-based approach to select the correct attribute-values from a set of candidate attribute-values extracted for a particular entity. First, we build an undirected weighted graph in which, attribute-values are represented by nodes, and the edge that connects two nodes in the graph represents the degree of relatedness between the corresponding attribute-values. Next, we find the maximum spanning tree of this graph that connects exactly one attribute-value for each attribute-type. The proposed method outperforms previously proposed attribute extraction methods on a dataset that contains 5000 web pages.

  16. Exchanging the Context between OGC Geospatial Web clients and GIS applications using Atom

    NASA Astrophysics Data System (ADS)

    Maso, Joan; Díaz, Paula; Riverola, Anna; Pons, Xavier

    2013-04-01

    Currently, the discovery and sharing of geospatial information over the web still presents difficulties. News distribution through website content was simplified by the use of Really Simple Syndication (RSS) and Atom syndication formats. This communication exposes an extension of Atom to redistribute references to geospatial information in a Spatial Data Infrastructure distributed environment. A geospatial client can save the status of an application that involves several OGC services of different kind and direct data and share this status with other users that need the same information and use different client vendor products in an interoperable way. The extensibility of the Atom format was essential to define a format that could be used in RSS enabled web browser, Mass Market map viewers and emerging geospatial enable integrated clients that support Open Geospatial Consortium (OGC) services. Since OWS Context has been designed as an Atom extension, it is possible to see the document in common places where Atom documents are valid. Internet web browsers are able to present the document as a list of items with title, abstract, time, description and downloading features. OWS Context uses GeoRSS so that, the document can be to be interpreted by both Google maps and Bing Maps as items that have the extent represented in a dynamic map. Another way to explode a OWS Context is to develop an XSLT to transform the Atom feed into an HTML5 document that shows the exact status of the client view window that saved the context document. To accomplish so, we use the width and height of the client window, and the extent of the view in world (geographic) coordinates in order to calculate the scale of the map. Then, we can mix elements in world coordinates (such as CF-NetCDF files or GML) with elements in pixel coordinates (such as WMS maps, WMTS tiles and direct SVG content). A smarter map browser application called MiraMon Map Browser is able to write a context document and read it again to recover the context of the previous view or load a context generated by another application. The possibility to store direct links to direct files in OWS Context is particularly interesting for GIS desktop solutions. This communication also presents the development made in the MiraMon desktop GIS solution to include OWS Context. MiraMon software is able to deal either with local files, web services and database connections. As in any other GIS solution, MiraMon team designed its own file (MiraMon Map MMM) for storing and sharing the status of a GIS session. The new OWS Context format is now adopted as an interoperable substitution of the MMM. The extensibility of the format makes it possible to map concepts in the MMM to current OWS Context elements (such as titles, data links, extent, etc) and to generate new elements that are able to include all extra metadata not currently covered by OWS Context. These developments were done in the nine edition of the OpenGIS Web Services Interoperability Experiment (OWS-9) and are demonstrated in this communication.

  17. 78 FR 29159 - Electric Power Research Institute; Seismic Evaluation Guidance

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-05-17

    ..., conducted field investigations, and used more recent methods than were previously available. In performing... available, by searching on http://www.regulations.gov under Docket ID NRC-2013-0038. Federal Rulemaking Web... Agencywide Documents Access and Management System (ADAMS): You may access publicly-available documents online...

  18. Storing and Viewing Electronic Documents.

    ERIC Educational Resources Information Center

    Falk, Howard

    1999-01-01

    Discusses the conversion of fragile library materials to computer storage and retrieval to extend the life of the items and to improve accessibility through the World Wide Web. Highlights include entering the images, including scanning; optical character recognition; full text and manual indexing; and available document- and image-management…

  19. EVALUATION OF THE POTENTIAL CARCINOGENICITY OF ELECTROMAGNETIC FIELDS (EXTERNAL REVIEW DRAFT)

    EPA Science Inventory

    The U.S. Environmental Protection Agency (EPA or Agency) is posting on this web site a draft document related to the potential adverse human health effects resulting from exposure to electromagnetic fields (EMF). This document was never finalized after EPA activities were discon...

  20. Space station ECLSS integration analysis: Simplified General Cluster Systems Model, ECLS System Assessment Program enhancements

    NASA Technical Reports Server (NTRS)

    Ferguson, R. E.

    1985-01-01

    The data base verification of the ECLS Systems Assessment Program (ESAP) was documented and changes made to enhance the flexibility of the water recovery subsystem simulations are given. All changes which were made to the data base values are described and the software enhancements performed. The refined model documented herein constitutes the submittal of the General Cluster Systems Model. A source listing of the current version of ESAP is provided in Appendix A.

Top