Stocker, Gernot; Rieder, Dietmar; Trajanoski, Zlatko
2004-03-22
ClusterControl is a web interface to simplify distributing and monitoring bioinformatics applications on Linux cluster systems. We have developed a modular concept that enables integration of command line oriented program into the application framework of ClusterControl. The systems facilitate integration of different applications accessed through one interface and executed on a distributed cluster system. The package is based on freely available technologies like Apache as web server, PHP as server-side scripting language and OpenPBS as queuing system and is available free of charge for academic and non-profit institutions. http://genome.tugraz.at/Software/ClusterControl
Web Program for Development of GUIs for Cluster Computers
NASA Technical Reports Server (NTRS)
Czikmantory, Akos; Cwik, Thomas; Klimeck, Gerhard; Hua, Hook; Oyafuso, Fabiano; Vinyard, Edward
2003-01-01
WIGLAF (a Web Interface Generator and Legacy Application Facade) is a computer program that provides a Web-based, distributed, graphical-user-interface (GUI) framework that can be adapted to any of a broad range of application programs, written in any programming language, that are executed remotely on any cluster computer system. WIGLAF enables the rapid development of a GUI for controlling and monitoring a specific application program running on the cluster and for transferring data to and from the application program. The only prerequisite for the execution of WIGLAF is a Web-browser program on a user's personal computer connected with the cluster via the Internet. WIGLAF has a client/server architecture: The server component is executed on the cluster system, where it controls the application program and serves data to the client component. The client component is an applet that runs in the Web browser. WIGLAF utilizes the Extensible Markup Language to hold all data associated with the application software, Java to enable platform-independent execution on the cluster system and the display of a GUI generator through the browser, and the Java Remote Method Invocation software package to provide simple, effective client/server networking.
The design and implementation of web mining in web sites security
NASA Astrophysics Data System (ADS)
Li, Jian; Zhang, Guo-Yin; Gu, Guo-Chang; Li, Jian-Li
2003-06-01
The backdoor or information leak of Web servers can be detected by using Web Mining techniques on some abnormal Web log and Web application log data. The security of Web servers can be enhanced and the damage of illegal access can be avoided. Firstly, the system for discovering the patterns of information leakages in CGI scripts from Web log data was proposed. Secondly, those patterns for system administrators to modify their codes and enhance their Web site security were provided. The following aspects were described: one is to combine web application log with web log to extract more information, so web data mining could be used to mine web log for discovering the information that firewall and Information Detection System cannot find. Another approach is to propose an operation module of web site to enhance Web site security. In cluster server session, Density-Based Clustering technique is used to reduce resource cost and obtain better efficiency.
Dynamically Allocated Virtual Clustering Management System Users Guide
2016-11-01
provides usage instructions for the DAVC version 2.0 web application. 15. SUBJECT TERMS DAVC, Dynamically Allocated Virtual Clustering...This report provides usage instructions for the DAVC version 2.0 web application. This report is separated into the following sections, which detail
BOWS (bioinformatics open web services) to centralize bioinformatics tools in web services.
Velloso, Henrique; Vialle, Ricardo A; Ortega, J Miguel
2015-06-02
Bioinformaticians face a range of difficulties to get locally-installed tools running and producing results; they would greatly benefit from a system that could centralize most of the tools, using an easy interface for input and output. Web services, due to their universal nature and widely known interface, constitute a very good option to achieve this goal. Bioinformatics open web services (BOWS) is a system based on generic web services produced to allow programmatic access to applications running on high-performance computing (HPC) clusters. BOWS intermediates the access to registered tools by providing front-end and back-end web services. Programmers can install applications in HPC clusters in any programming language and use the back-end service to check for new jobs and their parameters, and then to send the results to BOWS. Programs running in simple computers consume the BOWS front-end service to submit new processes and read results. BOWS compiles Java clients, which encapsulate the front-end web service requisitions, and automatically creates a web page that disposes the registered applications and clients. Bioinformatics open web services registered applications can be accessed from virtually any programming language through web services, or using standard java clients. The back-end can run in HPC clusters, allowing bioinformaticians to remotely run high-processing demand applications directly from their machines.
Brown, David K; Penkler, David L; Musyoka, Thommas M; Bishop, Özlem Tastan
2015-01-01
Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.
Brown, David K.; Penkler, David L.; Musyoka, Thommas M.; Bishop, Özlem Tastan
2015-01-01
Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS. PMID:26280450
A Clustering Methodology of Web Log Data for Learning Management Systems
ERIC Educational Resources Information Center
Valsamidis, Stavros; Kontogiannis, Sotirios; Kazanidis, Ioannis; Theodosiou, Theodosios; Karakos, Alexandros
2012-01-01
Learning Management Systems (LMS) collect large amounts of data. Data mining techniques can be applied to analyse their web data log files. The instructors may use this data for assessing and measuring their courses. In this respect, we have proposed a methodology for analysing LMS courses and students' activity. This methodology uses a Markov…
Food-web structure and network theory: The role of connectance and size
Dunne, Jennifer A.; Williams, Richard J.; Martinez, Neo D.
2002-01-01
Networks from a wide range of physical, biological, and social systems have been recently described as “small-world” and “scale-free.” However, studies disagree whether ecological networks called food webs possess the characteristic path lengths, clustering coefficients, and degree distributions required for membership in these classes of networks. Our analysis suggests that the disagreements are based on selective use of relatively few food webs, as well as analytical decisions that obscure important variability in the data. We analyze a broad range of 16 high-quality food webs, with 25–172 nodes, from a variety of aquatic and terrestrial ecosystems. Food webs generally have much higher complexity, measured as connectance (the fraction of all possible links that are realized in a network), and much smaller size than other networks studied, which have important implications for network topology. Our results resolve prior conflicts by demonstrating that although some food webs have small-world and scale-free structure, most do not if they exceed a relatively low level of connectance. Although food-web degree distributions do not display a universal functional form, observed distributions are systematically related to network connectance and size. Also, although food webs often lack small-world structure because of low clustering, we identify a continuum of real-world networks including food webs whose ratios of observed to random clustering coefficients increase as a power–law function of network size over 7 orders of magnitude. Although food webs are generally not small-world, scale-free networks, food-web topology is consistent with patterns found within those classes of networks. PMID:12235364
Tseng, Yi-Ju; Wu, Jung-Hsuan; Ping, Xiao-Ou; Lin, Hui-Chi; Chen, Ying-Yu; Shang, Rung-Ji; Chen, Ming-Yuan; Lai, Feipei
2012-01-01
Background The emergence and spread of multidrug-resistant organisms (MDROs) are causing a global crisis. Combating antimicrobial resistance requires prevention of transmission of resistant organisms and improved use of antimicrobials. Objectives To develop a Web-based information system for automatic integration, analysis, and interpretation of the antimicrobial susceptibility of all clinical isolates that incorporates rule-based classification and cluster analysis of MDROs and implements control chart analysis to facilitate outbreak detection. Methods Electronic microbiological data from a 2200-bed teaching hospital in Taiwan were classified according to predefined criteria of MDROs. The numbers of organisms, patients, and incident patients in each MDRO pattern were presented graphically to describe spatial and time information in a Web-based user interface. Hierarchical clustering with 7 upper control limits (UCL) was used to detect suspicious outbreaks. The system’s performance in outbreak detection was evaluated based on vancomycin-resistant enterococcal outbreaks determined by a hospital-wide prospective active surveillance database compiled by infection control personnel. Results The optimal UCL for MDRO outbreak detection was the upper 90% confidence interval (CI) using germ criterion with clustering (area under ROC curve (AUC) 0.93, 95% CI 0.91 to 0.95), upper 85% CI using patient criterion (AUC 0.87, 95% CI 0.80 to 0.93), and one standard deviation using incident patient criterion (AUC 0.84, 95% CI 0.75 to 0.92). The performance indicators of each UCL were statistically significantly higher with clustering than those without clustering in germ criterion (P < .001), patient criterion (P = .04), and incident patient criterion (P < .001). Conclusion This system automatically identifies MDROs and accurately detects suspicious outbreaks of MDROs based on the antimicrobial susceptibility of all clinical isolates. PMID:23195868
The web site provides guidance and technical assistance for homeowners, government officials, industry professionals, and EPA partners about how to properly develop and manage individual onsite and community cluster systems that treat domestic wastewater.
Providing Multi-Page Data Extraction Services with XWRAPComposer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Ling; Zhang, Jianjun; Han, Wei
2008-04-30
Dynamic Web data sources – sometimes known collectively as the Deep Web – increase the utility of the Web by providing intuitive access to data repositories anywhere that Web access is available. Deep Web services provide access to real-time information, like entertainment event listings, or present a Web interface to large databases or other data repositories. Recent studies suggest that the size and growth rate of the dynamic Web greatly exceed that of the static Web, yet dynamic content is often ignored by existing search engine indexers owing to the technical challenges that arise when attempting to search the Deepmore » Web. To address these challenges, we present DYNABOT, a service-centric crawler for discovering and clustering Deep Web sources offering dynamic content. DYNABOT has three unique characteristics. First, DYNABOT utilizes a service class model of the Web implemented through the construction of service class descriptions (SCDs). Second, DYNABOT employs a modular, self-tuning system architecture for focused crawling of the Deep Web using service class descriptions. Third, DYNABOT incorporates methods and algorithms for efficient probing of the Deep Web and for discovering and clustering Deep Web sources and services through SCD-based service matching analysis. Our experimental results demonstrate the effectiveness of the service class discovery, probing, and matching algorithms and suggest techniques for efficiently managing service discovery in the face of the immense scale of the Deep Web.« less
Mining a Web Citation Database for Author Co-Citation Analysis.
ERIC Educational Resources Information Center
He, Yulan; Hui, Siu Cheung
2002-01-01
Proposes a mining process to automate author co-citation analysis based on the Web Citation Database, a data warehouse for storing citation indices of Web publications. Describes the use of agglomerative hierarchical clustering for author clustering and multidimensional scaling for displaying author cluster maps, and explains PubSearch, a…
Cluster Analysis of Adolescent Blogs
ERIC Educational Resources Information Center
Liu, Eric Zhi-Feng; Lin, Chun-Hung; Chen, Feng-Yi; Peng, Ping-Chuan
2012-01-01
Emerging web applications and networking systems such as blogs have become popular, and they offer unique opportunities and environments for learners, especially for adolescent learners. This study attempts to explore the writing styles and genres used by adolescents in their blogs by employing content, factor, and cluster analyses. Factor…
A Web service substitution method based on service cluster nets
NASA Astrophysics Data System (ADS)
Du, YuYue; Gai, JunJing; Zhou, MengChu
2017-11-01
Service substitution is an important research topic in the fields of Web services and service-oriented computing. This work presents a novel method to analyse and substitute Web services. A new concept, called a Service Cluster Net Unit, is proposed based on Web service clusters. A service cluster is converted into a Service Cluster Net Unit. Then it is used to analyse whether the services in the cluster can satisfy some service requests. Meanwhile, the substitution methods of an atomic service and a composite service are proposed. The correctness of the proposed method is proved, and the effectiveness is shown and compared with the state-of-the-art method via an experiment. It can be readily applied to e-commerce service substitution to meet the business automation needs.
Focused Crawling of the Deep Web Using Service Class Descriptions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rocco, D; Liu, L; Critchlow, T
2004-06-21
Dynamic Web data sources--sometimes known collectively as the Deep Web--increase the utility of the Web by providing intuitive access to data repositories anywhere that Web access is available. Deep Web services provide access to real-time information, like entertainment event listings, or present a Web interface to large databases or other data repositories. Recent studies suggest that the size and growth rate of the dynamic Web greatly exceed that of the static Web, yet dynamic content is often ignored by existing search engine indexers owing to the technical challenges that arise when attempting to search the Deep Web. To address thesemore » challenges, we present DynaBot, a service-centric crawler for discovering and clustering Deep Web sources offering dynamic content. DynaBot has three unique characteristics. First, DynaBot utilizes a service class model of the Web implemented through the construction of service class descriptions (SCDs). Second, DynaBot employs a modular, self-tuning system architecture for focused crawling of the DeepWeb using service class descriptions. Third, DynaBot incorporates methods and algorithms for efficient probing of the Deep Web and for discovering and clustering Deep Web sources and services through SCD-based service matching analysis. Our experimental results demonstrate the effectiveness of the service class discovery, probing, and matching algorithms and suggest techniques for efficiently managing service discovery in the face of the immense scale of the Deep Web.« less
First Operational Experience With a High-Energy Physics Run Control System Based on Web Technologies
NASA Astrophysics Data System (ADS)
Bauer, Gerry; Beccati, Barbara; Behrens, Ulf; Biery, Kurt; Branson, James; Bukowiec, Sebastian; Cano, Eric; Cheung, Harry; Ciganek, Marek; Cittolin, Sergio; Coarasa Perez, Jose Antonio; Deldicque, Christian; Erhan, Samim; Gigi, Dominique; Glege, Frank; Gomez-Reino, Robert; Gulmini, Michele; Hatton, Derek; Hwong, Yi Ling; Loizides, Constantin; Ma, Frank; Masetti, Lorenzo; Meijers, Frans; Meschi, Emilio; Meyer, Andreas; Mommsen, Remigius K.; Moser, Roland; O'Dell, Vivian; Oh, Alexander; Orsini, Luciano; Paus, Christoph; Petrucci, Andrea; Pieri, Marco; Racz, Attila; Raginel, Olivier; Sakulin, Hannes; Sani, Matteo; Schieferdecker, Philipp; Schwick, Christoph; Shpakov, Dennis; Simon, Michal; Sumorok, Konstanty; Yoon, Andre Sungho
2012-08-01
Run control systems of modern high-energy particle physics experiments have requirements similar to those of today's Internet applications. The Compact Muon Solenoid (CMS) collaboration at CERN's Large Hadron Collider (LHC) therefore decided to build the run control system for its detector based on web technologies. The system is composed of Java Web Applications distributed over a set of Apache Tomcat servlet containers that connect to a database back-end. Users interact with the system through a web browser. The present paper reports on the successful scaling of the system from a small test setup to the production data acquisition system that comprises around 10.000 applications running on a cluster of about 1600 hosts. We report on operational aspects during the first phase of operation with colliding beams including performance, stability, integration with the CMS Detector Control System and tools to guide the operator.
Visualization of usability and functionality of a professional website through web-mining.
Jones, Josette F; Mahoui, Malika; Gopa, Venkata Devi Pragna
2007-10-11
Functional interface design requires understanding of the information system structure and the user. Web logs record user interactions with the interface, and thus provide some insight into user search behavior and efficiency of the search process. The present study uses a data-mining approach with techniques such as association rules, clustering and classification, to visualize the usability and functionality of a digital library through in depth analyses of web logs.
NASA Astrophysics Data System (ADS)
Amirnasr, Elham
It is widely recognized that nonwoven basis weight non-uniformity affects various properties of nonwovens. However, few studies can be found in this topic. The development of uniformity definition and measurement methods and the study of their impact on various web properties such as filtration properties and air permeability would be beneficial both in industrial applications and in academia. They can be utilized as a quality control tool and would provide insights about nonwoven behaviors that cannot be solely explained by average values. Therefore, for quantifying nonwoven web basis weight uniformity we purse to develop an optical analytical tool. The quadrant method and clustering analysis was utilized in an image analysis scheme to help define "uniformity" and its spatial variation. Implementing the quadrant method in an image analysis system allows the establishment of a uniformity index that can be used to quantify the degree of uniformity. Clustering analysis has also been modified and verified using uniform and random simulated images with known parameters. Number of clusters and cluster properties such as cluster size, member and density was determined. We also utilized this new measurement method to evaluate uniformity of nonwovens produced with different processes and investigated impacts of uniformity on filtration and permeability. The results of quadrant method shows that uniformity index computed from quadrant method demonstrate a good range for non-uniformity of nonwoven webs. Clustering analysis is also been applied on reference nonwoven with known visual uniformity. From clustering analysis results, cluster size is promising to be used as uniformity parameter. It is been shown that non-uniform nonwovens has provide lager cluster size than uniform nonwovens. It was been tried to find a relationship between web properties and uniformity index (as a web characteristic). To achieve this, filtration properties, air permeability, solidity and uniformity index of meltblown and spunbond samples was measured. Results for filtration test show some deviation between theoretical and experimental filtration efficiency by considering different types of fiber diameter. This deviation can occur due to variation in basis weight non-uniformity. So an appropriate theory is required to predict the variation of filtration efficiency with respect to non-uniformity of nonwoven filter media. And the results for air permeability test showed that uniformity index determined by quadrant method and measured properties have some relationship. In the other word, air permeability decreases as uniformity index on nonwoven web increase.
Chen, Yi-Bu; Chattopadhyay, Ansuman; Bergen, Phillip; Gadd, Cynthia; Tannery, Nancy
2007-01-01
To bridge the gap between the rising information needs of biological and medical researchers and the rapidly growing number of online bioinformatics resources, we have created the Online Bioinformatics Resources Collection (OBRC) at the Health Sciences Library System (HSLS) at the University of Pittsburgh. The OBRC, containing 1542 major online bioinformatics databases and software tools, was constructed using the HSLS content management system built on the Zope Web application server. To enhance the output of search results, we further implemented the Vivísimo Clustering Engine, which automatically organizes the search results into categories created dynamically based on the textual information of the retrieved records. As the largest online collection of its kind and the only one with advanced search results clustering, OBRC is aimed at becoming a one-stop guided information gateway to the major bioinformatics databases and software tools on the Web. OBRC is available at the University of Pittsburgh's HSLS Web site (http://www.hsls.pitt.edu/guides/genetics/obrc).
Pragmatic service development and customisation with the CEDA OGC Web Services framework
NASA Astrophysics Data System (ADS)
Pascoe, Stephen; Stephens, Ag; Lowe, Dominic
2010-05-01
The CEDA OGC Web Services framework (COWS) emphasises rapid service development by providing a lightweight layer of OGC web service logic on top of Pylons, a mature web application framework for the Python language. This approach gives developers a flexible web service development environment without compromising access to the full range of web application tools and patterns: Model-View-Controller paradigm, XML templating, Object-Relational-Mapper integration and authentication/authorization. We have found this approach useful for exploring evolving standards and implementing protocol extensions to meet the requirements of operational deployments. This paper outlines how COWS is being used to implement customised WMS, WCS, WFS and WPS services in a variety of web applications from experimental prototypes to load-balanced cluster deployments serving 10-100 simultaneous users. In particular we will cover 1) The use of Climate Science Modeling Language (CSML) in complex-feature aware WMS, WCS and WFS services, 2) Extending WMS to support applications with features specific to earth system science and 3) A cluster-enabled Web Processing Service (WPS) supporting asynchronous data processing. The COWS WPS underpins all backend services in the UK Climate Projections User Interface where users can extract, plot and further process outputs from a multi-dimensional probabilistic climate model dataset. The COWS WPS supports cluster job execution, result caching, execution time estimation and user management. The COWS WMS and WCS components drive the project-specific NCEO and QESDI portals developed by the British Atmospheric Data Centre. These portals use CSML as a backend description format and implement features such as multiple WMS layer dimensions and climatology axes that are beyond the scope of general purpose GIS tools and yet vital for atmospheric science applications.
Web service discovery among large service pools utilising semantic similarity and clustering
NASA Astrophysics Data System (ADS)
Chen, Fuzan; Li, Minqiang; Wu, Harris; Xie, Lingli
2017-03-01
With the rapid development of electronic business, Web services have attracted much attention in recent years. Enterprises can combine individual Web services to provide new value-added services. An emerging challenge is the timely discovery of close matches to service requests among large service pools. In this study, we first define a new semantic similarity measure combining functional similarity and process similarity. We then present a service discovery mechanism that utilises the new semantic similarity measure for service matching. All the published Web services are pre-grouped into functional clusters prior to the matching process. For a user's service request, the discovery mechanism first identifies matching services clusters and then identifies the best matching Web services within these matching clusters. Experimental results show that the proposed semantic discovery mechanism performs better than a conventional lexical similarity-based mechanism.
NASA Astrophysics Data System (ADS)
Fume, Kosei; Ishitani, Yasuto
2008-01-01
We propose a document categorization method based on a document model that can be defined externally for each task and that categorizes Web content or business documents into a target category in accordance with the similarity of the model. The main feature of the proposed method consists of two aspects of semantics extraction from an input document. The semantics of terms are extracted by the semantic pattern analysis and implicit meanings of document substructure are specified by a bottom-up text clustering technique focusing on the similarity of text line attributes. We have constructed a system based on the proposed method for trial purposes. The experimental results show that the system achieves more than 80% classification accuracy in categorizing Web content and business documents into 15 or 70 categories.
A web portal for hydrodynamical, cosmological simulations
NASA Astrophysics Data System (ADS)
Ragagnin, A.; Dolag, K.; Biffi, V.; Cadolle Bel, M.; Hammer, N. J.; Krukau, A.; Petkova, M.; Steinborn, D.
2017-07-01
This article describes a data centre hosting a web portal for accessing and sharing the output of large, cosmological, hydro-dynamical simulations with a broad scientific community. It also allows users to receive related scientific data products by directly processing the raw simulation data on a remote computing cluster. The data centre has a multi-layer structure: a web portal, a job control layer, a computing cluster and a HPC storage system. The outer layer enables users to choose an object from the simulations. Objects can be selected by visually inspecting 2D maps of the simulation data, by performing highly compounded and elaborated queries or graphically by plotting arbitrary combinations of properties. The user can run analysis tools on a chosen object. These services allow users to run analysis tools on the raw simulation data. The job control layer is responsible for handling and performing the analysis jobs, which are executed on a computing cluster. The innermost layer is formed by a HPC storage system which hosts the large, raw simulation data. The following services are available for the users: (I) CLUSTERINSPECT visualizes properties of member galaxies of a selected galaxy cluster; (II) SIMCUT returns the raw data of a sub-volume around a selected object from a simulation, containing all the original, hydro-dynamical quantities; (III) SMAC creates idealized 2D maps of various, physical quantities and observables of a selected object; (IV) PHOX generates virtual X-ray observations with specifications of various current and upcoming instruments.
Load Balancing in Distributed Web Caching: A Novel Clustering Approach
NASA Astrophysics Data System (ADS)
Tiwari, R.; Kumar, K.; Khan, G.
2010-11-01
The World Wide Web suffers from scaling and reliability problems due to overloaded and congested proxy servers. Caching at local proxy servers helps, but cannot satisfy more than a third to half of requests; more requests are still sent to original remote origin servers. In this paper we have developed an algorithm for Distributed Web Cache, which incorporates cooperation among proxy servers of one cluster. This algorithm uses Distributed Web Cache concepts along with static hierarchies with geographical based clusters of level one proxy server with dynamic mechanism of proxy server during the congestion of one cluster. Congestion and scalability problems are being dealt by clustering concept used in our approach. This results in higher hit ratio of caches, with lesser latency delay for requested pages. This algorithm also guarantees data consistency between the original server objects and the proxy cache objects.
P43-S Computational Biology Applications Suite for High-Performance Computing (BioHPC.net)
Pillardy, J.
2007-01-01
One of the challenges of high-performance computing (HPC) is user accessibility. At the Cornell University Computational Biology Service Unit, which is also a Microsoft HPC institute, we have developed a computational biology application suite that allows researchers from biological laboratories to submit their jobs to the parallel cluster through an easy-to-use Web interface. Through this system, we are providing users with popular bioinformatics tools including BLAST, HMMER, InterproScan, and MrBayes. The system is flexible and can be easily customized to include other software. It is also scalable; the installation on our servers currently processes approximately 8500 job submissions per year, many of them requiring massively parallel computations. It also has a built-in user management system, which can limit software and/or database access to specified users. TAIR, the major database of the plant model organism Arabidopsis, and SGN, the international tomato genome database, are both using our system for storage and data analysis. The system consists of a Web server running the interface (ASP.NET C#), Microsoft SQL server (ADO.NET), compute cluster running Microsoft Windows, ftp server, and file server. Users can interact with their jobs and data via a Web browser, ftp, or e-mail. The interface is accessible at http://cbsuapps.tc.cornell.edu/.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Curtis, Darren S.; Peterson, Elena S.; Oehmen, Chris S.
2008-05-04
This work presents the ScalaBLAST Web Application (SWA), a web based application implemented using the PHP script language, MySQL DBMS, and Apache web server under a GNU/Linux platform. SWA is an application built as part of the Data Intensive Computer for Complex Biological Systems (DICCBS) project at the Pacific Northwest National Laboratory (PNNL). SWA delivers accelerated throughput of bioinformatics analysis via high-performance computing through a convenient, easy-to-use web interface. This approach greatly enhances emerging fields of study in biology such as ontology-based homology, and multiple whole genome comparisons which, in the absence of a tool like SWA, require a heroicmore » effort to overcome the computational bottleneck associated with genome analysis. The current version of SWA includes a user account management system, a web based user interface, and a backend process that generates the files necessary for the Internet scientific community to submit a ScalaBLAST parallel processing job on a dedicated cluster.« less
A Science Portal and Archive for Extragalactic Globular Cluster Systems Data
NASA Astrophysics Data System (ADS)
Young, Michael; Rhode, Katherine L.; Gopu, Arvind
2015-01-01
For several years we have been carrying out a wide-field imaging survey of the globular cluster populations of a sample of giant spiral, S0, and elliptical galaxies with distances of ~10-30 Mpc. We use mosaic CCD cameras on the WIYN 3.5-m and Kitt Peak 4-m telescopes to acquire deep BVR imaging of each galaxy and then analyze the data to derive global properties of the globular cluster system. In addition to measuring the total numbers, specific frequencies, spatial distributions, and color distributions for the globular cluster populations, we have produced deep, high-quality images and lists of tens to thousands of globular cluster candidates for the ~40 galaxies included in the survey.With the survey nearing completion, we have been exploring how to efficiently disseminate not only the overall results, but also all of the relevant data products, to the astronomical community. Here we present our solution: a scientific portal and archive for extragalactic globular cluster systems data. With a modern and intuitive web interface built on the same framework as the WIYN One Degree Imager Portal, Pipeline, and Archive (ODI-PPA), our system will provide public access to the survey results and the final stacked mosaic images of the target galaxies. In addition, the astrometric and photometric data for thousands of identified globular cluster candidates, as well as for all point sources detected in each field, will be indexed and searchable. Where available, spectroscopic follow-up data will be paired with the candidates. Advanced imaging tools will enable users to overlay the cluster candidates and other sources on the mosaic images within the web interface, while metadata charting tools will allow users to rapidly and seamlessly plot the survey results for each galaxy and the data for hundreds of thousands of individual sources. Finally, we will appeal to other researchers with similar data products and work toward making our portal a central repository for data related to well-studied giant galaxy globular cluster systems. This work is supported by NSF Faculty Early Career Development (CAREER) award AST-0847109.
ΛGR Centennial: Cosmic Web in Dark Energy Background
NASA Astrophysics Data System (ADS)
Chernin, A. D.
The basic building blocks of the Cosmic Web are groups and clusters of galaxies, super-clusters (pancakes) and filaments embedded in the universal dark energy background. The background produces antigravity, and the antigravity effect is strong in groups, clusters and superclusters. Antigravity is very weak in filaments where matter (dark matter and baryons) produces gravity dominating in the filament internal dynamics. Gravity-antigravity interplay on the large scales is a grandiose phenomenon predicted by ΛGR theory and seen in modern observations of the Cosmic Web.
ICM: a web server for integrated clustering of multi-dimensional biomedical data.
He, Song; He, Haochen; Xu, Wenjian; Huang, Xin; Jiang, Shuai; Li, Fei; He, Fuchu; Bo, Xiaochen
2016-07-08
Large-scale efforts for parallel acquisition of multi-omics profiling continue to generate extensive amounts of multi-dimensional biomedical data. Thus, integrated clustering of multiple types of omics data is essential for developing individual-based treatments and precision medicine. However, while rapid progress has been made, methods for integrated clustering are lacking an intuitive web interface that facilitates the biomedical researchers without sufficient programming skills. Here, we present a web tool, named Integrated Clustering of Multi-dimensional biomedical data (ICM), that provides an interface from which to fuse, cluster and visualize multi-dimensional biomedical data and knowledge. With ICM, users can explore the heterogeneity of a disease or a biological process by identifying subgroups of patients. The results obtained can then be interactively modified by using an intuitive user interface. Researchers can also exchange the results from ICM with collaborators via a web link containing a Project ID number that will directly pull up the analysis results being shared. ICM also support incremental clustering that allows users to add new sample data into the data of a previous study to obtain a clustering result. Currently, the ICM web server is available with no login requirement and at no cost at http://biotech.bmi.ac.cn/icm/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Trust estimation of the semantic web using semantic web clustering
NASA Astrophysics Data System (ADS)
Shirgahi, Hossein; Mohsenzadeh, Mehran; Haj Seyyed Javadi, Hamid
2017-05-01
Development of semantic web and social network is undeniable in the Internet world these days. Widespread nature of semantic web has been very challenging to assess the trust in this field. In recent years, extensive researches have been done to estimate the trust of semantic web. Since trust of semantic web is a multidimensional problem, in this paper, we used parameters of social network authority, the value of pages links authority and semantic authority to assess the trust. Due to the large space of semantic network, we considered the problem scope to the clusters of semantic subnetworks and obtained the trust of each cluster elements as local and calculated the trust of outside resources according to their local trusts and trust of clusters to each other. According to the experimental result, the proposed method shows more than 79% Fscore that is about 11.9% in average more than Eigen, Tidal and centralised trust methods. Mean of error in this proposed method is 12.936, that is 9.75% in average less than Eigen and Tidal trust methods.
USDA-ARS?s Scientific Manuscript database
Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that i...
Query Results Clustering by Extending SPARQL with CLUSTER BY
NASA Astrophysics Data System (ADS)
Ławrynowicz, Agnieszka
The task of dynamic clustering of the search results proved to be useful in the Web context, where the user often does not know the granularity of the search results in advance. The goal of this paper is to provide a declarative way for invoking dynamic clustering of the results of queries submitted over Semantic Web data. To achieve this goal the paper proposes an approach that extends SPARQL by clustering abilities. The approach introduces a new statement, CLUSTER BY, into the SPARQL grammar and proposes semantics for such extension.
Kohonen, Pekka; Benfenati, Emilio; Bower, David; Ceder, Rebecca; Crump, Michael; Cross, Kevin; Grafström, Roland C; Healy, Lyn; Helma, Christoph; Jeliazkova, Nina; Jeliazkov, Vedrin; Maggioni, Silvia; Miller, Scott; Myatt, Glenn; Rautenberg, Michael; Stacey, Glyn; Willighagen, Egon; Wiseman, Jeff; Hardy, Barry
2013-01-01
The aim of the SEURAT-1 (Safety Evaluation Ultimately Replacing Animal Testing-1) research cluster, comprised of seven EU FP7 Health projects co-financed by Cosmetics Europe, is to generate a proof-of-concept to show how the latest technologies, systems toxicology and toxicogenomics can be combined to deliver a test replacement for repeated dose systemic toxicity testing on animals. The SEURAT-1 strategy is to adopt a mode-of-action framework to describe repeated dose toxicity, combining in vitro and in silico methods to derive predictions of in vivo toxicity responses. ToxBank is the cross-cluster infrastructure project whose activities include the development of a data warehouse to provide a web-accessible shared repository of research data and protocols, a physical compounds repository, reference or "gold compounds" for use across the cluster (available via wiki.toxbank.net), and a reference resource for biomaterials. Core technologies used in the data warehouse include the ISA-Tab universal data exchange format, REpresentational State Transfer (REST) web services, the W3C Resource Description Framework (RDF) and the OpenTox standards. We describe the design of the data warehouse based on cluster requirements, the implementation based on open standards, and finally the underlying concepts and initial results of a data analysis utilizing public data related to the gold compounds. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The Sargassum Early Advisory System (SEAS)
NASA Astrophysics Data System (ADS)
Armstrong, D.; Gallegos, S. C.
2016-02-01
The Sargassum Early Advisory System (SEAS) web-app was designed to automatically detect Sargassum at sea, forecast movement of the seaweed, and alert users of potential landings. Inspired to help address the economic hardships caused by large landings of Sargassum, the web app automates and enhances the manual tasks conducted by the SEAS group of Texas A&M University at Galveston. The SEAS web app is a modular, mobile-friendly tool that automates the entire workflow from data acquisition to user management. The modules include: 1) an Imagery Retrieval Module to automatically download Landsat-8 Operational Land Imagery (OLI) from the United States Geological Survey (USGS), 2) a Processing Module for automatic detection of Sargassum in the OLI imagery, and subsequent mapping of theses patches in the HYCOM grid, producing maps that show Sargassum clusters; 3) a Forecasting engine fed by the HYbrid Coordinate Ocean Model (HYCOM) model currents and winds from weather buoys; and 4) a mobile phone optimized geospatial user interface. The user can view the last known position of Sargassum clusters, trajectory and location projections for the next 24, 72 and 168 hrs. Users can also subscribe to alerts generated for particular areas. Currently, the SEAS web app produces advisories for Texas beaches. The forecasted Sargassum landing locations are validated by reports from Texas beach managers. However, the SEAS web app was designed to easily expand to other areas, and future plans call for extending the SEAS web app to Mexico and the Caribbean islands. The SEAS web app development is led by NASA, with participation by ASRC Federal/Computer Science Corporation, and the Naval Research Laboratory, all at Stennis Space Center, and Texas A&M University at Galveston.
Observations of a nearby filament of galaxy clusters with the Sardinia Radio Telescope
NASA Astrophysics Data System (ADS)
Vacca, Valentina; Murgia, M.; Loi, F. Govoni F.; Vazza, F.; Finoguenov, A.; Carretti, E.; Feretti, L.; Giovannini, G.; Concu, R.; Melis, A.; Gheller, C.; Paladino, R.; Poppi, S.; Valente, G.; Bernardi, G.; Boschin, W.; Brienza, M.; Clarke, T. E.; Colafrancesco, S.; Enßlin, T.; Ferrari, C.; de Gasperin, F.; Gastaldello, F.; Girardi, M.; Gregorini, L.; Johnston-Hollitt, M.; Junklewitz, H.; Orrù, E.; Parma, P.; Perley, R.; Taylor, G. B.
2018-05-01
We report the detection of diffuse radio emission which might be connected to a large-scale filament of the cosmic web covering a 8° × 8° area in the sky, likely associated with a z≈0.1 over-density traced by nine massive galaxy clusters. In this work, we present radio observations of this region taken with the Sardinia Radio Telescope. Two of the clusters in the field host a powerful radio halo sustained by violent ongoing mergers and provide direct proof of intra-cluster magnetic fields. In order to investigate the presence of large-scale diffuse radio synchrotron emission in and beyond the galaxy clusters in this complex system, we combined the data taken at 1.4 GHz with the Sardinia Radio Telescope with higher resolution data taken with the NRAO VLA Sky Survey. We found 28 candidate new sources with a size larger and X-ray emission fainter than known diffuse large-scale synchrotron cluster sources for a given radio power. This new population is potentially the tip of the iceberg of a class of diffuse large-scale synchrotron sources associated with the filaments of the cosmic web. In addition, we found in the field a candidate new giant radio galaxy.
Webs on surfaces, rings of invariants, and clusters.
Fomin, Sergey; Pylyavskyy, Pavlo
2014-07-08
We construct and study cluster algebra structures in rings of invariants of the special linear group action on collections of 3D vectors, covectors, and matrices. The construction uses Kuperberg's calculus of webs on marked surfaces with boundary.
Yokohama, Noriya; Tsuchimoto, Tadashi; Oishi, Masamichi; Itou, Katsuya
2007-01-20
It has been noted that the downtime of medical informatics systems is often long. Many systems encounter downtimes of hours or even days, which can have a critical effect on daily operations. Such systems remain especially weak in the areas of database and medical imaging data. The scheme design shows the three-layer architecture of the system: application, database, and storage layers. The application layer uses the DICOM protocol (Digital Imaging and Communication in Medicine) and HTTP (Hyper Text Transport Protocol) with AJAX (Asynchronous JavaScript+XML). The database is designed to decentralize in parallel using cluster technology. Consequently, restoration of the database can be done not only with ease but also with improved retrieval speed. In the storage layer, a network RAID (Redundant Array of Independent Disks) system, it is possible to construct exabyte-scale parallel file systems that exploit storage spread. Development and evaluation of the test-bed has been successful in medical information data backup and recovery in a network environment. This paper presents a schematic design of the new medical informatics system that can be accommodated from a recovery and the dynamic Web application for medical imaging distribution using AJAX.
Li, Jun; Tai, Cui; Deng, Zixin; Zhong, Weihong; He, Yongqun; Ou, Hong-Yu
2017-01-10
VRprofile is a Web server that facilitates rapid investigation of virulence and antibiotic resistance genes, as well as extends these trait transfer-related genetic contexts, in newly sequenced pathogenic bacterial genomes. The used backend database MobilomeDB was firstly built on sets of known gene cluster loci of bacterial type III/IV/VI/VII secretion systems and mobile genetic elements, including integrative and conjugative elements, prophages, class I integrons, IS elements and pathogenicity/antibiotic resistance islands. VRprofile is thus able to co-localize the homologs of these conserved gene clusters using HMMer or BLASTp searches. With the integration of the homologous gene cluster search module with a sequence composition module, VRprofile has exhibited better performance for island-like region predictions than the other widely used methods. In addition, VRprofile also provides an integrated Web interface for aligning and visualizing identified gene clusters with MobilomeDB-archived gene clusters, or a variety set of bacterial genomes. VRprofile might contribute to meet the increasing demands of re-annotations of bacterial variable regions, and aid in the real-time definitions of disease-relevant gene clusters in pathogenic bacteria of interest. VRprofile is freely available at http://bioinfo-mml.sjtu.edu.cn/VRprofile. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Pienaar, Rudolph; Rannou, Nicolas; Bernal, Jorge; Hahn, Daniel; Grant, P Ellen
2015-01-01
The utility of web browsers for general purpose computing, long anticipated, is only now coming into fruition. In this paper we present a web-based medical image data and information management software platform called ChRIS ([Boston] Children's Research Integration System). ChRIS' deep functionality allows for easy retrieval of medical image data from resources typically found in hospitals, organizes and presents information in a modern feed-like interface, provides access to a growing library of plugins that process these data - typically on a connected High Performance Compute Cluster, allows for easy data sharing between users and instances of ChRIS and provides powerful 3D visualization and real time collaboration.
Wang, Yi; Coleman-Derr, Devin; Chen, Guoping; Gu, Yong Q
2015-07-01
Genome wide analysis of orthologous clusters is an important component of comparative genomics studies. Identifying the overlap among orthologous clusters can enable us to elucidate the function and evolution of proteins across multiple species. Here, we report a web platform named OrthoVenn that is useful for genome wide comparisons and visualization of orthologous clusters. OrthoVenn provides coverage of vertebrates, metazoa, protists, fungi, plants and bacteria for the comparison of orthologous clusters and also supports uploading of customized protein sequences from user-defined species. An interactive Venn diagram, summary counts, and functional summaries of the disjunction and intersection of clusters shared between species are displayed as part of the OrthoVenn result. OrthoVenn also includes in-depth views of the clusters using various sequence analysis tools. Furthermore, OrthoVenn identifies orthologous clusters of single copy genes and allows for a customized search of clusters of specific genes through key words or BLAST. OrthoVenn is an efficient and user-friendly web server freely accessible at http://probes.pw.usda.gov/OrthoVenn or http://aegilops.wheat.ucdavis.edu/OrthoVenn. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Designing Web-based Telemedicine Training for Military Health Care Providers.
ERIC Educational Resources Information Center
Bangert, David; Doktor, Boert; Johnson, Erik
2001-01-01
Interviews with 48 military health care professionals identified 20 objectives and 4 learning clusters for a telemedicine training curriculum. From these clusters, web-based modules were developed addressing clinical learning, technology, organizational issues, and introduction to telemedicine. (Contains 19 references.) (SK)
Free Factories: Unified Infrastructure for Data Intensive Web Services
Zaranek, Alexander Wait; Clegg, Tom; Vandewege, Ward; Church, George M.
2010-01-01
We introduce the Free Factory, a platform for deploying data-intensive web services using small clusters of commodity hardware and free software. Independently administered virtual machines called Freegols give application developers the flexibility of a general purpose web server, along with access to distributed batch processing, cache and storage services. Each cluster exploits idle RAM and disk space for cache, and reserves disks in each node for high bandwidth storage. The batch processing service uses a variation of the MapReduce model. Virtualization allows every CPU in the cluster to participate in batch jobs. Each 48-node cluster can achieve 4-8 gigabytes per second of disk I/O. Our intent is to use multiple clusters to process hundreds of simultaneous requests on multi-hundred terabyte data sets. Currently, our applications achieve 1 gigabyte per second of I/O with 123 disks by scheduling batch jobs on two clusters, one of which is located in a remote data center. PMID:20514356
CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats.
Grissa, Ibtissem; Vergnaud, Gilles; Pourcel, Christine
2007-07-01
Clustered regularly interspaced short palindromic repeats (CRISPRs) constitute a particular family of tandem repeats found in a wide range of prokaryotic genomes (half of eubacteria and almost all archaea). They consist of a succession of highly conserved regions (DR) varying in size from 23 to 47 bp, separated by similarly sized unique sequences (spacer) of usually viral origin. A CRISPR cluster is flanked on one side by an AT-rich sequence called the leader and assumed to be a transcriptional promoter. Recent studies suggest that this structure represents a putative RNA-interference-based immune system. Here we describe CRISPRFinder, a web service offering tools to (i) detect CRISPRs including the shortest ones (one or two motifs); (ii) define DRs and extract spacers; (iii) get the flanking sequences to determine the leader; (iv) blast spacers against Genbank database and (v) check if the DR is found elsewhere in prokaryotic sequenced genomes. CRISPRFinder is freely accessible at http://crispr.u-psud.fr/Server/CRISPRfinder.php.
Burns, Randal; Roncal, William Gray; Kleissas, Dean; Lillaney, Kunal; Manavalan, Priya; Perlman, Eric; Berger, Daniel R; Bock, Davi D; Chung, Kwanghun; Grosenick, Logan; Kasthuri, Narayanan; Weiler, Nicholas C; Deisseroth, Karl; Kazhdan, Michael; Lichtman, Jeff; Reid, R Clay; Smith, Stephen J; Szalay, Alexander S; Vogelstein, Joshua T; Vogelstein, R Jacob
2013-01-01
We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes - neural connectivity maps of the brain-using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems-reads to parallel disk arrays and writes to solid-state storage-to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effec-tiveness of spatial data organization.
Cosmic web type dependence of halo clustering
NASA Astrophysics Data System (ADS)
Fisher, J. D.; Faltenbacher, A.
2018-01-01
We use the Millennium Simulation to show that halo clustering varies significantly with cosmic web type. Haloes are classified as node, filament, sheet and void haloes based on the eigenvalue decomposition of the velocity shear tensor. The velocity field is sampled by the peculiar velocities of a fixed number of neighbouring haloes, and spatial derivatives are computed using a kernel borrowed from smoothed particle hydrodynamics. The classification scheme is used to examine the clustering of haloes as a function of web type for haloes with masses larger than 1011 h- 1 M⊙. We find that node haloes show positive bias, filament haloes show negligible bias and void and sheet haloes are antibiased independent of halo mass. Our findings suggest that the mass dependence of halo clustering is rooted in the composition of web types as a function of halo mass. The substantial fraction of node-type haloes for halo masses ≳ 2 × 1013 h- 1 M⊙ leads to positive bias. Filament-type haloes prevail at intermediate masses, 1012-1013 h- 1 M⊙, resulting in unbiased clustering. The large contribution of sheet-type haloes at low halo masses ≲ 1012 h- 1 M⊙ generates antibiasing.
LigSearch: a knowledge-based web server to identify likely ligands for a protein target
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beer, Tjaart A. P. de; Laskowski, Roman A.; Duban, Mark-Eugene
LigSearch is a web server for identifying ligands likely to bind to a given protein. Identifying which ligands might bind to a protein before crystallization trials could provide a significant saving in time and resources. LigSearch, a web server aimed at predicting ligands that might bind to and stabilize a given protein, has been developed. Using a protein sequence and/or structure, the system searches against a variety of databases, combining available knowledge, and provides a clustered and ranked output of possible ligands. LigSearch can be accessed at http://www.ebi.ac.uk/thornton-srv/databases/LigSearch.
A New MI-Based Visualization Aided Validation Index for Mining Big Longitudinal Web Trial Data
Zhang, Zhaoyang; Fang, Hua; Wang, Honggang
2016-01-01
Web-delivered clinical trials generate big complex data. To help untangle the heterogeneity of treatment effects, unsupervised learning methods have been widely applied. However, identifying valid patterns is a priority but challenging issue for these methods. This paper, built upon our previous research on multiple imputation (MI)-based fuzzy clustering and validation, proposes a new MI-based Visualization-aided validation index (MIVOOS) to determine the optimal number of clusters for big incomplete longitudinal Web-trial data with inflated zeros. Different from a recently developed fuzzy clustering validation index, MIVOOS uses a more suitable overlap and separation measures for Web-trial data but does not depend on the choice of fuzzifiers as the widely used Xie and Beni (XB) index. Through optimizing the view angles of 3-D projections using Sammon mapping, the optimal 2-D projection-guided MIVOOS is obtained to better visualize and verify the patterns in conjunction with trajectory patterns. Compared with XB and VOS, our newly proposed MIVOOS shows its robustness in validating big Web-trial data under different missing data mechanisms using real and simulated Web-trial data. PMID:27482473
Information Clustering Based on Fuzzy Multisets.
ERIC Educational Resources Information Center
Miyamoto, Sadaaki
2003-01-01
Proposes a fuzzy multiset model for information clustering with application to information retrieval on the World Wide Web. Highlights include search engines; term clustering; document clustering; algorithms for calculating cluster centers; theoretical properties concerning clustering algorithms; and examples to show how the algorithms work.…
NASA Astrophysics Data System (ADS)
Wu, Zhihao; Lin, Youfang; Zhao, Yiji; Yan, Hongyan
2018-02-01
Networks can represent a wide range of complex systems, such as social, biological and technological systems. Link prediction is one of the most important problems in network analysis, and has attracted much research interest recently. Many link prediction methods have been proposed to solve this problem with various techniques. We can note that clustering information plays an important role in solving the link prediction problem. In previous literatures, we find node clustering coefficient appears frequently in many link prediction methods. However, node clustering coefficient is limited to describe the role of a common-neighbor in different local networks, because it cannot distinguish different clustering abilities of a node to different node pairs. In this paper, we shift our focus from nodes to links, and propose the concept of asymmetric link clustering (ALC) coefficient. Further, we improve three node clustering based link prediction methods via the concept of ALC. The experimental results demonstrate that ALC-based methods outperform node clustering based methods, especially achieving remarkable improvements on food web, hamster friendship and Internet networks. Besides, comparing with other methods, the performance of ALC-based methods are very stable in both globalized and personalized top-L link prediction tasks.
Liu, Yan-Lin; Shih, Cheng-Ting; Chang, Yuan-Jen; Chang, Shu-Jun; Wu, Jay
2014-01-01
The rapid development of picture archiving and communication systems (PACSs) thoroughly changes the way of medical informatics communication and management. However, as the scale of a hospital's operations increases, the large amount of digital images transferred in the network inevitably decreases system efficiency. In this study, a server cluster consisting of two server nodes was constructed. Network load balancing (NLB), distributed file system (DFS), and structured query language (SQL) duplication services were installed. A total of 1 to 16 workstations were used to transfer computed radiography (CR), computed tomography (CT), and magnetic resonance (MR) images simultaneously to simulate the clinical situation. The average transmission rate (ATR) was analyzed between the cluster and noncluster servers. In the download scenario, the ATRs of CR, CT, and MR images increased by 44.3%, 56.6%, and 100.9%, respectively, when using the server cluster, whereas the ATRs increased by 23.0%, 39.2%, and 24.9% in the upload scenario. In the mix scenario, the transmission performance increased by 45.2% when using eight computer units. The fault tolerance mechanisms of the server cluster maintained the system availability and image integrity. The server cluster can improve the transmission efficiency while maintaining high reliability and continuous availability in a healthcare environment.
Chang, Shu-Jun; Wu, Jay
2014-01-01
The rapid development of picture archiving and communication systems (PACSs) thoroughly changes the way of medical informatics communication and management. However, as the scale of a hospital's operations increases, the large amount of digital images transferred in the network inevitably decreases system efficiency. In this study, a server cluster consisting of two server nodes was constructed. Network load balancing (NLB), distributed file system (DFS), and structured query language (SQL) duplication services were installed. A total of 1 to 16 workstations were used to transfer computed radiography (CR), computed tomography (CT), and magnetic resonance (MR) images simultaneously to simulate the clinical situation. The average transmission rate (ATR) was analyzed between the cluster and noncluster servers. In the download scenario, the ATRs of CR, CT, and MR images increased by 44.3%, 56.6%, and 100.9%, respectively, when using the server cluster, whereas the ATRs increased by 23.0%, 39.2%, and 24.9% in the upload scenario. In the mix scenario, the transmission performance increased by 45.2% when using eight computer units. The fault tolerance mechanisms of the server cluster maintained the system availability and image integrity. The server cluster can improve the transmission efficiency while maintaining high reliability and continuous availability in a healthcare environment. PMID:24701580
Beyond Information Retrieval: Ways To Provide Content in Context.
ERIC Educational Resources Information Center
Wiley, Deborah Lynne
1998-01-01
Provides an overview of information retrieval from mainframe systems to Web search engines; discusses collaborative filtering, data extraction, data visualization, agent technology, pattern recognition, classification and clustering, and virtual communities. Argues that rather than huge data-storage centers and proprietary software, we need…
Cosmic Web of Galaxies in the COMOS Field
NASA Astrophysics Data System (ADS)
Darvish, Behnam; Martin, Christopher D.; Mobasher, Bahram; Scoville, Nicholas; Sobral, David; COSMOS science Team
2017-01-01
We use a mass complete sample of galaxies with accurate photometric redshifts in the COSMOS field to estimate the density field and to extract the components of the cosmic web. The comic web extraction algorithm relies on the signs and the ratio of eigenvalues of the Hessian matrix and is enable to integrate the density field into clusters, filaments and the field. We show that at z < 0.8, the median star-formation rate in the cosmic web gradually declines from the field to clusters and this decline is especially sharp for satellite galaxies (~1 dex vs. ~0.4 dex for centrals). However, at z > 0.8, the trend flattens out. For star-forming galaxies only, the median star-formation rate declines by ~ 0.3-0.4 dex from the field to clusters for both satellites and centrals, only at z < 0.5. We argue that for satellite galaxies, the main role of the cosmic web environment is to control their star-forming/quiescent fraction, whereas for centrals, it is mainly to control their overall star-formation rate. Given these, we suggest that most satellite galaxies experience a rapid quenching mechanism as they fall from the field into clusters through the channel of filaments, whereas for central galaxies, it is mostly due to a slow quenching process. Our preliminary results highlight the importance of the large-scale cosmic web on the evolution of galaxies.
Multipolar moments of weak lensing signal around clusters. Weighing filaments in harmonic space
NASA Astrophysics Data System (ADS)
Gouin, C.; Gavazzi, R.; Codis, S.; Pichon, C.; Peirani, S.; Dubois, Y.
2017-09-01
Context. Upcoming weak lensing surveys such as Euclid will provide an unprecedented opportunity to quantify the geometry and topology of the cosmic web, in particular in the vicinity of lensing clusters. Aims: Understanding the connectivity of the cosmic web with unbiased mass tracers, such as weak lensing, is of prime importance to probe the underlying cosmology, seek dynamical signatures of dark matter, and quantify environmental effects on galaxy formation. Methods: Mock catalogues of galaxy clusters are extracted from the N-body PLUS simulation. For each cluster, the aperture multipolar moments of the convergence are calculated in two annuli (inside and outside the virial radius). By stacking their modulus, a statistical estimator is built to characterise the angular mass distribution around clusters. The moments are compared to predictions from perturbation theory and spherical collapse. Results: The main weakly chromatic excess of multipolar power on large scales is understood as arising from the contraction of the primordial cosmic web driven by the growing potential well of the cluster. Besides this boost, the quadrupole prevails in the cluster (ellipsoidal) core, while at the outskirts, harmonic distortions are spread on small angular modes, and trace the non-linear sharpening of the filamentary structures. Predictions for the signal amplitude as a function of the cluster-centric distance, mass, and redshift are presented. The prospects of measuring this signal are estimated for current and future lensing data sets. Conclusions: The Euclid mission should provide all the necessary information for studying the cosmic evolution of the connectivity of the cosmic web around lensing clusters using multipolar moments and probing unique signatures of, for example, baryons and warm dark matter.
Extreme Mergers from the Massive Cluster Survey
NASA Astrophysics Data System (ADS)
Morris, Roger
2010-09-01
We propose to observe two extraordinary, high-redshift galaxy clusters from the Massive Cluster Survey. Both targets are very rare, triple merger systems (one a nearly co-linear merger), and likely lie at the deepest nodes of the cosmic web. Both targets show multiple strong gravitational lensing arcs in the cluster cores. These targets only possess very short (10ks) Chandra observations, and are unobserved by XMM-Newton. The X-ray data will be used to probe the mass distribution of hot, baryonic gas, and to reveal the details of the merger physics and the process of cluster assembly. We will also search for hints of X-ray emission from filaments between the merging clumps. Subaru and Hubble Space Telescope imaging data are in hand; we request additional HST coverage for one object.
Plasma Physics Calculations on a Parallel Macintosh Cluster
NASA Astrophysics Data System (ADS)
Decyk, Viktor; Dauger, Dean; Kokelaar, Pieter
2000-03-01
We have constructed a parallel cluster consisting of 16 Apple Macintosh G3 computers running the MacOS, and achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. For large problems where message packets are large and relatively few in number, performance of 50-150 MFlops/node is possible, depending on the problem. This is fast enough that 3D calculations can be routinely done. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. Full details are available on our web site: http://exodus.physics.ucla.edu/appleseed/.
Plasma Physics Calculations on a Parallel Macintosh Cluster
NASA Astrophysics Data System (ADS)
Decyk, Viktor K.; Dauger, Dean E.; Kokelaar, Pieter R.
We have constructed a parallel cluster consisting of 16 Apple Macintosh G3 computers running the MacOS, and achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. For large problems where message packets are large and relatively few in number, performance of 50-150 Mflops/node is possible, depending on the problem. This is fast enough that 3D calculations can be routinely done. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. Full details are available on our web site: http://exodus.physics.ucla.edu/appleseed/.
Cluster outskirts and the missing baryons
NASA Astrophysics Data System (ADS)
Eckert, D.
2016-06-01
Galaxy clusters are located at the crossroads of intergalactic filaments and are still forming through the continuous merging and accretion of smaller structures from the surrounding cosmic web. Deep, wide-field X-ray studies of the outskirts of the most massive clusters bring us valuable insight into the processes leading to the growth of cosmic structures. In addition, cluster outskirts are privileged sites to search for the missing baryons, which are thought to reside within the filaments of the cosmic web. I will present the XMM cluster outskirts project, a VLP that aims at mapping the outskirts of 13 nearby clusters. Based on the results obtained with this program, I will then explore ideas to exploit the capabilities of XMM during the next decade.
Burns, Randal; Roncal, William Gray; Kleissas, Dean; Lillaney, Kunal; Manavalan, Priya; Perlman, Eric; Berger, Daniel R.; Bock, Davi D.; Chung, Kwanghun; Grosenick, Logan; Kasthuri, Narayanan; Weiler, Nicholas C.; Deisseroth, Karl; Kazhdan, Michael; Lichtman, Jeff; Reid, R. Clay; Smith, Stephen J.; Szalay, Alexander S.; Vogelstein, Joshua T.; Vogelstein, R. Jacob
2013-01-01
We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes— neural connectivity maps of the brain—using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems—reads to parallel disk arrays and writes to solid-state storage—to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effec-tiveness of spatial data organization. PMID:24401992
NeAT: a toolbox for the analysis of biological networks, clusters, classes and pathways.
Brohée, Sylvain; Faust, Karoline; Lima-Mendez, Gipsi; Sand, Olivier; Janky, Rekin's; Vanderstocken, Gilles; Deville, Yves; van Helden, Jacques
2008-07-01
The network analysis tools (NeAT) (http://rsat.ulb.ac.be/neat/) provide a user-friendly web access to a collection of modular tools for the analysis of networks (graphs) and clusters (e.g. microarray clusters, functional classes, etc.). A first set of tools supports basic operations on graphs (comparison between two graphs, neighborhood of a set of input nodes, path finding and graph randomization). Another set of programs makes the connection between networks and clusters (graph-based clustering, cliques discovery and mapping of clusters onto a network). The toolbox also includes programs for detecting significant intersections between clusters/classes (e.g. clusters of co-expression versus functional classes of genes). NeAT are designed to cope with large datasets and provide a flexible toolbox for analyzing biological networks stored in various databases (protein interactions, regulation and metabolism) or obtained from high-throughput experiments (two-hybrid, mass-spectrometry and microarrays). The web interface interconnects the programs in predefined analysis flows, enabling to address a series of questions about networks of interest. Each tool can also be used separately by entering custom data for a specific analysis. NeAT can also be used as web services (SOAP/WSDL interface), in order to design programmatic workflows and integrate them with other available resources.
Revealing the Cosmic Web-dependent Halo Bias
NASA Astrophysics Data System (ADS)
Yang, Xiaohu; Zhang, Youcai; Lu, Tianhuan; Wang, Huiyuan; Shi, Feng; Tweed, Dylan; Li, Shijie; Luo, Wentao; Lu, Yi; Yang, Lei
2017-10-01
Halo bias is the one of the key ingredients of the halo models. It was shown at a given redshift to be only dependent, to the first order, on the halo mass. In this study, four types of cosmic web environments—clusters, filaments, sheets, and voids—are defined within a state-of-the-art high-resolution N-body simulation. Within these environments, we use both halo-dark matter cross correlation and halo-halo autocorrelation functions to probe the clustering properties of halos. The nature of the halo bias differs strongly between the four different cosmic web environments described here. With respect to the overall population, halos in clusters have significantly lower biases in the {10}11.0˜ {10}13.5 {h}-1 {M}⊙ mass range. In other environments, however, halos show extremely enhanced biases up to a factor 10 in voids for halos of mass ˜ {10}12.0 {h}-1 {M}⊙ . Such a strong cosmic web environment dependence in the halo bias may play an important role in future cosmological and galaxy formation studies. Within this cosmic web framework, the age dependency of halo bias is found to be only significant in clusters and filaments for relatively small halos ≲ {10}12.5 {h}-1 {M}⊙ .
Mapping Dark Matter in Simulated Galaxy Clusters
NASA Astrophysics Data System (ADS)
Bowyer, Rachel
2018-01-01
Galaxy clusters are the most massive bound objects in the Universe with most of their mass being dark matter. Cosmological simulations of structure formation show that clusters are embedded in a cosmic web of dark matter filaments and large scale structure. It is thought that these filaments are found preferentially close to the long axes of clusters. We extract galaxy clusters from the simulations "cosmo-OWLS" in order to study their properties directly and also to infer their properties from weak gravitational lensing signatures. We investigate various stacking procedures to enhance the signal of the filaments and large scale structure surrounding the clusters to better understand how the filaments of the cosmic web connect with galaxy clusters. This project was supported in part by the NSF REU grant AST-1358980 and by the Nantucket Maria Mitchell Association.
Java bioinformatics analysis web services for multiple sequence alignment--JABAWS:MSA.
Troshin, Peter V; Procter, James B; Barton, Geoffrey J
2011-07-15
JABAWS is a web services framework that simplifies the deployment of web services for bioinformatics. JABAWS:MSA provides services for five multiple sequence alignment (MSA) methods (Probcons, T-coffee, Muscle, Mafft and ClustalW), and is the system employed by the Jalview multiple sequence analysis workbench since version 2.6. A fully functional, easy to set up server is provided as a Virtual Appliance (VA), which can be run on most operating systems that support a virtualization environment such as VMware or Oracle VirtualBox. JABAWS is also distributed as a Web Application aRchive (WAR) and can be configured to run on a single computer and/or a cluster managed by Grid Engine, LSF or other queuing systems that support DRMAA. JABAWS:MSA provides clients full access to each application's parameters, allows administrators to specify named parameter preset combinations and execution limits for each application through simple configuration files. The JABAWS command-line client allows integration of JABAWS services into conventional scripts. JABAWS is made freely available under the Apache 2 license and can be obtained from: http://www.compbio.dundee.ac.uk/jabaws.
Fuzzy Document Clustering Approach using WordNet Lexical Categories
NASA Astrophysics Data System (ADS)
Gharib, Tarek F.; Fouad, Mohammed M.; Aref, Mostafa M.
Text mining refers generally to the process of extracting interesting information and knowledge from unstructured text. This area is growing rapidly mainly because of the strong need for analysing the huge and large amount of textual data that reside on internal file systems and the Web. Text document clustering provides an effective navigation mechanism to organize this large amount of data by grouping their documents into a small number of meaningful classes. In this paper we proposed a fuzzy text document clustering approach using WordNet lexical categories and Fuzzy c-Means algorithm. Some experiments are performed to compare efficiency of the proposed approach with the recently reported approaches. Experimental results show that Fuzzy clustering leads to great performance results. Fuzzy c-means algorithm overcomes other classical clustering algorithms like k-means and bisecting k-means in both clustering quality and running time efficiency.
Cloud computing for comparative genomics with windows azure platform.
Kim, Insik; Jung, Jae-Yoon; Deluca, Todd F; Nelson, Tristan H; Wall, Dennis P
2012-01-01
Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services.
Teaching Analytics: A Clustering and Triangulation Study of Digital Library User Data
ERIC Educational Resources Information Center
Xu, Beijie; Recker, Mimi
2012-01-01
Teachers and students increasingly enjoy unprecedented access to abundant web resources and digital libraries to enhance and enrich their classroom experiences. However, due to the distributed nature of such systems, conventional educational research methods, such as surveys and observations, provide only limited snapshots. In addition,…
Cloud Computing for Comparative Genomics with Windows Azure Platform
Kim, Insik; Jung, Jae-Yoon; DeLuca, Todd F.; Nelson, Tristan H.; Wall, Dennis P.
2012-01-01
Cloud computing services have emerged as a cost-effective alternative for cluster systems as the number of genomes and required computation power to analyze them increased in recent years. Here we introduce the Microsoft Azure platform with detailed execution steps and a cost comparison with Amazon Web Services. PMID:23032609
Advanced Cyber Attack Modeling Analysis and Visualization
2010-03-01
Graph Analysis Network Web Logs Netflow Data TCP Dump Data System Logs Detect Protect Security Management What-If Figure 8. TVA attack graphs for...Clustered Graphs,” in Proceedings of the Symposium on Graph Drawing, September 1996. [25] K. Lakkaraju, W. Yurcik, A. Lee, “NVisionIP: NetFlow
Extreme Mergers from the Massive Cluster Survey
NASA Astrophysics Data System (ADS)
Morris, R.
2010-09-01
We will observe an extraordinary, high-redshift galaxy cluster from the Massive Cluster Survey. The target is a very rare, triple merger system, and likely lies at the one of deepest nodes of the cosmic web. The target shows multiple strong gravitational lensing arcs in the cluster core. This target only possesses a very short {10ks} Chandra observations, and is unobserved by XMM-Newton. The X-ray data from this joint Chandra/HST proposal will be used to probe the mass distribution of hot, baryonic gas, and to reveal the details of the merger physics and the process of cluster assembly. We will also search for hints of X-ray emission from filaments between the merging clumps. Subaru and some Hubble Space Telescope imaging data are in hand; we will gather additional HST coverage for a lensing analysis.
NASA Astrophysics Data System (ADS)
Darvish, Behnam; Mobasher, Bahram; Martin, D. Christopher; Sobral, David; Scoville, Nick; Stroe, Andra; Hemmati, Shoubaneh; Kartaltepe, Jeyhan
2017-03-01
We use a mass complete (log(M/{M}⊙ ) ≥slant 9.6) sample of galaxies with accurate photometric redshifts in the COSMOS field to construct the density field and the cosmic web to z = 1.2. The comic web extraction relies on the density field Hessian matrix and breaks the density field into clusters, filaments, and the field. We provide the density field and cosmic web measures to the community. We show that at z ≲ 0.8, the median star formation rate (SFR) in the cosmic web gradually declines from the field to clusters and this decline is especially sharp for satellites (˜1 dex versus ˜0.5 dex for centrals). However, at z ≳ 0.8, the trend flattens out for the overall galaxy population and satellites. For star-forming (SF) galaxies only, the median SFR is constant at z ≳ 0.5 but declines by ˜0.3-0.4 dex from the field to clusters for satellites and centrals at z ≲ 0.5. We argue that for satellites, the main role of the cosmic web environment is to control their SF fraction, whereas for centrals, it is mainly to control their overall SFR at z ≲ 0.5 and to set their fraction at z ≳ 0.5. We suggest that most satellites experience a rapid quenching mechanism as they fall from the field into clusters through filaments, whereas centrals mostly undergo a slow environmental quenching at z ≲ 0.5 and a fast mechanism at higher redshifts. Our preliminary results highlight the importance of the large-scale cosmic web on galaxy evolution.
Designing Web-based telemedicine training for military health care providers.
Bangert, D; Doktor, R; Johnson, E
2001-01-01
The purpose of the study was to ascertain those learning objectives that will initiate increased use of telemedicine by military health care providers. Telemedicine is increasingly moving to the center of the health care industry's service offerings. As this migration occurs, health professionals will require training for proper and effective change management. The United States Department of Defense (DoD) is embracing the use of telemedicine and wishes to use Web-based training as a tool for effective change management to increase use. This article summarizes the findings of an educational needs assessment of military health care providers for the creation of the DoD Web-based telemedicine training curriculum. Forty-eight health care professionals were interviewed and surveyed to capture their opinions on what learning objectives a telemedicine training curriculum should include. Twenty learning objectives were found to be needed in a telemedicine training program. These 20 learning objectives were grouped into four learning clusters that formed the structure for the training program. In order of importance, the learning clusters were clinical, technical, organizational, and introduction to telemedicine. From these clusters, five Web-based modules were created, with two addressing clinical learning needs and one for each of the other learning objective clusters.
ERIC Educational Resources Information Center
Chen, Hsinchun
2003-01-01
Discusses information retrieval techniques used on the World Wide Web. Topics include machine learning in information extraction; relevance feedback; information filtering and recommendation; text classification and text clustering; Web mining, based on data mining techniques; hyperlink structure; and Web size. (LRW)
Architecture of marine food webs: To be or not be a 'small-world'.
Marina, Tomás Ignacio; Saravia, Leonardo A; Cordone, Georgina; Salinas, Vanesa; Doyle, Santiago R; Momo, Fernando R
2018-01-01
The search for general properties in network structure has been a central issue for food web studies in recent years. One such property is the small-world topology that combines a high clustering and a small distance between nodes of the network. This property may increase food web resilience but make them more sensitive to the extinction of connected species. Food web theory has been developed principally from freshwater and terrestrial ecosystems, largely omitting marine habitats. If theory needs to be modified to accommodate observations from marine ecosystems, based on major differences in several topological characteristics is still on debate. Here we investigated if the small-world topology is a common structural pattern in marine food webs. We developed a novel, simple and statistically rigorous method to examine the largest set of complex marine food webs to date. More than half of the analyzed marine networks exhibited a similar or lower characteristic path length than the random expectation, whereas 39% of the webs presented a significantly higher clustering than its random counterpart. Our method proved that 5 out of 28 networks fulfilled both features of the small-world topology: short path length and high clustering. This work represents the first rigorous analysis of the small-world topology and its associated features in high-quality marine networks. We conclude that such topology is a structural pattern that is not maximized in marine food webs; thus it is probably not an effective model to study robustness, stability and feasibility of marine ecosystems.
Jayashree, B; Rajgopal, S; Hoisington, D; Prasanth, V P; Chandra, S
2008-09-24
Structure, is a widely used software tool to investigate population genetic structure with multi-locus genotyping data. The software uses an iterative algorithm to group individuals into "K" clusters, representing possibly K genetically distinct subpopulations. The serial implementation of this programme is processor-intensive even with small datasets. We describe an implementation of the program within a parallel framework. Speedup was achieved by running different replicates and values of K on each node of the cluster. A web-based user-oriented GUI has been implemented in PHP, through which the user can specify input parameters for the programme. The number of processors to be used can be specified in the background command. A web-based visualization tool "Visualstruct", written in PHP (HTML and Java script embedded), allows for the graphical display of population clusters output from Structure, where each individual may be visualized as a line segment with K colors defining its possible genomic composition with respect to the K genetic sub-populations. The advantage over available programs is in the increased number of individuals that can be visualized. The analyses of real datasets indicate a speedup of up to four, when comparing the speed of execution on clusters of eight processors with the speed of execution on one desktop. The software package is freely available to interested users upon request.
Prosdocimi, Francisco; Bittencourt, Daniela; da Silva, Felipe Rodrigues; Kirst, Matias; Motta, Paulo C.; Rech, Elibio L.
2011-01-01
Characterized by distinctive evolutionary adaptations, spiders provide a comprehensive system for evolutionary and developmental studies of anatomical organs, including silk and venom production. Here we performed cDNA sequencing using massively parallel sequencers (454 GS-FLX Titanium) to generate ∼80,000 reads from the spinning gland of Actinopus spp. (infraorder: Mygalomorphae) and Gasteracantha cancriformis (infraorder: Araneomorphae, Orbiculariae clade). Actinopus spp. retains primitive characteristics on web usage and presents a single undifferentiated spinning gland while the orbiculariae spiders have seven differentiated spinning glands and complex patterns of web usage. MIRA, Celera Assembler and CAP3 software were used to cluster NGS reads for each spider. CAP3 unigenes passed through a pipeline for automatic annotation, classification by biological function, and comparative transcriptomics. Genes related to spider silks were manually curated and analyzed. Although a single spidroin gene family was found in Actinopus spp., a vast repertoire of specialized spider silk proteins was encountered in orbiculariae. Astacin-like metalloproteases (meprin subfamily) were shown to be some of the most sampled unigenes and duplicated gene families in G. cancriformis since its evolutionary split from mygalomorphs. Our results confirm that the evolution of the molecular repertoire of silk proteins was accompanied by the (i) anatomical differentiation of spinning glands and (ii) behavioral complexification in the web usage. Finally, a phylogenetic tree was constructed to cluster most of the known spidroins in gene clades. This is the first large-scale, multi-organism transcriptome for spider spinning glands and a first step into a broad understanding of spider web systems biology and evolution. PMID:21738742
Ergatis: a web interface and scalable software system for bioinformatics workflows
Orvis, Joshua; Crabtree, Jonathan; Galens, Kevin; Gussman, Aaron; Inman, Jason M.; Lee, Eduardo; Nampally, Sreenath; Riley, David; Sundaram, Jaideep P.; Felix, Victor; Whitty, Brett; Mahurkar, Anup; Wortman, Jennifer; White, Owen; Angiuoli, Samuel V.
2010-01-01
Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. Results: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports high-throughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects. Availability: Ergatis is an open-source project and is freely available at http://ergatis.sourceforge.net Contact: jorvis@users.sourceforge.net PMID:20413634
How Teachers Use and Manage Their Blogs? A Cluster Analysis of Teachers' Blogs in Taiwan
ERIC Educational Resources Information Center
Liu, Eric Zhi-Feng; Hou, Huei-Tse
2013-01-01
The development of Web 2.0 has ushered in a new set of web-based tools, including blogs. This study focused on how teachers use and manage their blogs. A sample of 165 teachers' blogs in Taiwan was analyzed by factor analysis, cluster analysis and qualitative content analysis. First, the teachers' blogs were analyzed according to six criteria…
Grid Computing Application for Brain Magnetic Resonance Image Processing
NASA Astrophysics Data System (ADS)
Valdivia, F.; Crépeault, B.; Duchesne, S.
2012-02-01
This work emphasizes the use of grid computing and web technology for automatic post-processing of brain magnetic resonance images (MRI) in the context of neuropsychiatric (Alzheimer's disease) research. Post-acquisition image processing is achieved through the interconnection of several individual processes into pipelines. Each process has input and output data ports, options and execution parameters, and performs single tasks such as: a) extracting individual image attributes (e.g. dimensions, orientation, center of mass), b) performing image transformations (e.g. scaling, rotation, skewing, intensity standardization, linear and non-linear registration), c) performing image statistical analyses, and d) producing the necessary quality control images and/or files for user review. The pipelines are built to perform specific sequences of tasks on the alphanumeric data and MRIs contained in our database. The web application is coded in PHP and allows the creation of scripts to create, store and execute pipelines and their instances either on our local cluster or on high-performance computing platforms. To run an instance on an external cluster, the web application opens a communication tunnel through which it copies the necessary files, submits the execution commands and collects the results. We present result on system tests for the processing of a set of 821 brain MRIs from the Alzheimer's Disease Neuroimaging Initiative study via a nonlinear registration pipeline composed of 10 processes. Our results show successful execution on both local and external clusters, and a 4-fold increase in performance if using the external cluster. However, the latter's performance does not scale linearly as queue waiting times and execution overhead increase with the number of tasks to be executed.
Ding, Yongxia; Zhang, Peili
2018-06-12
Problem-based learning (PBL) is an effective and highly efficient teaching approach that is extensively applied in education systems across a variety of countries. This study aimed to investigate the effectiveness of web-based PBL teaching pedagogies in large classes. The cluster sampling method was used to separate two college-level nursing student classes (graduating class of 2013) into two groups. The experimental group (n = 162) was taught using a web-based PBL teaching approach, while the control group (n = 166) was taught using conventional teaching methods. We subsequently assessed the satisfaction of the experimental group in relation to the web-based PBL teaching mode. This assessment was performed following comparison of teaching activity outcomes pertaining to exams and self-learning capacity between the two groups. When compared with the control group, the examination scores and self-learning capabilities were significantly higher in the experimental group (P < 0.01) compared with the control group. In addition, 92.6% of students in the experimental group expressed satisfaction with the new web-based PBL teaching approach. In a large class-size teaching environment, the web-based PBL teaching approach appears to be more optimal than traditional teaching methods. These results demonstrate the effectiveness of web-based teaching technologies in problem-based learning. Copyright © 2018. Published by Elsevier Ltd.
Graph and Network for Model Elicitation (GNOME Phase 2)
2013-02-01
10 3.3 GNOME UI Components for NOEM Web Client...20 Figure 17: Sampling in Web -client...the web -client). The server-side service can run and generate data asynchronously, allowing a cluster of servers to run the sampling. Also, a
Web-Based Evaluation System to Measure Learning Effectiveness in Kampo Medicine
Usuku, Koichiro; Segawa, Makoto; Wang, Yue; Ogashiwa, Kahori; Fujita, Yusuke; Ogihara, Hiroyuki; Tazuma, Susumu
2016-01-01
Measuring the learning effectiveness of Kampo Medicine (KM) education is challenging. The aim of this study was to develop a web-based test to measure the learning effectiveness of KM education among medical students (MSs). We used an open-source Moodle platform to test 30 multiple-choice questions classified into 8-type fields (eight basic concepts of KM) including “qi-blood-fluid” and “five-element” theories, on 117 fourth-year MSs. The mean (±standard deviation [SD]) score on the web-based test was 30.2 ± 11.9 (/100). The correct answer rate ranged from 17% to 36%. A pattern-based portfolio enabled these rates to be individualized in terms of KM proficiency. MSs with scores higher (n = 19) or lower (n = 14) than mean ± 1SD were defined as high or low achievers, respectively. Cluster analysis using the correct answer rates for the 8-type field questions revealed clear divisions between high and low achievers. Interestingly, each high achiever had a different proficiency pattern. In contrast, three major clusters were evident among low achievers, all of whom responded with a low percentage of or no correct answers. In addition, a combination of three questions accurately classified high and low achievers. These findings suggest that our web-based test allows individual quantitative assessment of the learning effectiveness of KM education among MSs. PMID:27738440
Web-Based Evaluation System to Measure Learning Effectiveness in Kampo Medicine.
Iizuka, Norio; Usuku, Koichiro; Nakae, Hajime; Segawa, Makoto; Wang, Yue; Ogashiwa, Kahori; Fujita, Yusuke; Ogihara, Hiroyuki; Tazuma, Susumu; Hamamoto, Yoshihiko
2016-01-01
Measuring the learning effectiveness of Kampo Medicine (KM) education is challenging. The aim of this study was to develop a web-based test to measure the learning effectiveness of KM education among medical students (MSs). We used an open-source Moodle platform to test 30 multiple-choice questions classified into 8-type fields (eight basic concepts of KM) including "qi-blood-fluid" and "five-element" theories, on 117 fourth-year MSs. The mean (±standard deviation [SD]) score on the web-based test was 30.2 ± 11.9 (/100). The correct answer rate ranged from 17% to 36%. A pattern-based portfolio enabled these rates to be individualized in terms of KM proficiency. MSs with scores higher ( n = 19) or lower ( n = 14) than mean ± 1SD were defined as high or low achievers, respectively. Cluster analysis using the correct answer rates for the 8-type field questions revealed clear divisions between high and low achievers. Interestingly, each high achiever had a different proficiency pattern. In contrast, three major clusters were evident among low achievers, all of whom responded with a low percentage of or no correct answers. In addition, a combination of three questions accurately classified high and low achievers. These findings suggest that our web-based test allows individual quantitative assessment of the learning effectiveness of KM education among MSs.
Automatic document classification of biological literature
Chen, David; Müller, Hans-Michael; Sternberg, Paul W
2006-01-01
Background Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, a text-mining system for biological literature, which marks up full text according to a shallow ontology that includes terms of biological interest. This project investigates document classification in the context of biological literature, making use of the Textpresso markup of a corpus of Caenorhabditis elegans literature. Results We present a two-step text categorization algorithm to classify a corpus of C. elegans papers. Our classification method first uses a support vector machine-trained classifier, followed by a novel, phrase-based clustering algorithm. This clustering step autonomously creates cluster labels that are descriptive and understandable by humans. This clustering engine performed better on a standard test-set (Reuters 21578) compared to previously published results (F-value of 0.55 vs. 0.49), while producing cluster descriptions that appear more useful. A web interface allows researchers to quickly navigate through the hierarchy and look for documents that belong to a specific concept. Conclusion We have demonstrated a simple method to classify biological documents that embodies an improvement over current methods. While the classification results are currently optimized for Caenorhabditis elegans papers by human-created rules, the classification engine can be adapted to different types of documents. We have demonstrated this by presenting a web interface that allows researchers to quickly navigate through the hierarchy and look for documents that belong to a specific concept. PMID:16893465
SILVA tree viewer: interactive web browsing of the SILVA phylogenetic guide trees.
Beccati, Alan; Gerken, Jan; Quast, Christian; Yilmaz, Pelin; Glöckner, Frank Oliver
2017-09-30
Phylogenetic trees are an important tool to study the evolutionary relationships among organisms. The huge amount of available taxa poses difficulties in their interactive visualization. This hampers the interaction with the users to provide feedback for the further improvement of the taxonomic framework. The SILVA Tree Viewer is a web application designed for visualizing large phylogenetic trees without requiring the download of any software tool or data files. The SILVA Tree Viewer is based on Web Geographic Information Systems (Web-GIS) technology with a PostgreSQL backend. It enables zoom and pan functionalities similar to Google Maps. The SILVA Tree Viewer enables access to two phylogenetic (guide) trees provided by the SILVA database: the SSU Ref NR99 inferred from high-quality, full-length small subunit sequences, clustered at 99% sequence identity and the LSU Ref inferred from high-quality, full-length large subunit sequences. The Tree Viewer provides tree navigation, search and browse tools as well as an interactive feedback system to collect any kinds of requests ranging from taxonomy to data curation and improving the tool itself.
The BioExtract Server: a web-based bioinformatic workflow platform
Lushbough, Carol M.; Jennewein, Douglas M.; Brendel, Volker P.
2011-01-01
The BioExtract Server (bioextract.org) is an open, web-based system designed to aid researchers in the analysis of genomic data by providing a platform for the creation of bioinformatic workflows. Scientific workflows are created within the system by recording tasks performed by the user. These tasks may include querying multiple, distributed data sources, saving query results as searchable data extracts, and executing local and web-accessible analytic tools. The series of recorded tasks can then be saved as a reproducible, sharable workflow available for subsequent execution with the original or modified inputs and parameter settings. Integrated data resources include interfaces to the National Center for Biotechnology Information (NCBI) nucleotide and protein databases, the European Molecular Biology Laboratory (EMBL-Bank) non-redundant nucleotide database, the Universal Protein Resource (UniProt), and the UniProt Reference Clusters (UniRef) database. The system offers access to numerous preinstalled, curated analytic tools and also provides researchers with the option of selecting computational tools from a large list of web services including the European Molecular Biology Open Software Suite (EMBOSS), BioMoby, and the Kyoto Encyclopedia of Genes and Genomes (KEGG). The system further allows users to integrate local command line tools residing on their own computers through a client-side Java applet. PMID:21546552
Price comparisons on the internet based on computational intelligence.
Kim, Jun Woo; Ha, Sung Ho
2014-01-01
Information-intensive Web services such as price comparison sites have recently been gaining popularity. However, most users including novice shoppers have difficulty in browsing such sites because of the massive amount of information gathered and the uncertainty surrounding Web environments. Even conventional price comparison sites face various problems, which suggests the necessity of a new approach to address these problems. Therefore, for this study, an intelligent product search system was developed that enables price comparisons for online shoppers in a more effective manner. In particular, the developed system adopts linguistic price ratings based on fuzzy logic to accommodate user-defined price ranges, and personalizes product recommendations based on linguistic product clusters, which help online shoppers find desired items in a convenient manner.
Price Comparisons on the Internet Based on Computational Intelligence
Kim, Jun Woo; Ha, Sung Ho
2014-01-01
Information-intensive Web services such as price comparison sites have recently been gaining popularity. However, most users including novice shoppers have difficulty in browsing such sites because of the massive amount of information gathered and the uncertainty surrounding Web environments. Even conventional price comparison sites face various problems, which suggests the necessity of a new approach to address these problems. Therefore, for this study, an intelligent product search system was developed that enables price comparisons for online shoppers in a more effective manner. In particular, the developed system adopts linguistic price ratings based on fuzzy logic to accommodate user-defined price ranges, and personalizes product recommendations based on linguistic product clusters, which help online shoppers find desired items in a convenient manner. PMID:25268901
The case for electron re-acceleration at galaxy cluster shocks
NASA Astrophysics Data System (ADS)
van Weeren, Reinout J.; Andrade-Santos, Felipe; Dawson, William A.; Golovich, Nathan; Lal, Dharam V.; Kang, Hyesung; Ryu, Dongsu; Brìggen, Marcus; Ogrean, Georgiana A.; Forman, William R.; Jones, Christine; Placco, Vinicius M.; Santucci, Rafael M.; Wittman, David; Jee, M. James; Kraft, Ralph P.; Sobral, David; Stroe, Andra; Fogarty, Kevin
2017-01-01
On the largest scales, the Universe consists of voids and filaments making up the cosmic web. Galaxy clusters are located at the knots in this web, at the intersection of filaments. Clusters grow through accretion from these large-scale filaments and by mergers with other clusters and groups. In a growing number of galaxy clusters, elongated Mpc-sized radio sources have been found1,2 . Also known as radio relics, these regions of diffuse radio emission are thought to trace relativistic electrons in the intracluster plasma accelerated by low-Mach-number shocks generated by cluster-cluster merger events 3 . A long-standing problem is how low-Mach-number shocks can accelerate electrons so efficiently to explain the observed radio relics. Here, we report the discovery of a direct connection between a radio relic and a radio galaxy in the merging galaxy cluster Abell 3411-3412 by combining radio, X-ray and optical observations. This discovery indicates that fossil relativistic electrons from active galactic nuclei are re-accelerated at cluster shocks. It also implies that radio galaxies play an important role in governing the non-thermal component of the intracluster medium in merging clusters.
van Engen-Verheul, Mariëtte M; Peek, Niels; Haafkens, Joke A; Joukes, Erik; Vromen, Tom; Jaspers, Monique W M; de Keizer, Nicolette F
2017-01-01
Evidence on successful quality improvement (QI) in health care requires quantitative information from randomized clinical trials (RCTs) on the effectiveness of QI interventions, but also qualitative information from professionals to understand factors influencing QI implementation. Using a structured qualitative approach, concept mapping, this study determines factors identified by cardiac rehabilitation (CR) teams on what is needed to successfully implement a web-based audit and feedback (A&F) intervention with outreach visits to improve the quality of CR care. Participants included 49 CR professionals from 18 Dutch CR centres who had worked with the A&F system during a RCT. In three focus group sessions participants formulated statements on factors needed to implement QI successfully. Subsequently, participants rated all statements for importance and feasibility and grouped them thematically. Multi dimensional scaling was used to produce a final concept map. Forty-two unique statements were formulated and grouped into five thematic clusters in the concept map. The cluster with the highest importance was QI team commitment, followed by organisational readiness, presence of an adequate A&F system, access to an external quality assessor, and future use and functionalities of the A&F system. Concept mapping appeared efficient and useful to understand contextual factors influencing QI implementation as perceived by healthcare teams. While presence of a web-based A&F system and external quality assessor were seen as instrumental for gaining insight into performance and formulating QI actions, QI team commitment and organisational readiness were perceived as essential to actually implement and carry out these actions. These two sociotechnical factors should be taken into account when implementing and evaluating the success of QI implementations in future research. Copyright © 2016. Published by Elsevier Ireland Ltd.
Web page sorting algorithm based on query keyword distance relation
NASA Astrophysics Data System (ADS)
Yang, Han; Cui, Hong Gang; Tang, Hao
2017-08-01
In order to optimize the problem of page sorting, according to the search keywords in the web page in the relationship between the characteristics of the proposed query keywords clustering ideas. And it is converted into the degree of aggregation of the search keywords in the web page. Based on the PageRank algorithm, the clustering degree factor of the query keyword is added to make it possible to participate in the quantitative calculation. This paper proposes an improved algorithm for PageRank based on the distance relation between search keywords. The experimental results show the feasibility and effectiveness of the method.
ESTminer: a Web interface for mining EST contig and cluster databases.
Huang, Yecheng; Pumphrey, Janie; Gingle, Alan R
2005-03-01
ESTminer is a Web application and database schema for interactive mining of expressed sequence tag (EST) contig and cluster datasets. The Web interface contains a query frame that allows the selection of contigs/clusters with specific cDNA library makeup or a threshold number of members. The results are displayed as color-coded tree nodes, where the color indicates the fractional size of each cDNA library component. The nodes are expandable, revealing library statistics as well as EST or contig members, with links to sequence data, GenBank records or user configurable links. Also, the interface allows 'queries within queries' where the result set of a query is further filtered by the subsequent query. ESTminer is implemented in Java/JSP and the package, including MySQL and Oracle schema creation scripts, is available from http://cggc.agtec.uga.edu/Data/download.asp agingle@uga.edu.
NASA Astrophysics Data System (ADS)
Cole, M.; Bambacus, M.; Lynnes, C.; Sauer, B.; Falke, S.; Yang, W.
2007-12-01
NASA's vast array of scientific data within its Distributed Active Archive Centers (DAACs) is especially valuable to both traditional research scientists as well as the emerging market of Earth Science Information Partners. For example, the air quality science and management communities are increasingly using satellite derived observations in their analyses and decision making. The Air Quality Cluster in the Federation of Earth Science Information Partners (ESIP) uses web infrastructures of interoperability, or Service Oriented Architecture (SOA), to extend data exploration, use, and analysis and provides a user environment for DAAC products. In an effort to continually offer these NASA data to the broadest research community audience, and reusing emerging technologies, both NASA's Goddard Earth Science (GES) and Land Process (LP) DAACs have engaged in a web services pilot project. Through these projects both GES and LP have exposed data through the Open Geospatial Consortiums (OGC) Web Services standards. Reusing several different existing applications and implementation techniques, GES and LP successfully exposed a variety data, through distributed systems to be ingested into multiple end-user systems. The results of this project will enable researchers world wide to access some of NASA's GES & LP DAAC data through OGC protocols. This functionality encourages inter-disciplinary research while increasing data use through advanced technologies. This paper will concentrate on the implementation and use of OGC Web Services, specifically Web Map and Web Coverage Services (WMS, WCS) at GES and LP DAACs, and the value of these services within scientific applications, including integration with the DataFed air quality web infrastructure and in the development of data analysis web applications.
SeMPI: a genome-based secondary metabolite prediction and identification web server.
Zierep, Paul F; Padilla, Natàlia; Yonchev, Dimitar G; Telukunta, Kiran K; Klementz, Dennis; Günther, Stefan
2017-07-03
The secondary metabolism of bacteria, fungi and plants yields a vast number of bioactive substances. The constantly increasing amount of published genomic data provides the opportunity for an efficient identification of gene clusters by genome mining. Conversely, for many natural products with resolved structures, the encoding gene clusters have not been identified yet. Even though genome mining tools have become significantly more efficient in the identification of biosynthetic gene clusters, structural elucidation of the actual secondary metabolite is still challenging, especially due to as yet unpredictable post-modifications. Here, we introduce SeMPI, a web server providing a prediction and identification pipeline for natural products synthesized by polyketide synthases of type I modular. In order to limit the possible structures of PKS products and to include putative tailoring reactions, a structural comparison with annotated natural products was introduced. Furthermore, a benchmark was designed based on 40 gene clusters with annotated PKS products. The web server of the pipeline (SeMPI) is freely available at: http://www.pharmaceutical-bioinformatics.de/sempi. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
ZBIT Bioinformatics Toolbox: A Web-Platform for Systems Biology and Expression Data Analysis
Römer, Michael; Eichner, Johannes; Dräger, Andreas; Wrzodek, Clemens; Wrzodek, Finja; Zell, Andreas
2016-01-01
Bioinformatics analysis has become an integral part of research in biology. However, installation and use of scientific software can be difficult and often requires technical expert knowledge. Reasons are dependencies on certain operating systems or required third-party libraries, missing graphical user interfaces and documentation, or nonstandard input and output formats. In order to make bioinformatics software easily accessible to researchers, we here present a web-based platform. The Center for Bioinformatics Tuebingen (ZBIT) Bioinformatics Toolbox provides web-based access to a collection of bioinformatics tools developed for systems biology, protein sequence annotation, and expression data analysis. Currently, the collection encompasses software for conversion and processing of community standards SBML and BioPAX, transcription factor analysis, and analysis of microarray data from transcriptomics and proteomics studies. All tools are hosted on a customized Galaxy instance and run on a dedicated computation cluster. Users only need a web browser and an active internet connection in order to benefit from this service. The web platform is designed to facilitate the usage of the bioinformatics tools for researchers without advanced technical background. Users can combine tools for complex analyses or use predefined, customizable workflows. All results are stored persistently and reproducible. For each tool, we provide documentation, tutorials, and example data to maximize usability. The ZBIT Bioinformatics Toolbox is freely available at https://webservices.cs.uni-tuebingen.de/. PMID:26882475
a Web-Based Interactive Platform for Co-Clustering Spatio-Temporal Data
NASA Astrophysics Data System (ADS)
Wu, X.; Poorthuis, A.; Zurita-Milla, R.; Kraak, M.-J.
2017-09-01
Since current studies on clustering analysis mainly focus on exploring spatial or temporal patterns separately, a co-clustering algorithm is utilized in this study to enable the concurrent analysis of spatio-temporal patterns. To allow users to adopt and adapt the algorithm for their own analysis, it is integrated within the server side of an interactive web-based platform. The client side of the platform, running within any modern browser, is a graphical user interface (GUI) with multiple linked visualizations that facilitates the understanding, exploration and interpretation of the raw dataset and co-clustering results. Users can also upload their own datasets and adjust clustering parameters within the platform. To illustrate the use of this platform, an annual temperature dataset from 28 weather stations over 20 years in the Netherlands is used. After the dataset is loaded, it is visualized in a set of linked visualizations: a geographical map, a timeline and a heatmap. This aids the user in understanding the nature of their dataset and the appropriate selection of co-clustering parameters. Once the dataset is processed by the co-clustering algorithm, the results are visualized in the small multiples, a heatmap and a timeline to provide various views for better understanding and also further interpretation. Since the visualization and analysis are integrated in a seamless platform, the user can explore different sets of co-clustering parameters and instantly view the results in order to do iterative, exploratory data analysis. As such, this interactive web-based platform allows users to analyze spatio-temporal data using the co-clustering method and also helps the understanding of the results using multiple linked visualizations.
Pipelining Architecture of Indexing Using Agglomerative Clustering
NASA Astrophysics Data System (ADS)
Goyal, Deepika; Goyal, Deepti; Gupta, Parul
2010-11-01
The World Wide Web is an interlinked collection of billions of documents. Ironically the huge size of this collection has become an obstacle for information retrieval. To access the information from Internet, search engine is used. Search engine retrieve the pages from indexer. This paper introduce a novel pipelining technique for structuring the core index-building system that substantially reduces the index construction time and also clustering algorithm that aims at partitioning the set of documents into ordered clusters so that the documents within the same cluster are similar and are being assigned the closer document identifiers. After assigning to the clusters it creates the hierarchy of index so that searching is efficient. It will make the super cluster then mega cluster by itself. The pipeline architecture will create the index in such a way that it will be efficient in space and time saving manner. It will direct the search from higher level to lower level of index or higher level of clusters to lower level of cluster so that the user gets the possible match result in time saving manner. As one cluster is making by taking only two clusters so it search is limited to two clusters for lower level of index and so on. So it is efficient in time saving manner.
How Japanese students characterize information from web-sites.
Iwahara, A; Yamada, M; Hatta, T; Kawakami, A; Okamoto, M
2000-12-01
How 352 Japanese university students regard web-site information was investigated by two kinds of survey. Application of correspondence analysis and cluster analysis to the questionnaire responses to the web-site advertisement showed students regarded a web-site as a new alien medium which is different from current media. Students regarded web-sites as simply complicated, intellectual, and impermanent, or not memorable. Students got precise information from web-sites but they did not use it in making decisions to purchase goods.
Halligan, Brian D; Geiger, Joey F; Vallejos, Andrew K; Greene, Andrew S; Twigger, Simon N
2009-06-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step-by-step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center Web site ( http://proteomics.mcw.edu/vipdac ).
The case for electron re-acceleration at galaxy cluster shocks
DOE Office of Scientific and Technical Information (OSTI.GOV)
van Weeren, Reinout J.; Andrade-Santos, Felipe; Dawson, William A.
On the largest scales, the Universe consists of voids and filaments making up the cosmic web. Galaxy clusters are located at the knots in this web, at the intersection of filaments. Clusters grow through accretion from these large-scale filaments and by mergers with other clusters and groups. In a growing number of galaxy clusters, elongated Mpc-sized radio sources have been found. Also known as radio relics, these regions of diffuse radio emission are thought to trace relativistic electrons in the intracluster plasma accelerated by low-Mach-number shocks generated by cluster–cluster merger events. A long-standing problem is how low-Mach-number shocks can acceleratemore » electrons so efficiently to explain the observed radio relics. Here, we report the discovery of a direct connection between a radio relic and a radio galaxy in the merging galaxy cluster Abell 3411–3412 by combining radio, X-ray and optical observations. This discovery indicates that fossil relativistic electrons from active galactic nuclei are re-accelerated at cluster shocks. Lastly, it also implies that radio galaxies play an important role in governing the non-thermal component of the intracluster medium in merging clusters.« less
The case for electron re-acceleration at galaxy cluster shocks
van Weeren, Reinout J.; Andrade-Santos, Felipe; Dawson, William A.; ...
2017-01-04
On the largest scales, the Universe consists of voids and filaments making up the cosmic web. Galaxy clusters are located at the knots in this web, at the intersection of filaments. Clusters grow through accretion from these large-scale filaments and by mergers with other clusters and groups. In a growing number of galaxy clusters, elongated Mpc-sized radio sources have been found. Also known as radio relics, these regions of diffuse radio emission are thought to trace relativistic electrons in the intracluster plasma accelerated by low-Mach-number shocks generated by cluster–cluster merger events. A long-standing problem is how low-Mach-number shocks can acceleratemore » electrons so efficiently to explain the observed radio relics. Here, we report the discovery of a direct connection between a radio relic and a radio galaxy in the merging galaxy cluster Abell 3411–3412 by combining radio, X-ray and optical observations. This discovery indicates that fossil relativistic electrons from active galactic nuclei are re-accelerated at cluster shocks. Lastly, it also implies that radio galaxies play an important role in governing the non-thermal component of the intracluster medium in merging clusters.« less
Zhang, Zhaoyang; Fang, Hua; Wang, Honggang
2016-06-01
Web-delivered trials are an important component in eHealth services. These trials, mostly behavior-based, generate big heterogeneous data that are longitudinal, high dimensional with missing values. Unsupervised learning methods have been widely applied in this area, however, validating the optimal number of clusters has been challenging. Built upon our multiple imputation (MI) based fuzzy clustering, MIfuzzy, we proposed a new multiple imputation based validation (MIV) framework and corresponding MIV algorithms for clustering big longitudinal eHealth data with missing values, more generally for fuzzy-logic based clustering methods. Specifically, we detect the optimal number of clusters by auto-searching and -synthesizing a suite of MI-based validation methods and indices, including conventional (bootstrap or cross-validation based) and emerging (modularity-based) validation indices for general clustering methods as well as the specific one (Xie and Beni) for fuzzy clustering. The MIV performance was demonstrated on a big longitudinal dataset from a real web-delivered trial and using simulation. The results indicate MI-based Xie and Beni index for fuzzy-clustering are more appropriate for detecting the optimal number of clusters for such complex data. The MIV concept and algorithms could be easily adapted to different types of clustering that could process big incomplete longitudinal trial data in eHealth services.
Zhang, Zhaoyang; Wang, Honggang
2016-01-01
Web-delivered trials are an important component in eHealth services. These trials, mostly behavior-based, generate big heterogeneous data that are longitudinal, high dimensional with missing values. Unsupervised learning methods have been widely applied in this area, however, validating the optimal number of clusters has been challenging. Built upon our multiple imputation (MI) based fuzzy clustering, MIfuzzy, we proposed a new multiple imputation based validation (MIV) framework and corresponding MIV algorithms for clustering big longitudinal eHealth data with missing values, more generally for fuzzy-logic based clustering methods. Specifically, we detect the optimal number of clusters by auto-searching and -synthesizing a suite of MI-based validation methods and indices, including conventional (bootstrap or cross-validation based) and emerging (modularity-based) validation indices for general clustering methods as well as the specific one (Xie and Beni) for fuzzy clustering. The MIV performance was demonstrated on a big longitudinal dataset from a real web-delivered trial and using simulation. The results indicate MI-based Xie and Beni index for fuzzy-clustering is more appropriate for detecting the optimal number of clusters for such complex data. The MIV concept and algorithms could be easily adapted to different types of clustering that could process big incomplete longitudinal trial data in eHealth services. PMID:27126063
World Wide Web Based Image Search Engine Using Text and Image Content Features
NASA Astrophysics Data System (ADS)
Luo, Bo; Wang, Xiaogang; Tang, Xiaoou
2003-01-01
Using both text and image content features, a hybrid image retrieval system for Word Wide Web is developed in this paper. We first use a text-based image meta-search engine to retrieve images from the Web based on the text information on the image host pages to provide an initial image set. Because of the high-speed and low cost nature of the text-based approach, we can easily retrieve a broad coverage of images with a high recall rate and a relatively low precision. An image content based ordering is then performed on the initial image set. All the images are clustered into different folders based on the image content features. In addition, the images can be re-ranked by the content features according to the user feedback. Such a design makes it truly practical to use both text and image content for image retrieval over the Internet. Experimental results confirm the efficiency of the system.
Scalar Potential Model progress
NASA Astrophysics Data System (ADS)
Hodge, John
2007-04-01
Because observations of galaxies and clusters have been found inconsistent with General Relativity (GR), the focus of effort in developing a Scalar Potential Model (SPM) has been on the examination of galaxies and clusters. The SPM has been found to be consistent with cluster cellular structure, the flow of IGM from spiral galaxies to elliptical galaxies, intergalactic redshift without an expanding universe, discrete redshift, rotation curve (RC) data without dark matter, asymmetric RCs, galaxy central mass, galaxy central velocity dispersion, and the Pioneer Anomaly. In addition, the SPM suggests a model of past expansion, past contraction, and current expansion of the universe. GR corresponds to the SPM in the limit in which a flat and static scalar potential field replaces the Sources and Sinks such as between clusters and on the solar system scale which is small relative to the distance to a Source. The papers may be viewed at http://web.infoave.net/˜scjh/ .
Kim, Jin Hae; Bothe, Jameson R.; Alderson, T. Reid; Markley, John L.
2014-01-01
Proteins containing iron–sulfur (Fe–S) clusters arose early in evolution and are essential to life. Organisms have evolved machinery consisting of specialized proteins that operate together to assemble Fe–S clusters efficiently so as to minimize cellular exposure to their toxic constituents: iron and sulfide ions. To date, the best studied system is the iron sulfur cluster (isc) operon of Escherichia coli, and the eight ISC proteins it encodes. Our investigations over the past five years have identified two functional conformational states for the scaffold protein (IscU) and have shown that the other ISC proteins that interact with IscU prefer to bind one conformational state or the other. From analyses of the NMR spectroscopy-derived network of interactions of ISC proteins and small-angle X-ray scattering (SAXS), chemical crosslinking experiments, and functional assays, we have constructed working models for Fe–S cluster assembly and delivery. Future work is needed to validate and refine what has been learned about the E. coli system and to extend these findings to the homologous Fe–S cluster biosynthetic machinery of yeast and human mitochondria. This article is part of a Special Issue entitled: Fe/S proteins: Analysis, structure, function, biogenesis and diseases. PMID:25450980
NASA Technical Reports Server (NTRS)
Newman, Doug; Mitchell, Andrew
2016-01-01
During the development of the CMR (Common Metadata Repository) (CMR) for the Earth Observing System Data and Information System (EOSDIS), CSW (Catalog Service for the Web) a number of best practices came to light. Given that the ESIP (Earth Science Information Partners) Discovery Cluster is committed to interoperability and standards in earth data discovery this seemed like a convenient moment to provide Best Practices to the organization in the same way we did for OpenSearch for this widely-used standard.
The topology of the cosmic web in terms of persistent Betti numbers
NASA Astrophysics Data System (ADS)
Pranav, Pratyush; Edelsbrunner, Herbert; van de Weygaert, Rien; Vegter, Gert; Kerber, Michael; Jones, Bernard J. T.; Wintraecken, Mathijs
2017-03-01
We introduce a multiscale topological description of the Megaparsec web-like cosmic matter distribution. Betti numbers and topological persistence offer a powerful means of describing the rich connectivity structure of the cosmic web and of its multiscale arrangement of matter and galaxies. Emanating from algebraic topology and Morse theory, Betti numbers and persistence diagrams represent an extension and deepening of the cosmologically familiar topological genus measure and the related geometric Minkowski functionals. In addition to a description of the mathematical background, this study presents the computational procedure for computing Betti numbers and persistence diagrams for density field filtrations. The field may be computed starting from a discrete spatial distribution of galaxies or simulation particles. The main emphasis of this study concerns an extensive and systematic exploration of the imprint of different web-like morphologies and different levels of multiscale clustering in the corresponding computed Betti numbers and persistence diagrams. To this end, we use Voronoi clustering models as templates for a rich variety of web-like configurations and the fractal-like Soneira-Peebles models exemplify a range of multiscale configurations. We have identified the clear imprint of cluster nodes, filaments, walls, and voids in persistence diagrams, along with that of the nested hierarchy of structures in multiscale point distributions. We conclude by outlining the potential of persistent topology for understanding the connectivity structure of the cosmic web, in large simulations of cosmic structure formation and in the challenging context of the observed galaxy distribution in large galaxy surveys.
Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce
NASA Astrophysics Data System (ADS)
Farhan Husain, Mohammad; Doshi, Pankil; Khan, Latifur; Thuraisingham, Bhavani
Handling huge amount of data scalably is a matter of concern for a long time. Same is true for semantic web data. Current semantic web frameworks lack this ability. In this paper, we describe a framework that we built using Hadoop to store and retrieve large number of RDF triples. We describe our schema to store RDF data in Hadoop Distribute File System. We also present our algorithms to answer a SPARQL query. We make use of Hadoop's MapReduce framework to actually answer the queries. Our results reveal that we can store huge amount of semantic web data in Hadoop clusters built mostly by cheap commodity class hardware and still can answer queries fast enough. We conclude that ours is a scalable framework, able to handle large amount of RDF data efficiently.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Riot, V.
2009-06-05
The software is a modidication to the Mantis BT V1.5 open source application provided by the mantis BT group to support cluster web servers. It also provides various cosmetic modifications used a LLNL.
Templet Web: the use of volunteer computing approach in PaaS-style cloud
NASA Astrophysics Data System (ADS)
Vostokin, Sergei; Artamonov, Yuriy; Tsarev, Daniil
2018-03-01
This article presents the Templet Web cloud service. The service is designed for high-performance scientific computing automation. The use of high-performance technology is specifically required by new fields of computational science such as data mining, artificial intelligence, machine learning, and others. Cloud technologies provide a significant cost reduction for high-performance scientific applications. The main objectives to achieve this cost reduction in the Templet Web service design are: (a) the implementation of "on-demand" access; (b) source code deployment management; (c) high-performance computing programs development automation. The distinctive feature of the service is the approach mainly used in the field of volunteer computing, when a person who has access to a computer system delegates his access rights to the requesting user. We developed an access procedure, algorithms, and software for utilization of free computational resources of the academic cluster system in line with the methods of volunteer computing. The Templet Web service has been in operation for five years. It has been successfully used for conducting laboratory workshops and solving research problems, some of which are considered in this article. The article also provides an overview of research directions related to service development.
WordCluster: detecting clusters of DNA words and genomic elements
2011-01-01
Background Many k-mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds. Results We introduce here an algorithm to detect clusters of DNA words (k-mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used WordCluster to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome. Conclusions WordCluster seems to predict biological meaningful clusters of DNA words (k-mers) and genomic entities. The implementation of the method into a web server is available at http://bioinfo2.ugr.es/wordCluster/wordCluster.php including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes. PMID:21261981
RSAT 2015: Regulatory Sequence Analysis Tools
Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A.; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M.; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques
2015-01-01
RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. PMID:25904632
Interactive Parallel Data Analysis within Data-Centric Cluster Facilities using the IPython Notebook
NASA Astrophysics Data System (ADS)
Pascoe, S.; Lansdowne, J.; Iwi, A.; Stephens, A.; Kershaw, P.
2012-12-01
The data deluge is making traditional analysis workflows for many researchers obsolete. Support for parallelism within popular tools such as matlab, IDL and NCO is not well developed and rarely used. However parallelism is necessary for processing modern data volumes on a timescale conducive to curiosity-driven analysis. Furthermore, for peta-scale datasets such as the CMIP5 archive, it is no longer practical to bring an entire dataset to a researcher's workstation for analysis, or even to their institutional cluster. Therefore, there is an increasing need to develop new analysis platforms which both enable processing at the point of data storage and which provides parallelism. Such an environment should, where possible, maintain the convenience and familiarity of our current analysis environments to encourage curiosity-driven research. We describe how we are combining the interactive python shell (IPython) with our JASMIN data-cluster infrastructure. IPython has been specifically designed to bridge the gap between the HPC-style parallel workflows and the opportunistic curiosity-driven analysis usually carried out using domain specific languages and scriptable tools. IPython offers a web-based interactive environment, the IPython notebook, and a cluster engine for parallelism all underpinned by the well-respected Python/Scipy scientific programming stack. JASMIN is designed to support the data analysis requirements of the UK and European climate and earth system modeling community. JASMIN, with its sister facility CEMS focusing the earth observation community, has 4.5 PB of fast parallel disk storage alongside over 370 computing cores provide local computation. Through the IPython interface to JASMIN, users can make efficient use of JASMIN's multi-core virtual machines to perform interactive analysis on all cores simultaneously or can configure IPython clusters across multiple VMs. Larger-scale clusters can be provisioned through JASMIN's batch scheduling system. Outputs can be summarised and visualised using the full power of Python's many scientific tools, including Scipy, Matplotlib, Pandas and CDAT. This rich user experience is delivered through the user's web browser; maintaining the interactive feel of a workstation-based environment with the parallel power of a remote data-centric processing facility.
Diagnostic Accuracy Comparison of Artificial Immune Algorithms for Primary Headaches.
Çelik, Ufuk; Yurtay, Nilüfer; Koç, Emine Rabia; Tepe, Nermin; Güllüoğlu, Halil; Ertaş, Mustafa
2015-01-01
The present study evaluated the diagnostic accuracy of immune system algorithms with the aim of classifying the primary types of headache that are not related to any organic etiology. They are divided into four types: migraine, tension, cluster, and other primary headaches. After we took this main objective into consideration, three different neurologists were required to fill in the medical records of 850 patients into our web-based expert system hosted on our project web site. In the evaluation process, Artificial Immune Systems (AIS) were used as the classification algorithms. The AIS are classification algorithms that are inspired by the biological immune system mechanism that involves significant and distinct capabilities. These algorithms simulate the specialties of the immune system such as discrimination, learning, and the memorizing process in order to be used for classification, optimization, or pattern recognition. According to the results, the accuracy level of the classifier used in this study reached a success continuum ranging from 95% to 99%, except for the inconvenient one that yielded 71% accuracy.
NASA Astrophysics Data System (ADS)
Cautun, Marius; van de Weygaert, Rien; Jones, Bernard J. T.; Frenk, Carlos S.
2014-07-01
The cosmic web is the largest scale manifestation of the anisotropic gravitational collapse of matter. It represents the transitional stage between linear and non-linear structures and contains easily accessible information about the early phases of structure formation processes. Here we investigate the characteristics and the time evolution of morphological components. Our analysis involves the application of the NEXUS Multiscale Morphology Filter technique, predominantly its NEXUS+ version, to high resolution and large volume cosmological simulations. We quantify the cosmic web components in terms of their mass and volume content, their density distribution and halo populations. We employ new analysis techniques to determine the spatial extent of filaments and sheets, like their total length and local width. This analysis identifies clusters and filaments as the most prominent components of the web. In contrast, while voids and sheets take most of the volume, they correspond to underdense environments and are devoid of group-sized and more massive haloes. At early times the cosmos is dominated by tenuous filaments and sheets, which, during subsequent evolution, merge together, such that the present-day web is dominated by fewer, but much more massive, structures. The analysis of the mass transport between environments clearly shows how matter flows from voids into walls, and then via filaments into cluster regions, which form the nodes of the cosmic web. We also study the properties of individual filamentary branches, to find long, almost straight, filaments extending to distances larger than 100 h-1 Mpc. These constitute the bridges between massive clusters, which seem to form along approximatively straight lines.
Using Cluster Analysis for Data Mining in Educational Technology Research
ERIC Educational Resources Information Center
Antonenko, Pavlo D.; Toy, Serkan; Niederhauser, Dale S.
2012-01-01
Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through…
Clustering header categories extracted from web tables
NASA Astrophysics Data System (ADS)
Nagy, George; Embley, David W.; Krishnamoorthy, Mukkai; Seth, Sharad
2015-01-01
Revealing related content among heterogeneous web tables is part of our long term objective of formulating queries over multiple sources of information. Two hundred HTML tables from institutional web sites are segmented and each table cell is classified according to the fundamental indexing property of row and column headers. The categories that correspond to the multi-dimensional data cube view of a table are extracted by factoring the (often multi-row/column) headers. To reveal commonalities between tables from diverse sources, the Jaccard distances between pairs of category headers (and also table titles) are computed. We show how about one third of our heterogeneous collection can be clustered into a dozen groups that exhibit table-title and header similarities that can be exploited for queries.
xQTL workbench: a scalable web environment for multi-level QTL analysis.
Arends, Danny; van der Velde, K Joeri; Prins, Pjotr; Broman, Karl W; Möller, Steffen; Jansen, Ritsert C; Swertz, Morris A
2012-04-01
xQTL workbench is a scalable web platform for the mapping of quantitative trait loci (QTLs) at multiple levels: for example gene expression (eQTL), protein abundance (pQTL), metabolite abundance (mQTL) and phenotype (phQTL) data. Popular QTL mapping methods for model organism and human populations are accessible via the web user interface. Large calculations scale easily on to multi-core computers, clusters and Cloud. All data involved can be uploaded and queried online: markers, genotypes, microarrays, NGS, LC-MS, GC-MS, NMR, etc. When new data types come available, xQTL workbench is quickly customized using the Molgenis software generator. xQTL workbench runs on all common platforms, including Linux, Mac OS X and Windows. An online demo system, installation guide, tutorials, software and source code are available under the LGPL3 license from http://www.xqtl.org. m.a.swertz@rug.nl.
xQTL workbench: a scalable web environment for multi-level QTL analysis
Arends, Danny; van der Velde, K. Joeri; Prins, Pjotr; Broman, Karl W.; Möller, Steffen; Jansen, Ritsert C.; Swertz, Morris A.
2012-01-01
Summary: xQTL workbench is a scalable web platform for the mapping of quantitative trait loci (QTLs) at multiple levels: for example gene expression (eQTL), protein abundance (pQTL), metabolite abundance (mQTL) and phenotype (phQTL) data. Popular QTL mapping methods for model organism and human populations are accessible via the web user interface. Large calculations scale easily on to multi-core computers, clusters and Cloud. All data involved can be uploaded and queried online: markers, genotypes, microarrays, NGS, LC-MS, GC-MS, NMR, etc. When new data types come available, xQTL workbench is quickly customized using the Molgenis software generator. Availability: xQTL workbench runs on all common platforms, including Linux, Mac OS X and Windows. An online demo system, installation guide, tutorials, software and source code are available under the LGPL3 license from http://www.xqtl.org. Contact: m.a.swertz@rug.nl PMID:22308096
ERIC Educational Resources Information Center
White, Marilyn Domas; Abels, Eileen G.; Gordon-Murnane, Laura
1998-01-01
Reports on methodological developments in a project to assess the adoption of the Web by publishers of business information for electronic commerce. Describes the approach used on a sample of 20 business publishers to identify five clusters of publishers ranging from traditionalist to innovator. Distinguishes between adopters and nonadopters of…
TethysCluster: A comprehensive approach for harnessing cloud resources for hydrologic modeling
NASA Astrophysics Data System (ADS)
Nelson, J.; Jones, N.; Ames, D. P.
2015-12-01
Advances in water resources modeling are improving the information that can be supplied to support decisions affecting the safety and sustainability of society. However, as water resources models become more sophisticated and data-intensive they require more computational power to run. Purchasing and maintaining the computing facilities needed to support certain modeling tasks has been cost-prohibitive for many organizations. With the advent of the cloud, the computing resources needed to address this challenge are now available and cost-effective, yet there still remains a significant technical barrier to leverage these resources. This barrier inhibits many decision makers and even trained engineers from taking advantage of the best science and tools available. Here we present the Python tools TethysCluster and CondorPy, that have been developed to lower the barrier to model computation in the cloud by providing (1) programmatic access to dynamically scalable computing resources, (2) a batch scheduling system to queue and dispatch the jobs to the computing resources, (3) data management for job inputs and outputs, and (4) the ability to dynamically create, submit, and monitor computing jobs. These Python tools leverage the open source, computing-resource management, and job management software, HTCondor, to offer a flexible and scalable distributed-computing environment. While TethysCluster and CondorPy can be used independently to provision computing resources and perform large modeling tasks, they have also been integrated into Tethys Platform, a development platform for water resources web apps, to enable computing support for modeling workflows and decision-support systems deployed as web apps.
Towards a Web-Enabled Geovisualization and Analytics Platform for the Energy and Water Nexus
NASA Astrophysics Data System (ADS)
Sanyal, J.; Chandola, V.; Sorokine, A.; Allen, M.; Berres, A.; Pang, H.; Karthik, R.; Nugent, P.; McManamay, R.; Stewart, R.; Bhaduri, B. L.
2017-12-01
Interactive data analytics are playing an increasingly vital role in the generation of new, critical insights regarding the complex dynamics of the energy/water nexus (EWN) and its interactions with climate variability and change. Integration of impacts, adaptation, and vulnerability (IAV) science with emerging, and increasingly critical, data science capabilities offers a promising potential to meet the needs of the EWN community. To enable the exploration of pertinent research questions, a web-based geospatial visualization platform is being built that integrates a data analysis toolbox with advanced data fusion and data visualization capabilities to create a knowledge discovery framework for the EWN. The system, when fully built out, will offer several geospatial visualization capabilities including statistical visual analytics, clustering, principal-component analysis, dynamic time warping, support uncertainty visualization and the exploration of data provenance, as well as support machine learning discoveries to render diverse types of geospatial data and facilitate interactive analysis. Key components in the system architecture includes NASA's WebWorldWind, the Globus toolkit, postgresql, as well as other custom built software modules.
Determining the trophic guilds of fishes and macroinvertebrates in a seagrass food web
Luczkovich, J.J.; Ward, G.P.; Johnson, J.C.; Christian, R.R.; Baird, D.; Neckles, H.; Rizzo, W.M.
2002-01-01
We established trophic guilds of macroinvertebrate and fish taxa using correspondence analysis and a hierarchical clustering strategy for a seagrass food web in winter in the northeastern Gulf of Mexico. To create the diet matrix, we characterized the trophic linkages of macroinvertebrate and fish taxa present in Halodule wrightii seagrass habitat areas within the St. Marks National Wildlife Refuge (Florida) using binary data, combining dietary links obtained from relevant literature for macroinvertebrates with stomach analysis of common fishes collected during January and February of 1994. Heirarchical average-linkage cluster analysis of the 73 taxa of fishes and macroinvertebrates in the diet matrix yielded 14 clusters with diet similarity ??? 0.60. We then used correspondence analysis with three factors to jointly plot the coordinates of the consumers (identified by cluster membership) and of the 33 food sources. Correspondence analysis served as a visualization tool for assigning each taxon to one of eight trophic guilds: herbivores, detritivores, suspension feeders, omnivores, molluscivores, meiobenthos consumers, macrobenthos consumers, and piscivores. These trophic groups, cross-classified with major taxonomic groups, were further used to develop consumer compartments in a network analysis model of carbon flow in this seagrass ecosystem. The method presented here should greatly improve the development of future network models of food webs by providing an objective procedure for aggregating trophic groups.
RSAT 2015: Regulatory Sequence Analysis Tools.
Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques
2015-07-01
RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
T-RMSD: a web server for automated fine-grained protein structural classification.
Magis, Cedrik; Di Tommaso, Paolo; Notredame, Cedric
2013-07-01
This article introduces the T-RMSD web server (tree-based on root-mean-square deviation), a service allowing the online computation of structure-based protein classification. It has been developed to address the relation between structural and functional similarity in proteins, and it allows a fine-grained structural clustering of a given protein family or group of structurally related proteins using distance RMSD (dRMSD) variations. These distances are computed between all pairs of equivalent residues, as defined by the ungapped columns within a given multiple sequence alignment. Using these generated distance matrices (one per equivalent position), T-RMSD produces a structural tree with support values for each cluster node, reminiscent of bootstrap values. These values, associated with the tree topology, allow a quantitative estimate of structural distances between proteins or group of proteins defined by the tree topology. The clusters thus defined have been shown to be structurally and functionally informative. The T-RMSD web server is a free website open to all users and available at http://tcoffee.crg.cat/apps/tcoffee/do:trmsd.
T-RMSD: a web server for automated fine-grained protein structural classification
Magis, Cedrik; Di Tommaso, Paolo; Notredame, Cedric
2013-01-01
This article introduces the T-RMSD web server (tree-based on root-mean-square deviation), a service allowing the online computation of structure-based protein classification. It has been developed to address the relation between structural and functional similarity in proteins, and it allows a fine-grained structural clustering of a given protein family or group of structurally related proteins using distance RMSD (dRMSD) variations. These distances are computed between all pairs of equivalent residues, as defined by the ungapped columns within a given multiple sequence alignment. Using these generated distance matrices (one per equivalent position), T-RMSD produces a structural tree with support values for each cluster node, reminiscent of bootstrap values. These values, associated with the tree topology, allow a quantitative estimate of structural distances between proteins or group of proteins defined by the tree topology. The clusters thus defined have been shown to be structurally and functionally informative. The T-RMSD web server is a free website open to all users and available at http://tcoffee.crg.cat/apps/tcoffee/do:trmsd. PMID:23716642
Reddy, Vinod; Swanson, Stanley M; Segelke, Brent; Kantardjieff, Katherine A; Sacchettini, James C; Rupp, Bernhard
2003-12-01
Anticipating a continuing increase in the number of structures solved by molecular replacement in high-throughput crystallography and drug-discovery programs, a user-friendly web service for automated molecular replacement, map improvement, bias removal and real-space correlation structure validation has been implemented. The service is based on an efficient bias-removal protocol, Shake&wARP, and implemented using EPMR and the CCP4 suite of programs, combined with various shell scripts and Fortran90 routines. The service returns improved maps, converted data files and real-space correlation and B-factor plots. User data are uploaded through a web interface and the CPU-intensive iteration cycles are executed on a low-cost Linux multi-CPU cluster using the Condor job-queuing package. Examples of map improvement at various resolutions are provided and include model completion and reconstruction of absent parts, sequence correction, and ligand validation in drug-target structures.
The web graph of a tourism system
NASA Astrophysics Data System (ADS)
Baggio, Rodolfo
2007-06-01
The website network of a tourism destination is examined. Network theoretic metrics are used to gauge the static and dynamic characteristics of the webspace. The topology of the network is found partly similar to the one exhibited by similar systems. However, some differences are found, mainly due to the relatively poor connectivity and clusterisation of the network. These results are interpreted by considering the formation mechanisms and the connotation of the linkages between websites. Clustering and assortativity coefficients are proposed as quantitative estimations of the degree of collaboration and cooperation among destination stakeholders.
A multimembership catalogue for 1876 open clusters using UCAC4 data
NASA Astrophysics Data System (ADS)
Sampedro, L.; Dias, W. S.; Alfaro, E. J.; Monteiro, H.; Molino, A.
2017-10-01
The main objective of this work is to determine the cluster members of 1876 open clusters, using positions and proper motions of the astrometric fourth United States Naval Observatory (USNO) CCD Astrograph Catalog (UCAC4). For this purpose, we apply three different methods, all based on a Bayesian approach, but with different formulations: a purely parametric method, another completely non-parametric algorithm and a third, recently developed by Sampedro & Alfaro, using both formulations at different steps of the whole process. The first and second statistical moments of the members' phase-space subspace, obtained after applying the three methods, are compared for every cluster. Although, on average, the three methods yield similar results, there are also specific differences between them, as well as for some particular clusters. The comparison with other published catalogues shows good agreement. We have also estimated, for the first time, the mean proper motion for a sample of 18 clusters. The results are organized in a single catalogue formed by two main files, one with the most relevant information for each cluster, partially including that in UCAC4, and the other showing the individual membership probabilities for each star in the cluster area. The final catalogue, with an interface design that enables an easy interaction with the user, is available in electronic format at the Stellar Systems Group (SSG-IAA) web site (http://ssg.iaa.es/en/content/sampedro-cluster-catalog).
Hira, A Y; Nebel de Mello, A; Faria, R A; Odone Filho, V; Lopes, R D; Zuffo, M K
2006-01-01
This article discusses a telemedicine model for emerging countries, through the description of ONCONET, a telemedicine initiative applied to pediatric oncology in Brazil. The ONCONET core technology is a Web-based system that offers health information and other services specialized in childhood cancer such as electronic medical records and cooperative protocols for complex treatments. All Web-based services are supported by the use of high performance computing infrastructure based on clusters of commodity computers. The system was fully implemented on an open-source and free-software approach. Aspects of modeling, implementation and integration are covered. A model, both technologically and economically viable, was created through the research and development of in-house solutions adapted to the emerging countries reality and with focus on scalability both in the total number of patients and in the national infrastructure.
Tsui, Fu-Chiang; Espino, Jeremy U; Weng, Yan; Choudary, Arvinder; Su, Hoah-Der; Wagner, Michael M
2005-01-01
The National Retail Data Monitor (NRDM) has monitored over-the-counter (OTC) medication sales in the United States since December 2002. The NRDM collects data from over 18,600 retail stores and processes over 0.6 million sales records per day. This paper describes key architectural features that we have found necessary for a data utility component in a national biosurveillance system. These elements include event-driven architecture to provide analyses of data in near real time, multiple levels of caching to improve query response time, high availability through the use of clustered servers, scalable data storage through the use of storage area networks and a web-service function for interoperation with affiliated systems. The methods and architectural principles are relevant to the design of any production data utility for public health surveillance-systems that collect data from multiple sources in near real time for use by analytic programs and user interfaces that have substantial requirements for time-series data aggregated in multiple dimensions.
SEEDisCs: How Clusters Form and Galaxies Transform in the Cosmic Web
NASA Astrophysics Data System (ADS)
Jablonka, P.
2017-08-01
This presentation introduces a new survey, the Spatial Extended EDisCS Survey (SEEDisCS), which aims at understanding how clusters assemble and the level at which galaxies are preprocessed before falling on the cluster cores. I focus on the changes in galaxy properties in the cluster large scale environments, and how we can get constraints on the timescale of star formation quenching. I also discuss new ALMA CO observations, which trace the fate of the galaxy cold gas content along the infalling paths towards the cluster cores.
Determining the trophic guilds of fishes and macroinvertebrates in a seagrass food web
Luczkovich, J.J.; Ward, G.P.; Johnson, J.C.; Christian, R.R.; Baird, D.; Neckles, H.; Rizzo, W.M.
2002-01-01
We established trophic guilds of macroinvertebrate and fish taxa using correspondence analysis and a hierarchical clustering strategy for a seagrass food web in winter in the northeastern Gulf of Mexico. To create the diet matrix, we characterized the trophic linkages of macroinvertebrate and fish taxa. present in Hatodule wrightii seagrass habitat areas within the St. Marks National Wildlife Refuge (Florida) using binary data, combining dietary links obtained from relevant literature for macroinvertebrates with stomach analysis of common fishes collected during January and February of 1994. Heirarchical average-linkage cluster analysis of the 73 taxa of fishes and macroinvertebrates in the diet matrix yielded 14 clusters with diet similarity greater than or equal to 0.60. We then used correspondence analysis with three factors to jointly plot the coordinates of the consumers (identified by cluster membership) and of the 33 food sources. Correspondence analysis served as a visualization tool for assigning each taxon to one of eight trophic guilds: herbivores, detritivores, suspension feeders, omnivores, molluscivores, meiobenthos consumers, macrobenthos consumers, and piscivores. These trophic groups, cross-classified with major taxonomic groups, were further used to develop consumer compartments in a network analysis model of carbon flow in this seagrass ecosystem. The method presented here should greatly improve the development of future network models of food webs by providing an objective procedure for aggregating trophic groups.
InCHlib - interactive cluster heatmap for web applications.
Skuta, Ctibor; Bartůněk, Petr; Svozil, Daniel
2014-12-01
Hierarchical clustering is an exploratory data analysis method that reveals the groups (clusters) of similar objects. The result of the hierarchical clustering is a tree structure called dendrogram that shows the arrangement of individual clusters. To investigate the row/column hierarchical cluster structure of a data matrix, a visualization tool called 'cluster heatmap' is commonly employed. In the cluster heatmap, the data matrix is displayed as a heatmap, a 2-dimensional array in which the colour of each element corresponds to its value. The rows/columns of the matrix are ordered such that similar rows/columns are near each other. The ordering is given by the dendrogram which is displayed on the side of the heatmap. We developed InCHlib (Interactive Cluster Heatmap Library), a highly interactive and lightweight JavaScript library for cluster heatmap visualization and exploration. InCHlib enables the user to select individual or clustered heatmap rows, to zoom in and out of clusters or to flexibly modify heatmap appearance. The cluster heatmap can be augmented with additional metadata displayed in a different colour scale. In addition, to further enhance the visualization, the cluster heatmap can be interconnected with external data sources or analysis tools. Data clustering and the preparation of the input file for InCHlib is facilitated by the Python utility script inchlib_clust . The cluster heatmap is one of the most popular visualizations of large chemical and biomedical data sets originating, e.g., in high-throughput screening, genomics or transcriptomics experiments. The presented JavaScript library InCHlib is a client-side solution for cluster heatmap exploration. InCHlib can be easily deployed into any modern web application and configured to cooperate with external tools and data sources. Though InCHlib is primarily intended for the analysis of chemical or biological data, it is a versatile tool which application domain is not limited to the life sciences only.
Document Clustering Approach for Meta Search Engine
NASA Astrophysics Data System (ADS)
Kumar, Naresh, Dr.
2017-08-01
The size of WWW is growing exponentially with ever change in technology. This results in huge amount of information with long list of URLs. Manually it is not possible to visit each page individually. So, if the page ranking algorithms are used properly then user search space can be restricted up to some pages of searched results. But available literatures show that no single search system can provide qualitative results from all the domains. This paper provides solution to this problem by introducing a new meta search engine that determine the relevancy of query corresponding to web page and cluster the results accordingly. The proposed approach reduces the user efforts, improves the quality of results and performance of the meta search engine.
Web Service Distributed Management Framework for Autonomic Server Virtualization
NASA Astrophysics Data System (ADS)
Solomon, Bogdan; Ionescu, Dan; Litoiu, Marin; Mihaescu, Mircea
Virtualization for the x86 platform has imposed itself recently as a new technology that can improve the usage of machines in data centers and decrease the cost and energy of running a high number of servers. Similar to virtualization, autonomic computing and more specifically self-optimization, aims to improve server farm usage through provisioning and deprovisioning of instances as needed by the system. Autonomic systems are able to determine the optimal number of server machines - real or virtual - to use at a given time, and add or remove servers from a cluster in order to achieve optimal usage. While provisioning and deprovisioning of servers is very important, the way the autonomic system is built is also very important, as a robust and open framework is needed. One such management framework is the Web Service Distributed Management (WSDM) system, which is an open standard of the Organization for the Advancement of Structured Information Standards (OASIS). This paper presents an open framework built on top of the WSDM specification, which aims to provide self-optimization for applications servers residing on virtual machines.
Comprehensive cluster analysis with Transitivity Clustering.
Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan
2011-03-01
Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.
Blaya, Joaquín A; Shin, Sonya; Contreras, Carmen; Yale, Gloria; Suarez, Carmen; Asencios, Luis; Kim, Jihoon; Rodriguez, Pablo; Cegielski, Peter; Fraser, Hamish S F
2011-01-01
To evaluate the time to communicate laboratory results to health centers (HCs) between the e-Chasqui web-based information system and the pre-existing paper-based system. Cluster randomized controlled trial in 78 HCs in Peru. In the intervention group, 12 HCs had web access to results via e-Chasqui (point-of-care HCs) and forwarded results to 17 peripheral HCs. In the control group, 22 point-of-care HCs received paper results directly and forwarded them to 27 peripheral HCs. Baseline data were collected for 15 months. Post-randomization data were collected for at least 2 years. Comparisons were made between intervention and control groups, stratified by point-of-care versus peripheral HCs. For point-of-care HCs, the intervention group took less time to receive drug susceptibility tests (DSTs) (median 9 vs 16 days, p<0.001) and culture results (4 vs 8 days, p<0.001) and had a lower proportion of 'late' DSTs taking >60 days to arrive (p<0.001) than the control. For peripheral HCs, the intervention group had similar communication times for DST (median 22 vs 19 days, p=0.30) and culture (10 vs 9 days, p=0.10) results, as well as proportion of 'late' DSTs (p=0.57) compared with the control. Only point-of-care HCs with direct access to the e-Chasqui information system had reduced communication times and fewer results with delays of >2 months. Peripheral HCs had no benefits from the system. This suggests that health establishments should have point-of-care access to reap the benefits of electronic laboratory reporting.
Shin, Sonya; Contreras, Carmen; Yale, Gloria; Suarez, Carmen; Asencios, Luis; Kim, Jihoon; Rodriguez, Pablo; Cegielski, Peter; Fraser, Hamish S F
2010-01-01
Objective To evaluate the time to communicate laboratory results to health centers (HCs) between the e-Chasqui web-based information system and the pre-existing paper-based system. Methods Cluster randomized controlled trial in 78 HCs in Peru. In the intervention group, 12 HCs had web access to results via e-Chasqui (point-of-care HCs) and forwarded results to 17 peripheral HCs. In the control group, 22 point-of-care HCs received paper results directly and forwarded them to 27 peripheral HCs. Baseline data were collected for 15 months. Post-randomization data were collected for at least 2 years. Comparisons were made between intervention and control groups, stratified by point-of-care versus peripheral HCs. Results For point-of-care HCs, the intervention group took less time to receive drug susceptibility tests (DSTs) (median 9 vs 16 days, p<0.001) and culture results (4 vs 8 days, p<0.001) and had a lower proportion of ‘late’ DSTs taking >60 days to arrive (p<0.001) than the control. For peripheral HCs, the intervention group had similar communication times for DST (median 22 vs 19 days, p=0.30) and culture (10 vs 9 days, p=0.10) results, as well as proportion of ‘late’ DSTs (p=0.57) compared with the control. Conclusions Only point-of-care HCs with direct access to the e-Chasqui information system had reduced communication times and fewer results with delays of >2 months. Peripheral HCs had no benefits from the system. This suggests that health establishments should have point-of-care access to reap the benefits of electronic laboratory reporting. PMID:21113076
DMINDA: an integrated web server for DNA motif identification and analyses
Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying
2014-01-01
DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. PMID:24753419
NASA Astrophysics Data System (ADS)
ZuHone, J. A.; Kowalik, K.; Öhman, E.; Lau, E.; Nagai, D.
2018-01-01
We present the “Galaxy Cluster Merger Catalog.” This catalog provides an extensive suite of mock observations and related data for N-body and hydrodynamical simulations of galaxy cluster mergers and clusters from cosmological simulations. These mock observations consist of projections of a number of important observable quantities in several different wavebands, as well as along different lines of sight through each simulation domain. The web interface to the catalog consists of easily browsable images over epoch and projection direction, as well as download links for the raw data and a JS9 interface for interactive data exploration. The data are presented within a consistent format so that comparison between simulations is straightforward. All of the data products are provided in the standard Flexible Image Transport System file format. The data are being stored on the yt Hub (http://hub.yt), which allows for remote access and analysis using a Jupyter notebook server. Future versions of the catalog will include simulations from a number of research groups and a variety of research topics related to the study of interactions of galaxy clusters with each other and with their member galaxies. The catalog is located at http://gcmc.hub.yt.
Software analysis in the semantic web
NASA Astrophysics Data System (ADS)
Taylor, Joshua; Hall, Robert T.
2013-05-01
Many approaches in software analysis, particularly dynamic malware analyis, benefit greatly from the use of linked data and other Semantic Web technology. In this paper, we describe AIS, Inc.'s Semantic Extractor (SemEx) component from the Malware Analysis and Attribution through Genetic Information (MAAGI) effort, funded under DARPA's Cyber Genome program. The SemEx generates OWL-based semantic models of high and low level behaviors in malware samples from system call traces generated by AIS's introspective hypervisor, IntroVirtTM. Within MAAGI, these semantic models were used by modules that cluster malware samples by functionality, and construct "genealogical" malware lineages. Herein, we describe the design, implementation, and use of the SemEx, as well as the C2DB, an OWL ontology used for representing software behavior and cyber-environments.
Web service module for access to g-Lite
NASA Astrophysics Data System (ADS)
Goranova, R.; Goranov, G.
2012-10-01
G-Lite is a lightweight grid middleware for grid computing installed on all clusters of the European Grid Infrastructure (EGI). The middleware is partially service-oriented and does not provide well-defined Web services for job management. The existing Web services in the environment cannot be directly used by grid users for building service compositions in the EGI. In this article we present a module of well-defined Web services for job management in the EGI. We describe the architecture of the module and the design of the developed Web services. The presented Web services are composable and can participate in service compositions (workflows). An example of usage of the module with tools for service compositions in g-Lite is shown.
Beyond accuracy: creating interoperable and scalable text-mining web services.
Wei, Chih-Hsuan; Leaman, Robert; Lu, Zhiyong
2016-06-15
The biomedical literature is a knowledge-rich resource and an important foundation for future research. With over 24 million articles in PubMed and an increasing growth rate, research in automated text processing is becoming increasingly important. We report here our recently developed web-based text mining services for biomedical concept recognition and normalization. Unlike most text-mining software tools, our web services integrate several state-of-the-art entity tagging systems (DNorm, GNormPlus, SR4GN, tmChem and tmVar) and offer a batch-processing mode able to process arbitrary text input (e.g. scholarly publications, patents and medical records) in multiple formats (e.g. BioC). We support multiple standards to make our service interoperable and allow simpler integration with other text-processing pipelines. To maximize scalability, we have preprocessed all PubMed articles, and use a computer cluster for processing large requests of arbitrary text. Our text-mining web service is freely available at http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmTools/#curl : Zhiyong.Lu@nih.gov. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.
Migration to Earth Observation Satellite Product Dissemination System at JAXA
NASA Astrophysics Data System (ADS)
Ikehata, Y.; Matsunaga, M.
2017-12-01
JAXA released "G-Portal" as a portal web site for search and deliver data of Earth observation satellites in February 2013. G-Portal handles ten satellites data; GPM, TRMM, Aqua, ADEOS-II, ALOS (search only), ALOS-2 (search only), MOS-1, MOS-1b, ERS-1 and JERS-1 and archives 5.17 million products and 14 million catalogues in total. Users can search those products/catalogues in GUI web search and catalogue interface(CSW/Opensearch). In this fiscal year, we will replace this to "Next G-Portal" and has been doing integration, test and migrations. New G-Portal will treat data of satellites planned to be launched in the future in addition to those handled by G - Portal. At system architecture perspective, G-Portal adopted "cluster system" for its redundancy, so we must replace the servers into those with higher specifications when we improve its performance ("scale up approach"). This requests a lot of cost in every improvement. To avoid this, Next G-Portal adopts "scale out" system: load balancing interfaces, distributed file system, distributed data bases. (We reported in AGU fall meeting 2015(IN23D-1748).) At customer usability perspective, G-Portal provides complicated interface: "step by step" web design, randomly generated URLs, sftp (needs anomaly tcp port). Customers complained about the interfaces and the support team had been tired from answering them. To solve this problem, Next G-Portal adopts simple interfaces: "1 page" web design, RESTful URL, and Normal FTP. (We reported in AGU fall meeting 2016(IN23B-1778).) Furthermore, Next G-Portal must merge GCOM-W data dissemination system to be terminated in the next March as well as the current G-Portal. This might arrise some difficulties, since the current G-Portal and GCOM-W data dissemination systems are quite different from Next G-Portal. The presentation reports the knowledge obtained from the process of merging those systems.
GRAMM-X public web server for protein–protein docking
Tovchigrechko, Andrey; Vakser, Ilya A.
2006-01-01
Protein docking software GRAMM-X and its web interface () extend the original GRAMM Fast Fourier Transformation methodology by employing smoothed potentials, refinement stage, and knowledge-based scoring. The web server frees users from complex installation of database-dependent parallel software and maintaining large hardware resources needed for protein docking simulations. Docking problems submitted to GRAMM-X server are processed by a 320 processor Linux cluster. The server was extensively tested by benchmarking, several months of public use, and participation in the CAPRI server track. PMID:16845016
View of Arabella, one of two Skylab spiders and her web
NASA Technical Reports Server (NTRS)
1973-01-01
A close-up view of Arabella, one of the two Skylab 3 common cross spiders 'aranous diadematus,' and the web it had spun in the zero gravity of space aboard the Skylab space station cluster in Earth orbit. During the 59 day Skylab 3 mission the two spiders Arabella and Anita, were housed in an enclosure onto which a motion picture and still camera were attached to record the spiders' attempts to build a web in the weightless environment.
NASA Astrophysics Data System (ADS)
Chen, Xiuhong; Huang, Xianglei; Jiao, Chaoyi; Flanner, Mark G.; Raeker, Todd; Palen, Brock
2017-01-01
The suites of numerical models used for simulating climate of our planet are usually run on dedicated high-performance computing (HPC) resources. This study investigates an alternative to the usual approach, i.e. carrying out climate model simulations on commercially available cloud computing environment. We test the performance and reliability of running the CESM (Community Earth System Model), a flagship climate model in the United States developed by the National Center for Atmospheric Research (NCAR), on Amazon Web Service (AWS) EC2, the cloud computing environment by Amazon.com, Inc. StarCluster is used to create virtual computing cluster on the AWS EC2 for the CESM simulations. The wall-clock time for one year of CESM simulation on the AWS EC2 virtual cluster is comparable to the time spent for the same simulation on a local dedicated high-performance computing cluster with InfiniBand connections. The CESM simulation can be efficiently scaled with the number of CPU cores on the AWS EC2 virtual cluster environment up to 64 cores. For the standard configuration of the CESM at a spatial resolution of 1.9° latitude by 2.5° longitude, increasing the number of cores from 16 to 64 reduces the wall-clock running time by more than 50% and the scaling is nearly linear. Beyond 64 cores, the communication latency starts to outweigh the benefit of distributed computing and the parallel speedup becomes nearly unchanged.
CoryneRegNet 4.0 – A reference database for corynebacterial gene regulatory networks
Baumbach, Jan
2007-01-01
Background Detailed information on DNA-binding transcription factors (the key players in the regulation of gene expression) and on transcriptional regulatory interactions of microorganisms deduced from literature-derived knowledge, computer predictions and global DNA microarray hybridization experiments, has opened the way for the genome-wide analysis of transcriptional regulatory networks. The large-scale reconstruction of these networks allows the in silico analysis of cell behavior in response to changing environmental conditions. We previously published CoryneRegNet, an ontology-based data warehouse of corynebacterial transcription factors and regulatory networks. Initially, it was designed to provide methods for the analysis and visualization of the gene regulatory network of Corynebacterium glutamicum. Results Now we introduce CoryneRegNet release 4.0, which integrates data on the gene regulatory networks of 4 corynebacteria, 2 mycobacteria and the model organism Escherichia coli K12. As the previous versions, CoryneRegNet provides a web-based user interface to access the database content, to allow various queries, and to support the reconstruction, analysis and visualization of regulatory networks at different hierarchical levels. In this article, we present the further improved database content of CoryneRegNet along with novel analysis features. The network visualization feature GraphVis now allows the inter-species comparisons of reconstructed gene regulatory networks and the projection of gene expression levels onto that networks. Therefore, we added stimulon data directly into the database, but also provide Web Service access to the DNA microarray analysis platform EMMA. Additionally, CoryneRegNet now provides a SOAP based Web Service server, which can easily be consumed by other bioinformatics software systems. Stimulons (imported from the database, or uploaded by the user) can be analyzed in the context of known transcriptional regulatory networks to predict putative contradictions or further gene regulatory interactions. Furthermore, it integrates protein clusters by means of heuristically solving the weighted graph cluster editing problem. In addition, it provides Web Service based access to up to date gene annotation data from GenDB. Conclusion The release 4.0 of CoryneRegNet is a comprehensive system for the integrated analysis of procaryotic gene regulatory networks. It is a versatile systems biology platform to support the efficient and large-scale analysis of transcriptional regulation of gene expression in microorganisms. It is publicly available at . PMID:17986320
Improving clustering with metabolic pathway data.
Milone, Diego H; Stegmayer, Georgina; López, Mariana; Kamenetzky, Laura; Carrari, Fernando
2014-04-10
It is a common practice in bioinformatics to validate each group returned by a clustering algorithm through manual analysis, according to a-priori biological knowledge. This procedure helps finding functionally related patterns to propose hypotheses for their behavior and the biological processes involved. Therefore, this knowledge is used only as a second step, after data are just clustered according to their expression patterns. Thus, it could be very useful to be able to improve the clustering of biological data by incorporating prior knowledge into the cluster formation itself, in order to enhance the biological value of the clusters. A novel training algorithm for clustering is presented, which evaluates the biological internal connections of the data points while the clusters are being formed. Within this training algorithm, the calculation of distances among data points and neurons centroids includes a new term based on information from well-known metabolic pathways. The standard self-organizing map (SOM) training versus the biologically-inspired SOM (bSOM) training were tested with two real data sets of transcripts and metabolites from Solanum lycopersicum and Arabidopsis thaliana species. Classical data mining validation measures were used to evaluate the clustering solutions obtained by both algorithms. Moreover, a new measure that takes into account the biological connectivity of the clusters was applied. The results of bSOM show important improvements in the convergence and performance for the proposed clustering method in comparison to standard SOM training, in particular, from the application point of view. Analyses of the clusters obtained with bSOM indicate that including biological information during training can certainly increase the biological value of the clusters found with the proposed method. It is worth to highlight that this fact has effectively improved the results, which can simplify their further analysis.The algorithm is available as a web-demo at http://fich.unl.edu.ar/sinc/web-demo/bsom-lite/. The source code and the data sets supporting the results of this article are available at http://sourceforge.net/projects/sourcesinc/files/bsom.
Galaxy CloudMan: delivering cloud compute clusters.
Afgan, Enis; Baker, Dannon; Coraor, Nate; Chapman, Brad; Nekrutenko, Anton; Taylor, James
2010-12-21
Widespread adoption of high-throughput sequencing has greatly increased the scale and sophistication of computational infrastructure needed to perform genomic research. An alternative to building and maintaining local infrastructure is "cloud computing", which, in principle, offers on demand access to flexible computational infrastructure. However, cloud computing resources are not yet suitable for immediate "as is" use by experimental biologists. We present a cloud resource management system that makes it possible for individual researchers to compose and control an arbitrarily sized compute cluster on Amazon's EC2 cloud infrastructure without any informatics requirements. Within this system, an entire suite of biological tools packaged by the NERC Bio-Linux team (http://nebc.nerc.ac.uk/tools/bio-linux) is available for immediate consumption. The provided solution makes it possible, using only a web browser, to create a completely configured compute cluster ready to perform analysis in less than five minutes. Moreover, we provide an automated method for building custom deployments of cloud resources. This approach promotes reproducibility of results and, if desired, allows individuals and labs to add or customize an otherwise available cloud system to better meet their needs. The expected knowledge and associated effort with deploying a compute cluster in the Amazon EC2 cloud is not trivial. The solution presented in this paper eliminates these barriers, making it possible for researchers to deploy exactly the amount of computing power they need, combined with a wealth of existing analysis software, to handle the ongoing data deluge.
Galaxy CloudMan: delivering cloud compute clusters
2010-01-01
Background Widespread adoption of high-throughput sequencing has greatly increased the scale and sophistication of computational infrastructure needed to perform genomic research. An alternative to building and maintaining local infrastructure is “cloud computing”, which, in principle, offers on demand access to flexible computational infrastructure. However, cloud computing resources are not yet suitable for immediate “as is” use by experimental biologists. Results We present a cloud resource management system that makes it possible for individual researchers to compose and control an arbitrarily sized compute cluster on Amazon’s EC2 cloud infrastructure without any informatics requirements. Within this system, an entire suite of biological tools packaged by the NERC Bio-Linux team (http://nebc.nerc.ac.uk/tools/bio-linux) is available for immediate consumption. The provided solution makes it possible, using only a web browser, to create a completely configured compute cluster ready to perform analysis in less than five minutes. Moreover, we provide an automated method for building custom deployments of cloud resources. This approach promotes reproducibility of results and, if desired, allows individuals and labs to add or customize an otherwise available cloud system to better meet their needs. Conclusions The expected knowledge and associated effort with deploying a compute cluster in the Amazon EC2 cloud is not trivial. The solution presented in this paper eliminates these barriers, making it possible for researchers to deploy exactly the amount of computing power they need, combined with a wealth of existing analysis software, to handle the ongoing data deluge. PMID:21210983
Density-based parallel skin lesion border detection with webCL
2015-01-01
Background Dermoscopy is a highly effective and noninvasive imaging technique used in diagnosis of melanoma and other pigmented skin lesions. Many aspects of the lesion under consideration are defined in relation to the lesion border. This makes border detection one of the most important steps in dermoscopic image analysis. In current practice, dermatologists often delineate borders through a hand drawn representation based upon visual inspection. Due to the subjective nature of this technique, intra- and inter-observer variations are common. Because of this, the automated assessment of lesion borders in dermoscopic images has become an important area of study. Methods Fast density based skin lesion border detection method has been implemented in parallel with a new parallel technology called WebCL. WebCL utilizes client side computing capabilities to use available hardware resources such as multi cores and GPUs. Developed WebCL-parallel density based skin lesion border detection method runs efficiently from internet browsers. Results Previous research indicates that one of the highest accuracy rates can be achieved using density based clustering techniques for skin lesion border detection. While these algorithms do have unfavorable time complexities, this effect could be mitigated when implemented in parallel. In this study, density based clustering technique for skin lesion border detection is parallelized and redesigned to run very efficiently on the heterogeneous platforms (e.g. tablets, SmartPhones, multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units) by transforming the technique into a series of independent concurrent operations. Heterogeneous computing is adopted to support accessibility, portability and multi-device use in the clinical settings. For this, we used WebCL, an emerging technology that enables a HTML5 Web browser to execute code in parallel for heterogeneous platforms. We depicted WebCL and our parallel algorithm design. In addition, we tested parallel code on 100 dermoscopy images and showed the execution speedups with respect to the serial version. Results indicate that parallel (WebCL) version and serial version of density based lesion border detection methods generate the same accuracy rates for 100 dermoscopy images, in which mean of border error is 6.94%, mean of recall is 76.66%, and mean of precision is 99.29% respectively. Moreover, WebCL version's speedup factor for 100 dermoscopy images' lesion border detection averages around ~491.2. Conclusions When large amount of high resolution dermoscopy images considered in a usual clinical setting along with the critical importance of early detection and diagnosis of melanoma before metastasis, the importance of fast processing dermoscopy images become obvious. In this paper, we introduce WebCL and the use of it for biomedical image processing applications. WebCL is a javascript binding of OpenCL, which takes advantage of GPU computing from a web browser. Therefore, WebCL parallel version of density based skin lesion border detection introduced in this study can supplement expert dermatologist, and aid them in early diagnosis of skin lesions. While WebCL is currently an emerging technology, a full adoption of WebCL into the HTML5 standard would allow for this implementation to run on a very large set of hardware and software systems. WebCL takes full advantage of parallel computational resources including multi-cores and GPUs on a local machine, and allows for compiled code to run directly from the Web Browser. PMID:26423836
Density-based parallel skin lesion border detection with webCL.
Lemon, James; Kockara, Sinan; Halic, Tansel; Mete, Mutlu
2015-01-01
Dermoscopy is a highly effective and noninvasive imaging technique used in diagnosis of melanoma and other pigmented skin lesions. Many aspects of the lesion under consideration are defined in relation to the lesion border. This makes border detection one of the most important steps in dermoscopic image analysis. In current practice, dermatologists often delineate borders through a hand drawn representation based upon visual inspection. Due to the subjective nature of this technique, intra- and inter-observer variations are common. Because of this, the automated assessment of lesion borders in dermoscopic images has become an important area of study. Fast density based skin lesion border detection method has been implemented in parallel with a new parallel technology called WebCL. WebCL utilizes client side computing capabilities to use available hardware resources such as multi cores and GPUs. Developed WebCL-parallel density based skin lesion border detection method runs efficiently from internet browsers. Previous research indicates that one of the highest accuracy rates can be achieved using density based clustering techniques for skin lesion border detection. While these algorithms do have unfavorable time complexities, this effect could be mitigated when implemented in parallel. In this study, density based clustering technique for skin lesion border detection is parallelized and redesigned to run very efficiently on the heterogeneous platforms (e.g. tablets, SmartPhones, multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units) by transforming the technique into a series of independent concurrent operations. Heterogeneous computing is adopted to support accessibility, portability and multi-device use in the clinical settings. For this, we used WebCL, an emerging technology that enables a HTML5 Web browser to execute code in parallel for heterogeneous platforms. We depicted WebCL and our parallel algorithm design. In addition, we tested parallel code on 100 dermoscopy images and showed the execution speedups with respect to the serial version. Results indicate that parallel (WebCL) version and serial version of density based lesion border detection methods generate the same accuracy rates for 100 dermoscopy images, in which mean of border error is 6.94%, mean of recall is 76.66%, and mean of precision is 99.29% respectively. Moreover, WebCL version's speedup factor for 100 dermoscopy images' lesion border detection averages around ~491.2. When large amount of high resolution dermoscopy images considered in a usual clinical setting along with the critical importance of early detection and diagnosis of melanoma before metastasis, the importance of fast processing dermoscopy images become obvious. In this paper, we introduce WebCL and the use of it for biomedical image processing applications. WebCL is a javascript binding of OpenCL, which takes advantage of GPU computing from a web browser. Therefore, WebCL parallel version of density based skin lesion border detection introduced in this study can supplement expert dermatologist, and aid them in early diagnosis of skin lesions. While WebCL is currently an emerging technology, a full adoption of WebCL into the HTML5 standard would allow for this implementation to run on a very large set of hardware and software systems. WebCL takes full advantage of parallel computational resources including multi-cores and GPUs on a local machine, and allows for compiled code to run directly from the Web Browser.
Greaney, Mary L; Puleo, Elaine; Bennett, Gary G; Haines, Jess; Viswanath, K; Gillman, Matthew W; Sprunck-Harrild, Kim; Coeling, Molly; Rusinak, Donna; Emmons, Karen M
2014-02-01
Many U.S. adults have multiple behavioral risk factors, and effective, scalable interventions are needed to promote population-level health. In the health care setting, interventions are often provided in print, although accessible to nearly everyone, are brief (e.g., pamphlets), are not interactive, and can require some logistics around distribution. Web-based interventions offer more interactivity but may not be accessible to all. Healthy Directions 2 was a primary care-based cluster randomized controlled trial designed to improve five behavioral cancer risk factors among a diverse sample of adults (n = 2,440) in metropolitan Boston. Intervention materials were available via print or the web. Purpose. To (a) describe the Healthy Directions 2 study design and (b) identify baseline factors associated with whether participants opted for print or web-based materials. Hierarchical regression models corrected for clustering by physician were built to examine factors associated with choice of intervention modality. At baseline, just 4.0% of participants met all behavioral recommendations. Nearly equivalent numbers of intervention participants opted for print and web-based materials (44.6% vs. 55.4%). Participants choosing web-based materials were younger, and reported having a better financial status, better perceived health, greater computer comfort, and more frequent Internet use (p < .05) than those opting for print. In addition, Whites were more likely to pick web-based material than Black participants. Interventions addressing multiple behaviors are needed in the primary care setting, but they should be available in web and print formats as nearly equal number of participants chose each option, and there are significant differences in the population groups using each modality.
Ariga, Katsuhiko; Urakawa, Toshihiro; Michiue, Atsuo; Kikuchi, Jun-ichi
2004-08-03
As a novel category of two-dimensional lipid clusters, dendrimers having an amphiphilic structure in every unit were synthesized and labeled "spider-web amphiphiles". Amphiphilic units based on a Lys-Lys-Glu tripeptide with hydrophobic tails at the C-terminal and a polar head at the N-terminal are dendrically connected through stepwise peptide coupling. This structural design allowed us to separately introduce the polar head and hydrophobic tails. Accordingly, we demonstrated the synthesis of the spider-web amphiphile series in three combinations: acetyl head/C16 chain, acetyl head/C18 chain, and ammonium head/C16 chain. All the spider-web amphiphiles were synthesized in satisfactory yields, and characterized by 1H NMR, MALDI-TOFMS, GPC, and elemental analyses. Surface pressure (pi)-molecular area (A) isotherms showed the formation of expanded monolayers except for the C18-chain amphiphile at 10 degrees C, for which the molecular area in the condensed phase is consistent with the cross-sectional area assigned for all the alkyl chains. In all the spider-web amphiphiles, the molecular areas at a given pressure in the expanded phase increased in proportion to the number of units, indicating that alkyl chains freely fill the inner space of the dendritic core. The mixing of octadecanoic acid with the spider-web amphiphiles at the air-water interface induced condensation of the molecular area. From the molecular area analysis, the inclusion of the octadecanoic acid bears a stoichiometric characteristic; i.e., the number of captured octadecanoic acids in the spider-web amphiphile roughly agrees with the number of branching points in the spider-web amphiphile.
Application of microarray analysis on computer cluster and cloud platforms.
Bernau, C; Boulesteix, A-L; Knaus, J
2013-01-01
Analysis of recent high-dimensional biological data tends to be computationally intensive as many common approaches such as resampling or permutation tests require the basic statistical analysis to be repeated many times. A crucial advantage of these methods is that they can be easily parallelized due to the computational independence of the resampling or permutation iterations, which has induced many statistics departments to establish their own computer clusters. An alternative is to rent computing resources in the cloud, e.g. at Amazon Web Services. In this article we analyze whether a selection of statistical projects, recently implemented at our department, can be efficiently realized on these cloud resources. Moreover, we illustrate an opportunity to combine computer cluster and cloud resources. In order to compare the efficiency of computer cluster and cloud implementations and their respective parallelizations we use microarray analysis procedures and compare their runtimes on the different platforms. Amazon Web Services provide various instance types which meet the particular needs of the different statistical projects we analyzed in this paper. Moreover, the network capacity is sufficient and the parallelization is comparable in efficiency to standard computer cluster implementations. Our results suggest that many statistical projects can be efficiently realized on cloud resources. It is important to mention, however, that workflows can change substantially as a result of a shift from computer cluster to cloud computing.
Cloud-based Predictive Modeling System and its Application to Asthma Readmission Prediction
Chen, Robert; Su, Hang; Khalilia, Mohammed; Lin, Sizhe; Peng, Yue; Davis, Tod; Hirsh, Daniel A; Searles, Elizabeth; Tejedor-Sojo, Javier; Thompson, Michael; Sun, Jimeng
2015-01-01
The predictive modeling process is time consuming and requires clinical researchers to handle complex electronic health record (EHR) data in restricted computational environments. To address this problem, we implemented a cloud-based predictive modeling system via a hybrid setup combining a secure private server with the Amazon Web Services (AWS) Elastic MapReduce platform. EHR data is preprocessed on a private server and the resulting de-identified event sequences are hosted on AWS. Based on user-specified modeling configurations, an on-demand web service launches a cluster of Elastic Compute 2 (EC2) instances on AWS to perform feature selection and classification algorithms in a distributed fashion. Afterwards, the secure private server aggregates results and displays them via interactive visualization. We tested the system on a pediatric asthma readmission task on a de-identified EHR dataset of 2,967 patients. We conduct a larger scale experiment on the CMS Linkable 2008–2010 Medicare Data Entrepreneurs’ Synthetic Public Use File dataset of 2 million patients, which achieves over 25-fold speedup compared to sequential execution. PMID:26958172
Tsui, Fu-Chiang; Espino, Jeremy U.; Weng, Yan; Choudary, Arvinder; Su, Hoah-Der; Wagner, Michael M.
2005-01-01
The National Retail Data Monitor (NRDM) has monitored over-the-counter (OTC) medication sales in the United States since December 2002. The NRDM collects data from over 18,600 retail stores and processes over 0.6 million sales records per day. This paper describes key architectural features that we have found necessary for a data utility component in a national biosurveillance system. These elements include event-driven architecture to provide analyses of data in near real time, multiple levels of caching to improve query response time, high availability through the use of clustered servers, scalable data storage through the use of storage area networks and a web-service function for interoperation with affiliated systems. The methods and architectural principles are relevant to the design of any production data utility for public health surveillance—systems that collect data from multiple sources in near real time for use by analytic programs and user interfaces that have substantial requirements for time-series data aggregated in multiple dimensions. PMID:16779138
Dear, R F; Barratt, A L; Askie, L M; Butow, P N; McGeechan, K; Crossing, S; Currow, D C; Tattersall, M H N
2012-07-01
Cancer patients want access to reliable information about currently recruiting clinical trials. Oncologists and their patients were randomly assigned to access a consumer-friendly cancer clinical trials web site [Australian Cancer Trials (ACT), www.australiancancertrials.gov.au] or to usual care in a cluster randomized controlled trial. The primary outcome, measured from audio recordings of oncologist-patient consultations, was the proportion of patients with whom participation in any clinical trial was discussed. Analysis was by intention-to-treat accounting for clustering and stratification. Thirty medical oncologists and 493 patients were recruited. Overall, 46% of consultations in the intervention group compared with 34% in the control group contained a discussion about clinical trials (P=0.08). The mean consultation length in both groups was 29 min (P=0.69). The proportion consenting to a trial was 10% in both groups (P=0.65). Patients' knowledge about randomized trials was lower in the intervention than the control group (mean score 3.0 versus 3.3, P=0.03) but decisional conflict scores were similar (mean score 42 versus 43, P=0.83). Good communication between patients and physicians is essential. Within this context, a web site such as Australian Cancer Trials may be an important tool to encourage discussion about clinical trial participation.
Pereira, Celina Andrade; Wen, Chao Lung; Miguel, Eurípedes Constantino; Polanczyk, Guilherme V
2015-08-01
Children affected by mental disorders are largely unrecognised and untreated across the world. Community resources, including the school system and teachers, are important elements in actions directed to promoting child mental health and preventing and treating mental disorders, especially in low- and middle-income countries. We developed a web-based program to educate primary school teachers on mental disorders in childhood and conducted a cluster-randomised controlled trial to test the effectiveness of the web-based program intervention in comparison with the same program based on text and video materials only and to a waiting-list control group. All nine schools of a single city in the state of São Paulo, Brazil, were randomised to the three groups, and teachers completed the educational programs during 3 weeks. Data were analysed according to complete cases and intention-to-treat approaches. In terms of gains of knowledge about mental disorders, the web-based program intervention was superior to the intervention with text and video materials, and to the waiting-list control group. In terms of beliefs and attitudes about mental disorders, the web-based program intervention group presented less stigmatised concepts than the text and video group and more non-stigmatised concepts than the waiting-list group. No differences were detected in terms of teachers' attitudes. This study demonstrated initial data on the effectiveness of a web-based program in educating schoolteachers on child mental disorders. Future studies are necessary to replicate and extend the findings.
Existence and significance of communities in the World Trade Web
NASA Astrophysics Data System (ADS)
Piccardi, Carlo; Tajoli, Lucia
2012-06-01
The World Trade Web (WTW), which models the international transactions among countries, is a fundamental tool for studying the economics of trade flows, their evolution over time, and their implications for a number of phenomena, including the propagation of economic shocks among countries. In this respect, the possible existence of communities is a key point, because it would imply that countries are organized in groups of preferential partners. In this paper, we use four approaches to analyze communities in the WTW between 1962 and 2008, based, respectively, on modularity optimization, cluster analysis, stability functions, and persistence probabilities. Overall, the four methods agree in finding no evidence of significant partitions. A few weak communities emerge from the analysis, but they do not represent secluded groups of countries, as intercommunity linkages are also strong, supporting the view of a truly globalized trading system.
Existence and significance of communities in the World Trade Web.
Piccardi, Carlo; Tajoli, Lucia
2012-06-01
The World Trade Web (WTW), which models the international transactions among countries, is a fundamental tool for studying the economics of trade flows, their evolution over time, and their implications for a number of phenomena, including the propagation of economic shocks among countries. In this respect, the possible existence of communities is a key point, because it would imply that countries are organized in groups of preferential partners. In this paper, we use four approaches to analyze communities in the WTW between 1962 and 2008, based, respectively, on modularity optimization, cluster analysis, stability functions, and persistence probabilities. Overall, the four methods agree in finding no evidence of significant partitions. A few weak communities emerge from the analysis, but they do not represent secluded groups of countries, as intercommunity linkages are also strong, supporting the view of a truly globalized trading system.
Network dynamics: The World Wide Web
NASA Astrophysics Data System (ADS)
Adamic, Lada Ariana
Despite its rapidly growing and dynamic nature, the Web displays a number of strong regularities which can be understood by drawing on methods of statistical physics. This thesis finds power-law distributions in website sizes, traffic, and links, and more importantly, develops a stochastic theory which explains them. Power-law link distributions are shown to lead to network characteristics which are especially suitable for scalable localized search. It is also demonstrated that the Web is a "small world": to reach one site from any other takes an average of only 4 hops, while most related sites cluster together. Additional dynamical properties of the Web graph are extracted from diffusion processes.
Fast segmentation of satellite images using SLIC, WebGL and Google Earth Engine
NASA Astrophysics Data System (ADS)
Donchyts, Gennadii; Baart, Fedor; Gorelick, Noel; Eisemann, Elmar; van de Giesen, Nick
2017-04-01
Google Earth Engine (GEE) is a parallel geospatial processing platform, which harmonizes access to petabytes of freely available satellite images. It provides a very rich API, allowing development of dedicated algorithms to extract useful geospatial information from these images. At the same time, modern GPUs provide thousands of computing cores, which are mostly not utilized in this context. In the last years, WebGL became a popular and well-supported API, allowing fast image processing directly in web browsers. In this work, we will evaluate the applicability of WebGL to enable fast segmentation of satellite images. A new implementation of a Simple Linear Iterative Clustering (SLIC) algorithm using GPU shaders will be presented. SLIC is a simple and efficient method to decompose an image in visually homogeneous regions. It adapts a k-means clustering approach to generate superpixels efficiently. While this approach will be hard to scale, due to a significant amount of data to be transferred to the client, it should significantly improve exploratory possibilities and simplify development of dedicated algorithms for geoscience applications. Our prototype implementation will be used to improve surface water detection of the reservoirs using multispectral satellite imagery.
BAGEL4: a user-friendly web server to thoroughly mine RiPPs and bacteriocins.
van Heel, Auke J; de Jong, Anne; Song, Chunxu; Viel, Jakob H; Kok, Jan; Kuipers, Oscar P
2018-05-21
Interest in secondary metabolites such as RiPPs (ribosomally synthesized and posttranslationally modified peptides) is increasing worldwide. To facilitate the research in this field we have updated our mining web server. BAGEL4 is faster than its predecessor and is now fully independent from ORF-calling. Gene clusters of interest are discovered using the core-peptide database and/or through HMM motifs that are present in associated context genes. The databases used for mining have been updated and extended with literature references and links to UniProt and NCBI. Additionally, we have included automated promoter and terminator prediction and the option to upload RNA expression data, which can be displayed along with the identified clusters. Further improvements include the annotation of the context genes, which is now based on a fast blast against the prokaryote part of the UniRef90 database, and the improved web-BLAST feature that dynamically loads structural data such as internal cross-linking from UniProt. Overall BAGEL4 provides the user with more information through a user-friendly web-interface which simplifies data evaluation. BAGEL4 is freely accessible at http://bagel4.molgenrug.nl.
DMINDA: an integrated web server for DNA motif identification and analyses.
Ma, Qin; Zhang, Hanyuan; Mao, Xizeng; Zhou, Chuan; Liu, Bingqiang; Chen, Xin; Xu, Ying
2014-07-01
DMINDA (DNA motif identification and analyses) is an integrated web server for DNA motif identification and analyses, which is accessible at http://csbl.bmb.uga.edu/DMINDA/. This web site is freely available to all users and there is no login requirement. This server provides a suite of cis-regulatory motif analysis functions on DNA sequences, which are important to elucidation of the mechanisms of transcriptional regulation: (i) de novo motif finding for a given set of promoter sequences along with statistical scores for the predicted motifs derived based on information extracted from a control set, (ii) scanning motif instances of a query motif in provided genomic sequences, (iii) motif comparison and clustering of identified motifs, and (iv) co-occurrence analyses of query motifs in given promoter sequences. The server is powered by a backend computer cluster with over 150 computing nodes, and is particularly useful for motif prediction and analyses in prokaryotic genomes. We believe that DMINDA, as a new and comprehensive web server for cis-regulatory motif finding and analyses, will benefit the genomic research community in general and prokaryotic genome researchers in particular. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Comparing cosmic web classifiers using information theory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leclercq, Florent; Lavaux, Guilhem; Wandelt, Benjamin
We introduce a decision scheme for optimally choosing a classifier, which segments the cosmic web into different structure types (voids, sheets, filaments, and clusters). Our framework, based on information theory, accounts for the design aims of different classes of possible applications: (i) parameter inference, (ii) model selection, and (iii) prediction of new observations. As an illustration, we use cosmographic maps of web-types in the Sloan Digital Sky Survey to assess the relative performance of the classifiers T-WEB, DIVA and ORIGAMI for: (i) analyzing the morphology of the cosmic web, (ii) discriminating dark energy models, and (iii) predicting galaxy colors. Ourmore » study substantiates a data-supported connection between cosmic web analysis and information theory, and paves the path towards principled design of analysis procedures for the next generation of galaxy surveys. We have made the cosmic web maps, galaxy catalog, and analysis scripts used in this work publicly available.« less
Baryons at the edge of the X-ray-brightest galaxy cluster.
Simionescu, Aurora; Allen, Steven W; Mantz, Adam; Werner, Norbert; Takei, Yoh; Morris, R Glenn; Fabian, Andrew C; Sanders, Jeremy S; Nulsen, Paul E J; George, Matthew R; Taylor, Gregory B
2011-03-25
Studies of the diffuse x-ray-emitting gas in galaxy clusters have provided powerful constraints on cosmological parameters and insights into plasma astrophysics. However, measurements of the faint cluster outskirts have become possible only recently. Using data from the Suzaku x-ray telescope, we determined an accurate, spatially resolved census of the gas, metals, and dark matter out to the edge of the Perseus Cluster. Contrary to previous results, our measurements of the cluster baryon fraction are consistent with the expected universal value at half of the virial radius. The apparent baryon fraction exceeds the cosmic mean at larger radii, suggesting a clumpy distribution of the gas, which is important for understanding the ongoing growth of clusters from the surrounding cosmic web.
Warm-hot baryons comprise 5-10 per cent of filaments in the cosmic web.
Eckert, Dominique; Jauzac, Mathilde; Shan, HuanYuan; Kneib, Jean-Paul; Erben, Thomas; Israel, Holger; Jullo, Eric; Klein, Matthias; Massey, Richard; Richard, Johan; Tchernin, Céline
2015-12-03
Observations of the cosmic microwave background indicate that baryons account for 5 per cent of the Universe's total energy content. In the local Universe, the census of all observed baryons falls short of this estimate by a factor of two. Cosmological simulations indicate that the missing baryons have not condensed into virialized haloes, but reside throughout the filaments of the cosmic web (where matter density is larger than average) as a low-density plasma at temperatures of 10(5)-10(7) kelvin, known as the warm-hot intergalactic medium. There have been previous claims of the detection of warm-hot baryons along the line of sight to distant blazars and of hot gas between interacting clusters. These observations were, however, unable to trace the large-scale filamentary structure, or to estimate the total amount of warm-hot baryons in a representative volume of the Universe. Here we report X-ray observations of filamentary structures of gas at 10(7) kelvin associated with the galaxy cluster Abell 2744. Previous observations of this cluster were unable to resolve and remove coincidental X-ray point sources. After subtracting these, we find hot gas structures that are coherent over scales of 8 megaparsecs. The filaments coincide with over-densities of galaxies and dark matter, with 5-10 per cent of their mass in baryonic gas. This gas has been heated up by the cluster's gravitational pull and is now feeding its core. Our findings strengthen evidence for a picture of the Universe in which a large fraction of the missing baryons reside in the filaments of the cosmic web.
New atlas of open star clusters
NASA Astrophysics Data System (ADS)
Seleznev, Anton F.; Avvakumova, Ekaterina; Kulesh, Maxim; Filina, Julia; Tsaregorodtseva, Polina; Kvashnina, Alvira
2017-11-01
Due to numerous new discoveries of open star clusters in the last two decades, astronomers need an easy-touse resource to get visual information on the relative position of clusters in the sky. Therefore we propose a new atlas of open star clusters. It is based on a table compiled from the largest modern cluster catalogues. The atlas shows the positions and sizes of 3291 clusters and associations, and consists of two parts. The first contains 108 maps of 12 by 12 degrees with an overlapping of 2 degrees in three strips along the Galactic equator. The second one is an online web application, which shows a square field of an arbitrary size, either in equatorial coordinates or in galactic coordinates by request. The atlas is proposed for the sampling of clusters and cluster stars for further investigation. Another use is the identification of clusters among overdensities in stellar density maps or among stellar groups in images of the sky.
Stein, Mart L; van Steenbergen, Jim E; Chanyasanha, Charnchudhi; Tipayamongkholgul, Mathuros; Buskens, Vincent; van der Heijden, Peter G M; Sabaiwan, Wasamon; Bengtsson, Linus; Lu, Xin; Thorson, Anna E; Kretzschmar, Mirjam E E
2014-01-01
Information on social interactions is needed to understand the spread of airborne infections through a population. Previous studies mostly collected egocentric information of independent respondents with self-reported information about contacts. Respondent-driven sampling (RDS) is a sampling technique allowing respondents to recruit contacts from their social network. We explored the feasibility of webRDS for studying contact patterns relevant for the spread of respiratory pathogens. We developed a webRDS system for facilitating and tracking recruitment by Facebook and email. One-day diary surveys were conducted by applying webRDS among a convenience sample of Thai students. Students were asked to record numbers of contacts at different settings and self-reported influenza-like-illness symptoms, and to recruit four contacts whom they had met in the previous week. Contacts were asked to do the same to create a network tree of socially connected individuals. Correlations between linked individuals were analysed to investigate assortativity within networks. We reached up to 6 waves of contacts of initial respondents, using only non-material incentives. Forty-four (23.0%) of the initially approached students recruited one or more contacts. In total 257 persons participated, of which 168 (65.4%) were recruited by others. Facebook was the most popular recruitment option (45.1%). Strong assortative mixing was seen by age, gender and education, indicating a tendency of respondents to connect to contacts with similar characteristics. Random mixing was seen by reported number of daily contacts. Despite methodological challenges (e.g. clustering among respondents and their contacts), applying RDS provides new insights in mixing patterns relevant for close-contact infections in real-world networks. Such information increases our knowledge of the transmission of respiratory infections within populations and can be used to improve existing modelling approaches. It is worthwhile to further develop and explore webRDS for the detection of clusters of respiratory symptoms in social networks.
Sharma, Parichit; Mantri, Shrikant S
2014-01-01
The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC) clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI) are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture, explain design decisions, describe workflows and provide a detailed analysis.
Sharma, Parichit; Mantri, Shrikant S.
2014-01-01
The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC) clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI) are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture, explain design decisions, describe workflows and provide a detailed analysis. PMID:24979410
Stein, Mart L.; van Steenbergen, Jim E.; Chanyasanha, Charnchudhi; Tipayamongkholgul, Mathuros; Buskens, Vincent; van der Heijden, Peter G. M.; Sabaiwan, Wasamon; Bengtsson, Linus; Lu, Xin; Thorson, Anna E.; Kretzschmar, Mirjam E. E.
2014-01-01
Background Information on social interactions is needed to understand the spread of airborne infections through a population. Previous studies mostly collected egocentric information of independent respondents with self-reported information about contacts. Respondent-driven sampling (RDS) is a sampling technique allowing respondents to recruit contacts from their social network. We explored the feasibility of webRDS for studying contact patterns relevant for the spread of respiratory pathogens. Materials and Methods We developed a webRDS system for facilitating and tracking recruitment by Facebook and email. One-day diary surveys were conducted by applying webRDS among a convenience sample of Thai students. Students were asked to record numbers of contacts at different settings and self-reported influenza-like-illness symptoms, and to recruit four contacts whom they had met in the previous week. Contacts were asked to do the same to create a network tree of socially connected individuals. Correlations between linked individuals were analysed to investigate assortativity within networks. Results We reached up to 6 waves of contacts of initial respondents, using only non-material incentives. Forty-four (23.0%) of the initially approached students recruited one or more contacts. In total 257 persons participated, of which 168 (65.4%) were recruited by others. Facebook was the most popular recruitment option (45.1%). Strong assortative mixing was seen by age, gender and education, indicating a tendency of respondents to connect to contacts with similar characteristics. Random mixing was seen by reported number of daily contacts. Conclusions Despite methodological challenges (e.g. clustering among respondents and their contacts), applying RDS provides new insights in mixing patterns relevant for close-contact infections in real-world networks. Such information increases our knowledge of the transmission of respiratory infections within populations and can be used to improve existing modelling approaches. It is worthwhile to further develop and explore webRDS for the detection of clusters of respiratory symptoms in social networks. PMID:24416371
Unveiling the Synchrotron Cosmic Web: Pilot Study
NASA Astrophysics Data System (ADS)
Brown, Shea; Rudnick, Lawrence; Pfrommer, Christoph; Jones, Thomas
2011-10-01
The overall goal of this project is to challenge our current theoretical understanding of the relativistic particle populations in the inter-galactic medium (IGM) through deep 1.4 GHz observations of 13 massive, high-redshift clusters of galaxies. Designed to compliment/extend the GMRT radio halo survey (Venturi et al. 2007), these observations will attempt to detect the peaks of the purported synchrotron cosmic-web, and place serious limits on models of CR acceleration and magnetic field amplification during large-scale structure formation. The primary goals of this survey are: 1) Confirm the bi-modal nature of the radio halo population, which favors turbulent re-acceleration of cosmic-ray electrons (CRe) during cluster mergers as the source of the diffuse radio emission; 2) Directly test hadronic secondary models which predict the presence of cosmic-ray protons (CRp) in the cores of massive X-ray clusters; 3) Search in polarization for shock structures, a potential source of CR acceleration in the IGM.
Processing ARM VAP data on an AWS cluster
NASA Astrophysics Data System (ADS)
Martin, T.; Macduff, M.; Shippert, T.
2017-12-01
The Atmospheric Radiation Measurement (ARM) Data Management Facility (DMF) manages over 18,000 processes and 1.3 TB of data each day. This includes many Value Added Products (VAPs) that make use of multiple instruments to produce the derived products that are scientifically relevant. A thermodynamic and cloud profile VAP is being developed to provide input to the ARM Large-eddy simulation (LES) ARM Symbiotic Simulation and Observation (LASSO) project (https://www.arm.gov/capabilities/vaps/lasso-122) . This algorithm is CPU intensive and the processing requirements exceeded the available DMF computing capacity. Amazon Web Service (AWS) along with CfnCluster was investigated to see how it would perform. This cluster environment is cost effective and scales dynamically based on demand. We were able to take advantage of autoscaling which allowed the cluster to grow and shrink based on the size of the processing queue. We also were able to take advantage of the Amazon Web Services spot market to further reduce the cost. Our test was very successful and found that cloud resources can be used to efficiently and effectively process time series data. This poster will present the resources and methodology used to successfully run the algorithm.
WIWS: a protein structure bioinformatics Web service collection.
Hekkelman, M L; Te Beek, T A H; Pettifer, S R; Thorne, D; Attwood, T K; Vriend, G
2010-07-01
The WHAT IF molecular-modelling and drug design program is widely distributed in the world of protein structure bioinformatics. Although originally designed as an interactive application, its highly modular design and inbuilt control language have recently enabled its deployment as a collection of programmatically accessible web services. We report here a collection of WHAT IF-based protein structure bioinformatics web services: these relate to structure quality, the use of symmetry in crystal structures, structure correction and optimization, adding hydrogens and optimizing hydrogen bonds and a series of geometric calculations. The freely accessible web services are based on the industry standard WS-I profile and the EMBRACE technical guidelines, and are available via both REST and SOAP paradigms. The web services run on a dedicated computational cluster; their function and availability is monitored daily.
Online interactive analysis of protein structure ensembles with Bio3D-web.
Skjærven, Lars; Jariwala, Shashank; Yao, Xin-Qiu; Grant, Barry J
2016-11-15
Bio3D-web is an online application for analyzing the sequence, structure and conformational heterogeneity of protein families. Major functionality is provided for identifying protein structure sets for analysis, their alignment and refined structure superposition, sequence and structure conservation analysis, mapping and clustering of conformations and the quantitative comparison of their predicted structural dynamics. Bio3D-web is based on the Bio3D and Shiny R packages. All major browsers are supported and full source code is available under a GPL2 license from http://thegrantlab.org/bio3d-web CONTACT: bjgrant@umich.edu or lars.skjarven@uib.no. © The Author 2016. Published by Oxford University Press.
A Web Interface for Eco System Modeling
NASA Astrophysics Data System (ADS)
McHenry, K.; Kooper, R.; Serbin, S. P.; LeBauer, D. S.; Desai, A. R.; Dietze, M. C.
2012-12-01
We have developed the Predictive Ecosystem Analyzer (PEcAn) as an open-source scientific workflow system and ecoinformatics toolbox that manages the flow of information in and out of regional-scale terrestrial biosphere models, facilitates heterogeneous data assimilation, tracks data provenance, and enables more effective feedback between models and field research. The over-arching goal of PEcAn is to make otherwise complex analyses transparent, repeatable, and accessible to a diverse array of researchers, allowing both novice and expert users to focus on using the models to examine complex ecosystems rather than having to deal with complex computer system setup and configuration questions in order to run the models. Through the developed web interface we hide much of the data and model details and allow the user to simply select locations, ecosystem models, and desired data sources as inputs to the model. Novice users are guided by the web interface through setting up a model execution and plotting the results. At the same time expert users are given enough freedom to modify specific parameters before the model gets executed. This will become more important as more and more models are added to the PEcAn workflow as well as more and more data that will become available as NEON comes online. On the backend we support the execution of potentially computationally expensive models on different High Performance Computers (HPC) and/or clusters. The system can be configured with a single XML file that gives it the flexibility needed for configuring and running the different models on different systems using a combination of information stored in a database as well as pointers to files on the hard disk. While the web interface usually creates this configuration file, expert users can still directly edit it to fine tune the configuration.. Once a workflow is finished the web interface will allow for the easy creation of plots over result data while also allowing the user to download the results for further processing. The current workflow in the web interface is a simple linear workflow, but will be expanded to allow for more complex workflows. We are working with Kepler and Cyberintegrator to allow for these more complex workflows as well as collecting provenance of the workflow being executed. This provenance regarding model executions is stored in a database along with the derived results. All of this information is then accessible using the BETY database web frontend. The PEcAn interface.
MR-Tandem: parallel X!Tandem using Hadoop MapReduce on Amazon Web Services.
Pratt, Brian; Howbert, J Jeffry; Tasman, Natalie I; Nilsson, Erik J
2012-01-01
MR-Tandem adapts the popular X!Tandem peptide search engine to work with Hadoop MapReduce for reliable parallel execution of large searches. MR-Tandem runs on any Hadoop cluster but offers special support for Amazon Web Services for creating inexpensive on-demand Hadoop clusters, enabling search volumes that might not otherwise be feasible with the compute resources a researcher has at hand. MR-Tandem is designed to drop in wherever X!Tandem is already in use and requires no modification to existing X!Tandem parameter files, and only minimal modification to X!Tandem-based workflows. MR-Tandem is implemented as a lightly modified X!Tandem C++ executable and a Python script that drives Hadoop clusters including Amazon Web Services (AWS) Elastic Map Reduce (EMR), using the modified X!Tandem program as a Hadoop Streaming mapper and reducer. The modified X!Tandem C++ source code is Artistic licensed, supports pluggable scoring, and is available as part of the Sashimi project at http://sashimi.svn.sourceforge.net/viewvc/sashimi/trunk/trans_proteomic_pipeline/extern/xtandem/. The MR-Tandem Python script is Apache licensed and available as part of the Insilicos Cloud Army project at http://ica.svn.sourceforge.net/viewvc/ica/trunk/mr-tandem/. Full documentation and a windows installer that configures MR-Tandem, Python and all necessary packages are available at this same URL. brian.pratt@insilicos.com
Assessing the Amazon Cloud Suitability for CLARREO's Computational Needs
NASA Technical Reports Server (NTRS)
Goldin, Daniel; Vakhnin, Andrei A.; Currey, Jon C.
2015-01-01
In this document we compare the performance of the Amazon Web Services (AWS), also known as Amazon Cloud, with the CLARREO (Climate Absolute Radiance and Refractivity Observatory) cluster and assess its suitability for computational needs of the CLARREO mission. A benchmark executable to process one month and one year of PARASOL (Polarization and Anistropy of Reflectances for Atmospheric Sciences coupled with Observations from a Lidar) data was used. With the optimal AWS configuration, adequate data-processing times, comparable to the CLARREO cluster, were found. The assessment of alternatives to the CLARREO cluster continues and several options, such as a NASA-based cluster, are being considered.
Self-aggregation in scaled principal component space
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ding, Chris H.Q.; He, Xiaofeng; Zha, Hongyuan
2001-10-05
Automatic grouping of voluminous data into meaningful structures is a challenging task frequently encountered in broad areas of science, engineering and information processing. These data clustering tasks are frequently performed in Euclidean space or a subspace chosen from principal component analysis (PCA). Here we describe a space obtained by a nonlinear scaling of PCA in which data objects self-aggregate automatically into clusters. Projection into this space gives sharp distinctions among clusters. Gene expression profiles of cancer tissue subtypes, Web hyperlink structure and Internet newsgroups are analyzed to illustrate interesting properties of the space.
Metsalu, Tauno; Vilo, Jaak
2015-01-01
The Principal Component Analysis (PCA) is a widely used method of reducing the dimensionality of high-dimensional data, often followed by visualizing two of the components on the scatterplot. Although widely used, the method is lacking an easy-to-use web interface that scientists with little programming skills could use to make plots of their own data. The same applies to creating heatmaps: it is possible to add conditional formatting for Excel cells to show colored heatmaps, but for more advanced features such as clustering and experimental annotations, more sophisticated analysis tools have to be used. We present a web tool called ClustVis that aims to have an intuitive user interface. Users can upload data from a simple delimited text file that can be created in a spreadsheet program. It is possible to modify data processing methods and the final appearance of the PCA and heatmap plots by using drop-down menus, text boxes, sliders etc. Appropriate defaults are given to reduce the time needed by the user to specify input parameters. As an output, users can download PCA plot and heatmap in one of the preferred file formats. This web server is freely available at http://biit.cs.ut.ee/clustvis/. PMID:25969447
Astrophysical data mining with GPU. A case study: Genetic classification of globular clusters
NASA Astrophysics Data System (ADS)
Cavuoti, S.; Garofalo, M.; Brescia, M.; Paolillo, M.; Pescape', A.; Longo, G.; Ventre, G.
2014-01-01
We present a multi-purpose genetic algorithm, designed and implemented with GPGPU/CUDA parallel computing technology. The model was derived from our CPU serial implementation, named GAME (Genetic Algorithm Model Experiment). It was successfully tested and validated on the detection of candidate Globular Clusters in deep, wide-field, single band HST images. The GPU version of GAME will be made available to the community by integrating it into the web application DAMEWARE (DAta Mining Web Application REsource, http://dame.dsf.unina.it/beta_info.html), a public data mining service specialized on massive astrophysical data. Since genetic algorithms are inherently parallel, the GPGPU computing paradigm leads to a speedup of a factor of 200× in the training phase with respect to the CPU based version.
Needham, Robert; Stebbins, Julie; Chockalingam, Nachiappan
2016-01-01
To review the current scientific literature on the assessment of three-dimensional movement of the lumbar spine with a focus on the utilisation of a 3D cluster. Electronic databases PubMed, OVID, CINAHL, The Cochrance Library, ScienceDirect, ProQuest and Web of Knowledge were searched between 1966 and March 2015. The reference lists of the articles that met the inclusion criteria were also searched. From the 1530 articles identified through an initial search, 16 articles met the inclusion criteria. All information relating to methodology and kinematic modelling of the lumbar segment along with the outcome measures were extracted from the studies identified for synthesis. Guidelines detailing 3D cluster construction were limited in the identified articles and the lack of information presented makes it difficult to assess the external validity of this technique. Scarce information was presented detailing time-series angle data of the lumbar spine during gait. Further developments of the 3D cluster technique are required and it is essential that the authors provide clear instruction, definitions and standards in their manuscript to improve clarity and reproducibility.
VO-KOREL: A Fourier Disentangling Service of the Virtual Observatory
NASA Astrophysics Data System (ADS)
Škoda, Petr; Hadrava, Petr; Fuchs, Jan
2012-04-01
VO-KOREL is a web service exploiting the technology of the Virtual Observatory for providing astronomers with the intuitive graphical front-end and distributed computing back-end running the most recent version of the Fourier disentangling code KOREL. The system integrates the ideas of the e-shop basket, conserving the privacy of every user by transfer encryption and access authentication, with features of laboratory notebook, allowing the easy housekeeping of both input parameters and final results, as well as it explores a newly emerging technology of cloud computing. While the web-based front-end allows the user to submit data and parameter files, edit parameters, manage a job list, resubmit or cancel running jobs and mainly watching the text and graphical results of a disentangling process, the main part of the back-end is a simple job queue submission system executing in parallel multiple instances of the FORTRAN code KOREL. This may be easily extended for GRID-based deployment on massively parallel computing clusters. The short introduction into underlying technologies is given, briefly mentioning advantages as well as bottlenecks of the design used.
Dunn, Heather; Quinn, Laurie; Corbridge, Susan J; Eldeirawi, Kamal; Kapella, Mary; Collins, Eileen G
2017-05-01
The use of cluster analysis in the nursing literature is limited to the creation of classifications of homogeneous groups and the discovery of new relationships. As such, it is important to provide clarity regarding its use and potential. The purpose of this article is to provide an introduction to distance-based, partitioning-based, and model-based cluster analysis methods commonly utilized in the nursing literature, provide a brief historical overview on the use of cluster analysis in nursing literature, and provide suggestions for future research. An electronic search included three bibliographic databases, PubMed, CINAHL and Web of Science. Key terms were cluster analysis and nursing. The use of cluster analysis in the nursing literature is increasing and expanding. The increased use of cluster analysis in the nursing literature is positioning this statistical method to result in insights that have the potential to change clinical practice.
ClassLess: A Comprehensive Database of Young Stellar Objects
NASA Astrophysics Data System (ADS)
Hillenbrand, Lynne; Baliber, Nairn
2015-01-01
We have designed and constructed a database housing published measurements of Young Stellar Objects (YSOs) within ~1 kpc of the Sun. ClassLess, so called because it includes YSOs in all stages of evolution, is a relational database in which user interaction is conducted via HTML web browsers, queries are performed in scientific language, and all data are linked to the sources of publication. Each star is associated with a cluster (or clusters), and both spatially resolved and unresolved measurements are stored, allowing proper use of data from multiple star systems. With this fully searchable tool, myriad ground- and space-based instruments and surveys across wavelength regimes can be exploited. In addition to primary measurements, the database self consistently calculates and serves higher level data products such as extinction, luminosity, and mass. As a result, searches for young stars with specific physical characteristics can be completed with just a few mouse clicks.
Bauermeister, José A; Zimmerman, Marc A; Johns, Michelle M; Glowacki, Pietreck; Stoddard, Sarah; Volz, Erik
2012-09-01
We used a web version of Respondent-Driven Sampling (webRDS) to recruit a sample of young adults (ages 18-24) and examined whether this strategy would result in alcohol and other drug (AOD) prevalence estimates comparable to national estimates (National Survey on Drug Use and Health [NSDUH]). We recruited 22 initial participants (seeds) via Facebook to complete a web survey examining AOD risk correlates. Sequential, incentivized recruitment continued until our desired sample size was achieved. After correcting for webRDS clustering effects, we contrasted our AOD prevalence estimates (past 30 days) to NSDUH estimates by comparing the 95% confidence intervals of prevalence estimates. We found comparable AOD prevalence estimates between our sample and NSDUH for the past 30 days for alcohol, marijuana, cocaine, Ecstasy (3,4-methylenedioxymethamphetamine, or MDMA), and hallucinogens. Cigarette use was lower than NSDUH estimates. WebRDS may be a suitable strategy to recruit young adults online. We discuss the unique strengths and challenges that may be encountered by public health researchers using webRDS methods.
CLUSTAG: hierarchical clustering and graph methods for selecting tag SNPs.
Ao, S I; Yip, Kevin; Ng, Michael; Cheung, David; Fong, Pui-Yee; Melhado, Ian; Sham, Pak C
2005-04-15
Cluster and set-cover algorithms are developed to obtain a set of tag single nucleotide polymorphisms (SNPs) that can represent all the known SNPs in a chromosomal region, subject to the constraint that all SNPs must have a squared correlation R2>C with at least one tag SNP, where C is specified by the user. http://hkumath.hku.hk/web/link/CLUSTAG/CLUSTAG.html mng@maths.hku.hk.
ERIC Educational Resources Information Center
Costa, Carolina; Alvelos, Helena; Teixeira, Leonor
2016-01-01
This study analyses and compares the use of Web 2.0 tools by students in both learning and leisure contexts. Data were collected based on a questionnaire applied to 234 students from the University of Aveiro (Portugal) and the results were analysed by using descriptive analysis, paired samples t-tests, cluster analyses and Kruskal-Wallis tests.…
Data Mining Meets HCI: Making Sense of Large Graphs
2012-07-01
graph algo- rithms, won the Open Source Software World Challenge, Silver Award. We have released Pegasus as free , open-source software, downloaded by...METIS [77], spectral clustering [108], and the parameter- free “Cross-associations” (CA) [26]. Belief Propagation can also be used for clus- tering, as...number of tools have been developed to support “ landscape ” views of information. These include WebBook and Web- Forager [23], which use a book metaphor
Supporting NEESPI with Data Services - The SIB-ESS-C e-Infrastructure
NASA Astrophysics Data System (ADS)
Gerlach, R.; Schmullius, C.; Frotscher, K.
2009-04-01
Data discovery and retrieval is commonly among the first steps performed for any Earth science study. The way scientific data is searched and accessed has changed significantly over the past two decades. Especially the development of the World Wide Web and the technologies that evolved along shortened the data discovery and data exchange process. On the other hand the amount of data collected and distributed by earth scientists has increased exponentially requiring new concepts for data management and sharing. One such concept to meet the demand is to build up Spatial Data Infrastructures (SDI) or e-Infrastructures. These infrastructures usually contain components for data discovery allowing users (or other systems) to query a catalogue or registry and retrieve metadata information on available data holdings and services. Data access is typically granted using FTP/HTTP protocols or, more advanced, through Web Services. A Service Oriented Architecture (SOA) approach based on standardized services enables users to benefit from interoperability among different systems and to integrate distributed services into their application. The Siberian Earth System Science Cluster (SIB-ESS-C) being established at the University of Jena (Germany) is such a spatial data infrastructure following these principles and implementing standards published by the Open Geospatial Consortium (OGC) and the International Organization for Standardization (ISO). The prime objective is to provide researchers with focus on Siberia with the technical means for data discovery, data access, data publication and data analysis. The region of interest covers the entire Asian part of the Russian Federation from the Ural to the Pacific Ocean including the Ob-, Lena- and Yenissey river catchments. The aim of SIB-ESS-C is to provide a comprehensive set of data products for Earth system science in this region. Although SIB-ESS-C will be equipped with processing capabilities for in-house data generation (mainly from Earth Observation), current data holdings of SIB-ESS-C have been created in collaboration with a number of partners in previous and ongoing research projects (e.g. SIBERIA-II, SibFORD, IRIS). At the current development stage the SIB-ESS-C system comprises a federated metadata catalogue accessible through the SIB-ESS-C Web Portal or from any OGC-CSW compliant client. Due to full interoperability with other metadata catalogues users of the SIB-ESS-C Web Portal are able to search external metadata repositories. The Web Portal contains also a simple visualization component which will be extended to a comprehensive visualization and analysis tool in the near future. All data products are already accessible as a Web Mapping Service and will be made available as Web Feature and Web Coverage Services soon allowing users to directly incorporate the data into their application. The SIB-ESS-C infrastructure will be further developed as one node in a network of similar systems (e.g. NASA GIOVANNI) in the NEESPI region.
Astronomy Fun with Mobile Devices
NASA Astrophysics Data System (ADS)
Pilachowski, Catherine A.; Morris, Frank
2016-01-01
Those mobile devices your students bring to class can do more that tweet and text. Engage your students with these web-based astronomy learning tools that allow students to manipulate astronomical data to learn important concepts. The tools are HTML5, CSS3, Javascript-based applications that provide access to the content on iPad and Android tablets. With "Three Color" students can combine monochrome astronomical images taken through different color filters or in different wavelength regions into a single color image. "Star Clusters" allows students to compare images of clusters with a pre-defined template of colors and sizes to compare clusters of different ages. An adaptation of Travis Rector's "NovaSearch" allows students to examine images of the central regions of the Andromeda Galaxy to find novae and to measure the time over which the nova fades away. New additions to our suite of applications allow students to estimate the surface temperatures of exoplanets and the probability of life elsewhere in the Universe. Further information and access to these web-based tools are available at www.astro.indiana.edu/ala/.
Galaxy Transformations In The Cosmic Web
NASA Astrophysics Data System (ADS)
Jablonka, Pascale
2017-06-01
In this talk, I present a new survey, the Spatial Extended EDisCS Survey (SEEDisCS), that aims at understanding how clusters assemble and the level at which galaxies are preprocessed before falling on the cluster cores. SEEDisCS therefore focusses on the changes in galaxy properties along the large scale structures surrounding a couple of z 0.5 medium mass clusters, I first describe how spiral disc stellar populations are affected by the environment,and how we can get constraints on the timescale of star formation quenching. I then present new NOEMA and ALMA CO observations that trace the fate of the galaxy cold gas content along the infalling paths towards the cluster cores.
BioTextQuest: a web-based biomedical text mining suite for concept discovery.
Papanikolaou, Nikolas; Pafilis, Evangelos; Nikolaou, Stavros; Ouzounis, Christos A; Iliopoulos, Ioannis; Promponas, Vasilis J
2011-12-01
BioTextQuest combines automated discovery of significant terms in article clusters with structured knowledge annotation, via Named Entity Recognition services, offering interactive user-friendly visualization. A tag-cloud-based illustration of terms labeling each document cluster are semantically annotated according to the biological entity, and a list of document titles enable users to simultaneously compare terms and documents of each cluster, facilitating concept association and hypothesis generation. BioTextQuest allows customization of analysis parameters, e.g. clustering/stemming algorithms, exclusion of documents/significant terms, to better match the biological question addressed. http://biotextquest.biol.ucy.ac.cy vprobon@ucy.ac.cy; iliopj@med.uoc.gr Supplementary data are available at Bioinformatics online.
The ALICE Software Release Validation cluster
NASA Astrophysics Data System (ADS)
Berzano, D.; Krzewicki, M.
2015-12-01
One of the most important steps of software lifecycle is Quality Assurance: this process comprehends both automatic tests and manual reviews, and all of them must pass successfully before the software is approved for production. Some tests, such as source code static analysis, are executed on a single dedicated service: in High Energy Physics, a full simulation and reconstruction chain on a distributed computing environment, backed with a sample “golden” dataset, is also necessary for the quality sign off. The ALICE experiment uses dedicated and virtualized computing infrastructures for the Release Validation in order not to taint the production environment (i.e. CVMFS and the Grid) with non-validated software and validation jobs: the ALICE Release Validation cluster is a disposable virtual cluster appliance based on CernVM and the Virtual Analysis Facility, capable of deploying on demand, and with a single command, a dedicated virtual HTCondor cluster with an automatically scalable number of virtual workers on any cloud supporting the standard EC2 interface. Input and output data are externally stored on EOS, and a dedicated CVMFS service is used to provide the software to be validated. We will show how the Release Validation Cluster deployment and disposal are completely transparent for the Release Manager, who simply triggers the validation from the ALICE build system's web interface. CernVM 3, based entirely on CVMFS, permits to boot any snapshot of the operating system in time: we will show how this allows us to certify each ALICE software release for an exact CernVM snapshot, addressing the problem of Long Term Data Preservation by ensuring a consistent environment for software execution and data reprocessing in the future.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dong, Han; Sharma, Diksha; Badano, Aldo, E-mail: aldo.badano@fda.hhs.gov
2014-12-15
Purpose: Monte Carlo simulations play a vital role in the understanding of the fundamental limitations, design, and optimization of existing and emerging medical imaging systems. Efforts in this area have resulted in the development of a wide variety of open-source software packages. One such package, hybridMANTIS, uses a novel hybrid concept to model indirect scintillator detectors by balancing the computational load using dual CPU and graphics processing unit (GPU) processors, obtaining computational efficiency with reasonable accuracy. In this work, the authors describe two open-source visualization interfaces, webMANTIS and visualMANTIS to facilitate the setup of computational experiments via hybridMANTIS. Methods: Themore » visualization tools visualMANTIS and webMANTIS enable the user to control simulation properties through a user interface. In the case of webMANTIS, control via a web browser allows access through mobile devices such as smartphones or tablets. webMANTIS acts as a server back-end and communicates with an NVIDIA GPU computing cluster that can support multiuser environments where users can execute different experiments in parallel. Results: The output consists of point response and pulse-height spectrum, and optical transport statistics generated by hybridMANTIS. The users can download the output images and statistics through a zip file for future reference. In addition, webMANTIS provides a visualization window that displays a few selected optical photon path as they get transported through the detector columns and allows the user to trace the history of the optical photons. Conclusions: The visualization tools visualMANTIS and webMANTIS provide features such as on the fly generation of pulse-height spectra and response functions for microcolumnar x-ray imagers while allowing users to save simulation parameters and results from prior experiments. The graphical interfaces simplify the simulation setup and allow the user to go directly from specifying input parameters to receiving visual feedback for the model predictions.« less
COGNAT: a web server for comparative analysis of genomic neighborhoods.
Klimchuk, Olesya I; Konovalov, Kirill A; Perekhvatov, Vadim V; Skulachev, Konstantin V; Dibrova, Daria V; Mulkidjanian, Armen Y
2017-11-22
In prokaryotic genomes, functionally coupled genes can be organized in conserved gene clusters enabling their coordinated regulation. Such clusters could contain one or several operons, which are groups of co-transcribed genes. Those genes that evolved from a common ancestral gene by speciation (i.e. orthologs) are expected to have similar genomic neighborhoods in different organisms, whereas those copies of the gene that are responsible for dissimilar functions (i.e. paralogs) could be found in dissimilar genomic contexts. Comparative analysis of genomic neighborhoods facilitates the prediction of co-regulated genes and helps to discern different functions in large protein families. We intended, building on the attribution of gene sequences to the clusters of orthologous groups of proteins (COGs), to provide a method for visualization and comparative analysis of genomic neighborhoods of evolutionary related genes, as well as a respective web server. Here we introduce the COmparative Gene Neighborhoods Analysis Tool (COGNAT), a web server for comparative analysis of genomic neighborhoods. The tool is based on the COG database, as well as the Pfam protein families database. As an example, we show the utility of COGNAT in identifying a new type of membrane protein complex that is formed by paralog(s) of one of the membrane subunits of the NADH:quinone oxidoreductase of type 1 (COG1009) and a cytoplasmic protein of unknown function (COG3002). This article was reviewed by Drs. Igor Zhulin, Uri Gophna and Igor Rogozin.
MR-Tandem: parallel X!Tandem using Hadoop MapReduce on Amazon Web Services
Pratt, Brian; Howbert, J. Jeffry; Tasman, Natalie I.; Nilsson, Erik J.
2012-01-01
Summary: MR-Tandem adapts the popular X!Tandem peptide search engine to work with Hadoop MapReduce for reliable parallel execution of large searches. MR-Tandem runs on any Hadoop cluster but offers special support for Amazon Web Services for creating inexpensive on-demand Hadoop clusters, enabling search volumes that might not otherwise be feasible with the compute resources a researcher has at hand. MR-Tandem is designed to drop in wherever X!Tandem is already in use and requires no modification to existing X!Tandem parameter files, and only minimal modification to X!Tandem-based workflows. Availability and implementation: MR-Tandem is implemented as a lightly modified X!Tandem C++ executable and a Python script that drives Hadoop clusters including Amazon Web Services (AWS) Elastic Map Reduce (EMR), using the modified X!Tandem program as a Hadoop Streaming mapper and reducer. The modified X!Tandem C++ source code is Artistic licensed, supports pluggable scoring, and is available as part of the Sashimi project at http://sashimi.svn.sourceforge.net/viewvc/sashimi/trunk/trans_proteomic_pipeline/extern/xtandem/. The MR-Tandem Python script is Apache licensed and available as part of the Insilicos Cloud Army project at http://ica.svn.sourceforge.net/viewvc/ica/trunk/mr-tandem/. Full documentation and a windows installer that configures MR-Tandem, Python and all necessary packages are available at this same URL. Contact: brian.pratt@insilicos.com PMID:22072385
Smith, Nicholas; Witham, Shawn; Sarkar, Subhra; Zhang, Jie; Li, Lin; Li, Chuan; Alexov, Emil
2012-06-15
A new edition of the DelPhi web server, DelPhi web server v2, is released to include atomic presentation of geometrical figures. These geometrical objects can be used to model nano-size objects together with real biological macromolecules. The position and size of the object can be manipulated by the user in real time until desired results are achieved. The server fixes structural defects, adds hydrogen atoms and calculates electrostatic energies and the corresponding electrostatic potential and ionic distributions. The web server follows a client-server architecture built on PHP and HTML and utilizes DelPhi software. The computation is carried out on supercomputer cluster and results are given back to the user via http protocol, including the ability to visualize the structure and corresponding electrostatic potential via Jmol implementation. The DelPhi web server is available from http://compbio.clemson.edu/delphi_webserver.
Romer, Katherine A.; Kayombya, Guy-Richard; Fraenkel, Ernest
2007-01-01
WebMOTIFS provides a web interface that facilitates the discovery and analysis of DNA-sequence motifs. Several studies have shown that the accuracy of motif discovery can be significantly improved by using multiple de novo motif discovery programs and using randomized control calculations to identify the most significant motifs or by using Bayesian approaches. WebMOTIFS makes it easy to apply these strategies. Using a single submission form, users can run several motif discovery programs and score, cluster and visualize the results. In addition, the Bayesian motif discovery program THEME can be used to determine the class of transcription factors that is most likely to regulate a set of sequences. Input can be provided as a list of gene or probe identifiers. Used with the default settings, WebMOTIFS accurately identifies biologically relevant motifs from diverse data in several species. WebMOTIFS is freely available at http://fraenkel.mit.edu/webmotifs. PMID:17584794
Next Generation Models for Storage and Representation of Microbial Biological Annotation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Quest, Daniel J; Land, Miriam L; Brettin, Thomas S
2010-01-01
Background Traditional genome annotation systems were developed in a very different computing era, one where the World Wide Web was just emerging. Consequently, these systems are built as centralized black boxes focused on generating high quality annotation submissions to GenBank/EMBL supported by expert manual curation. The exponential growth of sequence data drives a growing need for increasingly higher quality and automatically generated annotation. Typical annotation pipelines utilize traditional database technologies, clustered computing resources, Perl, C, and UNIX file systems to process raw sequence data, identify genes, and predict and categorize gene function. These technologies tightly couple the annotation software systemmore » to hardware and third party software (e.g. relational database systems and schemas). This makes annotation systems hard to reproduce, inflexible to modification over time, difficult to assess, difficult to partition across multiple geographic sites, and difficult to understand for those who are not domain experts. These systems are not readily open to scrutiny and therefore not scientifically tractable. The advent of Semantic Web standards such as Resource Description Framework (RDF) and OWL Web Ontology Language (OWL) enables us to construct systems that address these challenges in a new comprehensive way. Results Here, we develop a framework for linking traditional data to OWL-based ontologies in genome annotation. We show how data standards can decouple hardware and third party software tools from annotation pipelines, thereby making annotation pipelines easier to reproduce and assess. An illustrative example shows how TURTLE (Terse RDF Triple Language) can be used as a human readable, but also semantically-aware, equivalent to GenBank/EMBL files. Conclusions The power of this approach lies in its ability to assemble annotation data from multiple databases across multiple locations into a representation that is understandable to researchers. In this way, all researchers, experimental and computational, will more easily understand the informatics processes constructing genome annotation and ultimately be able to help improve the systems that produce them.« less
Myria: Scalable Analytics as a Service
NASA Astrophysics Data System (ADS)
Howe, B.; Halperin, D.; Whitaker, A.
2014-12-01
At the UW eScience Institute, we're working to empower non-experts, especially in the sciences, to write and use data-parallel algorithms. To this end, we are building Myria, a web-based platform for scalable analytics and data-parallel programming. Myria's internal model of computation is the relational algebra extended with iteration, such that every program is inherently data-parallel, just as every query in a database is inherently data-parallel. But unlike databases, iteration is a first class concept, allowing us to express machine learning tasks, graph traversal tasks, and more. Programs can be expressed in a number of languages and can be executed on a number of execution environments, but we emphasize a particular language called MyriaL that supports both imperative and declarative styles and a particular execution engine called MyriaX that uses an in-memory column-oriented representation and asynchronous iteration. We deliver Myria over the web as a service, providing an editor, performance analysis tools, and catalog browsing features in a single environment. We find that this web-based "delivery vector" is critical in reaching non-experts: they are insulated from irrelevant effort technical work associated with installation, configuration, and resource management. The MyriaX backend, one of several execution runtimes we support, is a main-memory, column-oriented, RDBMS-on-the-worker system that supports cyclic data flows as a first-class citizen and has been shown to outperform competitive systems on 100-machine cluster sizes. I will describe the Myria system, give a demo, and present some new results in large-scale oceanographic microbiology.
A web server for analysis, comparison and prediction of protein ligand binding sites.
Singh, Harinder; Srivastava, Hemant Kumar; Raghava, Gajendra P S
2016-03-25
One of the major challenges in the field of system biology is to understand the interaction between a wide range of proteins and ligands. In the past, methods have been developed for predicting binding sites in a protein for a limited number of ligands. In order to address this problem, we developed a web server named 'LPIcom' to facilitate users in understanding protein-ligand interaction. Analysis, comparison and prediction modules are available in the "LPIcom' server to predict protein-ligand interacting residues for 824 ligands. Each ligand must have at least 30 protein binding sites in PDB. Analysis module of the server can identify residues preferred in interaction and binding motif for a given ligand; for example residues glycine, lysine and arginine are preferred in ATP binding sites. Comparison module of the server allows comparing protein-binding sites of multiple ligands to understand the similarity between ligands based on their binding site. This module indicates that ATP, ADP and GTP ligands are in the same cluster and thus their binding sites or interacting residues exhibit a high level of similarity. Propensity-based prediction module has been developed for predicting ligand-interacting residues in a protein for more than 800 ligands. In addition, a number of web-based tools have been integrated to facilitate users in creating web logo and two-sample between ligand interacting and non-interacting residues. In summary, this manuscript presents a web-server for analysis of ligand interacting residue. This server is available for public use from URL http://crdd.osdd.net/raghava/lpicom .
Promoting Interests in Atmospheric Science at a Liberal Arts Institution
NASA Astrophysics Data System (ADS)
Roussev, S.; Sherengos, P. M.; Limpasuvan, V.; Xue, M.
2007-12-01
Coastal Carolina University (CCU) students in Computer Science participated in a project to set up an operational weather forecast for the local community. The project involved the construction of two computing clusters and the automation of daily forecasting. Funded by NSF-MRI, two high-performance clusters were successfully established to run the University of Oklahoma's Advance Regional Prediction System (ARPS). Daily weather predictions are made over South Carolina and North Carolina at 3-km horizontal resolution (roughly 1.9 miles) using initial and boundary condition data provided by UNIDATA. At this high resolution, the model is cloud- resolving, thus providing detailed picture of heavy thunderstorms and precipitation. Forecast results are displayed on CCU's website (https://marc.coastal.edu/HPC) to complement observations at the National Weather Service in Wilmington N.C. Present efforts include providing forecasts at 1-km resolution (or finer), comparisons with other models like Weather Research and Forecasting (WRF) model, and the examination of local phenomena (like water spouts and tornadoes). Through these activities the students learn about shell scripting, cluster operating systems, and web design. More importantly, students are introduced to Atmospheric Science, the processes involved in making weather forecasts, and the interpretation of their forecasts. Simulations generated by the forecasts will be integrated into the contents of CCU's course like Fluid Dynamics, Atmospheric Sciences, Atmospheric Physics, and Remote Sensing. Operated jointly between the departments of Applied Physics and Computer Science, the clusters are expected to be used by CCU faculty and students for future research and inquiry-based projects in Computer Science, Applied Physics, and Marine Science.
Bauermeister, José A.; Zimmerman, Marc A.; Johns, Michelle M.; Glowacki, Pietreck; Stoddard, Sarah; Volz, Erik
2012-01-01
Objective: We used a web version of Respondent-Driven Sampling (webRDS) to recruit a sample of young adults (ages 18–24) and examined whether this strategy would result in alcohol and other drug (AOD) prevalence estimates comparable to national estimates (National Survey on Drug Use and Health [NSDUH]). Method: We recruited 22 initial participants (seeds) via Facebook to complete a web survey examining AOD risk correlates. Sequential, incentivized recruitment continued until our desired sample size was achieved. After correcting for webRDS clustering effects, we contrasted our AOD prevalence estimates (past 30 days) to NSDUH estimates by comparing the 95% confidence intervals of prevalence estimates. Results: We found comparable AOD prevalence estimates between our sample and NSDUH for the past 30 days for alcohol, marijuana, cocaine, Ecstasy (3,4-methylenedioxymethamphetamine, or MDMA), and hallucinogens. Cigarette use was lower than NSDUH estimates. Conclusions: WebRDS may be a suitable strategy to recruit young adults online. We discuss the unique strengths and challenges that may be encountered by public health researchers using webRDS methods. PMID:22846248
Interactive visual exploration and analysis of origin-destination data
NASA Astrophysics Data System (ADS)
Ding, Linfang; Meng, Liqiu; Yang, Jian; Krisp, Jukka M.
2018-05-01
In this paper, we propose a visual analytics approach for the exploration of spatiotemporal interaction patterns of massive origin-destination data. Firstly, we visually query the movement database for data at certain time windows. Secondly, we conduct interactive clustering to allow the users to select input variables/features (e.g., origins, destinations, distance, and duration) and to adjust clustering parameters (e.g. distance threshold). The agglomerative hierarchical clustering method is applied for the multivariate clustering of the origin-destination data. Thirdly, we design a parallel coordinates plot for visualizing the precomputed clusters and for further exploration of interesting clusters. Finally, we propose a gradient line rendering technique to show the spatial and directional distribution of origin-destination clusters on a map view. We implement the visual analytics approach in a web-based interactive environment and apply it to real-world floating car data from Shanghai. The experiment results show the origin/destination hotspots and their spatial interaction patterns. They also demonstrate the effectiveness of our proposed approach.
NASA Astrophysics Data System (ADS)
Eberle, J.; Gerlach, R.; Hese, S.; Schmullius, C.
2012-04-01
To provide earth observation products in the area of Siberia, the Siberian Earth System Science Cluster (SIB-ESS-C) was established as a spatial data infrastructure at the University of Jena (Germany), Department for Earth Observation. This spatial data infrastructure implements standards published by the Open Geospatial Consortium (OGC) and the International Organizsation for Standardization (ISO) for data discovery, data access, data processing and data analysis. The objective of SIB-ESS-C is to faciliate environmental research and Earth system science in Siberia. The region for this project covers the entire Asian part of the Russian Federation approximately between 58°E - 170°W and 48°N - 80°N. To provide discovery, access and analysis services a webportal was published for searching and visualisation of available data. This webportal is based on current web technologies like AJAX, Drupal Content Management System as backend software and a user-friendly surface with Drag-n-Drop and further mouse events. To have a wide range of regular updated earth observation products, some products from sensor MODIS at the satellites Aqua and Terra were processed. A direct connection to NASA archive servers makes it possible to download MODIS Level 3 and 4 products and integrate it in the SIB-ESS-C infrastructure. These data can be downloaded in a file format called Hierarchical Data Format (HDF). For visualisation and further analysis, this data is reprojected, converted to GeoTIFF and global products clipped to the project area. All these steps are implemented as an automatic process chain. If new MODIS data is available within the infrastructure this process chain is executed. With the link to a MODIS catalogue system, the system gets new data daily. With the implemented analysis processes, timeseries data can be analysed, for example to plot a trend or different time series against one another. Scientists working in this area and working with MODIS data can make use of this service over the webportal. Both searching manually the NASA archive for MODIS data, processing these data automatically and then download it for further processing and using the regular updated products.
van Engen-Verheul, Mariëtte M.; Gude, Wouter T.; van der Veer, Sabine N.; Kemps, Hareld M.C.; Jaspers, Monique M.W.; de Keizer, Nicolette F.; Peek, Niels
2015-01-01
Despite their widespread use, audit and feedback (A&F) interventions show variable effectiveness on improving professional performance. Based on known facilitators of successful A&F interventions, we developed a web-based A&F intervention with indicator-based performance feedback, benchmark information, action planning and outreach visits. The goal of the intervention was to engage with multidisciplinary teams to overcome barriers to guideline concordance and to improve overall team performance in the field of cardiac rehabilitation (CR). To assess its effectiveness we conducted a cluster-randomized trial in 18 CR clinics (14,847 patients) already working with computerized decision support (CDS). Our preliminary results showed no increase in concordance with guideline recommendations regarding prescription of CR therapies. Future analyses will investigate whether our intervention did improve team performance on other quality indicators. PMID:26958310
Probing Gas Stripping with Resolved Star-Formation Maps of Virgo Filament Galaxies
NASA Astrophysics Data System (ADS)
Collova, Natasha
2018-01-01
We are conducting a multi-wavelength study of the gas in galaxies at a variety of positions in the cosmic web surrounding the Virgo cluster, one of the best studied regions of high density in the Universe. Galaxies are very likely pre-processed in filaments before falling into clusters, and our goal is to understand how galaxies are altered as they move through the cosmic web and enter the densest regions. We present spatially-resolved H-alpha imaging results from the KPNO 0.9-m and INT 2.54-m telescopes for a preliminary sample of 30 galaxies. We will combine the star-formation maps with observations of molecular and atomic gas to calculate gas consumption timescales, characterize multiple phases of the galactic gas, and look for signatures of environmentally-driven depletion. This work is supported in part by NSF grant AST-1716657.
Web-based Quality Control Tool used to validate CERES products on a cluster of Linux servers
NASA Astrophysics Data System (ADS)
Chu, C.; Sun-Mack, S.; Heckert, E.; Chen, Y.; Mlynczak, P.; Mitrescu, C.; Doelling, D.
2014-12-01
There have been a few popular desktop tools used in the Earth Science community to validate science data. Because of the limitation on the capacity of desktop hardware such as disk space and CPUs, those softwares are not able to display large amount of data from files.This poster will talk about an in-house developed web-based software built on a cluster of Linux servers. That allows users to take advantage of a few Linux servers working in parallel to generate hundreds images in a short period of time. The poster will demonstrate:(1) The hardware and software architecture is used to provide high throughput of images. (2) The software structure that can incorporate new products and new requirement quickly. (3) The user interface about how users can manipulate the data and users can control how the images are displayed.
One dark matter mystery: halos in the cosmic web
NASA Astrophysics Data System (ADS)
Gaite, Jose
2015-01-01
The current cold dark matter cosmological model explains the large scale cosmic web structure but is challenged by the observation of a relatively smooth distribution of matter in galactic clusters. We consider various aspects of modeling the dark matter around galaxies as distributed in smooth halos and, especially, the smoothness of the dark matter halos seen in N-body cosmological simulations. We conclude that the problems of the cold dark matter cosmology on small scales are more serious than normally admitted.
Swanson, Jonathan O; Plotner, David; Franklin, Holly L; Swanson, David L; Lokomba Bolamba, Victor; Lokangaka, Adrien; Sayury Pineda, Irma; Figueroa, Lester; Garces, Ana; Muyodi, David; Esamai, Fabian; Kanaiza, Nancy; Mirza, Waseem; Naqvi, Farnaz; Saleem, Sarah; Mwenechanya, Musaku; Chiwila, Melody; Hamsumonde, Dorothy; McClure, Elizabeth M; Goldenberg, Robert L; Nathan, Robert O
2016-01-01
ABSTRACT High quality is important in medical imaging, yet in many geographic areas, highly skilled sonographers are in short supply. Advances in Internet capacity along with the development of reliable portable ultrasounds have created an opportunity to provide centralized remote quality assurance (QA) for ultrasound exams performed at rural sites worldwide. We sought to harness these advances by developing a web-based tool to facilitate QA activities for newly trained sonographers who were taking part in a cluster randomized trial investigating the role of limited obstetric ultrasound to improve pregnancy outcomes in 5 low- and middle-income countries. We were challenged by connectivity issues, by country-specific needs for website usability, and by the overall need for a high-throughput system. After systematically addressing these needs, the resulting QA website helped drive ultrasound quality improvement across all 5 countries. It now offers the potential for adoption by future ultrasound- or imaging-based global health initiatives. PMID:28031304
Hu, Zhongkai; Jin, Bo; Shin, Andrew Y; Zhu, Chunqing; Zhao, Yifan; Hao, Shiying; Zheng, Le; Fu, Changlin; Wen, Qiaojun; Ji, Jun; Li, Zhen; Wang, Yong; Zheng, Xiaolin; Dai, Dorothy; Culver, Devore S; Alfreds, Shaun T; Rogow, Todd; Stearns, Frank; Sylvester, Karl G; Widen, Eric; Ling, Xuefeng B
2015-01-13
An easily accessible real-time Web-based utility to assess patient risks of future emergency department (ED) visits can help the health care provider guide the allocation of resources to better manage higher-risk patient populations and thereby reduce unnecessary use of EDs. Our main objective was to develop a Health Information Exchange-based, next 6-month ED risk surveillance system in the state of Maine. Data on electronic medical record (EMR) encounters integrated by HealthInfoNet (HIN), Maine's Health Information Exchange, were used to develop the Web-based surveillance system for a population ED future 6-month risk prediction. To model, a retrospective cohort of 829,641 patients with comprehensive clinical histories from January 1 to December 31, 2012 was used for training and then tested with a prospective cohort of 875,979 patients from July 1, 2012, to June 30, 2013. The multivariate statistical analysis identified 101 variables predictive of future defined 6-month risk of ED visit: 4 age groups, history of 8 different encounter types, history of 17 primary and 8 secondary diagnoses, 8 specific chronic diseases, 28 laboratory test results, history of 3 radiographic tests, and history of 25 outpatient prescription medications. The c-statistics for the retrospective and prospective cohorts were 0.739 and 0.732 respectively. Integration of our method into the HIN secure statewide data system in real time prospectively validated its performance. Cluster analysis in both the retrospective and prospective analyses revealed discrete subpopulations of high-risk patients, grouped around multiple "anchoring" demographics and chronic conditions. With the Web-based population risk-monitoring enterprise dashboards, the effectiveness of the active case finding algorithm has been validated by clinicians and caregivers in Maine. The active case finding model and associated real-time Web-based app were designed to track the evolving nature of total population risk, in a longitudinal manner, for ED visits across all payers, all diseases, and all age groups. Therefore, providers can implement targeted care management strategies to the patient subgroups with similar patterns of clinical histories, driving the delivery of more efficient and effective health care interventions. To the best of our knowledge, this prospectively validated EMR-based, Web-based tool is the first one to allow real-time total population risk assessment for statewide ED visits.
NASA Astrophysics Data System (ADS)
Eberle, Jonas; Urban, Marcel; Hüttich, Christian; Schmullius, Christiane
2014-05-01
Numerous datasets providing temperature information from meteorological stations or remote sensing satellites are available. However, the challenging issue is to search in the archives and process the time series information for further analysis. These steps can be automated for each individual product, if the pre-conditions are complied, e.g. data access through web services (HTTP, FTP) or legal rights to redistribute the datasets. Therefore a python-based package was developed to provide data access and data processing tools for MODIS Land Surface Temperature (LST) data, which is provided by NASA Land Processed Distributed Active Archive Center (LPDAAC), as well as the Global Surface Summary of the Day (GSOD) and the Global Historical Climatology Network (GHCN) daily datasets provided by NOAA National Climatic Data Center (NCDC). The package to access and process the information is available as web services used by an interactive web portal for simple data access and analysis. Tools for time series analysis were linked to the system, e.g. time series plotting, decomposition, aggregation (monthly, seasonal, etc.), trend analyses, and breakpoint detection. Especially for temperature data a plot was integrated for the comparison of two temperature datasets based on the work by Urban et al. (2013). As a first result, a kernel density plot compares daily MODIS LST from satellites Aqua and Terra with daily means from GSOD and GHCN datasets. Without any data download and data processing, the users can analyze different time series datasets in an easy-to-use web portal. As a first use case, we built up this complimentary system with remotely sensed MODIS data and in situ measurements from meteorological stations for Siberia within the Siberian Earth System Science Cluster (www.sibessc.uni-jena.de). References: Urban, Marcel; Eberle, Jonas; Hüttich, Christian; Schmullius, Christiane; Herold, Martin. 2013. "Comparison of Satellite-Derived Land Surface Temperature and Air Temperature from Meteorological Stations on the Pan-Arctic Scale." Remote Sens. 5, no. 5: 2348-2367. Further materials: Eberle, Jonas; Clausnitzer, Siegfried; Hüttich, Christian; Schmullius, Christiane. 2013. "Multi-Source Data Processing Middleware for Land Monitoring within a Web-Based Spatial Data Infrastructure for Siberia." ISPRS Int. J. Geo-Inf. 2, no. 3: 553-576.
The baryon content of the Cosmic Web
Eckert, Dominique; Jauzac, Mathilde; Shan, HuanYuan; Kneib, Jean-Paul; Erben, Thomas; Israel, Holger; Jullo, Eric; Klein, Matthias; Massey, Richard; Richard, Johan; Tchernin, Céline
2015-01-01
Big-Bang nucleosynthesis indicates that baryons account for 5% of the Universe’s total energy content[1]. In the local Universe, the census of all observed baryons falls short of this estimate by a factor of two[2,3]. Cosmological simulations indicate that the missing baryons have not yet condensed into virialised halos, but reside throughout the filaments of the cosmic web: a low-density plasma at temperature 105–107 K known as the warm-hot intergalactic medium (WHIM)[3,4,5,6]. There have been previous claims of the detection of warm baryons along the line of sight to distant blazars[7,8,9,10] and hot gas between interacting clusters[11,12,13,14]. These observations were however unable to trace the large-scale filamentary structure, or to estimate the total amount of warm baryons in a representative volume of the Universe. Here we report X-ray observations of filamentary structures of ten-million-degree gas associated with the galaxy cluster Abell 2744. Previous observations of this cluster[15] were unable to resolve and remove coincidental X-ray point sources. After subtracting these, we reveal hot gas structures that are coherent over 8 Mpc scales. The filaments coincide with over-densities of galaxies and dark matter, with 5-10% of their mass in baryonic gas. This gas has been heated up by the cluster's gravitational pull and is now feeding its core. PMID:26632589
Stellar and Binary Evolution in Star Clusters
NASA Technical Reports Server (NTRS)
McMillan, Stephen L. W.
2001-01-01
This paper presents a final report on research activities covered on Stellar and Binary Evolution in Star Clusters. Substantial progress was made in the development and dissemination of the "Starlab" software environment. Significant improvements were made to "kira," an N-body simulation program tailored to the study of dense stellar systems such as star clusters and galactic nuclei. Key advances include (1) the inclusion of stellar and binary evolution in a self-consistent manner, (2) proper treatment of the anisotropic Galactic tidal field, (3) numerous technical enhancements in the treatment of binary dynamics and interactions, and (4) full support for the special-purpose GRAPE-4 hardware, boosting the program's performance by a factor of 10-100 over the accelerated version. The data-reduction and analysis tools in Starlab were also substantially expanded. A Starlab Web site (http://www.sns.ias.edu/-starlab) was created and developed. The site contains detailed information on the structure and function of the various tools that comprise the package, as well as download information, "how to" tips and examples of common operations, demonstration programs, animations, etc. All versions of the software are freely distributed to all interested users, along with detailed installation instructions.
Youpi: A Web-based Astronomical Image Processing Pipeline
NASA Astrophysics Data System (ADS)
Monnerville, M.; Sémah, G.
2010-12-01
Youpi stands for “YOUpi is your processing PIpeline”. It is a portable, easy to use web application providing high level functionalities to perform data reduction on scientific FITS images. It is built on top of open source processing tools that are released to the community by Terapix, in order to organize your data on a computer cluster, to manage your processing jobs in real time and to facilitate teamwork by allowing fine-grain sharing of results and data. On the server side, Youpi is written in the Python programming language and uses the Django web framework. On the client side, Ajax techniques are used along with the Prototype and script.aculo.us Javascript librairies.
Božičević, Alen; Dobrzyński, Maciej; De Bie, Hans; Gafner, Frank; Garo, Eliane; Hamburger, Matthias
2017-12-05
The technological development of LC-MS instrumentation has led to significant improvements of performance and sensitivity, enabling high-throughput analysis of complex samples, such as plant extracts. Most software suites allow preprocessing of LC-MS chromatograms to obtain comprehensive information on single constituents. However, more advanced processing needs, such as the systematic and unbiased comparative metabolite profiling of large numbers of complex LC-MS chromatograms remains a challenge. Currently, users have to rely on different tools to perform such data analyses. We developed a two-step protocol comprising a comparative metabolite profiling tool integrated in ACD/MS Workbook Suite, and a web platform developed in R language designed for clustering and visualization of chromatographic data. Initially, all relevant chromatographic and spectroscopic data (retention time, molecular ions with the respective ion abundance, and sample names) are automatically extracted and assembled in an Excel spreadsheet. The file is then loaded into an online web application that includes various statistical algorithms and provides the user with tools to compare and visualize the results in intuitive 2D heatmaps. We applied this workflow to LC-ESIMS profiles obtained from 69 honey samples. Within few hours of calculation with a standard PC, honey samples were preprocessed and organized in clusters based on their metabolite profile similarities, thereby highlighting the common metabolite patterns and distributions among samples. Implementation in the ACD/Laboratories software package enables ulterior integration of other analytical data, and in silico prediction tools for modern drug discovery.
Kinetics and Structure of Superagglomerates Produced by Silane and Acetylene
NASA Technical Reports Server (NTRS)
Mulholland, G. W.; Hamins, A.; Sivathanu, Y.
1999-01-01
The evolution of smoke in a laminar diffusion flame involves several steps. The first step is particle inception/nucleation in the high-temperature fuel-rich region of the flame followed by surface growth and coagulation/coalescence of the small particles. As the primary spheres grow in size and lose hydrogen, the colliding particles no longer coalesce but retain their identity as a cluster of primary spheres, termed an agglomerate. Finally, in the upper portion of the flame, the particles enter an oxidizing environment which may lead to partial or complete burnout of the agglomerates. Currently there is no quantitative model for describing the growth of smoke agglomerates up to superagglomerates with an overall dimension of 10 microns and greater. Such particles are produced during the burning of acetylene and fuels containing benzene rings such as toluene and polystyrene. In the case of polystyrene, smoke agglomerates in excess of 1 mm have been observed "raining" out from large fires. Evidence of the formation of superagglomerates in a laminar acetylene/air diffusion flame has been recently reported. Acetylene was chosen as the fuel since the particulate loading in acetylene/air diffusion flames is very high. Photographs were obtained by Sorensen using a microsecond xenon lamp of the "stream" of soot just above the flame. For low flow rates of acetylene, only submicrometer soot clusters are produced and they give rise to the homogeneous appearance of the soot stream. When the flow rate is increased to 1.7 cu cm/s, soot clusters up to 10 microns are formed and they are responsible for the graininess and at a flow rate of 3.4 cu cm/s, a web of interconnected clusters as large as the width of the flame is seen. This interconnecting web of superagglomerates is described as a gel state by Sorensen et al (1998). This is the first observation of a gel for a gas phase system. It was observed that this gel state immediately breaks up into agglomerates due to buoyancy induced turbulence and gravitational sedimentation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Krishnamurthy, Dheepak
This paper is an overview of Power System Simulation Toolbox (psst). psst is an open-source Python application for the simulation and analysis of power system models. psst simulates the wholesale market operation by solving a DC Optimal Power Flow (DCOPF), Security Constrained Unit Commitment (SCUC) and a Security Constrained Economic Dispatch (SCED). psst also includes models for the various entities in a power system such as Generator Companies (GenCos), Load Serving Entities (LSEs) and an Independent System Operator (ISO). psst features an open modular object oriented architecture that will make it useful for researchers to customize, expand, experiment beyond solvingmore » traditional problems. psst also includes a web based Graphical User Interface (GUI) that allows for user friendly interaction and for implementation on remote High Performance Computing (HPCs) clusters for parallelized operations. This paper also provides an illustrative application of psst and benchmarks with standard IEEE test cases to show the advanced features and the performance of toolbox.« less
Vipie: web pipeline for parallel characterization of viral populations from multiple NGS samples.
Lin, Jake; Kramna, Lenka; Autio, Reija; Hyöty, Heikki; Nykter, Matti; Cinek, Ondrej
2017-05-15
Next generation sequencing (NGS) technology allows laboratories to investigate virome composition in clinical and environmental samples in a culture-independent way. There is a need for bioinformatic tools capable of parallel processing of virome sequencing data by exactly identical methods: this is especially important in studies of multifactorial diseases, or in parallel comparison of laboratory protocols. We have developed a web-based application allowing direct upload of sequences from multiple virome samples using custom parameters. The samples are then processed in parallel using an identical protocol, and can be easily reanalyzed. The pipeline performs de-novo assembly, taxonomic classification of viruses as well as sample analyses based on user-defined grouping categories. Tables of virus abundance are produced from cross-validation by remapping the sequencing reads to a union of all observed reference viruses. In addition, read sets and reports are created after processing unmapped reads against known human and bacterial ribosome references. Secured interactive results are dynamically plotted with population and diversity charts, clustered heatmaps and a sortable and searchable abundance table. The Vipie web application is a unique tool for multi-sample metagenomic analysis of viral data, producing searchable hits tables, interactive population maps, alpha diversity measures and clustered heatmaps that are grouped in applicable custom sample categories. Known references such as human genome and bacterial ribosomal genes are optionally removed from unmapped ('dark matter') reads. Secured results are accessible and shareable on modern browsers. Vipie is a freely available web-based tool whose code is open source.
Ten Billion Years of Brightest Cluster Galaxy Alignments
NASA Astrophysics Data System (ADS)
West, Michael J.
2017-07-01
Astronomers long assumed that galaxies are randomly oriented in space. However, it's now clear that some have preferred orientations with respect to their surroundings. Chief among these are the giant ellipticals found at the centers of rich galaxy clusters, whose major axes are often aligned with those of their host clusters - a remarkable coherence of structures over millions of light years. A better understanding of these alignments can yield new insights into the processes that have shaped galaxies over the history of the universe. Using Hubble Space Telescope observations of high-redshift galaxy clusters, we show for the first time that such alignments are seen at epochs when the universe was only one-third its current age. These results suggest that the brightest galaxies in clusters are the product of a special formation history, one influenced by development of the cosmic web over billions of years.
NASA Technical Reports Server (NTRS)
Nelson, Michael L.; Maly, Kurt; Shen, Stewart N. T.; Zubair, Mohammad
1998-01-01
We describe NCSTRL+, a unified, canonical digital library for scientific and technical information (STI). NCSTRL+ is based on the Networked Computer Science Technical Report Library (NCSTRL), a World Wide Web (WWW) accessible digital library (DL) that provides access to over 100 university departments and laboratories. NCSTRL+ implements two new technologies: cluster functionality and publishing buckets. We have extended Dienst, the protocol underlying NCSTRL, to provide the ability to cluster independent collections into a logically centralized digital library based upon subject category classification, type of organization, and genres of material. The bucket construct provides a mechanism for publishing and managing logically linked entities with multiple data forms as a single object. The NCSTRL+ prototype DL contains the holdings of NCSTRL and the NASA Technical Report Server (NTRS). The prototype demonstrates the feasibility of publishing into a multi-cluster DL, searching across clusters, and storing and presenting buckets of information.
Clustering and Dimensionality Reduction to Discover Interesting Patterns in Binary Data
NASA Astrophysics Data System (ADS)
Palumbo, Francesco; D'Enza, Alfonso Iodice
The attention towards binary data coding increased consistently in the last decade due to several reasons. The analysis of binary data characterizes several fields of application, such as market basket analysis, DNA microarray data, image mining, text mining and web-clickstream mining. The paper illustrates two different approaches exploiting a profitable combination of clustering and dimensionality reduction for the identification of non-trivial association structures in binary data. An application in the Association Rules framework supports the theory with the empirical evidence.
The BioPrompt-box: an ontology-based clustering tool for searching in biological databases.
Corsi, Claudio; Ferragina, Paolo; Marangoni, Roberto
2007-03-08
High-throughput molecular biology provides new data at an incredible rate, so that the increase in the size of biological databanks is enormous and very rapid. This scenario generates severe problems not only at indexing time, where suitable algorithmic techniques for data indexing and retrieval are required, but also at query time, since a user query may produce such a large set of results that their browsing and "understanding" becomes humanly impractical. This problem is well known to the Web community, where a new generation of Web search engines is being developed, like Vivisimo. These tools organize on-the-fly the results of a user query in a hierarchy of labeled folders that ease their browsing and knowledge extraction. We investigate this approach on biological data, and propose the so called The BioPrompt-boxsoftware system which deploys ontology-driven clustering strategies for making the searching process of biologists more efficient and effective. The BioPrompt-box (Bpb) defines a document as a biological sequence plus its associated meta-data taken from the underneath databank--like references to ontologies or to external databanks, and plain texts as comments of researchers and (title, abstracts or even body of) papers. Bpboffers several tools to customize the search and the clustering process over its indexed documents. The user can search a set of keywords within a specific field of the document schema, or can execute Blastto find documents relative to homologue sequences. In both cases the search task returns a set of documents (hits) which constitute the answer to the user query. Since the number of hits may be large, Bpbclusters them into groups of homogenous content, organized as a hierarchy of labeled clusters. The user can actually choose among several ontology-based hierarchical clustering strategies, each offering a different "view" of the returned hits. Bpbcomputes these views by exploiting the meta-data present within the retrieved documents such as the references to Gene Ontology, the taxonomy lineage, the organism and the keywords. Of course, the approach is flexible enough to leave room for future additions of other meta-information. The ultimate goal of the clustering process is to provide the user with several different readings of the (maybe numerous) query results and show possible hidden correlations among them, thus improving their browsing and understanding. Bpb is a powerful search engine that makes it very easy to perform complex queries over the indexed databanks (currently only UNIPROT is considered). The ontology-based clustering approach is efficient and effective, and could thus be applied successfully to larger databanks, like GenBank or EMBL.
The BioPrompt-box: an ontology-based clustering tool for searching in biological databases
Corsi, Claudio; Ferragina, Paolo; Marangoni, Roberto
2007-01-01
Background High-throughput molecular biology provides new data at an incredible rate, so that the increase in the size of biological databanks is enormous and very rapid. This scenario generates severe problems not only at indexing time, where suitable algorithmic techniques for data indexing and retrieval are required, but also at query time, since a user query may produce such a large set of results that their browsing and "understanding" becomes humanly impractical. This problem is well known to the Web community, where a new generation of Web search engines is being developed, like Vivisimo. These tools organize on-the-fly the results of a user query in a hierarchy of labeled folders that ease their browsing and knowledge extraction. We investigate this approach on biological data, and propose the so called The BioPrompt-boxsoftware system which deploys ontology-driven clustering strategies for making the searching process of biologists more efficient and effective. Results The BioPrompt-box (Bpb) defines a document as a biological sequence plus its associated meta-data taken from the underneath databank – like references to ontologies or to external databanks, and plain texts as comments of researchers and (title, abstracts or even body of) papers. Bpboffers several tools to customize the search and the clustering process over its indexed documents. The user can search a set of keywords within a specific field of the document schema, or can execute Blastto find documents relative to homologue sequences. In both cases the search task returns a set of documents (hits) which constitute the answer to the user query. Since the number of hits may be large, Bpbclusters them into groups of homogenous content, organized as a hierarchy of labeled clusters. The user can actually choose among several ontology-based hierarchical clustering strategies, each offering a different "view" of the returned hits. Bpbcomputes these views by exploiting the meta-data present within the retrieved documents such as the references to Gene Ontology, the taxonomy lineage, the organism and the keywords. Of course, the approach is flexible enough to leave room for future additions of other meta-information. The ultimate goal of the clustering process is to provide the user with several different readings of the (maybe numerous) query results and show possible hidden correlations among them, thus improving their browsing and understanding. Conclusion Bpb is a powerful search engine that makes it very easy to perform complex queries over the indexed databanks (currently only UNIPROT is considered). The ontology-based clustering approach is efficient and effective, and could thus be applied successfully to larger databanks, like GenBank or EMBL. PMID:17430575
webPIPSA: a web server for the comparison of protein interaction properties
Richter, Stefan; Wenzel, Anne; Stein, Matthias; Gabdoulline, Razif R.; Wade, Rebecca C.
2008-01-01
Protein molecular interaction fields are key determinants of protein functionality. PIPSA (Protein Interaction Property Similarity Analysis) is a procedure to compare and analyze protein molecular interaction fields, such as the electrostatic potential. PIPSA may assist in protein functional assignment, classification of proteins, the comparison of binding properties and the estimation of enzyme kinetic parameters. webPIPSA is a web server that enables the use of PIPSA to compare and analyze protein electrostatic potentials. While PIPSA can be run with downloadable software (see http://projects.eml.org/mcm/software/pipsa), webPIPSA extends and simplifies a PIPSA run. This allows non-expert users to perform PIPSA for their protein datasets. With input protein coordinates, the superposition of protein structures, as well as the computation and analysis of electrostatic potentials, is automated. The results are provided as electrostatic similarity matrices from an all-pairwise comparison of the proteins which can be subjected to clustering and visualized as epograms (tree-like diagrams showing electrostatic potential differences) or heat maps. webPIPSA is freely available at: http://pipsa.eml.org. PMID:18420653
The role of penetrating gas streams in setting the dynamical state of galaxy clusters
NASA Astrophysics Data System (ADS)
Zinger, E.; Dekel, A.; Birnboim, Y.; Kravtsov, A.; Nagai, D.
2016-09-01
We utilize cosmological simulations of 16 galaxy clusters at redshifts z = 0 and z = 0.6 to study the effect of inflowing streams on the properties of the X-ray emitting intracluster medium. We find that the mass accretion occurs predominantly along streams that originate from the cosmic web and consist of heated gas. Clusters that are unrelaxed in terms of their X-ray morphology are characterized by higher mass inflow rates and deeper penetration of the streams, typically into the inner third of the virial radius. The penetrating streams generate elevated random motions, bulk flows and cold fronts. The degree of penetration of the streams may change over time such that clusters can switch from being unrelaxed to relaxed over a time-scale of several giga years.
NASA Astrophysics Data System (ADS)
Wright, Dawn; Sayre, Roger; Breyer, Sean; Butler, Kevin; VanGraafeiland, Keith; Goodin, Kathy; Kavanaugh, Maria; Costello, Mark; Cressie, Noel; Basher, Zeenatul; Harris, Peter; Guinotte, John
2017-04-01
A data-derived, ecological stratification-based ecosystem mapping approach was recently demonstrated by Sayre et al. for terrestrial ecosystems, resulting in a standardized map of nearly 4000 global ecological land units (ELUs) at a base spatial resolution of 250 m. The map was commissioned by the Group on Earth Observations for eventual use by the Global Earth Observation System of Systems (GEOSS), and was also a contribution to the Climate Data Initiative of US President Barack Obama. We now present a similar environmental stratification approach for extending a global ecosystems map into the oceans through the delineation of analog global ecological marine units (EMUs). EMUs are comprised of a global point mesh framework, created from over 52 million points from NOAA's World Ocean Atlas with a spatial resolution of ¼ by ¼ degree ( 27 x 27 km at the equator) at varying depths and a temporal resolution that is currently decadal. Each point carries attributes of chemical and physical oceanographic structure (temperature, salinity, dissolved oxygen, nitrate, silicate, phosphate) that are likely drivers of many marine ecosystem responses. We used a k-means statistical clustering algorithm to identify physically distinct, relatively homogenous, volumetric regions within the water column (the EMUs). Backwards stepwise discriminant analysis determined if all of six variables contributed significantly to the clustering, and a pseudo F-statistic gave us an optimum number of clusters worldwide at 37. Canonical discriminant analysis verified that all 37 clusters were significantly different from one another. A major intent of the EMUs is to support marine biodiversity conservation assessments, economic valuation studies of marine ecosystem goods and services, and studies of ocean acidification and other impacts (e.g., pollution, resource exploitation, etc.). As such, they represent a rich geospatial accounting framework for these types of studies, as well as for scientific research on species distributions and their relationships to the marine physical environment. To further benefit the community and facilitate collaborate knowledge building, data products are shared openly and interoperably via www.esri.com/ecological-marine-units. This includes provision of 3D point mesh and EMU clusters at the surface, bottom, and within the water column in varying formats via download, web services or web apps, as well as generic algorithms and GIS workflows that scale from global to regional and local. A major aim is for the community members to may move the research forward with higher-resolution data from their own field studies or areas of interest, with the original EMU project team assisting with GIS implementation (especially via a new online discussion forum), or hosting of additional data products as needed.
Halligan, Brian D.; Geiger, Joey F.; Vallejos, Andrew K.; Greene, Andrew S.; Twigger, Simon N.
2009-01-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step by step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center website (http://proteomics.mcw.edu/vipdac). PMID:19358578
75 FR 27986 - Electronic Filing System-Web (EFS-Web) Contingency Option
Federal Register 2010, 2011, 2012, 2013, 2014
2010-05-19
...] Electronic Filing System--Web (EFS-Web) Contingency Option AGENCY: United States Patent and Trademark Office... availability of its patent electronic filing system, Electronic Filing System--Web (EFS-Web) by providing a new contingency option when the primary portal to EFS-Web has an unscheduled outage. Previously, the entire EFS...
The Vainshtein mechanism in the cosmic web
DOE Office of Scientific and Technical Information (OSTI.GOV)
Falck, Bridget; Koyama, Kazuya; Zhao, Gong-bo
We investigate the dependence of the Vainshtein screening mechanism on the cosmic web morphology of both dark matter particles and halos as determined by ORIGAMI. Unlike chameleon and symmetron screening, which come into effect in regions of high density, Vainshtein screening instead depends on the dimensionality of the system, and screened bodies can still feel external fields. ORIGAMI is well-suited to this problem because it defines morphologies according to the dimensionality of the collapsing structure and does not depend on a smoothing scale or density threshold parameter. We find that halo particles are screened while filament, wall, and void particlesmore » are unscreened, and this is independent of the particle density. However, after separating halos according to their large scale cosmic web environment, we find no difference in the screening properties of halos in filaments versus halos in clusters. We find that the fifth force enhancement of dark matter particles in halos is greatest well outside the virial radius. We confirm the theoretical expectation that even if the internal field is suppressed by the Vainshtein mechanism, the object still feels the fifth force generated by the external fields, by measuring peculiar velocities and velocity dispersions of halos. Finally, we investigate the morphology and gravity model dependence of halo spins, concentrations, and shapes.« less
The inverse niche model for food webs with parasites
Warren, Christopher P.; Pascual, Mercedes; Lafferty, Kevin D.; Kuris, Armand M.
2010-01-01
Although parasites represent an important component of ecosystems, few field and theoretical studies have addressed the structure of parasites in food webs. We evaluate the structure of parasitic links in an extensive salt marsh food web, with a new model distinguishing parasitic links from non-parasitic links among free-living species. The proposed model is an extension of the niche model for food web structure, motivated by the potential role of size (and related metabolic rates) in structuring food webs. The proposed extension captures several properties observed in the data, including patterns of clustering and nestedness, better than does a random model. By relaxing specific assumptions, we demonstrate that two essential elements of the proposed model are the similarity of a parasite's hosts and the increasing degree of parasite specialization, along a one-dimensional niche axis. Thus, inverting one of the basic rules of the original model, the one determining consumers' generality appears critical. Our results support the role of size as one of the organizing principles underlying niche space and food web topology. They also strengthen the evidence for the non-random structure of parasitic links in food webs and open the door to addressing questions concerning the consequences and origins of this structure.
Astronomers Uncover One of the Youngest and Brightest Galaxies in the Early Universe
2008-02-12
A massive cluster of yellowish galaxies is seemingly caught in a spider web of eerily distorted background galaxies in the left-hand image, taken with the Advanced Camera for Surveys ACS aboard NASA Hubble Space Telescope.
Peels, Denise Astrid; Bolman, Catherine; Golsteijn, Rianne Henrica Johanna; De Vries, Hein; Mudde, Aart Nicolaas; van Stralen, Maartje Marieke; Lechner, Lilian
2012-12-17
The Internet has the potential to provide large populations with individual health promotion advice at a relatively low cost. Despite the high rates of Internet access, actual reach by Web-based interventions is often disappointingly low, and differences in use between demographic subgroups are present. Furthermore, Web-based interventions often have to deal with high rates of attrition. This study aims to assess user characteristics related to participation and attrition when comparing Web-based and print-delivered tailored interventions containing similar content and thereby to provide recommendations in choosing the appropriate delivery mode for a particular target audience. We studied the distribution of a Web-based and a print-delivered version of the Active Plus intervention in a clustered randomized controlled trial (RCT). Participants were recruited via direct mailing within the participating Municipal Health Council regions and randomized to the printed or Web-based intervention by their region. Based on the answers given in a prior assessment, participants received tailored advice on 3 occasions: (1) within 2 weeks after the baseline, (2) 2 months after the baseline, and (3) within 4 months after the baseline (based on a second assessment at 3 months). The baseline (printed or Web-based) results were analyzed using ANOVA and chi-square tests to establish the differences in user characteristics between both intervention groups. We used logistic regression analyses to study the interaction between the user characteristics and the delivery mode in the prediction of dropout rate within the intervention period. The printed intervention resulted in a higher participation rate (19%) than the Web-based intervention (12%). Participants of the Web-based intervention were significantly younger (P<.001), more often men (P=.01), had a higher body mass index (BMI) (P=.001) and a lower intention to be physically active (P=.03) than participants of the printed intervention. The dropout rate was significantly higher in the Web-based intervention group (53%) compared to the print-delivered intervention (39%, P<.001). A low intention to be physically active was a strong predictor for dropout within both delivery modes (P<.001). The difference in dropout rate between the Web-based and the printed intervention was not explained by user characteristics. The reach of the same tailored physical activity (PA) intervention in a printed or Web-based delivery mode differed between sociodemographic subgroups of participants over 50 years of age. Although the reach of the Web-based intervention is lower, Web-based interventions can be a good channel to reach high-risk populations (lower PA intention and higher BMI). While the dropout rate was significantly higher in the Web-based intervention group, no specific user characteristics explained the difference in dropout rates between the delivery modes. More research is needed to determine what caused the high rate of dropout in the Web-based intervention. Dutch Trial Register (NTR): 2297: http://www.trialregister.nl/trialreg/admin/rctview.asp?TC=2297 (Archived by WebCite at http://www.webcitation.org/65TkwoESp).
Bolman, Catherine; Golsteijn, Rianne Henrica Johanna; De Vries, Hein; Mudde, Aart Nicolaas; van Stralen, Maartje Marieke; Lechner, Lilian
2012-01-01
Background The Internet has the potential to provide large populations with individual health promotion advice at a relatively low cost. Despite the high rates of Internet access, actual reach by Web-based interventions is often disappointingly low, and differences in use between demographic subgroups are present. Furthermore, Web-based interventions often have to deal with high rates of attrition. Objective This study aims to assess user characteristics related to participation and attrition when comparing Web-based and print-delivered tailored interventions containing similar content and thereby to provide recommendations in choosing the appropriate delivery mode for a particular target audience. Methods We studied the distribution of a Web-based and a print-delivered version of the Active Plus intervention in a clustered randomized controlled trial (RCT). Participants were recruited via direct mailing within the participating Municipal Health Council regions and randomized to the printed or Web-based intervention by their region. Based on the answers given in a prior assessment, participants received tailored advice on 3 occasions: (1) within 2 weeks after the baseline, (2) 2 months after the baseline, and (3) within 4 months after the baseline (based on a second assessment at 3 months). The baseline (printed or Web-based) results were analyzed using ANOVA and chi-square tests to establish the differences in user characteristics between both intervention groups. We used logistic regression analyses to study the interaction between the user characteristics and the delivery mode in the prediction of dropout rate within the intervention period. Results The printed intervention resulted in a higher participation rate (19%) than the Web-based intervention (12%). Participants of the Web-based intervention were significantly younger (P<.001), more often men (P=.01), had a higher body mass index (BMI) (P=.001) and a lower intention to be physically active (P=.03) than participants of the printed intervention. The dropout rate was significantly higher in the Web-based intervention group (53%) compared to the print-delivered intervention (39%, P<.001). A low intention to be physically active was a strong predictor for dropout within both delivery modes (P<.001). The difference in dropout rate between the Web-based and the printed intervention was not explained by user characteristics. Conclusions The reach of the same tailored physical activity (PA) intervention in a printed or Web-based delivery mode differed between sociodemographic subgroups of participants over 50 years of age. Although the reach of the Web-based intervention is lower, Web-based interventions can be a good channel to reach high-risk populations (lower PA intention and higher BMI). While the dropout rate was significantly higher in the Web-based intervention group, no specific user characteristics explained the difference in dropout rates between the delivery modes. More research is needed to determine what caused the high rate of dropout in the Web-based intervention. Trial Registration Dutch Trial Register (NTR): 2297: http://www.trialregister.nl/trialreg/admin/rctview.asp?TC=2297 (Archived by WebCite at http://www.webcitation.org/65TkwoESp). PMID:23246790
Combining Mixture Components for Clustering*
Baudry, Jean-Patrick; Raftery, Adrian E.; Celeux, Gilles; Lo, Kenneth; Gottardo, Raphaël
2010-01-01
Model-based clustering consists of fitting a mixture model to data and identifying each cluster with one of its components. Multivariate normal distributions are typically used. The number of clusters is usually determined from the data, often using BIC. In practice, however, individual clusters can be poorly fitted by Gaussian distributions, and in that case model-based clustering tends to represent one non-Gaussian cluster by a mixture of two or more Gaussian distributions. If the number of mixture components is interpreted as the number of clusters, this can lead to overestimation of the number of clusters. This is because BIC selects the number of mixture components needed to provide a good approximation to the density, rather than the number of clusters as such. We propose first selecting the total number of Gaussian mixture components, K, using BIC and then combining them hierarchically according to an entropy criterion. This yields a unique soft clustering for each number of clusters less than or equal to K. These clusterings can be compared on substantive grounds, and we also describe an automatic way of selecting the number of clusters via a piecewise linear regression fit to the rescaled entropy plot. We illustrate the method with simulated data and a flow cytometry dataset. Supplemental Materials are available on the journal Web site and described at the end of the paper. PMID:20953302
Galaxy clusters in the cosmic web
NASA Astrophysics Data System (ADS)
Acebrón, A.; Durret, F.; Martinet, N.; Adami, C.; Guennou, L.
2014-12-01
Simulations of large scale structure formation in the universe predict that matter is essentially distributed along filaments at the intersection of which lie galaxy clusters. We have analysed 9 clusters in the redshift range 0.4
Jones, James Brian; Weiner, Jonathan P; Shah, Nirav R; Stewart, Walter F
2015-02-20
As providers develop an electronic health record-based infrastructure, patients are increasingly using Web portals to access their health information and participate electronically in the health care process. Little is known about how such portals are actually used. In this paper, our goal was to describe the types and patterns of portal users in an integrated delivery system. We analyzed 12 months of data from Web server log files on 2282 patients using a Web-based portal to their electronic health record (EHR). We obtained data for patients with cardiovascular disease and/or diabetes who had a Geisinger Clinic primary care provider and were registered "MyGeisinger" Web portal users. Hierarchical cluster analysis was applied to longitudinal data to profile users based on their frequency, intensity, and consistency of use. User types were characterized by basic demographic data from the EHR. We identified eight distinct portal user groups. The two largest groups (41.98%, 948/2258 and 24.84%, 561/2258) logged into the portal infrequently but had markedly different levels of engagement with their medical record. Other distinct groups were characterized by tracking biometric measures (10.54%, 238/2258), sending electronic messages to their provider (9.25%, 209/2258), preparing for an office visit (5.98%, 135/2258), and tracking laboratory results (4.16%, 94/2258). There are naturally occurring groups of EHR Web portal users within a population of adult primary care patients with chronic conditions. More than half of the patient cohort exhibited distinct patterns of portal use linked to key features. These patterns of portal access and interaction provide insight into opportunities for electronic patient engagement strategies.
Computational Science in Armenia (Invited Talk)
NASA Astrophysics Data System (ADS)
Marandjian, H.; Shoukourian, Yu.
This survey is devoted to the development of informatics and computer science in Armenia. The results in theoretical computer science (algebraic models, solutions to systems of general form recursive equations, the methods of coding theory, pattern recognition and image processing), constitute the theoretical basis for developing problem-solving-oriented environments. As examples can be mentioned: a synthesizer of optimized distributed recursive programs, software tools for cluster-oriented implementations of two-dimensional cellular automata, a grid-aware web interface with advanced service trading for linear algebra calculations. In the direction of solving scientific problems that require high-performance computing resources, examples of completed projects include the field of physics (parallel computing of complex quantum systems), astrophysics (Armenian virtual laboratory), biology (molecular dynamics study of human red blood cell membrane), meteorology (implementing and evaluating the Weather Research and Forecast Model for the territory of Armenia). The overview also notes that the Institute for Informatics and Automation Problems of the National Academy of Sciences of Armenia has established a scientific and educational infrastructure, uniting computing clusters of scientific and educational institutions of the country and provides the scientific community with access to local and international computational resources, that is a strong support for computational science in Armenia.
Cooperation in Harsh Environments and the Emergence of Spatial Patterns.
Smaldino, Paul E
2013-11-01
This paper concerns the confluence of two important areas of research in mathematical biology: spatial pattern formation and cooperative dilemmas. Mechanisms through which social organisms form spatial patterns are not fully understood. Prior work connecting cooperation and pattern formation has often included unrealistic assumptions that shed doubt on the applicability of those models toward understanding real biological patterns. I investigated a more biologically realistic model of cooperation among social actors. The environment is harsh, so that interactions with cooperators are strictly needed to survive. Harshness is implemented via a constant energy deduction. I show that this model can generate spatial patterns similar to those seen in many naturally-occuring systems. Moreover, for each payoff matrix there is an associated critical value of the energy deduction that separates two distinct dynamical processes. In low-harshness environments, the growth of cooperator clusters is impeded by defectors, but these clusters gradually expand to form dense dendritic patterns. In very harsh environments, cooperators expand rapidly but defectors can subsequently make inroads to form reticulated patterns. The resulting web-like patterns are reminiscent of transportation networks observed in slime mold colonies and other biological systems.
Effects of a Web-based course on nursing skills and knowledge learning.
Lu, Der-Fa; Lin, Zu-Chun; Li, Yun-Ju
2009-02-01
The purpose of the study was to assess the effectiveness of supplementing traditional classroom teaching with Web-based learning design when teaching intramuscular injection nursing skills. Four clusters of nursing students at a junior college in eastern Taiwan were randomly assigned to experimental and control groups. A total of 147 students (80 in the experimental group, 67 in the control group) completed the study. All participants received the same classroom lectures and skill demonstration. The experimental group interacted using a Web-based course and were able to view the content on demand. The students and instructor interacted via a chatroom, the bulletin board, and e-mail. Participants in the experimental group had significantly higher scores on both intramuscular injection knowledge and skill learning. A Web-based design can be an effective supplementing learning tool for teaching nursing knowledge and skills.
Fernandez, Nicolas F.; Gundersen, Gregory W.; Rahman, Adeeb; Grimes, Mark L.; Rikova, Klarisa; Hornbeck, Peter; Ma’ayan, Avi
2017-01-01
Most tools developed to visualize hierarchically clustered heatmaps generate static images. Clustergrammer is a web-based visualization tool with interactive features such as: zooming, panning, filtering, reordering, sharing, performing enrichment analysis, and providing dynamic gene annotations. Clustergrammer can be used to generate shareable interactive visualizations by uploading a data table to a web-site, or by embedding Clustergrammer in Jupyter Notebooks. The Clustergrammer core libraries can also be used as a toolkit by developers to generate visualizations within their own applications. Clustergrammer is demonstrated using gene expression data from the cancer cell line encyclopedia (CCLE), original post-translational modification data collected from lung cancer cells lines by a mass spectrometry approach, and original cytometry by time of flight (CyTOF) single-cell proteomics data from blood. Clustergrammer enables producing interactive web based visualizations for the analysis of diverse biological data. PMID:28994825
Interactive visual exploration and refinement of cluster assignments.
Kern, Michael; Lex, Alexander; Gehlenborg, Nils; Johnson, Chris R
2017-09-12
With ever-increasing amounts of data produced in biology research, scientists are in need of efficient data analysis methods. Cluster analysis, combined with visualization of the results, is one such method that can be used to make sense of large data volumes. At the same time, cluster analysis is known to be imperfect and depends on the choice of algorithms, parameters, and distance measures. Most clustering algorithms don't properly account for ambiguity in the source data, as records are often assigned to discrete clusters, even if an assignment is unclear. While there are metrics and visualization techniques that allow analysts to compare clusterings or to judge cluster quality, there is no comprehensive method that allows analysts to evaluate, compare, and refine cluster assignments based on the source data, derived scores, and contextual data. In this paper, we introduce a method that explicitly visualizes the quality of cluster assignments, allows comparisons of clustering results and enables analysts to manually curate and refine cluster assignments. Our methods are applicable to matrix data clustered with partitional, hierarchical, and fuzzy clustering algorithms. Furthermore, we enable analysts to explore clustering results in context of other data, for example, to observe whether a clustering of genomic data results in a meaningful differentiation in phenotypes. Our methods are integrated into Caleydo StratomeX, a popular, web-based, disease subtype analysis tool. We show in a usage scenario that our approach can reveal ambiguities in cluster assignments and produce improved clusterings that better differentiate genotypes and phenotypes.
Parmodel: a web server for automated comparative modeling of proteins.
Uchôa, Hugo Brandão; Jorge, Guilherme Eberhart; Freitas Da Silveira, Nelson José; Camera, João Carlos; Canduri, Fernanda; De Azevedo, Walter Filgueira
2004-12-24
Parmodel is a web server for automated comparative modeling and evaluation of protein structures. The aim of this tool is to help inexperienced users to perform modeling, assessment, visualization, and optimization of protein models as well as crystallographers to evaluate structures solved experimentally. It is subdivided in four modules: Parmodel Modeling, Parmodel Assessment, Parmodel Visualization, and Parmodel Optimization. The main module is the Parmodel Modeling that allows the building of several models for a same protein in a reduced time, through the distribution of modeling processes on a Beowulf cluster. Parmodel automates and integrates the main softwares used in comparative modeling as MODELLER, Whatcheck, Procheck, Raster3D, Molscript, and Gromacs. This web server is freely accessible at .
On the topology of the world exchange arrangements web
NASA Astrophysics Data System (ADS)
Li, Xiang; Jin, Yu Ying; Chen, Guanrong
2004-11-01
Exchange arrangements among different countries over the world are foundations of the world economy, which generally stand behind the daily economic evolution. As the first study of the world exchange arrangements web (WEAW), we built a bipartite network with countries as one type of nodes and currencies as the other, and found it to have a prominent scale-free feature with a power-law degree distribution. In a further empirical study of the currency section of the WEAW, we calculated the clustering coefficients, average nearest-neighbors degree, and average shortest distance. As an essential economic network, the WEAW is found to be a correlated disassortative network with a hierarchical structure, possessing a more prominent scale-free feature than the world trade web (WTW).
View of Arabella, one of the two Skylab 3 spiders used in experiment
NASA Technical Reports Server (NTRS)
1973-01-01
A close-up view of Arabella, one of the two Skylab 3 common cross spiders 'Araneus diadematus,' and the web it had spun in the zero gravity of space aboard the Skylab space station cluster in Earth orbit. This is a photographic reproduction made from a color television transmission aboard Skylab. Arabella and Anita, were housed in an enclosure onto which a motion picture camera and a still camera were attached to record the spiders' attempts to build a web in the weightless environment.
Sun Protection Belief Clusters: Analysis of Amazon Mechanical Turk Data.
Santiago-Rivas, Marimer; Schnur, Julie B; Jandorf, Lina
2016-12-01
This study aimed (i) to determine whether people could be differentiated on the basis of their sun protection belief profiles and individual characteristics and (ii) explore the use of a crowdsourcing web service for the assessment of sun protection beliefs. A sample of 500 adults completed an online survey of sun protection belief items using Amazon Mechanical Turk. A two-phased cluster analysis (i.e., hierarchical and non-hierarchical K-means) was utilized to determine clusters of sun protection barriers and facilitators. Results yielded three distinct clusters of sun protection barriers and three distinct clusters of sun protection facilitators. Significant associations between gender, age, sun sensitivity, and cluster membership were identified. Results also showed an association between barrier and facilitator cluster membership. The results of this study provided a potential alternative approach to developing future sun protection promotion initiatives in the population. Findings add to our knowledge regarding individuals who support, oppose, or are ambivalent toward sun protection and inform intervention research by identifying distinct subtypes that may best benefit from (or have a higher need for) skin cancer prevention efforts.
The eNanoMapper database for nanomaterial safety information
Chomenidis, Charalampos; Doganis, Philip; Fadeel, Bengt; Grafström, Roland; Hardy, Barry; Hastings, Janna; Hegi, Markus; Jeliazkov, Vedrin; Kochev, Nikolay; Kohonen, Pekka; Munteanu, Cristian R; Sarimveis, Haralambos; Smeets, Bart; Sopasakis, Pantelis; Tsiliki, Georgia; Vorgrimmler, David; Willighagen, Egon
2015-01-01
Summary Background: The NanoSafety Cluster, a cluster of projects funded by the European Commision, identified the need for a computational infrastructure for toxicological data management of engineered nanomaterials (ENMs). Ontologies, open standards, and interoperable designs were envisioned to empower a harmonized approach to European research in nanotechnology. This setting provides a number of opportunities and challenges in the representation of nanomaterials data and the integration of ENM information originating from diverse systems. Within this cluster, eNanoMapper works towards supporting the collaborative safety assessment for ENMs by creating a modular and extensible infrastructure for data sharing, data analysis, and building computational toxicology models for ENMs. Results: The eNanoMapper database solution builds on the previous experience of the consortium partners in supporting diverse data through flexible data storage, open source components and web services. We have recently described the design of the eNanoMapper prototype database along with a summary of challenges in the representation of ENM data and an extensive review of existing nano-related data models, databases, and nanomaterials-related entries in chemical and toxicogenomic databases. This paper continues with a focus on the database functionality exposed through its application programming interface (API), and its use in visualisation and modelling. Considering the preferred community practice of using spreadsheet templates, we developed a configurable spreadsheet parser facilitating user friendly data preparation and data upload. We further present a web application able to retrieve the experimental data via the API and analyze it with multiple data preprocessing and machine learning algorithms. Conclusion: We demonstrate how the eNanoMapper database is used to import and publish online ENM and assay data from several data sources, how the “representational state transfer” (REST) API enables building user friendly interfaces and graphical summaries of the data, and how these resources facilitate the modelling of reproducible quantitative structure–activity relationships for nanomaterials (NanoQSAR). PMID:26425413
The eNanoMapper database for nanomaterial safety information.
Jeliazkova, Nina; Chomenidis, Charalampos; Doganis, Philip; Fadeel, Bengt; Grafström, Roland; Hardy, Barry; Hastings, Janna; Hegi, Markus; Jeliazkov, Vedrin; Kochev, Nikolay; Kohonen, Pekka; Munteanu, Cristian R; Sarimveis, Haralambos; Smeets, Bart; Sopasakis, Pantelis; Tsiliki, Georgia; Vorgrimmler, David; Willighagen, Egon
2015-01-01
The NanoSafety Cluster, a cluster of projects funded by the European Commision, identified the need for a computational infrastructure for toxicological data management of engineered nanomaterials (ENMs). Ontologies, open standards, and interoperable designs were envisioned to empower a harmonized approach to European research in nanotechnology. This setting provides a number of opportunities and challenges in the representation of nanomaterials data and the integration of ENM information originating from diverse systems. Within this cluster, eNanoMapper works towards supporting the collaborative safety assessment for ENMs by creating a modular and extensible infrastructure for data sharing, data analysis, and building computational toxicology models for ENMs. The eNanoMapper database solution builds on the previous experience of the consortium partners in supporting diverse data through flexible data storage, open source components and web services. We have recently described the design of the eNanoMapper prototype database along with a summary of challenges in the representation of ENM data and an extensive review of existing nano-related data models, databases, and nanomaterials-related entries in chemical and toxicogenomic databases. This paper continues with a focus on the database functionality exposed through its application programming interface (API), and its use in visualisation and modelling. Considering the preferred community practice of using spreadsheet templates, we developed a configurable spreadsheet parser facilitating user friendly data preparation and data upload. We further present a web application able to retrieve the experimental data via the API and analyze it with multiple data preprocessing and machine learning algorithms. We demonstrate how the eNanoMapper database is used to import and publish online ENM and assay data from several data sources, how the "representational state transfer" (REST) API enables building user friendly interfaces and graphical summaries of the data, and how these resources facilitate the modelling of reproducible quantitative structure-activity relationships for nanomaterials (NanoQSAR).
Data Handling and Communication
NASA Astrophysics Data System (ADS)
Hemmer, FréDéRic Giorgio Innocenti, Pier
The following sections are included: * Introduction * Computing Clusters and Data Storage: The New Factory and Warehouse * Local Area Networks: Organizing Interconnection * High-Speed Worldwide Networking: Accelerating Protocols * Detector Simulation: Events Before the Event * Data Analysis and Programming Environment: Distilling Information * World Wide Web: Global Networking * References
Bringing Control System User Interfaces to the Web
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Xihui; Kasemir, Kay
With the evolution of web based technologies, especially HTML5 [1], it becomes possible to create web-based control system user interfaces (UI) that are cross-browser and cross-device compatible. This article describes two technologies that facilitate this goal. The first one is the WebOPI [2], which can seamlessly display CSS BOY [3] Operator Interfaces (OPI) in web browsers without modification to the original OPI file. The WebOPI leverages the powerful graphical editing capabilities of BOY and provides the convenience of re-using existing OPI files. On the other hand, it uses generic JavaScript and a generic communication mechanism between the web browser andmore » web server. It is not optimized for a control system, which results in unnecessary network traffic and resource usage. Our second technology is the WebSocket-based Process Data Access (WebPDA) [4]. It is a protocol that provides efficient control system data communication using WebSocket [5], so that users can create web-based control system UIs using standard web page technologies such as HTML, CSS and JavaScript. WebPDA is control system independent, potentially supporting any type of control system.« less
Ten billion years of brightest cluster galaxy alignments
NASA Astrophysics Data System (ADS)
West, Michael J.; de Propris, Roberto; Bremer, Malcolm N.; Phillipps, Steven
2017-07-01
A galaxy's orientation is one of its most basic observable properties. Astronomers once assumed that galaxies are randomly oriented in space; however, it is now clear that some have preferred orientations with respect to their surroundings. Chief among these are giant elliptical galaxies found in the centres of rich galaxy clusters. Numerous studies have shown that the major axes of these galaxies often share the same orientation as the surrounding matter distribution on larger scales1,2,3,4,5,6. Using Hubble Space Telescope observations of 65 distant galaxy clusters, we show that similar alignments are seen at earlier epochs when the Universe was only one-third of its current age. These results suggest that the brightest galaxies in clusters are the product of a special formation history, one influenced by development of the cosmic web over billions of years.
Applying Web Usage Mining for Personalizing Hyperlinks in Web-Based Adaptive Educational Systems
ERIC Educational Resources Information Center
Romero, Cristobal; Ventura, Sebastian; Zafra, Amelia; de Bra, Paul
2009-01-01
Nowadays, the application of Web mining techniques in e-learning and Web-based adaptive educational systems is increasing exponentially. In this paper, we propose an advanced architecture for a personalization system to facilitate Web mining. A specific Web mining tool is developed and a recommender engine is integrated into the AHA! system in…
Convex Clustering: An Attractive Alternative to Hierarchical Clustering
Chen, Gary K.; Chi, Eric C.; Ranola, John Michael O.; Lange, Kenneth
2015-01-01
The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/ PMID:25965340
Convex clustering: an attractive alternative to hierarchical clustering.
Chen, Gary K; Chi, Eric C; Ranola, John Michael O; Lange, Kenneth
2015-05-01
The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/.
The Atlas of Chinese World Wide Web Ecosystem Shaped by the Collective Attention Flows.
Lou, Xiaodan; Li, Yong; Gu, Weiwei; Zhang, Jiang
2016-01-01
The web can be regarded as an ecosystem of digital resources connected and shaped by collective successive behaviors of users. Knowing how people allocate limited attention on different resources is of great importance. To answer this, we embed the most popular Chinese web sites into a high dimensional Euclidean space based on the open flow network model of a large number of Chinese users' collective attention flows, which both considers the connection topology of hyperlinks between the sites and the collective behaviors of the users. With these tools, we rank the web sites and compare their centralities based on flow distances with other metrics. We also study the patterns of attention flow allocation, and find that a large number of web sites concentrate on the central area of the embedding space, and only a small fraction of web sites disperse in the periphery. The entire embedding space can be separated into 3 regions(core, interim, and periphery). The sites in the core (1%) occupy a majority of the attention flows (40%), and the sites (34%) in the interim attract 40%, whereas other sites (65%) only take 20% flows. What's more, we clustered the web sites into 4 groups according to their positions in the space, and found that similar web sites in contents and topics are grouped together. In short, by incorporating the open flow network model, we can clearly see how collective attention allocates and flows on different web sites, and how web sites connected each other.
Thematic clustering of text documents using an EM-based approach
2012-01-01
Clustering textual contents is an important step in mining useful information on the web or other text-based resources. The common task in text clustering is to handle text in a multi-dimensional space, and to partition documents into groups, where each group contains documents that are similar to each other. However, this strategy lacks a comprehensive view for humans in general since it cannot explain the main subject of each cluster. Utilizing semantic information can solve this problem, but it needs a well-defined ontology or pre-labeled gold standard set. In this paper, we present a thematic clustering algorithm for text documents. Given text, subject terms are extracted and used for clustering documents in a probabilistic framework. An EM approach is used to ensure documents are assigned to correct subjects, hence it converges to a locally optimal solution. The proposed method is distinctive because its results are sufficiently explanatory for human understanding as well as efficient for clustering performance. The experimental results show that the proposed method provides a competitive performance compared to other state-of-the-art approaches. We also show that the extracted themes from the MEDLINE® dataset represent the subjects of clusters reasonably well. PMID:23046528
Special issue on cluster algebras in mathematical physics
NASA Astrophysics Data System (ADS)
Di Francesco, Philippe; Gekhtman, Michael; Kuniba, Atsuo; Yamazaki, Masahito
2014-02-01
This is a call for contributions to a special issue of Journal of Physics A: Mathematical and Theoretical dedicated to cluster algebras in mathematical physics. Over the ten years since their introduction by Fomin and Zelevinsky, the theory of cluster algebras has witnessed a spectacular growth, first and foremost due to the many links that have been discovered with a wide range of subjects in mathematics and, increasingly, theoretical and mathematical physics. The main motivation of this special issue is to gather together reviews, recent developments and open problems, mainly from a mathematical physics viewpoint, into a single comprehensive issue. We expect that such a special issue will become a valuable reference for the broad scientific community working in mathematical and theoretical physics. The issue will consist of invited review articles and contributed papers containing new results on the interplays of cluster algebras with mathematical physics. Editorial policy The Guest Editors for this issue are Philippe Di Francesco, Michael Gekhtman, Atsuo Kuniba and Masahito Yamazaki. The areas and topics for this issue include, but are not limited to: discrete integrable systems arising from cluster mutations cluster structure on Poisson varieties cluster algebras and soliton interactions cluster positivity conjecture Y-systems in the thermodynamic Bethe ansatz and Zamolodchikov's periodicity conjecture T-system of transfer matrices of integrable lattice models dilogarithm identities in conformal field theory wall crossing in 4d N = 2 supersymmetric gauge theories 4d N = 1 quiver gauge theories described by networks scattering amplitudes of 4d N = 4 theories 3d N = 2 gauge theories described by flat connections on 3-manifolds integrability of dimer/Ising models on graphs. All contributions will be refereed and processed according to the usual procedure of the journal. Guidelines for preparation of contributions The deadline for contributed papers is 31 March 2014. This deadline will allow the special issue to appear at the end of 2014. There is no strict regulation on article size, but as a guide the preferable size is 15-30 pages for contributed papers and 40-60 pages for reviews. Further advice on publishing your work in Journal of Physics A may be found at iopscience.iop.org/jphysa. Contributions to the special issue should be submitted by web upload via ScholarOne Manuscripts, quoting 'JPhysA special issue on cluster algebras in mathematical physics'. Submissions should ideally be in standard LaTeX form. Please see the website for further information on electronic submissions. All contributions should be accompanied by a read-me file or covering letter giving the postal and e-mail addresses for correspondence. The Publishing Office should be notified of any subsequent change of address. The special issue will be published in the print and online versions of the journal.
Special issue on cluster algebras in mathematical physics
NASA Astrophysics Data System (ADS)
Di Francesco, Philippe; Gekhtman, Michael; Kuniba, Atsuo; Yamazaki, Masahito
2013-12-01
This is a call for contributions to a special issue of Journal of Physics A: Mathematical and Theoretical dedicated to cluster algebras in mathematical physics. Over the ten years since their introduction by Fomin and Zelevinsky, the theory of cluster algebras has witnessed a spectacular growth, first and foremost due to the many links that have been discovered with a wide range of subjects in mathematics and, increasingly, theoretical and mathematical physics. The main motivation of this special issue is to gather together reviews, recent developments and open problems, mainly from a mathematical physics viewpoint, into a single comprehensive issue. We expect that such a special issue will become a valuable reference for the broad scientific community working in mathematical and theoretical physics. The issue will consist of invited review articles and contributed papers containing new results on the interplays of cluster algebras with mathematical physics. Editorial policy The Guest Editors for this issue are Philippe Di Francesco, Michael Gekhtman, Atsuo Kuniba and Masahito Yamazaki. The areas and topics for this issue include, but are not limited to: discrete integrable systems arising from cluster mutations cluster structure on Poisson varieties cluster algebras and soliton interactions cluster positivity conjecture Y-systems in the thermodynamic Bethe ansatz and Zamolodchikov's periodicity conjecture T-system of transfer matrices of integrable lattice models dilogarithm identities in conformal field theory wall crossing in 4d N = 2 supersymmetric gauge theories 4d N = 1 quiver gauge theories described by networks scattering amplitudes of 4d N = 4 theories 3d N = 2 gauge theories described by flat connections on 3-manifolds integrability of dimer/Ising models on graphs. All contributions will be refereed and processed according to the usual procedure of the journal. Guidelines for preparation of contributions The deadline for contributed papers is 31 March 2014. This deadline will allow the special issue to appear at the end of 2014. There is no strict regulation on article size, but as a guide the preferable size is 15-30 pages for contributed papers and 40-60 pages for reviews. Further advice on publishing your work in Journal of Physics A may be found at iopscience.iop.org/jphysa. Contributions to the special issue should be submitted by web upload via ScholarOne Manuscripts, quoting 'JPhysA special issue on cluster algebras in mathematical physics'. Submissions should ideally be in standard LaTeX form. Please see the website for further information on electronic submissions. All contributions should be accompanied by a read-me file or covering letter giving the postal and e-mail addresses for correspondence. The Publishing Office should be notified of any subsequent change of address. The special issue will be published in the print and online versions of the journal.
Special issue on cluster algebras in mathematical physics
NASA Astrophysics Data System (ADS)
Di Francesco, Philippe; Gekhtman, Michael; Kuniba, Atsuo; Yamazaki, Masahito
2013-11-01
This is a call for contributions to a special issue of Journal of Physics A: Mathematical and Theoretical dedicated to cluster algebras in mathematical physics. Over the ten years since their introduction by Fomin and Zelevinsky, the theory of cluster algebras has witnessed a spectacular growth, first and foremost due to the many links that have been discovered with a wide range of subjects in mathematics and, increasingly, theoretical and mathematical physics. The main motivation of this special issue is to gather together reviews, recent developments and open problems, mainly from a mathematical physics viewpoint, into a single comprehensive issue. We expect that such a special issue will become a valuable reference for the broad scientific community working in mathematical and theoretical physics. The issue will consist of invited review articles and contributed papers containing new results on the interplays of cluster algebras with mathematical physics. Editorial policy The Guest Editors for this issue are Philippe Di Francesco, Michael Gekhtman, Atsuo Kuniba and Masahito Yamazaki. The areas and topics for this issue include, but are not limited to: discrete integrable systems arising from cluster mutations cluster structure on Poisson varieties cluster algebras and soliton interactions cluster positivity conjecture Y-systems in the thermodynamic Bethe ansatz and Zamolodchikov's periodicity conjecture T-system of transfer matrices of integrable lattice models dilogarithm identities in conformal field theory wall crossing in 4d N = 2 supersymmetric gauge theories 4d N = 1 quiver gauge theories described by networks scattering amplitudes of 4d N = 4 theories 3d N = 2 gauge theories described by flat connections on 3-manifolds integrability of dimer/Ising models on graphs. All contributions will be refereed and processed according to the usual procedure of the journal. Guidelines for preparation of contributions The deadline for contributed papers is 31 March 2014. This deadline will allow the special issue to appear at the end of 2014. There is no strict regulation on article size, but as a guide the preferable size is 15-30 pages for contributed papers and 40-60 pages for reviews. Further advice on publishing your work in Journal of Physics A may be found at iopscience.iop.org/jphysa. Contributions to the special issue should be submitted by web upload via ScholarOne Manuscripts, quoting 'JPhysA special issue on cluster algebras in mathematical physics'. Submissions should ideally be in standard LaTeX form. Please see the website for further information on electronic submissions. All contributions should be accompanied by a read-me file or covering letter giving the postal and e-mail addresses for correspondence. The Publishing Office should be notified of any subsequent change of address. The special issue will be published in the print and online versions of the journal.
Paradigms, Citations, and Maps of Science: A Personal History.
ERIC Educational Resources Information Center
Small, Henry
2003-01-01
Discusses mapping science and Kuhn's theories of paradigms and scientific development. Highlights include cocitation clustering; bibliometric definition of a paradigm; specialty dynamics; pathways through science; a new Web tool called Essential Science Indicators (ESI) for studying the structure of science; and microrevolutions. (Author/LRW)
Cloud CPFP: a shotgun proteomics data analysis pipeline using cloud and high performance computing.
Trudgian, David C; Mirzaei, Hamid
2012-12-07
We have extended the functionality of the Central Proteomics Facilities Pipeline (CPFP) to allow use of remote cloud and high performance computing (HPC) resources for shotgun proteomics data processing. CPFP has been modified to include modular local and remote scheduling for data processing jobs. The pipeline can now be run on a single PC or server, a local cluster, a remote HPC cluster, and/or the Amazon Web Services (AWS) cloud. We provide public images that allow easy deployment of CPFP in its entirety in the AWS cloud. This significantly reduces the effort necessary to use the software, and allows proteomics laboratories to pay for compute time ad hoc, rather than obtaining and maintaining expensive local server clusters. Alternatively the Amazon cloud can be used to increase the throughput of a local installation of CPFP as necessary. We demonstrate that cloud CPFP allows users to process data at higher speed than local installations but with similar cost and lower staff requirements. In addition to the computational improvements, the web interface to CPFP is simplified, and other functionalities are enhanced. The software is under active development at two leading institutions and continues to be released under an open-source license at http://cpfp.sourceforge.net.
Extracting Related Words from Anchor Text Clusters by Focusing on the Page Designer's Intention
NASA Astrophysics Data System (ADS)
Liu, Jianquan; Chen, Hanxiong; Furuse, Kazutaka; Ohbo, Nobuo
Approaches for extracting related words (terms) by co-occurrence work poorly sometimes. Two words frequently co-occurring in the same documents are considered related. However, they may not relate at all because they would have no common meanings nor similar semantics. We address this problem by considering the page designer’s intention and propose a new model to extract related words. Our approach is based on the idea that the web page designers usually make the correlative hyperlinks appear in close zone on the browser. We developed a browser-based crawler to collect “geographically” near hyperlinks, then by clustering these hyperlinks based on their pixel coordinates, we extract related words which can well reflect the designer’s intention. Experimental results show that our method can represent the intention of the web page designer in extremely high precision. Moreover, the experiments indicate that our extracting method can obtain related words in a high average precision.
SpatialEpiApp: A Shiny web application for the analysis of spatial and spatio-temporal disease data.
Moraga, Paula
2017-11-01
During last years, public health surveillance has been facilitated by the existence of several packages implementing statistical methods for the analysis of spatial and spatio-temporal disease data. However, these methods are still inaccesible for many researchers lacking the adequate programming skills to effectively use the required software. In this paper we present SpatialEpiApp, a Shiny web application that integrate two of the most common approaches in health surveillance: disease mapping and detection of clusters. SpatialEpiApp is easy to use and does not require any programming knowledge. Given information about the cases, population and optionally covariates for each of the areas and dates of study, the application allows to fit Bayesian models to obtain disease risk estimates and their uncertainty by using R-INLA, and to detect disease clusters by using SaTScan. The application allows user interaction and the creation of interactive data visualizations and reports showing the analyses performed. Copyright © 2017 Elsevier Ltd. All rights reserved.
Piquette, Noella A.; Savage, Robert S.; Abrami, Philip C.
2014-01-01
The present paper reports a cluster randomized control trial evaluation of teaching using ABRACADABRA (ABRA), an evidence-based and web-based literacy intervention (http://abralite.concordia.ca) with 107 kindergarten and 96 grade 1 children in 24 classes (12 intervention 12 control classes) from all 12 elementary schools in one school district in Canada. Children in the intervention condition received 10–12 h of whole class instruction using ABRA between pre- and post-test. Hierarchical linear modeling of post-test results showed significant gains in letter-sound knowledge for intervention classrooms over control classrooms. In addition, medium effect sizes were evident for three of five outcome measures favoring the intervention: letter-sound knowledge (d= +0.66), phonological blending (d = +0.52), and word reading (d = +0.52), over effect sizes for regular teaching. It is concluded that regular teaching with ABRA technology adds significantly to literacy in the early elementary years. PMID:25538663
Embedded Web Technology: Applying World Wide Web Standards to Embedded Systems
NASA Technical Reports Server (NTRS)
Ponyik, Joseph G.; York, David W.
2002-01-01
Embedded Systems have traditionally been developed in a highly customized manner. The user interface hardware and software along with the interface to the embedded system are typically unique to the system for which they are built, resulting in extra cost to the system in terms of development time and maintenance effort. World Wide Web standards have been developed in the passed ten years with the goal of allowing servers and clients to intemperate seamlessly. The client and server systems can consist of differing hardware and software platforms but the World Wide Web standards allow them to interface without knowing about the details of system at the other end of the interface. Embedded Web Technology is the merging of Embedded Systems with the World Wide Web. Embedded Web Technology decreases the cost of developing and maintaining the user interface by allowing the user to interface to the embedded system through a web browser running on a standard personal computer. Embedded Web Technology can also be used to simplify an Embedded System's internal network.
Analysis and Visualization of Relations in eLearning
NASA Astrophysics Data System (ADS)
Dráždilová, Pavla; Obadi, Gamila; Slaninová, Kateřina; Martinovič, Jan; Snášel, Václav
The popularity of eLearning systems is growing rapidly; this growth is enabled by the consecutive development in Internet and multimedia technologies. Web-based education became wide spread in the past few years. Various types of learning management systems facilitate development of Web-based courses. Users of these courses form social networks through the different activities performed by them. This chapter focuses on searching the latent social networks in eLearning systems data. These data consist of students activity records wherein latent ties among actors are embedded. The social network studied in this chapter is represented by groups of students who have similar contacts and interact in similar social circles. Different methods of data clustering analysis can be applied to these groups, and the findings show the existence of latent ties among the group members. The second part of this chapter focuses on social network visualization. Graphical representation of social network can describe its structure very efficiently. It can enable social network analysts to determine the network degree of connectivity. Analysts can easily determine individuals with a small or large amount of relationships as well as the amount of independent groups in a given network. When applied to the field of eLearning, data visualization simplifies the process of monitoring the study activities of individuals or groups, as well as the planning of educational curriculum, the evaluation of study processes, etc.
New insights into the classification and nomenclature of cortical GABAergic interneurons.
DeFelipe, Javier; López-Cruz, Pedro L; Benavides-Piccione, Ruth; Bielza, Concha; Larrañaga, Pedro; Anderson, Stewart; Burkhalter, Andreas; Cauli, Bruno; Fairén, Alfonso; Feldmeyer, Dirk; Fishell, Gord; Fitzpatrick, David; Freund, Tamás F; González-Burgos, Guillermo; Hestrin, Shaul; Hill, Sean; Hof, Patrick R; Huang, Josh; Jones, Edward G; Kawaguchi, Yasuo; Kisvárday, Zoltán; Kubota, Yoshiyuki; Lewis, David A; Marín, Oscar; Markram, Henry; McBain, Chris J; Meyer, Hanno S; Monyer, Hannah; Nelson, Sacha B; Rockland, Kathleen; Rossier, Jean; Rubenstein, John L R; Rudy, Bernardo; Scanziani, Massimo; Shepherd, Gordon M; Sherwood, Chet C; Staiger, Jochen F; Tamás, Gábor; Thomson, Alex; Wang, Yun; Yuste, Rafael; Ascoli, Giorgio A
2013-03-01
A systematic classification and accepted nomenclature of neuron types is much needed but is currently lacking. This article describes a possible taxonomical solution for classifying GABAergic interneurons of the cerebral cortex based on a novel, web-based interactive system that allows experts to classify neurons with pre-determined criteria. Using Bayesian analysis and clustering algorithms on the resulting data, we investigated the suitability of several anatomical terms and neuron names for cortical GABAergic interneurons. Moreover, we show that supervised classification models could automatically categorize interneurons in agreement with experts' assignments. These results demonstrate a practical and objective approach to the naming, characterization and classification of neurons based on community consensus.
New insights into the classification and nomenclature of cortical GABAergic interneurons
DeFelipe, Javier; López-Cruz, Pedro L.; Benavides-Piccione, Ruth; Bielza, Concha; Larrañaga, Pedro; Anderson, Stewart; Burkhalter, Andreas; Cauli, Bruno; Fairén, Alfonso; Feldmeyer, Dirk; Fishell, Gord; Fitzpatrick, David; Freund, Tamás F.; González-Burgos, Guillermo; Hestrin, Shaul; Hill, Sean; Hof, Patrick R.; Huang, Josh; Jones, Edward G.; Kawaguchi, Yasuo; Kisvárday, Zoltán; Kubota, Yoshiyuki; Lewis, David A.; Marín, Oscar; Markram, Henry; McBain, Chris J.; Meyer, Hanno S.; Monyer, Hannah; Nelson, Sacha B.; Rockland, Kathleen; Rossier, Jean; Rubenstein, John L. R.; Rudy, Bernardo; Scanziani, Massimo; Shepherd, Gordon M.; Sherwood, Chet C.; Staiger, Jochen F.; Tamás, Gábor; Thomson, Alex; Wang, Yun; Yuste, Rafael; Ascoli, Giorgio A.
2013-01-01
A systematic classification and accepted nomenclature of neuron types is much needed but is currently lacking. This article describes a possible taxonomical solution for classifying GABAergic interneurons of the cerebral cortex based on a novel, web-based interactive system that allows experts to classify neurons with pre-determined criteria. Using Bayesian analysis and clustering algorithms on the resulting data, we investigated the suitability of several anatomical terms and neuron names for cortical GABAergic interneurons. Moreover, we show that supervised classification models could automatically categorize interneurons in agreement with experts’ assignments. These results demonstrate a practical and objective approach to the naming, characterization and classification of neurons based on community consensus. PMID:23385869
CyanoClust: comparative genome resources of cyanobacteria and plastids.
Sasaki, Naobumi V; Sato, Naoki
2010-01-01
Cyanobacteria, which perform oxygen-evolving photosynthesis as do chloroplasts of plants and algae, are one of the best-studied prokaryotic phyla and one from which many representative genomes have been sequenced. Lack of a suitable comparative genomic database has been a problem in cyanobacterial genomics because many proteins involved in physiological functions such as photosynthesis and nitrogen fixation are not catalogued in commonly used databases, such as Clusters of Orthologous Proteins (COG). CyanoClust is a database of homolog groups in cyanobacteria and plastids that are produced by the program Gclust. We have developed a web-server system for the protein homology database featuring cyanobacteria and plastids. Database URL: http://cyanoclust.c.u-tokyo.ac.jp/.
DelPhiPKa web server: predicting pKa of proteins, RNAs and DNAs.
Wang, Lin; Zhang, Min; Alexov, Emil
2016-02-15
A new pKa prediction web server is released, which implements DelPhi Gaussian dielectric function to calculate electrostatic potentials generated by charges of biomolecules. Topology parameters are extended to include atomic information of nucleotides of RNA and DNA, which extends the capability of pKa calculations beyond proteins. The web server allows the end-user to protonate the biomolecule at particular pH based on calculated pKa values and provides the downloadable file in PQR format. Several tests are performed to benchmark the accuracy and speed of the protocol. The web server follows a client-server architecture built on PHP and HTML and utilizes DelPhiPKa program. The computation is performed on the Palmetto supercomputer cluster and results/download links are given back to the end-user via http protocol. The web server takes advantage of MPI parallel implementation in DelPhiPKa and can run a single job on up to 24 CPUs. The DelPhiPKa web server is available at http://compbio.clemson.edu/pka_webserver. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
van Stralen, Maartje M; Bolman, Catherine; Golsteijn, Rianne HJ; de Vries, Hein; Mudde, Aart N; Lechner, Lilian
2012-01-01
Background The Active Plus project is a systematically developed theory- and evidence-based, computer-tailored intervention, which was found to be effective in changing physical activity behavior in people aged over 50 years. The process and effect outcomes of the first version of the Active Plus project were translated into an adapted intervention using the RE-AIM framework. The RE-AIM model is often used to evaluate the potential public health impact of an intervention and distinguishes five dimensions: reach, effectiveness, adoption, implementation, and maintenance. Objective To gain insight into the systematic translation of the first print-delivered version of the Active Plus project into an adapted (Web-based) follow-up project. The focus of this study was on the reach and effectiveness dimensions, since these dimensions are most influenced by the results from the original Active Plus project. Methods We optimized the potential reach and effect of the interventions by extending the delivery mode of the print-delivered intervention into an additional Web-based intervention. The interventions were adapted based on results of the process evaluation, analyses of effects within subgroups, and evaluation of the working mechanisms of the original intervention. We pretested the new intervention materials and the Web-based versions of the interventions. Subsequently, the new intervention conditions were implemented in a clustered randomized controlled trial. Results Adaptations resulted in four improved tailoring interventions: (1) a basic print-delivered intervention, (2) a basic Web-based intervention, (3) a print-delivered intervention with an additional environmental component, and (4) a Web-based version with an additional environmental component. Pretest results with participants showed that all new intervention materials had modest usability and relatively high appreciation, and that filling in an online questionnaire and performing the online tasks was not problematic. We used the pretest results to improve the usability of the different interventions. Implementation of the new interventions in a clustered randomized controlled trial showed that the print-delivered interventions had a higher response rate than the Web-based interventions. Participants of both low and high socioeconomic status were reached by both print-delivered and Web-based interventions. Conclusions Translation of the (process) evaluation of an effective intervention into an adapted intervention is challenging and rarely reported. We discuss several major lessons learned from our experience. Trial Registration Nederlands Trial Register (NTR): 2297; http://www.trialregister.nl/trialreg/admin/rctview.asp?TC=2297 (Archived by WebCite at http://www.webcitation.org/65TkwoESp). PMID:22390878
Peels, Denise A; van Stralen, Maartje M; Bolman, Catherine; Golsteijn, Rianne Hj; de Vries, Hein; Mudde, Aart N; Lechner, Lilian
2012-03-02
The Active Plus project is a systematically developed theory- and evidence-based, computer-tailored intervention, which was found to be effective in changing physical activity behavior in people aged over 50 years. The process and effect outcomes of the first version of the Active Plus project were translated into an adapted intervention using the RE-AIM framework. The RE-AIM model is often used to evaluate the potential public health impact of an intervention and distinguishes five dimensions: reach, effectiveness, adoption, implementation, and maintenance. To gain insight into the systematic translation of the first print-delivered version of the Active Plus project into an adapted (Web-based) follow-up project. The focus of this study was on the reach and effectiveness dimensions, since these dimensions are most influenced by the results from the original Active Plus project. We optimized the potential reach and effect of the interventions by extending the delivery mode of the print-delivered intervention into an additional Web-based intervention. The interventions were adapted based on results of the process evaluation, analyses of effects within subgroups, and evaluation of the working mechanisms of the original intervention. We pretested the new intervention materials and the Web-based versions of the interventions. Subsequently, the new intervention conditions were implemented in a clustered randomized controlled trial. Adaptations resulted in four improved tailoring interventions: (1) a basic print-delivered intervention, (2) a basic Web-based intervention, (3) a print-delivered intervention with an additional environmental component, and (4) a Web-based version with an additional environmental component. Pretest results with participants showed that all new intervention materials had modest usability and relatively high appreciation, and that filling in an online questionnaire and performing the online tasks was not problematic. We used the pretest results to improve the usability of the different interventions. Implementation of the new interventions in a clustered randomized controlled trial showed that the print-delivered interventions had a higher response rate than the Web-based interventions. Participants of both low and high socioeconomic status were reached by both print-delivered and Web-based interventions. Translation of the (process) evaluation of an effective intervention into an adapted intervention is challenging and rarely reported. We discuss several major lessons learned from our experience. Nederlands Trial Register (NTR): 2297; http://www.trialregister.nl/trialreg/admin/rctview.asp?TC=2297 (Archived by WebCite at http://www.webcitation.org/65TkwoESp).
Liere, Heidi; Jackson, Doug; Vandermeer, John
2012-01-01
Background Spatial heterogeneity is essential for the persistence of many inherently unstable systems such as predator-prey and parasitoid-host interactions. Since biological interactions themselves can create heterogeneity in space, the heterogeneity necessary for the persistence of an unstable system could be the result of local interactions involving elements of the unstable system itself. Methodology/Principal Findings Here we report on a predatory ladybird beetle whose natural history suggests that the beetle requires the patchy distribution of the mutualism between its prey, the green coffee scale, and the arboreal ant, Azteca instabilis. Based on known ecological interactions and the natural history of the system, we constructed a spatially-explicit model and showed that the clustered spatial pattern of ant nests facilitates the persistence of the beetle populations. Furthermore, we show that the dynamics of the beetle consuming the scale insects can cause the clustered distribution of the mutualistic ants in the first place. Conclusions/Significance From a theoretical point of view, our model represents a novel situation in which a predator indirectly causes a spatial pattern of an organism other than its prey, and in doing so facilitates its own persistence. From a practical point of view, it is noteworthy that one of the elements in the system is a persistent pest of coffee, an important world commodity. This pest, we argue, is kept within limits of control through a complex web of ecological interactions that involves the emergent spatial pattern. PMID:23029061
Resource Management Scheme Based on Ubiquitous Data Analysis
Lee, Heung Ki; Jung, Jaehee
2014-01-01
Resource management of the main memory and process handler is critical to enhancing the system performance of a web server. Owing to the transaction delay time that affects incoming requests from web clients, web server systems utilize several web processes to anticipate future requests. This procedure is able to decrease the web generation time because there are enough processes to handle the incoming requests from web browsers. However, inefficient process management results in low service quality for the web server system. Proper pregenerated process mechanisms are required for dealing with the clients' requests. Unfortunately, it is difficult to predict how many requests a web server system is going to receive. If a web server system builds too many web processes, it wastes a considerable amount of memory space, and thus performance is reduced. We propose an adaptive web process manager scheme based on the analysis of web log mining. In the proposed scheme, the number of web processes is controlled through prediction of incoming requests, and accordingly, the web process management scheme consumes the least possible web transaction resources. In experiments, real web trace data were used to prove the improved performance of the proposed scheme. PMID:25197692
Content-based image retrieval with ontological ranking
NASA Astrophysics Data System (ADS)
Tsai, Shen-Fu; Tsai, Min-Hsuan; Huang, Thomas S.
2010-02-01
Images are a much more powerful medium of expression than text, as the adage says: "One picture is worth a thousand words." It is because compared with text consisting of an array of words, an image has more degrees of freedom and therefore a more complicated structure. However, the less limited structure of images presents researchers in the computer vision community a tough task of teaching machines to understand and organize images, especially when a limit number of learning examples and background knowledge are given. The advance of internet and web technology in the past decade has changed the way human gain knowledge. People, hence, can exchange knowledge with others by discussing and contributing information on the web. As a result, the web pages in the internet have become a living and growing source of information. One is therefore tempted to wonder whether machines can learn from the web knowledge base as well. Indeed, it is possible to make computer learn from the internet and provide human with more meaningful knowledge. In this work, we explore this novel possibility on image understanding applied to semantic image search. We exploit web resources to obtain links from images to keywords and a semantic ontology constituting human's general knowledge. The former maps visual content to related text in contrast to the traditional way of associating images with surrounding text; the latter provides relations between concepts for machines to understand to what extent and in what sense an image is close to the image search query. With the aid of these two tools, the resulting image search system is thus content-based and moreover, organized. The returned images are ranked and organized such that semantically similar images are grouped together and given a rank based on the semantic closeness to the input query. The novelty of the system is twofold: first, images are retrieved not only based on text cues but their actual contents as well; second, the grouping is different from pure visual similarity clustering. More specifically, the inferred concepts of each image in the group are examined in the context of a huge concept ontology to determine their true relations with what people have in mind when doing image search.
The Food Web of Potter Cove (Antarctica): complexity, structure and function
NASA Astrophysics Data System (ADS)
Marina, Tomás I.; Salinas, Vanesa; Cordone, Georgina; Campana, Gabriela; Moreira, Eugenia; Deregibus, Dolores; Torre, Luciana; Sahade, Ricardo; Tatián, Marcos; Barrera Oro, Esteban; De Troch, Marleen; Doyle, Santiago; Quartino, María Liliana; Saravia, Leonardo A.; Momo, Fernando R.
2018-01-01
Knowledge of the food web structure and complexity are central to better understand ecosystem functioning. A food-web approach includes both species and energy flows among them, providing a natural framework for characterizing species' ecological roles and the mechanisms through which biodiversity influences ecosystem dynamics. Here we present for the first time a high-resolution food web for a marine ecosystem at Potter Cove (northern Antarctic Peninsula). Eleven food web properties were analyzed in order to document network complexity, structure and topology. We found a low linkage density (3.4), connectance (0.04) and omnivory percentage (45), as well as a short path length (1.8) and a low clustering coefficient (0.08). Furthermore, relating the structure of the food web to its dynamics, an exponential degree distribution (in- and out-links) was found. This suggests that the Potter Cove food web may be vulnerable if the most connected species became locally extinct. For two of the three more connected functional groups, competition overlap graphs imply high trophic interaction between demersal fish and niche specialization according to feeding strategies in amphipods. On the other hand, the prey overlap graph shows also that multiple energy pathways of carbon flux exist across benthic and pelagic habitats in the Potter Cove ecosystem. Although alternative food sources might add robustness to the web, network properties (low linkage density, connectance and omnivory) suggest fragility and potential trophic cascade effects.
FloVis: Leveraging Visualization to Protect Sensitive Network Infrastructure
2010-11-01
words, we are clustering the hourly web surfing patterns of users on a small private network. The data in this case is filtered NetFlow records...Entity-based NetFlow Visualization Utility for Identifying Intrusive Behavior. In Goodall et al. (eds.), Mathematics and Visualization (Proceedings
COGcollator: a web server for analysis of distant relationships between homologous protein families.
Dibrova, Daria V; Konovalov, Kirill A; Perekhvatov, Vadim V; Skulachev, Konstantin V; Mulkidjanian, Armen Y
2017-11-29
The Clusters of Orthologous Groups (COGs) of proteins systematize evolutionary related proteins into specific groups with similar functions. However, the available databases do not provide means to assess the extent of similarity between the COGs. We intended to provide a method for identification and visualization of evolutionary relationships between the COGs, as well as a respective web server. Here we introduce the COGcollator, a web tool for identification of evolutionarily related COGs and their further analysis. We demonstrate the utility of this tool by identifying the COGs that contain distant homologs of (i) the catalytic subunit of bacterial rotary membrane ATP synthases and (ii) the DNA/RNA helicases of the superfamily 1. This article was reviewed by Drs. Igor N. Berezovsky, Igor Zhulin and Yuri Wolf.
A brief introduction to web-based genome browsers.
Wang, Jun; Kong, Lei; Gao, Ge; Luo, Jingchu
2013-03-01
Genome browser provides a graphical interface for users to browse, search, retrieve and analyze genomic sequence and annotation data. Web-based genome browsers can be classified into general genome browsers with multiple species and species-specific genome browsers. In this review, we attempt to give an overview for the main functions and features of web-based genome browsers, covering data visualization, retrieval, analysis and customization. To give a brief introduction to the multiple-species genome browser, we describe the user interface and main functions of the Ensembl and UCSC genome browsers using the human alpha-globin gene cluster as an example. We further use the MSU and the Rice-Map genome browsers to show some special features of species-specific genome browser, taking a rice transcription factor gene OsSPL14 as an example.
Duan, Qiaonan; Flynn, Corey; Niepel, Mario; Hafner, Marc; Muhlich, Jeremy L; Fernandez, Nicolas F; Rouillard, Andrew D; Tan, Christopher M; Chen, Edward Y; Golub, Todd R; Sorger, Peter K; Subramanian, Aravind; Ma'ayan, Avi
2014-07-01
For the Library of Integrated Network-based Cellular Signatures (LINCS) project many gene expression signatures using the L1000 technology have been produced. The L1000 technology is a cost-effective method to profile gene expression in large scale. LINCS Canvas Browser (LCB) is an interactive HTML5 web-based software application that facilitates querying, browsing and interrogating many of the currently available LINCS L1000 data. LCB implements two compacted layered canvases, one to visualize clustered L1000 expression data, and the other to display enrichment analysis results using 30 different gene set libraries. Clicking on an experimental condition highlights gene-sets enriched for the differentially expressed genes from the selected experiment. A search interface allows users to input gene lists and query them against over 100 000 conditions to find the top matching experiments. The tool integrates many resources for an unprecedented potential for new discoveries in systems biology and systems pharmacology. The LCB application is available at http://www.maayanlab.net/LINCS/LCB. Customized versions will be made part of the http://lincscloud.org and http://lincs.hms.harvard.edu websites. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
NASA Technical Reports Server (NTRS)
Nelson, Michael L.; Maly, Kurt; Shen, Stewart N. T.
1997-01-01
In this paper we describe NCSTRL+, a unified, canonical digital library for scientific and technical information (STI). NCSTRL+ is based on the Networked Computer Science Technical Report Library (NCSTRL), a World Wide Web (WWW) accessible digital library (DL) that provides access to over 80 university departments and laboratories. NCSTRL+ implements two new technologies: cluster functionality and publishing "buckets." We have extended the Dienst protocol, the protocol underlying NCSTRL, to provide the ability to "cluster" independent collections into a logically centralized digital library based upon subject category classification, type of organization, and genres of material. The concept of "buckets" provides a mechanism for publishing and managing logically linked entities with multiple data formats. The NCSTRL+ prototype DL contains the holdings of NCSTRL and the NASA Technical Report Server (NTRS). The prototype demonstrates the feasibility of publishing into a multi-cluster DL, searching across clusters, and storing and presenting buckets of information. We show that the overhead for these additional capabilities is minimal to both the author and the user when compared to the equivalent process within NCSTRL.
van Engen-Verheul, Mariëtte M; de Keizer, Nicolette F; van der Veer, Sabine N; Kemps, Hareld M C; Scholte op Reimer, Wilma J M; Jaspers, Monique W M; Peek, Niels
2014-12-31
Implementation of clinical practice guidelines into daily care is hampered by a variety of barriers related to professional knowledge and collaboration in teams and organizations. To improve guideline concordance by changing the clinical decision-making behavior of professionals, computerized decision support (CDS) has been shown to be one of the most effective instruments. However, to address barriers at the organizational level, additional interventions are needed. Continuous monitoring and systematic improvement of quality are increasingly used to achieve change at this level in complex health care systems. The study aims to assess the effectiveness of a web-based quality improvement (QI) system with indicator-based performance feedback and educational outreach visits to overcome organizational barriers for guideline concordance in multidisciplinary teams in the field of cardiac rehabilitation (CR). A multicenter cluster-randomized trial with a balanced incomplete block design will be conducted in 18 Dutch CR clinics using an electronic patient record with CDS at the point of care. The intervention consists of (i) periodic performance feedback on quality indicators for CR and (ii) educational outreach visits to support local multidisciplinary QI teams focussing on systematically improving the care they provide. The intervention is supported by a web-based system which provides an overview of the feedback and facilitates development and monitoring of local QI plans. The primary outcome will be concordance to national CR guidelines with respect to the CR needs assessment and therapy indication procedure. Secondary outcomes are changes in performance of CR clinics as measured by structure, process and outcome indicators, and changes in practice variation on these indicators. We will also conduct a qualitative process evaluation (concept-mapping methodology) to assess experiences from participating CR clinics and to gain insight into factors which influence the implementation of the intervention. To our knowledge, this will be the first study to evaluate the effect of providing performance feedback with a web-based system that incorporates underlying QI concepts. The results may contribute to improving CR in the Netherlands, increasing knowledge on facilitators of guideline implementation in multidisciplinary health care teams and identifying success factors of multifaceted feedback interventions. NTR3251.
Distance-Learning, ADHD Quality Improvement in Primary Care: A Cluster-Randomized Trial.
Fiks, Alexander G; Mayne, Stephanie L; Michel, Jeremy J; Miller, Jeffrey; Abraham, Manju; Suh, Andrew; Jawad, Abbas F; Guevara, James P; Grundmeier, Robert W; Blum, Nathan J; Power, Thomas J
2017-10-01
To evaluate a distance-learning, quality improvement intervention to improve pediatric primary care provider use of attention-deficit/hyperactivity disorder (ADHD) rating scales. Primary care practices were cluster randomized to a 3-part distance-learning, quality improvement intervention (web-based education, collaborative consultation with ADHD experts, and performance feedback reports/calls), qualifying for Maintenance of Certification (MOC) Part IV credit, or wait-list control. We compared changes relative to a baseline period in rating scale use by study arm using logistic regression clustered by practice (primary analysis) and examined effect modification by level of clinician participation. An electronic health record-linked system for gathering ADHD rating scales from parents and teachers was implemented before the intervention period at all sites. Rating scale use was ascertained by manual chart review. One hundred five clinicians at 19 sites participated. Differences between arms were not significant. From the baseline to intervention period and after implementation of the electronic system, clinicians in both study arms were significantly more likely to administer and receive parent and teacher rating scales. Among intervention clinicians, those who participated in at least 1 feedback call or qualified for MOC credit were more likely to give parents rating scales with differences of 14.2 (95% confidence interval [CI], 0.6-27.7) and 18.8 (95% CI, 1.9-35.7) percentage points, respectively. A 3-part clinician-focused distance-learning, quality improvement intervention did not improve rating scale use. Complementary strategies that support workflows and more fully engage clinicians may be needed to bolster care. Electronic systems that gather rating scales may help achieve this goal. Index terms: ADHD, primary care, quality improvement, clinical decision support.
Gifford, Elizabeth V; Tavakoli, Sara; Weingardt, Kenneth R; Finney, John W; Pierson, Heather M; Rosen, Craig S; Hagedorn, Hildi J; Cook, Joan M; Curran, Geoff M
2012-01-01
Evidence-based psychological treatments (EBPTs) are clusters of interventions, but it is unclear how providers actually implement these clusters in practice. A disaggregated measure of EBPTs was developed to characterize clinicians' component-level evidence-based practices and to examine relationships among these practices. Survey items captured components of evidence-based treatments based on treatment integrity measures. The Web-based survey was conducted with 75 U.S. Department of Veterans Affairs (VA) substance use disorder (SUD) practitioners and 149 non-VA community-based SUD practitioners. Clinician's self-designated treatment orientations were positively related to their endorsement of those EBPT components; however, clinicians used components from a variety of EBPTs. Hierarchical cluster analysis indicated that clinicians combined and organized interventions from cognitive-behavioral therapy, the community reinforcement approach, motivational interviewing, structured family and couples therapy, 12-step facilitation, and contingency management into clusters including empathy and support, treatment engagement and activation, abstinence initiation, and recovery maintenance. Understanding how clinicians use EBPT components may lead to improved evidence-based practice dissemination and implementation. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Steinberg, P. D.; Bednar, J. A.; Rudiger, P.; Stevens, J. L. R.; Ball, C. E.; Christensen, S. D.; Pothina, D.
2017-12-01
The rich variety of software libraries available in the Python scientific ecosystem provides a flexible and powerful alternative to traditional integrated GIS (geographic information system) programs. Each such library focuses on doing a certain set of general-purpose tasks well, and Python makes it relatively simple to glue the libraries together to solve a wide range of complex, open-ended problems in Earth science. However, choosing an appropriate set of libraries can be challenging, and it is difficult to predict how much "glue code" will be needed for any particular combination of libraries and tasks. Here we present a set of libraries that have been designed to work well together to build interactive analyses and visualizations of large geographic datasets, in standard web browsers. The resulting workflows run on ordinary laptops even for billions of data points, and easily scale up to larger compute clusters when available. The declarative top-level interface used in these libraries means that even complex, fully interactive applications can be built and deployed as web services using only a few dozen lines of code, making it simple to create and share custom interactive applications even for datasets too large for most traditional GIS systems. The libraries we will cover include GeoViews (HoloViews extended for geographic applications) for declaring visualizable/plottable objects, Bokeh for building visual web applications from GeoViews objects, Datashader for rendering arbitrarily large datasets faithfully as fixed-size images, Param for specifying user-modifiable parameters that model your domain, Xarray for computing with n-dimensional array data, Dask for flexibly dispatching computational tasks across processors, and Numba for compiling array-based Python code down to fast machine code. We will show how to use the resulting workflow with static datasets and with simulators such as GSSHA or AdH, allowing you to deploy flexible, high-performance web-based dashboards for your GIS data or simulations without needing major investments in code development or maintenance.
Accounting Data to Web Interface Using PERL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hargeaves, C
2001-08-13
This document will explain the process to create a web interface for the accounting information generated by the High Performance Storage Systems (HPSS) accounting report feature. The accounting report contains useful data but it is not easily accessed in a meaningful way. The accounting report is the only way to see summarized storage usage information. The first step is to take the accounting data, make it meaningful and store the modified data in persistent databases. The second step is to generate the various user interfaces, HTML pages, that will be used to access the data. The third step is tomore » transfer all required files to the web server. The web pages pass parameters to Common Gateway Interface (CGI) scripts that generate dynamic web pages and graphs. The end result is a web page with specific information presented in text with or without graphs. The accounting report has a specific format that allows the use of regular expressions to verify if a line is storage data. Each storage data line is stored in a detailed database file with a name that includes the run date. The detailed database is used to create a summarized database file that also uses run date in its name. The summarized database is used to create the group.html web page that includes a list of all storage users. Scripts that query the database folder to build a list of available databases generate two additional web pages. A master script that is run monthly as part of a cron job, after the accounting report has completed, manages all of these individual scripts. All scripts are written in the PERL programming language. Whenever possible data manipulation scripts are written as filters. All scripts are written to be single source, which means they will function properly on both the open and closed networks at LLNL. The master script handles the command line inputs for all scripts, file transfers to the web server and records run information in a log file. The rest of the scripts manipulate the accounting data or use the files created to generate HTML pages. Each script will be described in detail herein. The following is a brief description of HPSS taken directly from an HPSS web site. ''HPSS is a major development project, which began in 1993 as a Cooperative Research and Development Agreement (CRADA) between government and industry. The primary objective of HPSS is to move very large data objects between high performance computers, workstation clusters, and storage libraries at speeds many times faster than is possible with today's software systems. For example, HPSS can manage parallel data transfers from multiple network-connected disk arrays at rates greater than 1 Gbyte per second, making it possible to access high definition digitized video in real time.'' The HPSS accounting report is a canned report whose format is controlled by the HPSS developers.« less
Crawling the Cosmic Web: An Exploration of Filamentary Structure
NASA Astrophysics Data System (ADS)
Bond, Nicholas A.; Strauss, M. A.; Cen, R.
2006-12-01
By analyzing the smoothed density field and its derivatives on a variety of scales, we can select strands from the cosmic web in a way which is consistent with our common sense understanding of a "filament". We present results from a twoand three-dimensional filament finder, run on both CDM simulations and a section of the SDSS spectroscopic sample. In both data sets, we will analyze the length and width distribution of filamentary structure and discuss its relation to galaxy clusters. Sources of contamination and error, such as "fingers of god", will also be addressed.
A Multi-Discipline, Multi-Genre Digital Library for Research and Education
NASA Technical Reports Server (NTRS)
Nelson, Michael L.; Maly, Kurt; Shen, Stewart N. T.
2004-01-01
We describe NCSTRL+, a unified, canonical digital library for educational and scientific and technical information (STI). NCSTRL+ is based on the Networked Computer Science Technical Report Library (NCSTRL), a World Wide Web (WWW) accessible digital library (DL) that provides access to over 100 university departments and laboratories. NCSTRL+ implements two new technologies: cluster functionality and publishing "buckets". We have extended the Dienst protocol, the protocol underlying NCSTRL, to provide the ability to "cluster" independent collections into a logically centralized digital library based upon subject category classification, type of organization, and genres of material. The concept of "buckets" provides a mechanism for publishing and managing logically linked entities with multiple data formats. The NCSTRL+ prototype DL contains the holdings of NCSTRL and the NASA Technical Report Server (NTRS). The prototype demonstrates the feasibility of publishing into a multi-cluster DL, searching across clusters, and storing and presenting buckets of information.
A Web-Based Development Environment for Collaborative Data Analysis
NASA Astrophysics Data System (ADS)
Erdmann, M.; Fischer, R.; Glaser, C.; Klingebiel, D.; Komm, M.; Müller, G.; Rieger, M.; Steggemann, J.; Urban, M.; Winchen, T.
2014-06-01
Visual Physics Analysis (VISPA) is a web-based development environment addressing high energy and astroparticle physics. It covers the entire analysis spectrum from the design and validation phase to the execution of analyses and the visualization of results. VISPA provides a graphical steering of the analysis flow, which consists of self-written, re-usable Python and C++ modules for more demanding tasks. All common operating systems are supported since a standard internet browser is the only software requirement for users. Even access via mobile and touch-compatible devices is possible. In this contribution, we present the most recent developments of our web application concerning technical, state-of-the-art approaches as well as practical experiences. One of the key features is the use of workspaces, i.e. user-configurable connections to remote machines supplying resources and local file access. Thereby, workspaces enable the management of data, computing resources (e.g. remote clusters or computing grids), and additional software either centralized or individually. We further report on the results of an application with more than 100 third-year students using VISPA for their regular particle physics exercises during the winter term 2012/13. Besides the ambition to support and simplify the development cycle of physics analyses, new use cases such as fast, location-independent status queries, the validation of results, and the ability to share analyses within worldwide collaborations with a single click become conceivable.
Low-energy collisions of helium clusters with size-selected cobalt cluster ions
NASA Astrophysics Data System (ADS)
Odaka, Hideho; Ichihashi, Masahiko
2017-04-01
Collisions of helium clusters with size-selected cobalt cluster ions, Com+ (m ≤ 5), were studied experimentally by using a merging beam technique. The product ions, Com+Hen (cluster complexes), were mass-analyzed, and this result indicates that more than 20 helium atoms can be attached onto Com+ at the relative velocities of 103 m/s. The measured size distributions of the cluster complexes indicate that there are relatively stable complexes: Co2+Hen (n = 2, 4, 6, and 12), Co3+Hen (n = 3, 6), Co4+He4, and Co5+Hen (n = 3, 6, 8, and 10). These stabilities are explained in terms of their geometric structures. The yields of the cluster complexes were also measured as a function of the relative velocity (1 × 102-4 × 103 m/s), and this result demonstrates that the main interaction in the collision process changes with the increase of the collision energy from the electrostatic interaction, which includes the induced deformation of HeN, to the hard-sphere interaction. Supplementary material in the form of one pdf file available from the Journal web page at http://https://doi.org/10.1140/epjd/e2017-80015-0
SCIMITAR: Scalable Stream-Processing for Sensor Information Brokering
2013-11-01
IaaS) cloud frameworks including Amazon Web Services and Eucalyptus . For load testing, we used The Grinder [9], a Java load testing framework that...internal Eucalyptus cluster which we could not scale as large as the Amazon environment due to a lack of computation resources. We recreated our
The Atlas of Chinese World Wide Web Ecosystem Shaped by the Collective Attention Flows
Lou, Xiaodan; Li, Yong; Gu, Weiwei; Zhang, Jiang
2016-01-01
The web can be regarded as an ecosystem of digital resources connected and shaped by collective successive behaviors of users. Knowing how people allocate limited attention on different resources is of great importance. To answer this, we embed the most popular Chinese web sites into a high dimensional Euclidean space based on the open flow network model of a large number of Chinese users’ collective attention flows, which both considers the connection topology of hyperlinks between the sites and the collective behaviors of the users. With these tools, we rank the web sites and compare their centralities based on flow distances with other metrics. We also study the patterns of attention flow allocation, and find that a large number of web sites concentrate on the central area of the embedding space, and only a small fraction of web sites disperse in the periphery. The entire embedding space can be separated into 3 regions(core, interim, and periphery). The sites in the core (1%) occupy a majority of the attention flows (40%), and the sites (34%) in the interim attract 40%, whereas other sites (65%) only take 20% flows. What’s more, we clustered the web sites into 4 groups according to their positions in the space, and found that similar web sites in contents and topics are grouped together. In short, by incorporating the open flow network model, we can clearly see how collective attention allocates and flows on different web sites, and how web sites connected each other. PMID:27812133
a Web-Based Platform for Visualizing Spatiotemporal Dynamics of Big Taxi Data
NASA Astrophysics Data System (ADS)
Xiong, H.; Chen, L.; Gui, Z.
2017-09-01
With more and more vehicles equipped with Global Positioning System (GPS), access to large-scale taxi trajectory data has become increasingly easy. Taxis are valuable sensors and information associated with taxi trajectory can provide unprecedented insight into many aspects of city life. But analysing these data presents many challenges. Visualization of taxi data is an efficient way to represent its distributions and structures and reveal hidden patterns in the data. However, Most of the existing visualization systems have some shortcomings. On the one hand, the passenger loading status and speed information cannot be expressed. On the other hand, mono-visualization form limits the information presentation. In view of these problems, this paper designs and implements a visualization system in which we use colour and shape to indicate passenger loading status and speed information and integrate various forms of taxi visualization. The main work as follows: 1. Pre-processing and storing the taxi data into MongoDB database. 2. Visualization of hotspots for taxi pickup points. Through DBSCAN clustering algorithm, we cluster the extracted taxi passenger's pickup locations to produce passenger hotspots. 3. Visualizing the dynamic of taxi moving trajectory using interactive animation. We use a thinning algorithm to reduce the amount of data and design a preloading strategyto load the data smoothly. Colour and shape are used to visualize the taxi trajectory data.
PSAT: A web tool to compare genomic neighborhoods of multiple prokaryotic genomes
Fong, Christine; Rohmer, Laurence; Radey, Matthew; Wasnick, Michael; Brittnacher, Mitchell J
2008-01-01
Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT) is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any web browser with no client side software setup or installation required. Source code is freely available to researchers interested in setting up a local version of PSAT for analysis of genomes not available through the public server. Access to the public web server and instructions for obtaining source code can be found at . PMID:18366802
Development and Evaluation of an Interactive WebQuest Environment: "Web Macerasi"
ERIC Educational Resources Information Center
Gulbahar, Yasemin; Madran, R. Orcun; Kalelioglu, Filiz
2010-01-01
This study was conducted to develop a web-based interactive system, Web Macerasi, for teaching-learning and evaluation purposes, and to find out the possible effects of this system. The study has two stages. In the first stage, a WebQuest site was designed as an interactive system in which various Internet and web technologies were used for…
Sloan Great Wall as a complex of superclusters with collapsing cores
NASA Astrophysics Data System (ADS)
Einasto, Maret; Lietzen, Heidi; Gramann, Mirt; Tempel, Elmo; Saar, Enn; Liivamägi, Lauri Juhan; Heinämäki, Pekka; Nurmi, Pasi; Einasto, Jaan
2016-10-01
Context. The formation and evolution of the cosmic web is governed by the gravitational attraction of dark matter and antigravity of dark energy (cosmological constant). In the cosmic web, galaxy superclusters or their high-density cores are the largest objects that may collapse at present or during the future evolution. Aims: We study the dynamical state and possible future evolution of galaxy superclusters from the Sloan Great Wall (SGW), the richest galaxy system in the nearby Universe. Methods: We calculated supercluster masses using dynamical masses of galaxy groups and stellar masses of galaxies. We employed normal mixture modelling to study the structure of rich SGW superclusters and search for components (cores) in superclusters. We analysed the radial mass distribution in the high-density cores of superclusters centred approximately at rich clusters and used the spherical collapse model to study their dynamical state. Results: The lower limit of the total mass of the SGW is approximately M = 2.5 × 1016 h-1 M⊙. Different mass estimators of superclusters agree well, the main uncertainties in masses of superclusters come from missing groups and clusters. We detected three high-density cores in the richest SGW supercluster (SCl 027) and two in the second richest supercluster (SCl 019). They have masses of 1.2 - 5.9 × 1015 h-1 M⊙ and sizes of up to ≈60 h-1 Mpc. The high-density cores of superclusters are very elongated, flattened perpendicularly to the line of sight. The comparison of the radial mass distribution in the high-density cores with the predictions of spherical collapse model suggests that their central regions with radii smaller than 8 h-1 Mpc and masses of up to M = 2 × 1015 h-1 M⊙ may be collapsing. Conclusions: The rich SGW superclusters with their high-density cores represent dynamically evolving environments for studies of the properties of galaxies and galaxy systems.
Alam, Zaid; Peddinti, Gopal
2017-01-01
Abstract The advent of polypharmacology paradigm in drug discovery calls for novel chemoinformatic tools for analyzing compounds’ multi-targeting activities. Such tools should provide an intuitive representation of the chemical space through capturing and visualizing underlying patterns of compound similarities linked to their polypharmacological effects. Most of the existing compound-centric chemoinformatics tools lack interactive options and user interfaces that are critical for the real-time needs of chemical biologists carrying out compound screening experiments. Toward that end, we introduce C-SPADE, an open-source exploratory web-tool for interactive analysis and visualization of drug profiling assays (biochemical, cell-based or cell-free) using compound-centric similarity clustering. C-SPADE allows the users to visually map the chemical diversity of a screening panel, explore investigational compounds in terms of their similarity to the screening panel, perform polypharmacological analyses and guide drug-target interaction predictions. C-SPADE requires only the raw drug profiling data as input, and it automatically retrieves the structural information and constructs the compound clusters in real-time, thereby reducing the time required for manual analysis in drug development or repurposing applications. The web-tool provides a customizable visual workspace that can either be downloaded as figure or Newick tree file or shared as a hyperlink with other users. C-SPADE is freely available at http://cspade.fimm.fi/. PMID:28472495
Massive gravity wrapped in the cosmic web
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shim, Junsup; Lee, Jounghun; Li, Baojiu, E-mail: jsshim@astro.snu.ac.kr, E-mail: jounghun@astro.snu.ac.kr
We study how the filamentary pattern of the cosmic web changes if the true gravity deviates from general relativity (GR) on a large scale. The f(R) gravity, whose strength is controlled to satisfy the current observational constraints on the cluster scale, is adopted as our fiducial model and a large, high-resolution N-body simulation is utilized for this study. By applying the minimal spanning tree algorithm to the halo catalogs from the simulation at various epochs, we identify the main stems of the rich superclusters located in the most prominent filamentary section of the cosmic web and determine their spatial extentsmore » per member cluster to be the degree of their straightness. It is found that the f(R) gravity has the effect of significantly bending the superclusters and that the effect becomes stronger as the universe evolves. Even in the case where the deviation from GR is too small to be detectable by any other observables, the degree of the supercluster straightness exhibits a conspicuous difference between the f(R) and the GR models. Our results also imply that the supercluster straightness could be a useful discriminator of f(R) gravity from the coupled dark energy since it is shown to evolve differently between the two models. As a final conclusion, the degree of the straightness of the rich superclusters should provide a powerful cosmological test of large scale gravity.« less
ClassLess: A Comprehensive Database of Young Stellar Objects
NASA Astrophysics Data System (ADS)
Hillenbrand, Lynne A.; baliber, nairn
2015-08-01
We have designed and constructed a database intended to house catalog and literature-published measurements of Young Stellar Objects (YSOs) within ~1 kpc of the Sun. ClassLess, so called because it includes YSOs in all stages of evolution, is a relational database in which user interaction is conducted via HTML web browsers, queries are performed in scientific language, and all data are linked to the sources of publication. Each star is associated with a cluster (or clusters), and both spatially resolved and unresolved measurements are stored, allowing proper use of data from multiple star systems. With this fully searchable tool, myriad ground- and space-based instruments and surveys across wavelength regimes can be exploited. In addition to primary measurements, the database self consistently calculates and serves higher level data products such as extinction, luminosity, and mass. As a result, searches for young stars with specific physical characteristics can be completed with just a few mouse clicks. We are in the database population phase now, and are eager to engage with interested experts worldwide on local galactic star formation and young stellar populations.
Scalable and cost-effective NGS genotyping in the cloud.
Souilmi, Yassine; Lancaster, Alex K; Jung, Jae-Yoon; Rizzo, Ettore; Hawkins, Jared B; Powles, Ryan; Amzazi, Saaïd; Ghazal, Hassan; Tonellato, Peter J; Wall, Dennis P
2015-10-15
While next-generation sequencing (NGS) costs have plummeted in recent years, cost and complexity of computation remain substantial barriers to the use of NGS in routine clinical care. The clinical potential of NGS will not be realized until robust and routine whole genome sequencing data can be accurately rendered to medically actionable reports within a time window of hours and at scales of economy in the 10's of dollars. We take a step towards addressing this challenge, by using COSMOS, a cloud-enabled workflow management system, to develop GenomeKey, an NGS whole genome analysis workflow. COSMOS implements complex workflows making optimal use of high-performance compute clusters. Here we show that the Amazon Web Service (AWS) implementation of GenomeKey via COSMOS provides a fast, scalable, and cost-effective analysis of both public benchmarking and large-scale heterogeneous clinical NGS datasets. Our systematic benchmarking reveals important new insights and considerations to produce clinical turn-around of whole genome analysis optimization and workflow management including strategic batching of individual genomes and efficient cluster resource configuration.
antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters
Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko
2015-01-01
Abstract Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software. PMID:25948579
Bao, Shunxing; Weitendorf, Frederick D; Plassard, Andrew J; Huo, Yuankai; Gokhale, Aniruddha; Landman, Bennett A
2017-02-11
The field of big data is generally concerned with the scale of processing at which traditional computational paradigms break down. In medical imaging, traditional large scale processing uses a cluster computer that combines a group of workstation nodes into a functional unit that is controlled by a job scheduler. Typically, a shared-storage network file system (NFS) is used to host imaging data. However, data transfer from storage to processing nodes can saturate network bandwidth when data is frequently uploaded/retrieved from the NFS, e.g., "short" processing times and/or "large" datasets. Recently, an alternative approach using Hadoop and HBase was presented for medical imaging to enable co-location of data storage and computation while minimizing data transfer. The benefits of using such a framework must be formally evaluated against a traditional approach to characterize the point at which simply "large scale" processing transitions into "big data" and necessitates alternative computational frameworks. The proposed Hadoop system was implemented on a production lab-cluster alongside a standard Sun Grid Engine (SGE). Theoretical models for wall-clock time and resource time for both approaches are introduced and validated. To provide real example data, three T1 image archives were retrieved from a university secure, shared web database and used to empirically assess computational performance under three configurations of cluster hardware (using 72, 109, or 209 CPU cores) with differing job lengths. Empirical results match the theoretical models. Based on these data, a comparative analysis is presented for when the Hadoop framework will be relevant and non-relevant for medical imaging.
NASA Astrophysics Data System (ADS)
Bao, Shunxing; Weitendorf, Frederick D.; Plassard, Andrew J.; Huo, Yuankai; Gokhale, Aniruddha; Landman, Bennett A.
2017-03-01
The field of big data is generally concerned with the scale of processing at which traditional computational paradigms break down. In medical imaging, traditional large scale processing uses a cluster computer that combines a group of workstation nodes into a functional unit that is controlled by a job scheduler. Typically, a shared-storage network file system (NFS) is used to host imaging data. However, data transfer from storage to processing nodes can saturate network bandwidth when data is frequently uploaded/retrieved from the NFS, e.g., "short" processing times and/or "large" datasets. Recently, an alternative approach using Hadoop and HBase was presented for medical imaging to enable co-location of data storage and computation while minimizing data transfer. The benefits of using such a framework must be formally evaluated against a traditional approach to characterize the point at which simply "large scale" processing transitions into "big data" and necessitates alternative computational frameworks. The proposed Hadoop system was implemented on a production lab-cluster alongside a standard Sun Grid Engine (SGE). Theoretical models for wall-clock time and resource time for both approaches are introduced and validated. To provide real example data, three T1 image archives were retrieved from a university secure, shared web database and used to empirically assess computational performance under three configurations of cluster hardware (using 72, 109, or 209 CPU cores) with differing job lengths. Empirical results match the theoretical models. Based on these data, a comparative analysis is presented for when the Hadoop framework will be relevant and nonrelevant for medical imaging.
Helmer, Stefanie M; Muellmann, Saskia; Zeeb, Hajo; Pischke, Claudia R
2016-03-11
Previous research suggests that perceptions of peer substance use are associated with personal use. Specifically, overestimating use in the peer group is predictive of higher rates of personal substance use. 'Social norms'-interventions are based on the premise that changing these misperceived social norms regarding substance use by providing feedback on actual norms is associated with a reduction in personal substance use. Studies conducted in the U.S.A. suggest that 'social norms'-feedback is an effective strategy for reducing substance use among university students. It is unknown whether the effects of a 'social norms'-feedback on substance use can be replicated in a sample of German university students. The objective of this article is to describe the study design and aims of the 'INternet-based Social norms-Intervention for the prevention of substance use among Students' (INSIST)-study, a cluster-controlled trial examining the effects of a web-based 'social norms'- intervention in students enrolled at four intervention universities with those enrolled at four delayed intervention control universities. The INSIST-study is funded by the German Federal Ministry of Health. Eight universities in four regions in Germany will take part in the study, four serving as intervention and four as delayed intervention control universities (randomly selected within a geographic region). Six hundred students will be recruited at each university and will be asked to complete a web-based survey assessing personal and perceived substance use/attitudes towards substance use at baseline. These data will be used to develop the web-based 'social norms'-feedback tailored to gender and university. Three months after the baseline survey, students at intervention universities will receive the intervention. Two months after the launch of the intervention, students of all eight universities will be asked to complete the follow-up questionnaires to assess changes in perceptions of/attitudes toward peer substance use and rates of personal substance use. This study is the first German cluster-controlled trial investigating the influence of a web-based 'social norms'-intervention on perceptions of/attitudes towards substance use and substance use behavior in a large university student sample. This study will provide new information on the efficacy of this intervention strategy in the German university context. DRKS00007635 at the 'German Clinical Trials Register' (17.12.2014).
Personalization of Rule-based Web Services.
Choi, Okkyung; Han, Sang Yong
2008-04-04
Nowadays Web users have clearly expressed their wishes to receive personalized services directly. Personalization is the way to tailor services directly to the immediate requirements of the user. However, the current Web Services System does not provide any features supporting this such as consideration of personalization of services and intelligent matchmaking. In this research a flexible, personalized Rule-based Web Services System to address these problems and to enable efficient search, discovery and construction across general Web documents and Semantic Web documents in a Web Services System is proposed. This system utilizes matchmaking among service requesters', service providers' and users' preferences using a Rule-based Search Method, and subsequently ranks search results. A prototype of efficient Web Services search and construction for the suggested system is developed based on the current work.
NASA Astrophysics Data System (ADS)
Lucas, Ray A.; Rohde, David; Tamura, Takayuki; van Dyne, Jeffrey
At the first NVO Summer School in September 2004, a complete sample of Texas Radio Survey sources, first derived in 1989 and subsequently observed with the VLA in A-array snapshot mode in 1990, was revisited. The original investigators had never had the occasion to reduce the A-array 5-minute snapshot data, nor to do any other significant follow-up, though the sample still seemed a possibly useful but relatively small study of radio galaxies, AGN, quasars, extragalactic sources, and galaxy clusters, etc. At the time of the original sample definition in late 1989, the best optical material available for the region was the SRC-J plate from the UK Schmidt Telescope in Australia. In much more recent times, the Sloan Digital Sky Survey has included the region in its DR2 data release, so good multicolor optical imaging in a number of standard bandpasses has finally become available. These data, along with other material in the radio, infrared, and (where available) were used to get a better preliminary idea of the nature of the objects in the 1989 sample. We also investigated one of the original questions: whether these radio sources with steeper (or at least non-flat) radio spectra were associated with galaxy clusters, and in some cases higher-redshift galaxy clusters and AGN. A rudimentary web service was created which allowed the user to perform simple cone searches and SIAP image extractions of specified field sizes for multiwavelength data across the electromagnetic spectrum, and a prototype web page was set up which would display the resulting images in wavelength order across the page for sources in the sample. Finally, as an additional investigation, using radio and X-ray IDs as a proxy for AGN which might be associated with large, central cluster galaxies, positional matches of radio and X-ray sources from two much larger catalogs were done using the tool TOPCAT in order to search for the degree of correlation between ID positions, radio luminosity, and cluster ID positions. It was hoped that cross-correlated matches could possibly give some clue to the relationship of these radio sources to galaxy clusters. These preliminary results need more in-depth investigation and are currently being pursued via a NVO grant to the first author. The original VLA 5-minute A-array snapshots have also now been reduced and are complementary in nature to the VLA FIRST data. It is planned to eventually make these reduced VLA A-array data publicas part of a web service via the NVO facilities along with a table of multiwavelength properties for the sources in VOTable format.
Organic dairy farmers put more emphasis on production traits than conventional farmers.
Slagboom, M; Kargo, M; Edwards, D; Sørensen, A C; Thomasen, J R; Hjortø, L
2016-12-01
The overall aim of this research was to characterize the preferences of Danish dairy farmers for improvements in breeding goal traits. The specific aims were (1) to investigate the presence of heterogeneity in farmers' preferences by means of cluster analysis, and (2) to associate these clusters with herd characteristics and production systems (organic or conventional). We established a web-based survey to characterize the preferences of farmers for improvements in 10 traits, by means of pairwise rankings. We also collected a considerable number of herd characteristics. Overall, 106 organic farmers and 290 conventional farmers answered the survey, all with Holstein cows. The most preferred trait improvement was cow fertility, and the least preferred was calving difficulty. By means of cluster analysis, we identified 4 distinct clusters of farmers and named them according to the trait improvements that were most preferred: Health and Fertility, Production and Udder Health, Survival, and Fertility and Production. Some herd characteristics differed between clusters; for example, farmers in the Survival cluster had twice the percentage of dead cows in their herds compared with the other clusters, and farmers that gave the highest ranking to cow and heifer fertility had the lowest conception rate in their herds. This finding suggests that farmers prefer to improve traits that are more problematic in their herd. The proportion of organic and conventional farmers also differed between clusters; we found a higher proportion of organic farmers in the production-based clusters. When we analyzed organic and conventional data separately, we found that organic farmers ranked production traits higher than conventional farmers. The herds of organic farmers had lower milk yields and lower disease incidences, which might explain the high ranking of milk production and the low ranking of disease traits. This study shows that heterogeneity exists in farmers' preferences for improvements in breeding goal traits, that organic and conventional farmers differ in their preferences, and that herd characteristics can be linked to different farmer clusters. The results of this study could be used for the future development of breeding goals in Danish Holstein cows and for the development of customized total merit indices based on farmer preferences. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Abstracting application deployment on Cloud infrastructures
NASA Astrophysics Data System (ADS)
Aiftimiei, D. C.; Fattibene, E.; Gargana, R.; Panella, M.; Salomoni, D.
2017-10-01
Deploying a complex application on a Cloud-based infrastructure can be a challenging task. In this contribution we present an approach for Cloud-based deployment of applications and its present or future implementation in the framework of several projects, such as “!CHAOS: a cloud of controls” [1], a project funded by MIUR (Italian Ministry of Research and Education) to create a Cloud-based deployment of a control system and data acquisition framework, “INDIGO-DataCloud” [2], an EC H2020 project targeting among other things high-level deployment of applications on hybrid Clouds, and “Open City Platform”[3], an Italian project aiming to provide open Cloud solutions for Italian Public Administrations. We considered to use an orchestration service to hide the complex deployment of the application components, and to build an abstraction layer on top of the orchestration one. Through Heat [4] orchestration service, we prototyped a dynamic, on-demand, scalable platform of software components, based on OpenStack infrastructures. On top of the orchestration service we developed a prototype of a web interface exploiting the Heat APIs. The user can start an instance of the application without having knowledge about the underlying Cloud infrastructure and services. Moreover, the platform instance can be customized by choosing parameters related to the application such as the size of a File System or the number of instances of a NoSQL DB cluster. As soon as the desired platform is running, the web interface offers the possibility to scale some infrastructure components. In this contribution we describe the solution design and implementation, based on the application requirements, the details of the development of both the Heat templates and of the web interface, together with possible exploitation strategies of this work in Cloud data centers.
flexCloud: Deployment of the FLEXPART Atmospheric Transport Model as a Cloud SaaS Environment
NASA Astrophysics Data System (ADS)
Morton, Don; Arnold, Dèlia
2014-05-01
FLEXPART (FLEXible PARTicle dispersion model) is a Lagrangian transport and dispersion model used by a growing international community. We have used it to simulate and forecast the atmospheric transport of wildfire smoke, volcanic ash and radionuclides. Additionally, FLEXPART may be run in backwards mode to provide information for the determination of emission sources such as nuclear emissions and greenhouse gases. This open source software is distributed in source code form, and has several compiler and library dependencies that users need to address. Although well-documented, getting it compiled, set up, running, and post-processed is often tedious, making it difficult for the inexperienced user. Our interest is in moving scientific modeling and simulation activities from site-specific clusters and supercomputers to a cloud model as a service paradigm. Choosing FLEXPART for our prototyping, our vision is to construct customised IaaS images containing fully-compiled and configured FLEXPART codes, including pre-processing, execution and postprocessing components. In addition, with the inclusion of a small web server in the image, we introduce a web-accessible graphical user interface that drives the system. A further initiative being pursued is the deployment of multiple, simultaneous FLEXPART ensembles in the cloud. A single front-end web interface is used to define the ensemble members, and separate cloud instances are launched, on-demand, to run the individual models and to conglomerate the outputs into a unified display. The outcome of this work is a Software as a Service (Saas) deployment whereby the details of the underlying modeling systems are hidden, allowing modelers to perform their science activities without the burden of considering implementation details.
Wireless, Web-Based Interactive Control of Optical Coherence Tomography with Mobile Devices.
Mehta, Rajvi; Nankivil, Derek; Zielinski, David J; Waterman, Gar; Keller, Brenton; Limkakeng, Alexander T; Kopper, Regis; Izatt, Joseph A; Kuo, Anthony N
2017-01-01
Optical coherence tomography (OCT) is widely used in ophthalmology clinics and has potential for more general medical settings and remote diagnostics. In anticipation of remote applications, we developed wireless interactive control of an OCT system using mobile devices. A web-based user interface (WebUI) was developed to interact with a handheld OCT system. The WebUI consisted of key OCT displays and controls ported to a webpage using HTML and JavaScript. Client-server relationships were created between the WebUI and the OCT system computer. The WebUI was accessed on a cellular phone mounted to the handheld OCT probe to wirelessly control the OCT system. Twenty subjects were imaged using the WebUI to assess the system. System latency was measured using different connection types (wireless 802.11n only, wireless to remote virtual private network [VPN], and cellular). Using a cellular phone, the WebUI was successfully used to capture posterior eye OCT images in all subjects. Simultaneous interactivity by a remote user on a laptop was also demonstrated. On average, use of the WebUI added only 58, 95, and 170 ms to the system latency using wireless only, wireless to VPN, and cellular connections, respectively. Qualitatively, operator usage was not affected. Using a WebUI, we demonstrated wireless and remote control of an OCT system with mobile devices. The web and open source software tools used in this project make it possible for any mobile device to potentially control an OCT system through a WebUI. This platform can be a basis for remote, teleophthalmology applications using OCT.
Jaeger, Sébastien; Thieffry, Denis
2017-01-01
Abstract Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif discovery from genome-scale data sets (e.g. ChIP-seq) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant TFBM collections. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, and its capability to simultaneously treat multiple collections from various sources. We demonstrate that matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools, and highlights biologically relevant variations of similar motifs. We also ran a large-scale application to cluster ∼11 000 motifs from 24 entire databases, showing that matrix-clustering correctly groups motifs belonging to the same TF families, and drastically reduced motif redundancy. matrix-clustering is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines. PMID:28591841
Special issue on cluster algebras in mathematical physics
NASA Astrophysics Data System (ADS)
Di Francesco, Philippe; Gekhtman, Michael; Kuniba, Atsuo; Yamazaki, Masahito
2013-10-01
This is a call for contributions to a special issue of Journal of Physics A: Mathematical and Theoretical dedicated to cluster algebras in mathematical physics. Over the ten years since their introduction by Fomin and Zelevinsky, the theory of cluster algebras has witnessed a spectacular growth, first and foremost due to the many links that have been discovered with a wide range of subjects in mathematics and, increasingly, theoretical and mathematical physics. The main motivation of this special issue is to gather together reviews, recent developments and open problems, mainly from a mathematical physics viewpoint, into a single comprehensive issue. We expect that such a special issue will become a valuable reference for the broad scientific community working in mathematical and theoretical physics. The issue will consist of invited review articles and contributed papers containing new results on the interplays of cluster algebras with mathematical physics. Editorial policy The Guest Editors for this issue are Philippe Di Francesco, Michael Gekhtman, Atsuo Kuniba and Masahito Yamazaki. The areas and topics for this issue include, but are not limited to: discrete integrable systems arising from cluster mutations cluster structure on Poisson varieties cluster algebras and soliton interactions cluster positivity conjecture Y-systems in the thermodynamic Bethe ansatz and Zamolodchikov's periodicity conjecture T-system of transfer matrices of integrable lattice models dilogarithm identities in conformal field theory wall crossing in 4d N = 2 supersymmetric gauge theories 4d N = 1 quiver gauge theories described by networks scattering amplitudes of 4d N = 4 theories 3d N = 2 gauge theories described by flat connections on 3-manifolds integrability of dimer/Ising models on graphs. All contributions will be refereed and processed according to the usual procedure of the journal. Guidelines for preparation of contributions The deadline for contributed papers is 31 March 2014. This deadline will allow the special issue to appear at the end of 2014. There is no strict regulation on article size, but as a guide the preferable size is 15-30 pages for contributed papers and 40-60 pages for reviews. Further advice on publishing your work in Journal of Physics A may be found at iopscience.iop.org/jphysa. Contributions to the special issue should be submitted by web upload via authors.iop.org/, or by email to jphysa@iop.org, quoting 'JPhysA special issue on cluster algebras in mathematical physics'. Submissions should ideally be in standard LaTeX form. Please see the website for further information on electronic submissions. All contributions should be accompanied by a read-me file or covering letter giving the postal and e-mail addresses for correspondence. The Publishing Office should be notified of any subsequent change of address. The special issue will be published in the print and online versions of the journal.
Future View: Web Navigation based on Learning User's Browsing Strategy
NASA Astrophysics Data System (ADS)
Nagino, Norikatsu; Yamada, Seiji
In this paper, we propose a Future View system that assists user's usual Web browsing. The Future View will prefetch Web pages based on user's browsing strategies and present them to a user in order to assist Web browsing. To learn user's browsing strategy, the Future View uses two types of learning classifier systems: a content-based classifier system for contents change patterns and an action-based classifier system for user's action patterns. The results of learning is applied to crawling by Web robots, and the gathered Web pages are presented to a user through a Web browser interface. We experimentally show effectiveness of navigation using the Future View.
Migration Related Socio-Cultural Changes and e-Learning in a European Globalising Society
ERIC Educational Resources Information Center
Leman, Johan; Trappers, Ann; Brandon, Emily; Ruppol, Xavier
2008-01-01
OECD figures (1998-2002) reveal a sharply increasing flow of foreign workers into European countries. Ethnic diversification has become a generalized matter of fact. At the same time, rapidly developing technology and "intellectual globalization" processes--the world wide web--have also become a reality. This complex cluster of changes…
Proclaimed Graduate Attributes of Australian Universities: Patterns, Problems and Prospects
ERIC Educational Resources Information Center
Donleavy, Gabriel
2012-01-01
Purpose: Graduate attributes are about to be policed by the Tertiary Education Quality and Standards Agency (TEQSA) in Australia. All universities proclaim them on their public web sites. The aim of this paper is to determine whether distinct patterns or clusters are apparent in the declared graduate attributes declared by Australian universities…
Beam Dynamics Simulation Platform and Studies of Beam Breakup in Dielectric Wakefield Structures
NASA Astrophysics Data System (ADS)
Schoessow, P.; Kanareykin, A.; Jing, C.; Kustov, A.; Altmark, A.; Gai, W.
2010-11-01
A particle-Green's function beam dynamics code (BBU-3000) to study beam breakup effects is incorporated into a parallel computing framework based on the Boinc software environment, and supports both task farming on a heterogeneous cluster and local grid computing. User access to the platform is through a web browser.
Shin, Sung Hee; Yun, Eun Kyoung
2011-06-01
This study was conducted to explore the profiles of online health information users in terms of certain psychological characteristics and to suggest guidelines for the provision of better user-oriented health information service. The cross-sectional study design was used with convenient sampling by Web-based questionnaire survey in Korea. To analyze health information user profiles on the Internet, a two-step cluster analysis was conducted. The results reveal that online health information users can be classified into four groups according to their level of subjective knowledge and health concern. The findings also suggest that four clusters that exhibit distinct profile patterns exist. The findings of this study would be useful for health portal developers who would like to understand users' characteristics and behaviors and to provide more user-oriented service in a satisfactory manner. It is suggested that to develop a full understanding of users' behaviors regarding Internet health information service, further research would be needed to explore users' various needs, their preferences, and relevant factors among users across a variety of health problem-addressing Web sites at different professional levels.
Astronomy Learning Activities for Tablets
NASA Astrophysics Data System (ADS)
Pilachowski, Catherine A.; Morris, Frank
2015-08-01
Four web-based tools allow students to manipulate astronomical data to learn concepts in astronomy. The tools are HTML5, CSS3, Javascript-based applications that provide access to the content on iPad and Android tablets. The first tool “Three Color” allows students to combine monochrome astronomical images taken through different color filters or in different wavelength regions into a single color image. The second tool “Star Clusters” allows students to compare images of stars in clusters with a pre-defined template of colors and sizes in order to produce color-magnitude diagrams to determine cluster ages. The third tool adapts Travis Rector’s “NovaSearch” to allow students to examine images of the central regions of the Andromeda Galaxy to find novae. After students find a nova, they are able to measure the time over which the nova fades away. A fourth tool, Proper Pair, allows students to interact with Hipparcos data to evaluate close double stars are physical binaries or chance superpositions. Further information and access to these web-based tools are available at www.astro.indiana.edu/ala/.
NASA Astrophysics Data System (ADS)
Tachibana, Aiko; Watanabe, Yuko; Moteki, Masato; Hosie, Graham W.; Ishimaru, Takashi
2017-06-01
Copepods are one of the most important components of the Southern Ocean food web, and are widely distributed from surface to deeper waters. We conducted discrete depth sampling to clarify the community structure of copepods from the epi- to bathypelagic layers of the oceanic and neritic waters off Adélie and George V Land, East Antarctica, in the austral summer of 2008. Notably high diversity and species numbers were observed in the meso- and bathypelagic layers. Cluster analysis based on the similarity of copepod communities identified seven cluster groups, which corresponded well with water masses. In the epi- and upper- mesopelagic layers of the oceanic zone, the SB (Southern Boundary of the Antarctic Circumpolar Current) divided copepod communities. Conversely, in the lower meso- and bathypelagic layers (500-2000 m depth), communities were consistent across the SB. In these layers, the distributions of copepod species were separated by habitat depth ranges and feeding behaviour. The different food webs occur in the epipelagic layer with habitat segregation by zooplankton in their horizontal distribution ranges.
A new probe of the magnetic field power spectrum in cosmic web filaments
NASA Astrophysics Data System (ADS)
Hales, Christopher A.; Greiner, Maksim; Ensslin, Torsten A.
2015-08-01
Establishing the properties of magnetic fields on scales larger than galaxy clusters is critical for resolving the unknown origin and evolution of galactic and cluster magnetism. More generally, observations of magnetic fields on cosmic scales are needed for assessing the impacts of magnetism on cosmology, particle physics, and structure formation over the full history of the Universe. However, firm observational evidence for magnetic fields in large scale structure remains elusive. In an effort to address this problem, we have developed a novel statistical method to infer the magnetic field power spectrum in cosmic web filaments using observation of the two-point correlation of Faraday rotation measures from a dense grid of extragalactic radio sources. Here we describe our approach, which embeds and extends the pioneering work of Kolatt (1998) within the context of Information Field Theory (a statistical theory for Bayesian inference on spatially distributed signals; Enfllin et al., 2009). We describe prospects for observation, for example with forthcoming data from the ultra-deep JVLA CHILES Con Pol survey and future surveys with the SKA.
Pathway Distiller - multisource biological pathway consolidation
2012-01-01
Background One method to understand and evaluate an experiment that produces a large set of genes, such as a gene expression microarray analysis, is to identify overrepresentation or enrichment for biological pathways. Because pathways are able to functionally describe the set of genes, much effort has been made to collect curated biological pathways into publicly accessible databases. When combining disparate databases, highly related or redundant pathways exist, making their consolidation into pathway concepts essential. This will facilitate unbiased, comprehensive yet streamlined analysis of experiments that result in large gene sets. Methods After gene set enrichment finds representative pathways for large gene sets, pathways are consolidated into representative pathway concepts. Three complementary, but different methods of pathway consolidation are explored. Enrichment Consolidation combines the set of the pathways enriched for the signature gene list through iterative combining of enriched pathways with other pathways with similar signature gene sets; Weighted Consolidation utilizes a Protein-Protein Interaction network based gene-weighting approach that finds clusters of both enriched and non-enriched pathways limited to the experiments' resultant gene list; and finally the de novo Consolidation method uses several measurements of pathway similarity, that finds static pathway clusters independent of any given experiment. Results We demonstrate that the three consolidation methods provide unified yet different functional insights of a resultant gene set derived from a genome-wide profiling experiment. Results from the methods are presented, demonstrating their applications in biological studies and comparing with a pathway web-based framework that also combines several pathway databases. Additionally a web-based consolidation framework that encompasses all three methods discussed in this paper, Pathway Distiller (http://cbbiweb.uthscsa.edu/PathwayDistiller), is established to allow researchers access to the methods and example microarray data described in this manuscript, and the ability to analyze their own gene list by using our unique consolidation methods. Conclusions By combining several pathway systems, implementing different, but complementary pathway consolidation methods, and providing a user-friendly web-accessible tool, we have enabled users the ability to extract functional explanations of their genome wide experiments. PMID:23134636
Pathway Distiller - multisource biological pathway consolidation.
Doderer, Mark S; Anguiano, Zachry; Suresh, Uthra; Dashnamoorthy, Ravi; Bishop, Alexander J R; Chen, Yidong
2012-01-01
One method to understand and evaluate an experiment that produces a large set of genes, such as a gene expression microarray analysis, is to identify overrepresentation or enrichment for biological pathways. Because pathways are able to functionally describe the set of genes, much effort has been made to collect curated biological pathways into publicly accessible databases. When combining disparate databases, highly related or redundant pathways exist, making their consolidation into pathway concepts essential. This will facilitate unbiased, comprehensive yet streamlined analysis of experiments that result in large gene sets. After gene set enrichment finds representative pathways for large gene sets, pathways are consolidated into representative pathway concepts. Three complementary, but different methods of pathway consolidation are explored. Enrichment Consolidation combines the set of the pathways enriched for the signature gene list through iterative combining of enriched pathways with other pathways with similar signature gene sets; Weighted Consolidation utilizes a Protein-Protein Interaction network based gene-weighting approach that finds clusters of both enriched and non-enriched pathways limited to the experiments' resultant gene list; and finally the de novo Consolidation method uses several measurements of pathway similarity, that finds static pathway clusters independent of any given experiment. We demonstrate that the three consolidation methods provide unified yet different functional insights of a resultant gene set derived from a genome-wide profiling experiment. Results from the methods are presented, demonstrating their applications in biological studies and comparing with a pathway web-based framework that also combines several pathway databases. Additionally a web-based consolidation framework that encompasses all three methods discussed in this paper, Pathway Distiller (http://cbbiweb.uthscsa.edu/PathwayDistiller), is established to allow researchers access to the methods and example microarray data described in this manuscript, and the ability to analyze their own gene list by using our unique consolidation methods. By combining several pathway systems, implementing different, but complementary pathway consolidation methods, and providing a user-friendly web-accessible tool, we have enabled users the ability to extract functional explanations of their genome wide experiments.
Goldszal, A F; Brown, G K; McDonald, H J; Vucich, J J; Staab, E V
2001-06-01
In this work, we describe the digital imaging network (DIN), picture archival and communication system (PACS), and radiology information system (RIS) currently being implemented at the Clinical Center, National Institutes of Health (NIH). These systems are presently in clinical operation. The DIN is a redundant meshed network designed to address gigabit density and expected high bandwidth requirements for image transfer and server aggregation. The PACS projected workload is 5.0 TB of new imaging data per year. Its architecture consists of a central, high-throughput Digital Imaging and Communications in Medicine (DICOM) data repository and distributed redundant array of inexpensive disks (RAID) servers employing fiber-channel technology for immediate delivery of imaging data. On demand distribution of images and reports to clinicians and researchers is accomplished via a clustered web server. The RIS follows a client-server model and provides tools to order exams, schedule resources, retrieve and review results, and generate management reports. The RIS-hospital information system (HIS) interfaces include admissions, discharges, and transfers (ATDs)/demographics, orders, appointment notifications, doctors update, and results.
EasyLCMS: an asynchronous web application for the automated quantification of LC-MS data
2012-01-01
Background Downstream applications in metabolomics, as well as mathematical modelling, require data in a quantitative format, which may also necessitate the automated and simultaneous quantification of numerous metabolites. Although numerous applications have been previously developed for metabolomics data handling, automated calibration and calculation of the concentrations in terms of μmol have not been carried out. Moreover, most of the metabolomics applications are designed for GC-MS, and would not be suitable for LC-MS, since in LC, the deviation in the retention time is not linear, which is not taken into account in these applications. Moreover, only a few are web-based applications, which could improve stand-alone software in terms of compatibility, sharing capabilities and hardware requirements, even though a strong bandwidth is required. Furthermore, none of these incorporate asynchronous communication to allow real-time interaction with pre-processed results. Findings Here, we present EasyLCMS (http://www.easylcms.es/), a new application for automated quantification which was validated using more than 1000 concentration comparisons in real samples with manual operation. The results showed that only 1% of the quantifications presented a relative error higher than 15%. Using clustering analysis, the metabolites with the highest relative error distributions were identified and studied to solve recurrent mistakes. Conclusions EasyLCMS is a new web application designed to quantify numerous metabolites, simultaneously integrating LC distortions and asynchronous web technology to present a visual interface with dynamic interaction which allows checking and correction of LC-MS raw data pre-processing results. Moreover, quantified data obtained with EasyLCMS are fully compatible with numerous downstream applications, as well as for mathematical modelling in the systems biology field. PMID:22884039
A bibliometric and visual analysis of global geo-ontology research
NASA Astrophysics Data System (ADS)
Li, Lin; Liu, Yu; Zhu, Haihong; Ying, Shen; Luo, Qinyao; Luo, Heng; Kuai, Xi; Xia, Hui; Shen, Hang
2017-02-01
In this paper, the results of a bibliometric and visual analysis of geo-ontology research articles collected from the Web of Science (WOS) database between 1999 and 2014 are presented. The numbers of national institutions and published papers are visualized and a global research heat map is drawn, illustrating an overview of global geo-ontology research. In addition, we present a chord diagram of countries and perform a visual cluster analysis of a knowledge co-citation network of references, disclosing potential academic communities and identifying key points, main research areas, and future research trends. The International Journal of Geographical Information Science, Progress in Human Geography, and Computers & Geosciences are the most active journals. The USA makes the largest contributions to geo-ontology research by virtue of its highest numbers of independent and collaborative papers, and its dominance was also confirmed in the country chord diagram. The majority of institutions are in the USA, Western Europe, and Eastern Asia. Wuhan University, University of Munster, and the Chinese Academy of Sciences are notable geo-ontology institutions. Keywords such as "Semantic Web," "GIS," and "space" have attracted a great deal of attention. "Semantic granularity in ontology-driven geographic information systems, "Ontologies in support of activities in geographical space" and "A translation approach to portable ontology specifications" have the highest cited centrality. Geographical space, computer-human interaction, and ontology cognition are the three main research areas of geo-ontology. The semantic mismatch between the producers and users of ontology data as well as error propagation in interdisciplinary and cross-linguistic data reuse needs to be solved. In addition, the development of geo-ontology modeling primitives based on OWL (Web Ontology Language)and finding methods to automatically rework data in Semantic Web are needed. Furthermore, the topological relations between geographical entities still require further study.
Carroll, Adam J; Badger, Murray R; Harvey Millar, A
2010-07-14
Standardization of analytical approaches and reporting methods via community-wide collaboration can work synergistically with web-tool development to result in rapid community-driven expansion of online data repositories suitable for data mining and meta-analysis. In metabolomics, the inter-laboratory reproducibility of gas-chromatography/mass-spectrometry (GC/MS) makes it an obvious target for such development. While a number of web-tools offer access to datasets and/or tools for raw data processing and statistical analysis, none of these systems are currently set up to act as a public repository by easily accepting, processing and presenting publicly submitted GC/MS metabolomics datasets for public re-analysis. Here, we present MetabolomeExpress, a new File Transfer Protocol (FTP) server and web-tool for the online storage, processing, visualisation and statistical re-analysis of publicly submitted GC/MS metabolomics datasets. Users may search a quality-controlled database of metabolite response statistics from publicly submitted datasets by a number of parameters (eg. metabolite, species, organ/biofluid etc.). Users may also perform meta-analysis comparisons of multiple independent experiments or re-analyse public primary datasets via user-friendly tools for t-test, principal components analysis, hierarchical cluster analysis and correlation analysis. They may interact with chromatograms, mass spectra and peak detection results via an integrated raw data viewer. Researchers who register for a free account may upload (via FTP) their own data to the server for online processing via a novel raw data processing pipeline. MetabolomeExpress https://www.metabolome-express.org provides a new opportunity for the general metabolomics community to transparently present online the raw and processed GC/MS data underlying their metabolomics publications. Transparent sharing of these data will allow researchers to assess data quality and draw their own insights from published metabolomics datasets.
DOORS to the semantic web and grid with a PORTAL for biomedical computing.
Taswell, Carl
2008-03-01
The semantic web remains in the early stages of development. It has not yet achieved the goals envisioned by its founders as a pervasive web of distributed knowledge and intelligence. Success will be attained when a dynamic synergism can be created between people and a sufficient number of infrastructure systems and tools for the semantic web in analogy with those for the original web. The domain name system (DNS), web browsers, and the benefits of publishing web pages motivated many people to register domain names and publish web sites on the original web. An analogous resource label system, semantic search applications, and the benefits of collaborative semantic networks will motivate people to register resource labels and publish resource descriptions on the semantic web. The Domain Ontology Oriented Resource System (DOORS) and Problem Oriented Registry of Tags and Labels (PORTAL) are proposed as infrastructure systems for resource metadata within a paradigm that can serve as a bridge between the original web and the semantic web. The Internet Registry Information Service (IRIS) registers [corrected] domain names while DNS publishes domain addresses with mapping of names to addresses for the original web. Analogously, PORTAL registers resource labels and tags while DOORS publishes resource locations and descriptions with mapping of labels to locations for the semantic web. BioPORT is proposed as a prototype PORTAL registry specific for the problem domain of biomedical computing.
Data Mining Web Services for Science Data Repositories
NASA Astrophysics Data System (ADS)
Graves, S.; Ramachandran, R.; Keiser, K.; Maskey, M.; Lynnes, C.; Pham, L.
2006-12-01
The maturation of web services standards and technologies sets the stage for a distributed "Service-Oriented Architecture" (SOA) for NASA's next generation science data processing. This architecture will allow members of the scientific community to create and combine persistent distributed data processing services and make them available to other users over the Internet. NASA has initiated a project to create a suite of specialized data mining web services designed specifically for science data. The project leverages the Algorithm Development and Mining (ADaM) toolkit as its basis. The ADaM toolkit is a robust, mature and freely available science data mining toolkit that is being used by several research organizations and educational institutions worldwide. These mining services will give the scientific community a powerful and versatile data mining capability that can be used to create higher order products such as thematic maps from current and future NASA satellite data records with methods that are not currently available. The package of mining and related services are being developed using Web Services standards so that community-based measurement processing systems can access and interoperate with them. These standards-based services allow users different options for utilizing them, from direct remote invocation by a client application to deployment of a Business Process Execution Language (BPEL) solutions package where a complex data mining workflow is exposed to others as a single service. The ability to deploy and operate these services at a data archive allows the data mining algorithms to be run where the data are stored, a more efficient scenario than moving large amounts of data over the network. This will be demonstrated in a scenario in which a user uses a remote Web-Service-enabled clustering algorithm to create cloud masks from satellite imagery at the Goddard Earth Sciences Data and Information Services Center (GES DISC).
Automatic Generation of Data Types for Classification of Deep Web Sources
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ngu, A H; Buttler, D J; Critchlow, T J
2005-02-14
A Service Class Description (SCD) is an effective meta-data based approach for discovering Deep Web sources whose data exhibit some regular patterns. However, it is tedious and error prone to create an SCD description manually. Moreover, a manually created SCD is not adaptive to the frequent changes of Web sources. It requires its creator to identify all the possible input and output types of a service a priori. In many domains, it is impossible to exhaustively list all the possible input and output data types of a source in advance. In this paper, we describe machine learning approaches for automaticmore » generation of the data types of an SCD. We propose two different approaches for learning data types of a class of Web sources. The Brute-Force Learner is able to generate data types that can achieve high recall, but with low precision. The Clustering-based Learner generates data types that have a high precision rate, but with a lower recall rate. We demonstrate the feasibility of these two learning-based solutions for automatic generation of data types for citation Web sources and presented a quantitative evaluation of these two solutions.« less
Quality evaluation on an e-learning system in continuing professional education of nurses.
Lin, I-Chun; Chien, Yu-Mei; Chang, I-Chiu
2006-01-01
Maintaining high quality in Web-based learning is a powerful means of increasing the overall efficiency and effectiveness of distance learning. Many studies have evaluated Web-based learning but seldom evaluate from the information systems (IS) perspective. This study applied the famous IS Success model in measuring the quality of a Web-based learning system using a Web-based questionnaire for data collection. One hundred and fifty four nurses participated in the survey. Based on confirmatory factor analysis, the variables of the research model fit for measuring the quality of a Web-based learning system. As Web-based education continues to grow worldwide, the results of this study may assist the system adopter (hospital executives), the learner (nurses), and the system designers in making reasonable and informed judgments with regard to the quality of Web-based learning system in continuing professional education.
Zhang, Wangjian; Du, Zhicheng; Tang, Shaokai; Guo, Pi; Ye, Xingdong; Hao, Yuantao
2015-08-08
Guangzhou is the economic center of South China, which is currently suffering an insidious re-emergence of syphilis. Syphilis epidemic in this area is a matter of serious concern, because of the special economic position of Guangzhou and its large migrant population. Therefore, a comprehensive analysis of surveillance data is needed to provide further information for developing targeted control programs. Case-based surveillance data obtained from a real-time, web-based system were analyzed. A hierarchical clustering method was applied to classify the 12 districts of Guangzhou into several epidemiological regions. The district-level annual incidence and clustering results were displayed on the same map to show the spatial patterns of syphilis in Guangzhou. A total of 60,178 syphilis cases were reported during the period from 2005 to 2013, among which primary/secondary syphilis accounted for 15,864 cases (26.36 %), latent syphilis for 41,078 cases (68.26 %) and congenital syphilis for 2,090 cases (3.47 %). Moreover, primary/secondary syphilis burden slightly decreased from 17.5-18.0 cases per 100,000 people in the first years to 10.6 cases per 100,000 in 2013, with latent syphilis largely increasing from 18.5 cases per 100,000 to 43.4 cases per 100,000. Districts of Guangzhou could be classified into 3 epidemiological regions according to the syphilis burden over the last 3 years of the study period. The burden of primary/secondary syphilis appears to be decreasing in recent years, whereas that of latent syphilis is increasing. Given the epidemiological features and the annual changes found in this study, it is suggested that future control programs should be more population-specific and spatially targeted.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-07-21
... Information Collection; Comment Request; NTIA/FCC Web- based Frequency Coordination System AGENCY: National.... Abstract The National Telecommunications and Information Administration (NTIA) hosts a Web-based system...) bands that are shared on a co-primary basis by federal and non-federal users. The Web-based system...
A filament of dark matter between two clusters of galaxies.
Dietrich, Jörg P; Werner, Norbert; Clowe, Douglas; Finoguenov, Alexis; Kitching, Tom; Miller, Lance; Simionescu, Aurora
2012-07-12
It is a firm prediction of the concordance cold-dark-matter cosmological model that galaxy clusters occur at the intersection of large-scale structure filaments. The thread-like structure of this 'cosmic web' has been traced by galaxy redshift surveys for decades. More recently, the warm–hot intergalactic medium (a sparse plasma with temperatures of 10(5) kelvin to 10(7) kelvin) residing in low-redshift filaments has been observed in emission and absorption. However, a reliable direct detection of the underlying dark-matter skeleton, which should contain more than half of all matter, has remained elusive, because earlier candidates for such detections were either falsified or suffered from low signal-to-noise ratios and unphysical misalignments of dark and luminous matter. Here we report the detection of a dark-matter filament connecting the two main components of the Abell 222/223 supercluster system from its weak gravitational lensing signal, both in a non-parametric mass reconstruction and in parametric model fits. This filament is coincident with an overdensity of galaxies and diffuse, soft-X-ray emission, and contributes a mass comparable to that of an additional galaxy cluster to the total mass of the supercluster. By combining this result with X-ray observations, we can place an upper limit of 0.09 on the hot gas fraction (the mass of X-ray-emitting gas divided by the total mass) in the filament.
Structural, electronic and magnetic properties of Ti n Mo ( n = 1 - 7) clusters
NASA Astrophysics Data System (ADS)
Zhang, Ge; Zhai, Zhongyuan; Sheng, Yong
2017-04-01
The ground state structures of TinMo and Tin+1 (n = 1 - 7) clusters and their structural, electronic and magnetic properties are investigated with the density functional method at B3LYP/LanL2DZ level. One Mo atom substituted Tin+1 structure is the dominant growth pattern, and the TinMo clusters exhibit enhanced structural stabilities according to the averaged binding energies. The electronic properties are also discussed by investigating chemical hardness and HOMO-LUMO energy gap. The results reveal that Ti3Mo and Ti5Mo keep higher chemical stabilities when compared with the other clusters. For all the studied clusters, the Mo atoms always get electrons from Ti atoms and present negative charges. Moreover, the doping of Mo in the bare titanium clusters can alter the magnetic moments of them. Ti3Mo and Ti5Mo show relatively large total magnetic moments, which may be related to the presence of exchange splitting behavior in their densities of states. Supplementary material in the form of one pdf file available from the Journal web page at http://https://doi.org/10.1140/epjd/e2017-70589-8
Wireless, Web-Based Interactive Control of Optical Coherence Tomography with Mobile Devices
Mehta, Rajvi; Nankivil, Derek; Zielinski, David J.; Waterman, Gar; Keller, Brenton; Limkakeng, Alexander T.; Kopper, Regis; Izatt, Joseph A.; Kuo, Anthony N.
2017-01-01
Purpose Optical coherence tomography (OCT) is widely used in ophthalmology clinics and has potential for more general medical settings and remote diagnostics. In anticipation of remote applications, we developed wireless interactive control of an OCT system using mobile devices. Methods A web-based user interface (WebUI) was developed to interact with a handheld OCT system. The WebUI consisted of key OCT displays and controls ported to a webpage using HTML and JavaScript. Client–server relationships were created between the WebUI and the OCT system computer. The WebUI was accessed on a cellular phone mounted to the handheld OCT probe to wirelessly control the OCT system. Twenty subjects were imaged using the WebUI to assess the system. System latency was measured using different connection types (wireless 802.11n only, wireless to remote virtual private network [VPN], and cellular). Results Using a cellular phone, the WebUI was successfully used to capture posterior eye OCT images in all subjects. Simultaneous interactivity by a remote user on a laptop was also demonstrated. On average, use of the WebUI added only 58, 95, and 170 ms to the system latency using wireless only, wireless to VPN, and cellular connections, respectively. Qualitatively, operator usage was not affected. Conclusions Using a WebUI, we demonstrated wireless and remote control of an OCT system with mobile devices. Translational Relevance The web and open source software tools used in this project make it possible for any mobile device to potentially control an OCT system through a WebUI. This platform can be a basis for remote, teleophthalmology applications using OCT. PMID:28138415
Aura in Cluster Headache: A Cross-Sectional Study.
de Coo, Ilse F; Wilbrink, Leopoldine A; Ie, Gaby D; Haan, Joost; Ferrari, Michel D
2018-06-22
Aura symptoms have been reported in up to 23% of cluster headache patients, but it is not known whether clinical characteristics are different in participants with and without aura. Using validated web-based questionnaires we assessed the presence and characteristics of attack-related aura and other clinical features in 629 subjects available for analysis from an initial cohort of 756 cluster headache subjects. Participants who screened positive for aura were contacted by telephone for confirmation of the ICHD-III criteria for aura. Typical aura symptoms before or during cluster headache attacks were found in 44/629 participants (7.0%) mainly involving visual symptoms (61.4%). Except for lower alcohol consumption and higher prevalence of frontal pain in participants with aura, no differences in clinical characteristics were found compared with participants without aura. At least 7.0% of the participants with cluster headache in our large cohort reported typical aura symptoms, which most often involved visual symptoms. No major clinical differences were found between participants with and without aura. © 2018 The Authors. Headache: The Journal of Head and Face Pain published by Wiley Periodicals, Inc. on behalf of American Headache Society.
Boson, Bertrand; Denolly, Solène; Turlure, Fanny; Chamot, Christophe; Dreux, Marlène; Cosset, François-Loïc
2017-03-01
Daclatasvir is a direct-acting antiviral agent and potent inhibitor of NS5A, which is involved in replication of the hepatitis C virus (HCV) genome, presumably via membranous web shaping, and assembly of new virions, likely via transfer of the HCV RNA genome to viral particle assembly sites. Daclatasvir inhibits the formation of new membranous web structures and, ultimately, of replication complex vesicles, but also inhibits an early assembly step. We investigated the relationship between daclatasvir-induced clustering of HCV proteins, intracellular localization of viral RNAs, and inhibition of viral particle assembly. Cell-culture-derived HCV particles were produced from Huh7.5 hepatocarcinoma cells in presence of daclatasvir for short time periods. Infectivity and production of physical particles were quantified and producer cells were subjected to subcellular fractionation. Intracellular colocalization between core, E2, NS5A, NS4B proteins, and viral RNAs was quantitatively analyzed by confocal microscopy and by structured illumination microscopy. Short exposure of HCV-infected cells to daclatasvir reduced viral assembly and induced clustering of structural proteins with non-structural HCV proteins, including core, E2, NS4B, and NS5A. These clustered structures appeared to be inactive assembly platforms, likely owing to loss of functional connection with replication complexes. Daclatasvir greatly reduced delivery of viral genomes to these core clusters without altering HCV RNA colocalization with NS5A. In contrast, daclatasvir neither induced clustered structures nor inhibited HCV assembly in cells infected with a daclatasvir-resistant mutant (NS5A-Y93H), indicating that daclatasvir targets a mutual, specific function of NS5A inhibiting both processes. In addition to inhibiting replication complex biogenesis, daclatasvir prevents viral assembly by blocking transfer of the viral genome to assembly sites. This leads to clustering of HCV proteins because viral particles and replication complex vesicles cannot form or egress. This dual mode of action of daclatasvir could explain its efficacy in blocking HCV replication in cultured cells and in treatment of patients with HCV infection. Copyright © 2017 AGA Institute. Published by Elsevier Inc. All rights reserved.
The stabilities and electron structures of Al-Mg clusters with 18 and 20 valence electrons
NASA Astrophysics Data System (ADS)
Yang, Huihui; Chen, Hongshan
2017-07-01
The spherical jellium model predicts that metal clusters having 18 and 20 valence electrons correspond to the magic numbers and will show specific stabilities. We explore in detail the geometric structures, stabilities and electronic structures of Al-Mg clusters containing 18 and 20 valence electrons by using genetic algorithm combined with density functional theories. The stabilities of the clusters are governed by the electronic configurations and Mg/Al ratios. The clusters with lower Mg/Al ratios are more stable. The molecular orbitals accord with the shell structures predicted by the jellium model but the 2S level interweaves with the 1D levels and the 2S and 1D orbitals form a subgroup. The clusters having 20 valence electrons form closed 1S21P61D102S2 shells and show enhanced stability. The Al-Mg clusters with a valence electron count of 18 do not form closed shells because one 1D orbital is unoccupied. The ionization potential and electron affinity are closely related to the electronic configurations; their values are determined by the subgroups the HOMO or LUMO belong to. Supplementary material in the form of one pdf file available from the Journal web page at http://https://doi.org/10.1140/epjd/e2017-80042-9
The web system for operative description of air quality in the city
NASA Astrophysics Data System (ADS)
Barth, A. A.; Starchenko, A. V.; Fazliev, A. Z.
2009-04-01
Development and implementation of information-computational system (ICS) is described. The system is oriented on the collective usage of the calculation's facilities in order to determine the air quality on the basis of photochemical model. The ICS has been implemented on the basis of the middleware of ATMOS web-portal [1, 2]. The data and calculation layer of this ICS includes: Mathematical model of pollution transport based on transport differential equations. The model describes propagation, scattering and chemical transformation of the pollutants in the atmosphere [3]. The model may use averaged data value for city or forecast results obtained with help of the Chaser model.[4] Atmospheric boundary layer model (ABLM) [3] is used for operative numerical prediction of the meteorological parameters. These are such parameters as speed and direction of the wind, humidity and temperature of the air, which are necessary for the transport impurity model to operate. The model may use data assimilation of meteorological measurements data (including land based observations and the results of remote sensing of vertical structure of the atmosphere) or the weather forecast results obtained with help of the Semi-Lagrange model [5]. Applications for manipulation of data: An application for downloading parameters of atmospheric surface layer and remote sensing of vertical structure of the atmosphere from the web sites (http://meteo.infospace.ru and http://weather.uwyo.edu); An application for uploading these data into the ICS database; An application for transformation of the uploaded data into the internal data format of the system. At present this ICS is a part of "Climate" web site located in ATMOS portal [5]. The database is based on the data schemes providing the calculation in ICS workflow. The applications manipulated with the data are working in automatic regime. The workflow oriented on computation of physical parameters contains: The application for the calculation of geostrophic wind components on the base of Eckman equations; The applications for solution of the equations derived from ABL and transport of impurity models. The application for representation of calculation results in tabular and graphical forms. "Cyberia" cluster [6] located in Tomsk State University is used for computation of the impurity transport equations. References: Gordov E.P., V. N. Lykosov, and A. Z. Fazliev, Web portal on environmental sciences "ATMOS"// Advances in Geoscience, 2006, v. 8, p. 33-38. ATMOS web-portal http://atmos.iao.ru/middleware/ Belikov D.A., Starchenko A.V. Numerical investigation of secondary air pollutions formation near industrial center // Computational technologies. 2005. V. 10. Special issue. Proceedings of the International Conference and the School of Young Scientists "Computational and informational technologies for environmental sciences" (CITES 2005). Tomsk, 13-23 March 2005. Part 2. P. 99-105 Sudo, K., Takahashi M., Kurokawa J., Akimoto H. CHASER: A global chemical model of the troposphere. Model description, J. Geophys. Res., 2002, Vol.107(D17), P. 4339. Tolstykh M.A., Fadeev R.Y. Semi-Lagrangian variable-resolution weather prediction model and its further development // Computational technologies. 2006. V. 11. Special issue. P. 176-184 ATMOS web-portal http://climate.atmos.math.tsu.ru/ Tomsk state university, Interregional computational center http://skif.tsu.ru
Jani, Saurin D; Argraves, Gary L; Barth, Jeremy L; Argraves, W Scott
2010-04-01
An important objective of DNA microarray-based gene expression experimentation is determining inter-relationships that exist between differentially expressed genes and biological processes, molecular functions, cellular components, signaling pathways, physiologic processes and diseases. Here we describe GeneMesh, a web-based program that facilitates analysis of DNA microarray gene expression data. GeneMesh relates genes in a query set to categories available in the Medical Subject Headings (MeSH) hierarchical index. The interface enables hypothesis driven relational analysis to a specific MeSH subcategory (e.g., Cardiovascular System, Genetic Processes, Immune System Diseases etc.) or unbiased relational analysis to broader MeSH categories (e.g., Anatomy, Biological Sciences, Disease etc.). Genes found associated with a given MeSH category are dynamically linked to facilitate tabular and graphical depiction of Entrez Gene information, Gene Ontology information, KEGG metabolic pathway diagrams and intermolecular interaction information. Expression intensity values of groups of genes that cluster in relation to a given MeSH category, gene ontology or pathway can be displayed as heat maps of Z score-normalized values. GeneMesh operates on gene expression data derived from a number of commercial microarray platforms including Affymetrix, Agilent and Illumina. GeneMesh is a versatile web-based tool for testing and developing new hypotheses through relating genes in a query set (e.g., differentially expressed genes from a DNA microarray experiment) to descriptors making up the hierarchical structure of the National Library of Medicine controlled vocabulary thesaurus, MeSH. The system further enhances the discovery process by providing links between sets of genes associated with a given MeSH category to a rich set of html linked tabular and graphic information including Entrez Gene summaries, gene ontologies, intermolecular interactions, overlays of genes onto KEGG pathway diagrams and heatmaps of expression intensity values. GeneMesh is freely available online at http://proteogenomics.musc.edu/genemesh/.
Barton, G; Abbott, J; Chiba, N; Huang, DW; Huang, Y; Krznaric, M; Mack-Smith, J; Saleem, A; Sherman, BT; Tiwari, B; Tomlinson, C; Aitman, T; Darlington, J; Game, L; Sternberg, MJE; Butcher, SA
2008-01-01
Background Microarray experimentation requires the application of complex analysis methods as well as the use of non-trivial computer technologies to manage the resultant large data sets. This, together with the proliferation of tools and techniques for microarray data analysis, makes it very challenging for a laboratory scientist to keep up-to-date with the latest developments in this field. Our aim was to develop a distributed e-support system for microarray data analysis and management. Results EMAAS (Extensible MicroArray Analysis System) is a multi-user rich internet application (RIA) providing simple, robust access to up-to-date resources for microarray data storage and analysis, combined with integrated tools to optimise real time user support and training. The system leverages the power of distributed computing to perform microarray analyses, and provides seamless access to resources located at various remote facilities. The EMAAS framework allows users to import microarray data from several sources to an underlying database, to pre-process, quality assess and analyse the data, to perform functional analyses, and to track data analysis steps, all through a single easy to use web portal. This interface offers distance support to users both in the form of video tutorials and via live screen feeds using the web conferencing tool EVO. A number of analysis packages, including R-Bioconductor and Affymetrix Power Tools have been integrated on the server side and are available programmatically through the Postgres-PLR library or on grid compute clusters. Integrated distributed resources include the functional annotation tool DAVID, GeneCards and the microarray data repositories GEO, CELSIUS and MiMiR. EMAAS currently supports analysis of Affymetrix 3' and Exon expression arrays, and the system is extensible to cater for other microarray and transcriptomic platforms. Conclusion EMAAS enables users to track and perform microarray data management and analysis tasks through a single easy-to-use web application. The system architecture is flexible and scalable to allow new array types, analysis algorithms and tools to be added with relative ease and to cope with large increases in data volume. PMID:19032776
A Browser-Based Multi-User Working Environment for Physicists
NASA Astrophysics Data System (ADS)
Erdmann, M.; Fischer, R.; Glaser, C.; Klingebiel, D.; Komm, M.; Müller, G.; Rieger, M.; Steggemann, J.; Urban, M.; Winchen, T.
2014-06-01
Many programs in experimental particle physics do not yet have a graphical interface, or demand strong platform and software requirements. With the most recent development of the VISPA project, we provide graphical interfaces to existing software programs and access to multiple computing clusters through standard web browsers. The scalable clientserver system allows analyses to be performed in sizable teams, and disburdens the individual physicist from installing and maintaining a software environment. The VISPA graphical interfaces are implemented in HTML, JavaScript and extensions to the Python webserver. The webserver uses SSH and RPC to access user data, code and processes on remote sites. As example applications we present graphical interfaces for steering the reconstruction framework OFFLINE of the Pierre-Auger experiment, and the analysis development toolkit PXL. The browser based VISPA system was field-tested in biweekly homework of a third year physics course by more than 100 students. We discuss the system deployment and the evaluation by the students.
Comparative case study between D3 and highcharts on lustre data visualization
NASA Astrophysics Data System (ADS)
ElTayeby, Omar; John, Dwayne; Patel, Pragnesh; Simmerman, Scott
2013-12-01
One of the challenging tasks in visual analytics is to target clustered time-series data sets, since it is important for data analysts to discover patterns changing over time while keeping their focus on particular subsets. In order to leverage the humans ability to quickly visually perceive these patterns, multivariate features should be implemented according to the attributes available. However, a comparative case study has been done using JavaScript libraries to demonstrate the differences in capabilities of using them. A web-based application to monitor the Lustre file system for the systems administrators and the operation teams has been developed using D3 and Highcharts. Lustre file systems are responsible of managing Remote Procedure Calls (RPCs) which include input output (I/O) requests between clients and Object Storage Targets (OSTs). The objective of this application is to provide time-series visuals of these calls and storage patterns of users on Kraken, a University of Tennessee High Performance Computing (HPC) resource in Oak Ridge National Laboratory (ORNL).
Sequence harmony: detecting functional specificity from alignments
Feenstra, K. Anton; Pirovano, Walter; Krab, Klaas; Heringa, Jaap
2007-01-01
Multiple sequence alignments are often used for the identification of key specificity-determining residues within protein families. We present a web server implementation of the Sequence Harmony (SH) method previously introduced. SH accurately detects subfamily specific positions from a multiple alignment by scoring compositional differences between subfamilies, without imposing conservation. The SH web server allows a quick selection of subtype specific sites from a multiple alignment given a subfamily grouping. In addition, it allows the predicted sites to be directly mapped onto a protein structure and displayed. We demonstrate the use of the SH server using the family of plant mitochondrial alternative oxidases (AOX). In addition, we illustrate the usefulness of combining sequence and structural information by showing that the predicted sites are clustered into a few distinct regions in an AOX homology model. The SH web server can be accessed at www.ibi.vu.nl/programs/seqharmwww. PMID:17584793
View of Arabella, one of two Skylab spiders and her web
1973-08-16
SL3-108-1307 (July-September 1973) --- A close-up view of Arabella, one of the two Skylab 3 common cross spiders "Araneus diadematus," and the web it had spun in the zero-gravity of space aboard the Skylab space station cluster in Earth orbit. This picture was taken with a hand-held 35mm Nikon camera. During the 59-day Skylab 3 mission the two spiders, Arabella and Anita, were housed in an enclosure onto which a motion picture and a still camera were attempts to build a web in the weightless environment. The spider experiment (ED52) was one of 25 experiments selected for Skylab by NASA from more than 3,400 experiment proposals submitted by high school students throughout the nation. ED52 was submitted by 17-year-old Judith S. Miles of Lexington, Massachusetts. Anita died during the last week of the mission. Photo credit: NASA
Efficient image data distribution and management with application to web caching architectures
NASA Astrophysics Data System (ADS)
Han, Keesook J.; Suter, Bruce W.
2003-03-01
We present compact image data structures and associated packet delivery techniques for effective Web caching architectures. Presently, images on a web page are inefficiently stored, using a single image per file. Our approach is to use clustering to merge similar images into a single file in order to exploit the redundancy between images. Our studies indicate that a 30-50% image data size reduction can be achieved by eliminating the redundancies of color indexes. Attached to this file is new metadata to permit an easy extraction of images. This approach will permit a more efficient use of the cache, since a shorter list of cache references will be required. Packet and transmission delays can be reduced by 50% eliminating redundant TCP/IP headers and connection time. Thus, this innovative paradigm for the elimination of redundancy may provide valuable benefits for optimizing packet delivery in IP networks by reducing latency and minimizing the bandwidth requirements.
antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters.
Weber, Tilmann; Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko; Medema, Marnix H
2015-07-01
Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Ligand cluster-based protein network and ePlatton, a multi-target ligand finder.
Du, Yu; Shi, Tieliu
2016-01-01
Small molecules are information carriers that make cells aware of external changes and couple internal metabolic and signalling pathway systems with each other. In some specific physiological status, natural or artificial molecules are used to interact with selective biological targets to activate or inhibit their functions to achieve expected biological and physiological output. Millions of years of evolution have optimized biological processes and pathways and now the endocrine and immune system cannot work properly without some key small molecules. In the past thousands of years, the human race has managed to find many medicines against diseases by trail-and-error experience. In the recent decades, with the deepening understanding of life and the progress of molecular biology, researchers spare no effort to design molecules targeting one or two key enzymes and receptors related to corresponding diseases. But recent studies in pharmacogenomics have shown that polypharmacology may be necessary for the effects of drugs, which challenge the paradigm, 'one drug, one target, one disease'. Nowadays, cheminformatics and structural biology can help us reasonably take advantage of the polypharmacology to design next-generation promiscuous drugs and drug combination therapies. 234,591 protein-ligand interactions were extracted from ChEMBL. By the 2D structure similarity, 13,769 ligand emerged from 156,151 distinct ligands which were recognized by 1477 proteins. Ligand cluster- and sequence-based protein networks (LCBN, SBN) were constructed, compared and analysed. For assisting compound designing, exploring polypharmacology and finding possible drug combination, we integrated the pathway, disease, drug adverse reaction and the relationship of targets and ligand clusters into the web platform, ePlatton, which is available at http://www.megabionet.org/eplatton. Although there were some disagreements between the LCBN and SBN, communities in both networks were largely the same with normalized mutual information at 0.9. The study of target and ligand cluster promiscuity underlying the LCBN showed that light ligand clusters were more promiscuous than the heavy one and that highly connected nodes tended to be protein kinases and involved in phosphorylation. ePlatton considerably reduced the redundancy of the ligand set of targets and made it easy to deduce the possible relationship between compounds and targets, pathways and side effects. ePlatton behaved reliably in validation experiments and also fast in virtual screening and information retrieval.Graphical abstractCluster exemplars and ePlatton's mechanism.
Cooperation and Contagion in Web-Based, Networked Public Goods Experiments
Suri, Siddharth; Watts, Duncan J.
2011-01-01
A longstanding idea in the literature on human cooperation is that cooperation should be reinforced when conditional cooperators are more likely to interact. In the context of social networks, this idea implies that cooperation should fare better in highly clustered networks such as cliques than in networks with low clustering such as random networks. To test this hypothesis, we conducted a series of web-based experiments, in which 24 individuals played a local public goods game arranged on one of five network topologies that varied between disconnected cliques and a random regular graph. In contrast with previous theoretical work, we found that network topology had no significant effect on average contributions. This result implies either that individuals are not conditional cooperators, or else that cooperation does not benefit from positive reinforcement between connected neighbors. We then tested both of these possibilities in two subsequent series of experiments in which artificial seed players were introduced, making either full or zero contributions. First, we found that although players did generally behave like conditional cooperators, they were as likely to decrease their contributions in response to low contributing neighbors as they were to increase their contributions in response to high contributing neighbors. Second, we found that positive effects of cooperation were contagious only to direct neighbors in the network. In total we report on 113 human subjects experiments, highlighting the speed, flexibility, and cost-effectiveness of web-based experiments over those conducted in physical labs. PMID:21412431
Cooperation and contagion in web-based, networked public goods experiments.
Suri, Siddharth; Watts, Duncan J
2011-03-11
A longstanding idea in the literature on human cooperation is that cooperation should be reinforced when conditional cooperators are more likely to interact. In the context of social networks, this idea implies that cooperation should fare better in highly clustered networks such as cliques than in networks with low clustering such as random networks. To test this hypothesis, we conducted a series of web-based experiments, in which 24 individuals played a local public goods game arranged on one of five network topologies that varied between disconnected cliques and a random regular graph. In contrast with previous theoretical work, we found that network topology had no significant effect on average contributions. This result implies either that individuals are not conditional cooperators, or else that cooperation does not benefit from positive reinforcement between connected neighbors. We then tested both of these possibilities in two subsequent series of experiments in which artificial seed players were introduced, making either full or zero contributions. First, we found that although players did generally behave like conditional cooperators, they were as likely to decrease their contributions in response to low contributing neighbors as they were to increase their contributions in response to high contributing neighbors. Second, we found that positive effects of cooperation were contagious only to direct neighbors in the network. In total we report on 113 human subjects experiments, highlighting the speed, flexibility, and cost-effectiveness of web-based experiments over those conducted in physical labs.
Clusters, voids and reconstructions of the cosmic web
NASA Astrophysics Data System (ADS)
Bos, E. G. Patrick
2016-12-01
The Universe is filled for 95% with dark matter and energy that we cannot see. Of the remaining 5% normal matter we can only see a small part. However, if we want to study the Universe as a whole, we will have to get to know it for 100%. We have to uncover indirectly where dark matter is hiding and what is the nature of dark energy. In this thesis we explore two such methods. The first part describes how we can use the large empty regions between galaxies, "voids", to learn more about dark energy. We converted our theoretical simulations to a model of real observations of galaxies. In this model, we perform the same measurements as we would in real observations. This way, we show that it is indeed possible to unravel the nature of dark energy. The second part is based on our computer code: BARCODE. It unites two models: a physical model of the formation of the Cosmic Web, and a description of the observational effects of (clusters of) galaxies, in particular the effect of redshift on distance measurements. It allows us to back-trace our observations to the primordial conditions. These enable us to trace all (dark) matter, also that which we did not directly observe. The result is a reconstruction of the complete Cosmic Web. In these, we studied "filaments". These objects have not yet been extensively studied. BARCODE will enable further study, e.g. by using it to find observable filaments.
Mobile Visualization and Analysis Tools for Spatial Time-Series Data
NASA Astrophysics Data System (ADS)
Eberle, J.; Hüttich, C.; Schmullius, C.
2013-12-01
The Siberian Earth System Science Cluster (SIB-ESS-C) provides access and analysis services for spatial time-series data build on products from the Moderate Resolution Imaging Spectroradiometer (MODIS) and climate data from meteorological stations. Until now a webportal for data access, visualization and analysis with standard-compliant web services was developed for SIB-ESS-C. As a further enhancement a mobile app was developed to provide an easy access to these time-series data for field campaigns. The app sends the current position from the GPS receiver and a specific dataset (like land surface temperature or vegetation indices) - selected by the user - to our SIB-ESS-C web service and gets the requested time-series data for the identified pixel back in real-time. The data is then being plotted directly in the app. Furthermore the user has possibilities to analyze the time-series data for breaking points and other phenological values. These processings are executed on demand of the user on our SIB-ESS-C web server and results are transfered to the app. Any processing can also be done at the SIB-ESS-C webportal. The aim of this work is to make spatial time-series data and analysis functions available for end users without the need of data processing. In this presentation the author gives an overview on this new mobile app, the functionalities, the technical infrastructure as well as technological issues (how the app was developed, our made experiences).
75 FR 27182 - Energy Conservation Program: Web-Based Compliance and Certification Management System
Federal Register 2010, 2011, 2012, 2013, 2014
2010-05-14
... Conservation Program: Web-Based Compliance and Certification Management System AGENCY: Office of Energy... certification reports to the Department of Energy (DOE) through an electronic Web-based tool, the Compliance and... following means: 1. Compliance and Certification Management System (CCMS)--via the Web portal: http...
Information Retrieval System for Japanese Standard Disease-Code Master Using XML Web Service
Hatano, Kenji; Ohe, Kazuhiko
2003-01-01
Information retrieval system of Japanese Standard Disease-Code Master Using XML Web Service is developed. XML Web Service is a new distributed processing system by standard internet technologies. With seamless remote method invocation of XML Web Service, users are able to get the latest disease code master information from their rich desktop applications or internet web sites, which refer to this service. PMID:14728364
Web Mining: Machine Learning for Web Applications.
ERIC Educational Resources Information Center
Chen, Hsinchun; Chau, Michael
2004-01-01
Presents an overview of machine learning research and reviews methods used for evaluating machine learning systems. Ways that machine-learning algorithms were used in traditional information retrieval systems in the "pre-Web" era are described, and the field of Web mining and how machine learning has been used in different Web mining…
NASA Astrophysics Data System (ADS)
Friberg, P. A.; Luis, R. S.; Quintiliani, M.; Lisowski, S.; Hunter, S.
2014-12-01
Recently, a novel set of modules has been included in the Open Source Earthworm seismic data processing system, supporting the use of web applications. These include the Mole sub-system, for storing relevant event data in a MySQL database (see M. Quintiliani and S. Pintore, SRL, 2013), and an embedded webserver, Moleserv, for serving such data to web clients in QuakeML format. These modules have enabled, for the first time using Earthworm, the use of web applications for seismic data processing. These can greatly simplify the operation and maintenance of seismic data processing centers by having one or more servers providing the relevant data as well as the data processing applications themselves to client machines running arbitrary operating systems.Web applications with secure online web access allow operators to work anywhere, without the often cumbersome and bandwidth hungry use of secure shell or virtual private networks. Furthermore, web applications can seamlessly access third party data repositories to acquire additional information, such as maps. Finally, the usage of HTML email brought the possibility of specialized web applications, to be used in email clients. This is the case of EWHTMLEmail, which produces event notification emails that are in fact simple web applications for plotting relevant seismic data.Providing web services as part of Earthworm has enabled a number of other tools as well. One is ISTI's EZ Earthworm, a web based command and control system for an otherwise command line driven system; another is a waveform web service. The waveform web service serves Earthworm data to additional web clients for plotting, picking, and other web-based processing tools. The current Earthworm waveform web service hosts an advanced plotting capability for providing views of event-based waveforms from a Mole database served by Moleserve.The current trend towards the usage of cloud services supported by web applications is driving improvements in JavaScript, css and HTML, as well as faster and more efficient web browsers, including mobile. It is foreseeable that in the near future, web applications are as powerful and efficient as native applications. Hence the work described here has been the first step towards bringing the Open Source Earthworm seismic data processing system to this new paradigm.
Accessing the SEED genome databases via Web services API: tools for programmers.
Disz, Terry; Akhter, Sajia; Cuevas, Daniel; Olson, Robert; Overbeek, Ross; Vonstein, Veronika; Stevens, Rick; Edwards, Robert A
2010-06-14
The SEED integrates many publicly available genome sequences into a single resource. The database contains accurate and up-to-date annotations based on the subsystems concept that leverages clustering between genomes and other clues to accurately and efficiently annotate microbial genomes. The backend is used as the foundation for many genome annotation tools, such as the Rapid Annotation using Subsystems Technology (RAST) server for whole genome annotation, the metagenomics RAST server for random community genome annotations, and the annotation clearinghouse for exchanging annotations from different resources. In addition to a web user interface, the SEED also provides Web services based API for programmatic access to the data in the SEED, allowing the development of third-party tools and mash-ups. The currently exposed Web services encompass over forty different methods for accessing data related to microbial genome annotations. The Web services provide comprehensive access to the database back end, allowing any programmer access to the most consistent and accurate genome annotations available. The Web services are deployed using a platform independent service-oriented approach that allows the user to choose the most suitable programming platform for their application. Example code demonstrate that Web services can be used to access the SEED using common bioinformatics programming languages such as Perl, Python, and Java. We present a novel approach to access the SEED database. Using Web services, a robust API for access to genomics data is provided, without requiring large volume downloads all at once. The API ensures timely access to the most current datasets available, including the new genomes as soon as they come online.
Integration of Simulation into Pre-Laboratory Chemical Course: Computer Cluster versus WebCT
ERIC Educational Resources Information Center
Limniou, Maria; Papadopoulos, Nikos; Whitehead, Christopher
2009-01-01
Pre-laboratory activities have been known to improve students' preparation before their practical work as they assist students to make available more working memory capacity for actual learning during the laboratory. The aim of this investigation was to compare two different teaching approaches which supported a pre-laboratory session by using the…
ERIC Educational Resources Information Center
DuRant, Robert; Champion, Heather; Wolfson, Mark; Omli, Morrow; McCoy, Thomas; D'Agostino, Ralph B., Jr.; Wagoner, Kim; Mitra, Ananda
2007-01-01
Objective: The authors examined the clustering of health-risk behaviors among college students who reported date fight involvement. Participants and Methods: The authors administered a Web-based survey to a stratified random sample of 3,920 college students from 10 universities in North Carolina. Results: Among men, 5.6% reported date fight…
Application of Learning Analytics Using Clustering Data Mining for Students' Disposition Analysis
ERIC Educational Resources Information Center
Bharara, Sanyam; Sabitha, Sai; Bansal, Abhay
2018-01-01
Learning Analytics (LA) is an emerging field in which sophisticated analytic tools are used to improve learning and education. It draws from, and is closely tied to, a series of other fields of study like business intelligence, web analytics, academic analytics, educational data mining, and action analytics. The main objective of this research…
Rot, Gregor; Parikh, Anup; Curk, Tomaz; Kuspa, Adam; Shaulsky, Gad; Zupan, Blaz
2009-08-25
Bioinformatics often leverages on recent advancements in computer science to support biologists in their scientific discovery process. Such efforts include the development of easy-to-use web interfaces to biomedical databases. Recent advancements in interactive web technologies require us to rethink the standard submit-and-wait paradigm, and craft bioinformatics web applications that share analytical and interactive power with their desktop relatives, while retaining simplicity and availability. We have developed dictyExpress, a web application that features a graphical, highly interactive explorative interface to our database that consists of more than 1000 Dictyostelium discoideum gene expression experiments. In dictyExpress, the user can select experiments and genes, perform gene clustering, view gene expression profiles across time, view gene co-expression networks, perform analyses of Gene Ontology term enrichment, and simultaneously display expression profiles for a selected gene in various experiments. Most importantly, these tasks are achieved through web applications whose components are seamlessly interlinked and immediately respond to events triggered by the user, thus providing a powerful explorative data analysis environment. dictyExpress is a precursor for a new generation of web-based bioinformatics applications with simple but powerful interactive interfaces that resemble that of the modern desktop. While dictyExpress serves mainly the Dictyostelium research community, it is relatively easy to adapt it to other datasets. We propose that the design ideas behind dictyExpress will influence the development of similar applications for other model organisms.
Rot, Gregor; Parikh, Anup; Curk, Tomaz; Kuspa, Adam; Shaulsky, Gad; Zupan, Blaz
2009-01-01
Background Bioinformatics often leverages on recent advancements in computer science to support biologists in their scientific discovery process. Such efforts include the development of easy-to-use web interfaces to biomedical databases. Recent advancements in interactive web technologies require us to rethink the standard submit-and-wait paradigm, and craft bioinformatics web applications that share analytical and interactive power with their desktop relatives, while retaining simplicity and availability. Results We have developed dictyExpress, a web application that features a graphical, highly interactive explorative interface to our database that consists of more than 1000 Dictyostelium discoideum gene expression experiments. In dictyExpress, the user can select experiments and genes, perform gene clustering, view gene expression profiles across time, view gene co-expression networks, perform analyses of Gene Ontology term enrichment, and simultaneously display expression profiles for a selected gene in various experiments. Most importantly, these tasks are achieved through web applications whose components are seamlessly interlinked and immediately respond to events triggered by the user, thus providing a powerful explorative data analysis environment. Conclusion dictyExpress is a precursor for a new generation of web-based bioinformatics applications with simple but powerful interactive interfaces that resemble that of the modern desktop. While dictyExpress serves mainly the Dictyostelium research community, it is relatively easy to adapt it to other datasets. We propose that the design ideas behind dictyExpress will influence the development of similar applications for other model organisms. PMID:19706156
Deep convolutional neural networks for pan-specific peptide-MHC class I binding prediction.
Han, Youngmahn; Kim, Dongsup
2017-12-28
Computational scanning of peptide candidates that bind to a specific major histocompatibility complex (MHC) can speed up the peptide-based vaccine development process and therefore various methods are being actively developed. Recently, machine-learning-based methods have generated successful results by training large amounts of experimental data. However, many machine learning-based methods are generally less sensitive in recognizing locally-clustered interactions, which can synergistically stabilize peptide binding. Deep convolutional neural network (DCNN) is a deep learning method inspired by visual recognition process of animal brain and it is known to be able to capture meaningful local patterns from 2D images. Once the peptide-MHC interactions can be encoded into image-like array(ILA) data, DCNN can be employed to build a predictive model for peptide-MHC binding prediction. In this study, we demonstrated that DCNN is able to not only reliably predict peptide-MHC binding, but also sensitively detect locally-clustered interactions. Nonapeptide-HLA-A and -B binding data were encoded into ILA data. A DCNN, as a pan-specific prediction model, was trained on the ILA data. The DCNN showed higher performance than other prediction tools for the latest benchmark datasets, which consist of 43 datasets for 15 HLA-A alleles and 25 datasets for 10 HLA-B alleles. In particular, the DCNN outperformed other tools for alleles belonging to the HLA-A3 supertype. The F1 scores of the DCNN were 0.86, 0.94, and 0.67 for HLA-A*31:01, HLA-A*03:01, and HLA-A*68:01 alleles, respectively, which were significantly higher than those of other tools. We found that the DCNN was able to recognize locally-clustered interactions that could synergistically stabilize peptide binding. We developed ConvMHC, a web server to provide user-friendly web interfaces for peptide-MHC class I binding predictions using the DCNN. ConvMHC web server can be accessible via http://jumong.kaist.ac.kr:8080/convmhc . We developed a novel method for peptide-HLA-I binding predictions using DCNN trained on ILA data that encode peptide binding data and demonstrated the reliable performance of the DCNN in nonapeptide binding predictions through the independent evaluation on the latest IEDB benchmark datasets. Our approaches can be applied to characterize locally-clustered patterns in molecular interactions, such as protein/DNA, protein/RNA, and drug/protein interactions.
deFUME: Dynamic exploration of functional metagenomic sequencing data.
van der Helm, Eric; Geertz-Hansen, Henrik Marcus; Genee, Hans Jasper; Malla, Sailesh; Sommer, Morten Otto Alexander
2015-07-31
Functional metagenomic selections represent a powerful technique that is widely applied for identification of novel genes from complex metagenomic sources. However, whereas hundreds to thousands of clones can be easily generated and sequenced over a few days of experiments, analyzing the data is time consuming and constitutes a major bottleneck for experimental researchers in the field. Here we present the deFUME web server, an easy-to-use web-based interface for processing, annotation and visualization of functional metagenomics sequencing data, tailored to meet the requirements of non-bioinformaticians. The web-server integrates multiple analysis steps into one single workflow: read assembly, open reading frame prediction, and annotation with BLAST, InterPro and GO classifiers. Analysis results are visualized in an online dynamic web-interface. The deFUME webserver provides a fast track from raw sequence to a comprehensive visual data overview that facilitates effortless inspection of gene function, clustering and distribution. The webserver is available at cbs.dtu.dk/services/deFUME/and the source code is distributed at github.com/EvdH0/deFUME.
VAAPA: a web platform for visualization and analysis of alternative polyadenylation.
Guan, Jinting; Fu, Jingyi; Wu, Mingcheng; Chen, Longteng; Ji, Guoli; Quinn Li, Qingshun; Wu, Xiaohui
2015-02-01
Polyadenylation [poly(A)] is an essential process during the maturation of most mRNAs in eukaryotes. Alternative polyadenylation (APA) as an important layer of gene expression regulation has been increasingly recognized in various species. Here, a web platform for visualization and analysis of alternative polyadenylation (VAAPA) was developed. This platform can visualize the distribution of poly(A) sites and poly(A) clusters of a gene or a section of a chromosome. It can also highlight genes with switched APA sites among different conditions. VAAPA is an easy-to-use web-based tool that provides functions of poly(A) site query, data uploading, downloading, and APA sites visualization. It was designed in a multi-tier architecture and developed based on Smart GWT (Google Web Toolkit) using Java as the development language. VAAPA will be a valuable addition to the community for the comprehensive study of APA, not only by making the high quality poly(A) site data more accessible, but also by providing users with numerous valuable functions for poly(A) site analysis and visualization. Copyright © 2014 Elsevier Ltd. All rights reserved.
The Fabric of the Universe: Exploring the Cosmic Web in 3D Prints and Woven Textiles
NASA Astrophysics Data System (ADS)
Diemer, Benedikt; Facio, Isaac
2017-05-01
We introduce The Fabric of the Universe, an art and science collaboration focused on exploring the cosmic web of dark matter with unconventional techniques and materials. We discuss two of our projects in detail. First, we describe a pipeline for translating three-dimensional (3D) density structures from N-body simulations into solid surfaces suitable for 3D printing, and present prints of a cosmological volume and of the infall region around a massive cluster halo. In these models, we discover wall-like features that are invisible in two-dimensional projections. Going beyond the sheer visualization of simulation data, we undertake an exploration of the cosmic web as a three-dimensional woven textile. To this end, we develop experimental 3D weaving techniques to create sphere-like and filamentary shapes and radically simplify a region of the cosmic web into a set of filaments and halos. We translate the resulting tree structure into a series of commands that can be executed by a digital weaving machine, and present a large-scale textile installation.
Web-based support as an adjunct to group-based smoking cessation for adolescents
Mermelstein, Robin; Turner, Lindsey
2008-01-01
Although group-based programs remain the most common treatment approach for adolescent smoking cessation, success rates for these programs have been relatively modest, and their reach may be limited. Web-based adjuncts may be one way to boost the efficacy and reach of group-based approaches. The purpose of this study was to evaluate the effectiveness of enhancing the American Lung Association’s Not on Tobacco program (NOT) with a Web-based adjunct (NOT Plus). Twenty-nine high schools were randomly assigned to either the NOT program alone or to the NOT Plus condition, which included access to a specially designed Web site for teens, along with proactive phone calls from the group facilitator to the participant. Self-reported smoking behavior was obtained at end-of-program and at a 3-month follow-up. Using hierarchical linear modeling, accounting for the clustering of students in schools, and controlling for student gender, grade, race, and baseline smoking rate, there was a marginally significant (p = .06) condition effect at end-of-treatment and a significant effect at 3-month follow-up (p < .05) favoring the NOT Plus condition. Approximately 57% of adolescents reported visiting the Web site, and among the NOT Plus condition, use of the Web site was associated with cessation significantly at end-of-program (p < .05), but not at 3 months. Adolescents in urban schools were more likely to access the Web site than those in rural schools. Participants who visited the Web site rated it positively on several dimensions. Reasons for not using the Web site will be discussed, as well as its value as an adjunct. PMID:17491173
Generic Space Science Visualization in 2D/3D using SDDAS
NASA Astrophysics Data System (ADS)
Mukherjee, J.; Murphy, Z. B.; Gonzalez, C. A.; Muller, M.; Ybarra, S.
2017-12-01
The Southwest Data Display and Analysis System (SDDAS) is a flexible multi-mission / multi-instrument software system intended to support space physics data analysis, and has been in active development for over 20 years. For the Magnetospheric Multi-Scale (MMS), Juno, Cluster, and Mars Express missions, we have modified these generic tools for visualizing data in two and three dimensions. The SDDAS software is open source and makes use of various other open source packages, including VTK and Qwt. The software offers interactive plotting as well as a Python and Lua module to modify the data before plotting. In theory, by writing a Lua or Python module to read the data, any data could be used. Currently, the software can natively read data in IDFS, CEF, CDF, FITS, SEG-Y, ASCII, and XLS formats. We have integrated the software with other Python packages such as SPICE and SpacePy. Included with the visualization software is a database application and other utilities for managing data that can retrieve data from the Cluster Active Archive and Space Physics Data Facility at Goddard, as well as other local archives. Line plots, spectrograms, geographic, volume plots, strip charts, etc. are just some of the types of plots one can generate with SDDAS. Furthermore, due to the design, output is not limited to strictly visualization as SDDAS can also be used to generate stand-alone IDL or Python visualization code.. Lastly, SDDAS has been successfully used as a backend for several web based analysis systems as well.
Blaya, Joaquín A.; Shin, Sonya S.; Yagui, Martin; Contreras, Carmen; Cegielski, Peter; Yale, Gloria; Suarez, Carmen; Asencios, Luis; Bayona, Jaime; Kim, Jihoon; Fraser, Hamish S. F.
2014-01-01
Background Lost, delayed or incorrect laboratory results are associated with delays in initiating treatment. Delays in treatment for Multi-Drug Resistant Tuberculosis (MDR-TB) can worsen patient outcomes and increase transmission. The objective of this study was to evaluate the impact of a laboratory information system in reducing delays and the time for MDR-TB patients to culture convert (stop transmitting). Methods Setting: 78 primary Health Centers (HCs) in Lima, Peru. Participants lived within the catchment area of participating HCs and had at least one MDR-TB risk factor. The study design was a cluster randomized controlled trial with baseline data. The intervention was the e-Chasqui web-based laboratory information system. Main outcome measures were: times to communicate a result; to start or change a patient's treatment; and for that patient to culture convert. Results 1671 patients were enrolled. Intervention HCs took significantly less time to receive drug susceptibility test (DST) (median 11 vs. 17 days, Hazard Ratio 0.67 [0.62–0.72]) and culture (5 vs. 8 days, 0.68 [0.65–0.72]) results. The time to treatment was not significantly different, but patients in intervention HCs took 16 days (20%) less time to culture convert (p = 0.047). Conclusions The eChasqui system reduced the time to communicate results between laboratories and HCs and time to culture conversion. It is now used in over 259 HCs covering 4.1 million people. This is the first randomized controlled trial of a laboratory information system in a developing country for any disease and the only study worldwide to show clinical impact of such a system. Trial Registration ClinicalTrials.gov NCT01201941 PMID:24721980
Blaya, Joaquín A; Shin, Sonya S; Yagui, Martin; Contreras, Carmen; Cegielski, Peter; Yale, Gloria; Suarez, Carmen; Asencios, Luis; Bayona, Jaime; Kim, Jihoon; Fraser, Hamish S F
2014-01-01
Lost, delayed or incorrect laboratory results are associated with delays in initiating treatment. Delays in treatment for Multi-Drug Resistant Tuberculosis (MDR-TB) can worsen patient outcomes and increase transmission. The objective of this study was to evaluate the impact of a laboratory information system in reducing delays and the time for MDR-TB patients to culture convert (stop transmitting). 78 primary Health Centers (HCs) in Lima, Peru. Participants lived within the catchment area of participating HCs and had at least one MDR-TB risk factor. The study design was a cluster randomized controlled trial with baseline data. The intervention was the e-Chasqui web-based laboratory information system. Main outcome measures were: times to communicate a result; to start or change a patient's treatment; and for that patient to culture convert. 1671 patients were enrolled. Intervention HCs took significantly less time to receive drug susceptibility test (DST) (median 11 vs. 17 days, Hazard Ratio 0.67 [0.62-0.72]) and culture (5 vs. 8 days, 0.68 [0.65-0.72]) results. The time to treatment was not significantly different, but patients in intervention HCs took 16 days (20%) less time to culture convert (p = 0.047). The eChasqui system reduced the time to communicate results between laboratories and HCs and time to culture conversion. It is now used in over 259 HCs covering 4.1 million people. This is the first randomized controlled trial of a laboratory information system in a developing country for any disease and the only study worldwide to show clinical impact of such a system. ClinicalTrials.gov NCT01201941.
A clustering approach to segmenting users of internet-based risk calculators.
Harle, C A; Downs, J S; Padman, R
2011-01-01
Risk calculators are widely available Internet applications that deliver quantitative health risk estimates to consumers. Although these tools are known to have varying effects on risk perceptions, little is known about who will be more likely to accept objective risk estimates. To identify clusters of online health consumers that help explain variation in individual improvement in risk perceptions from web-based quantitative disease risk information. A secondary analysis was performed on data collected in a field experiment that measured people's pre-diabetes risk perceptions before and after visiting a realistic health promotion website that provided quantitative risk information. K-means clustering was performed on numerous candidate variable sets, and the different segmentations were evaluated based on between-cluster variation in risk perception improvement. Variation in responses to risk information was best explained by clustering on pre-intervention absolute pre-diabetes risk perceptions and an objective estimate of personal risk. Members of a high-risk overestimater cluster showed large improvements in their risk perceptions, but clusters of both moderate-risk and high-risk underestimaters were much more muted in improving their optimistically biased perceptions. Cluster analysis provided a unique approach for segmenting health consumers and predicting their acceptance of quantitative disease risk information. These clusters suggest that health consumers were very responsive to good news, but tended not to incorporate bad news into their self-perceptions much. These findings help to quantify variation among online health consumers and may inform the targeted marketing of and improvements to risk communication tools on the Internet.
Castro-Mondragon, Jaime Abraham; Jaeger, Sébastien; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques
2017-07-27
Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif discovery from genome-scale data sets (e.g. ChIP-seq) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant TFBM collections. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, and its capability to simultaneously treat multiple collections from various sources. We demonstrate that matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools, and highlights biologically relevant variations of similar motifs. We also ran a large-scale application to cluster ∼11 000 motifs from 24 entire databases, showing that matrix-clustering correctly groups motifs belonging to the same TF families, and drastically reduced motif redundancy. matrix-clustering is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
OPTICAL COLORS OF INTRACLUSTER LIGHT IN THE VIRGO CLUSTER CORE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rudick, Craig S.; Mihos, J. Christopher; Harding, Paul
2010-09-01
We continue our deep optical imaging survey of the Virgo cluster using the CWRU Burrell Schmidt telescope by presenting B-band surface photometry of the core of the Virgo cluster in order to study the cluster's intracluster light (ICL). We find ICL features down to {mu}{sub B} {approx}29 mag arcsec{sup -2}, confirming the results of Mihos et al., who saw a vast web of low surface brightness streams, arcs, plumes, and diffuse light in the Virgo cluster core using V-band imaging. By combining these two data sets, we are able to measure the optical colors of many of the cluster's lowmore » surface brightness features. While much of our imaging area is contaminated by galactic cirrus, the cluster core near the cD galaxy, M87, is unobscured. We trace the color profile of M87 out to over 2000'', and find a blueing trend with radius, continuing out to the largest radii. Moreover, we have measured the colors of several ICL features which extend beyond M87's outermost reaches and find that they have similar colors to the M87's halo itself, B - V {approx}0.8. The common colors of these features suggest that the extended outer envelopes of cD galaxies, such as M87, may be formed from similar streams, created by tidal interactions within the cluster, that have since dissolved into a smooth background in the cluster potential.« less
Web-GIS approach for integrated analysis of heterogeneous georeferenced data
NASA Astrophysics Data System (ADS)
Okladnikov, Igor; Gordov, Evgeny; Titov, Alexander; Shulgina, Tamara
2014-05-01
Georeferenced datasets are currently actively used for modeling, interpretation and forecasting of climatic and ecosystem changes on different spatial and temporal scales [1]. Due to inherent heterogeneity of environmental datasets as well as their huge size (up to tens terabytes for a single dataset) a special software supporting studies in the climate and environmental change areas is required [2]. Dedicated information-computational system for integrated analysis of heterogeneous georeferenced climatological and meteorological data is presented. It is based on combination of Web and GIS technologies according to Open Geospatial Consortium (OGC) standards, and involves many modern solutions such as object-oriented programming model, modular composition, and JavaScript libraries based on GeoExt library (http://www.geoext.org), ExtJS Framework (http://www.sencha.com/products/extjs) and OpenLayers software (http://openlayers.org). The main advantage of the system lies in it's capability to perform integrated analysis of time series of georeferenced data obtained from different sources (in-situ observations, model results, remote sensing data) and to combine the results in a single map [3, 4] as WMS and WFS layers in a web-GIS application. Also analysis results are available for downloading as binary files from the graphical user interface or can be directly accessed through web mapping (WMS) and web feature (WFS) services for a further processing by the user. Data processing is performed on geographically distributed computational cluster comprising data storage systems and corresponding computational nodes. Several geophysical datasets represented by NCEP/NCAR Reanalysis II, JMA/CRIEPI JRA-25 Reanalysis, ECMWF ERA-40 Reanalysis, ECMWF ERA Interim Reanalysis, MRI/JMA APHRODITE's Water Resources Project Reanalysis, DWD Global Precipitation Climatology Centre's data, GMAO Modern Era-Retrospective analysis for Research and Applications, reanalysis of Monitoring atmospheric composition and climate (MACC) Collaborated Project, NOAA-CIRES Twentieth Century Global Reanalysis Version II, NCEP Climate Forecast System Reanalysis (CFSR), meteorological observational data for the territory of the former USSR for the 20th century, results of modeling by global and regional climatological models, and others are available for processing by the system. The Web-GIS information-computational system for heterogeneous geophysical data analysis provides specialists involved into multidisciplinary research projects with reliable and practical instruments for integrated research of climate and ecosystems changes on global and regional scales. With its help even an unskilled in programming user is able to process and visualize multidimensional observational and model data through unified web-interface using a common graphical web-browser. This work is partially supported by SB RAS project VIII.80.2.1, RFBR grant #13-05-12034, grant #14-05-00502, and integrated project SB RAS #131. References 1. Gordov E.P., Lykosov V.N., Krupchatnikov V.N., Okladnikov I.G., Titov A.G., Shulgina T.M. Computational and information technologies for monitoring and modeling of climate changes and their consequences. - Novosibirsk: Nauka, Siberian branch, 2013. - 195 p. (in Russian) 2. Felice Frankel, Rosalind Reid. Big data: Distilling meaning from data // Nature. Vol. 455. N. 7209. P. 30. 3. T.M. Shulgina, E.P. Gordov, I.G. Okladnikov, A.G., Titov, E.Yu. Genina, N.P. Gorbatenko, I.V. Kuzhevskaya, A.S. Akhmetshina. Software complex for a regional climate change analysis. // Vestnik NGU. Series: Information technologies. 2013. Vol. 11. Issue 1. P. 124-131 (in Russian). 4. I.G. Okladnikov, A.G. Titov, T.M. Shulgina, E.P. Gordov, V.Yu. Bogomolov, Yu.V. Martynova, S.P. Suschenko, A.V. Skvortsov. Software for analysis and visualization of climate change monitoring and forecasting data // Numerical methods and programming, 2013. Vol. 14. P. 123-131 (in Russian).
Federal Register 2010, 2011, 2012, 2013, 2014
2013-11-05
...-0392] Proposed Enhancements to the Motor Carrier Safety Measurement System (SMS) Public Web Site AGENCY... on the Agency's Safety Measurement System (SMS) public Web site. FMCSA first announced the... public Web site that are the direct result of feedback from stakeholders regarding the information...
ERIC Educational Resources Information Center
Mitsuhara, Hiroyuki; Kurose, Yoshinobu; Ochi, Youji; Yano, Yoneo
The authors developed a Web-based Adaptive Educational System (Web-based AES) named ITMS (Individualized Teaching Material System). ITMS adaptively integrates knowledge on the distributed Web pages and generates individualized teaching material that has various contents. ITMS also presumes the learners' knowledge levels from the states of their…
NASA Astrophysics Data System (ADS)
Madiraju, Praveen; Zhang, Yanqing
2002-03-01
When a user logs in to a website, behind the scenes the user leaves his/her impressions, usage patterns and also access patterns in the web servers log file. A web usage mining agent can analyze these web logs to help web developers to improve the organization and presentation of their websites. They can help system administrators in improving the system performance. Web logs provide invaluable help in creating adaptive web sites and also in analyzing the network traffic analysis. This paper presents the design and implementation of a Web usage mining agent for digging in to the web log files.
Bao, Shunxing; Weitendorf, Frederick D.; Plassard, Andrew J.; Huo, Yuankai; Gokhale, Aniruddha; Landman, Bennett A.
2016-01-01
The field of big data is generally concerned with the scale of processing at which traditional computational paradigms break down. In medical imaging, traditional large scale processing uses a cluster computer that combines a group of workstation nodes into a functional unit that is controlled by a job scheduler. Typically, a shared-storage network file system (NFS) is used to host imaging data. However, data transfer from storage to processing nodes can saturate network bandwidth when data is frequently uploaded/retrieved from the NFS, e.g., “short” processing times and/or “large” datasets. Recently, an alternative approach using Hadoop and HBase was presented for medical imaging to enable co-location of data storage and computation while minimizing data transfer. The benefits of using such a framework must be formally evaluated against a traditional approach to characterize the point at which simply “large scale” processing transitions into “big data” and necessitates alternative computational frameworks. The proposed Hadoop system was implemented on a production lab-cluster alongside a standard Sun Grid Engine (SGE). Theoretical models for wall-clock time and resource time for both approaches are introduced and validated. To provide real example data, three T1 image archives were retrieved from a university secure, shared web database and used to empirically assess computational performance under three configurations of cluster hardware (using 72, 109, or 209 CPU cores) with differing job lengths. Empirical results match the theoretical models. Based on these data, a comparative analysis is presented for when the Hadoop framework will be relevant and non-relevant for medical imaging. PMID:28736473
Global change in the trophic functioning of marine food webs.
Maureaud, Aurore; Gascuel, Didier; Colléter, Mathieu; Palomares, Maria L D; Du Pontavice, Hubert; Pauly, Daniel; Cheung, William W L
2017-01-01
The development of fisheries in the oceans, and other human drivers such as climate warming, have led to changes in species abundance, assemblages, trophic interactions, and ultimately in the functioning of marine food webs. Here, using a trophodynamic approach and global databases of catches and life history traits of marine species, we tested the hypothesis that anthropogenic ecological impacts may have led to changes in the global parameters defining the transfers of biomass within the food web. First, we developed two indicators to assess such changes: the Time Cumulated Indicator (TCI) measuring the residence time of biomass within the food web, and the Efficiency Cumulated Indicator (ECI) quantifying the fraction of secondary production reaching the top of the trophic chain. Then, we assessed, at the large marine ecosystem scale, the worldwide change of these two indicators over the 1950-2010 time-periods. Global trends were identified and cluster analyses were used to characterize the variability of trends between ecosystems. Results showed that the most common pattern over the study period is a global decrease in TCI, while the ECI indicator tends to increase. Thus, changes in species assemblages would induce faster and apparently more efficient biomass transfers in marine food webs. Results also suggested that the main driver of change over that period had been the large increase in fishing pressure. The largest changes occurred in ecosystems where 'fishing down the marine food web' are most intensive.
Web Mining for Web Image Retrieval.
ERIC Educational Resources Information Center
Chen, Zheng; Wenyin, Liu; Zhang, Feng; Li, Mingjing; Zhang, Hongjiang
2001-01-01
Presents a prototype system for image retrieval from the Internet using Web mining. Discusses the architecture of the Web image retrieval prototype; document space modeling; user log mining; and image retrieval experiments to evaluate the proposed system. (AEF)
WebCIS: large scale deployment of a Web-based clinical information system.
Hripcsak, G; Cimino, J J; Sengupta, S
1999-01-01
WebCIS is a Web-based clinical information system. It sits atop the existing Columbia University clinical information system architecture, which includes a clinical repository, the Medical Entities Dictionary, an HL7 interface engine, and an Arden Syntax based clinical event monitor. WebCIS security features include authentication with secure tokens, authorization maintained in an LDAP server, SSL encryption, permanent audit logs, and application time outs. WebCIS is currently used by 810 physicians at the Columbia-Presbyterian center of New York Presbyterian Healthcare to review and enter data into the electronic medical record. Current deployment challenges include maintaining adequate database performance despite complex queries, replacing large numbers of computers that cannot run modern Web browsers, and training users that have never logged onto the Web. Although the raised expectations and higher goals have increased deployment costs, the end result is a far more functional, far more available system.
Birkhofer, Klaus; Henschel, Joh; Lubin, Yael
2012-11-01
Individuals of most animal species are non-randomly distributed in space. Extreme climatic events are often ignored as potential drivers of distribution patterns, and the role of such events is difficult to assess. Seothyra henscheli (Araneae, Eresidae) is a sedentary spider found in the Namib dunes in Namibia. The spider constructs a sticky-edged silk web on the sand surface, connected to a vertical, silk-lined burrow. Above-ground web structures can be damaged by strong winds or heavy rainfall, and during dispersal spiders are susceptible to environmental extremes. Locations of burrows were mapped in three field sites in 16 out of 20 years from 1987 to 2007, and these grid-based data were used to identify the relationship between spatial patterns, climatic extremes and sampling year. According to Morisita's index, individuals had an aggregated distribution in most years and field sites, and Geary's C suggests clustering up to scales of 2 m. Individuals were more aggregated in years with high maximum wind speed and low annual precipitation. Our results suggest that clustering is a temporally stable property of populations that holds even under fluctuating burrow densities. Climatic extremes, however, affect the intensity of clustering behaviour: individuals seem to be better protected in field sites with many conspecific neighbours. We suggest that burrow-site selection is driven at least partly by conspecific cuing, and this behaviour may protect populations from collapse during extreme climatic events.
NASA Astrophysics Data System (ADS)
Yuizono, Takaya; Hara, Kousuke; Nakayama, Shigeru
A web-based distributed cooperative development environment of sign-language animation system has been developed. We have extended the system from the previous animation system that was constructed as three tiered system which consists of sign-language animation interface layer, sign-language data processing layer, and sign-language animation database. Two components of a web client using VRML plug-in and web servlet are added to the previous system. The systems can support humanoid-model avatar for interoperability, and can use the stored sign language animation data shared on the database. It is noted in the evaluation of this system that the inverse kinematics function of web client improves the sign-language animation making.
A Bibliometric Analysis of U.S.-Based Research on the Behavioral Risk Factor Surveillance System
Khalil, George M.; Gotway Crawford, Carol A.
2017-01-01
Background Since Alan Pritchard defined bibliometrics as “the application of statistical methods to media of communication” in 1969, bibliometric analyses have become widespread. To date, however, bibliometrics has not been used to analyze publications related to the U.S. Behavioral Risk Factor Surveillance System (BRFSS). Purpose To determine the most frequently cited BRFSS-related topical areas, institutions, and journals. Methods A search of the Web of Knowledge database in 2013 identified U.S.-published studies related to BRFSS, from its start in 1984 through 2012. Search terms were BRFSS, Behavioral Risk Factor Surveillance System, or Behavioral Risk Survey. The resulting 1,387 articles were analyzed descriptively and produced data for VOSviewer, a computer program that plotted a relevance distance–based map and clustered keywords from text in titles and abstracts. Results Topics, journals, and publishing institutions ranged widely. Most research was clustered by content area, such as cancer screening, access to care, heart health, and quality of life. The American Journal of Preventive Medicine and American Journal of Public Health published the most BRFSS-related papers (95 and 70, respectively). Conclusions Bibliometrics can help identify the most frequently published BRFSS-related topics, publishing journals, and publishing institutions. BRFSS data are widely used, particularly by CDC and academic institutions such as the University of Washington and other universities hosting top-ranked schools of public health. Bibliometric analysis and mapping provides an innovative way of quantifying and visualizing the plethora of research conducted using BRFSS data and summarizing the contribution of this surveillance system to public health. PMID:25442231
High-throughput bioinformatics with the Cyrille2 pipeline system
Fiers, Mark WEJ; van der Burgt, Ate; Datema, Erwin; de Groot, Joost CW; van Ham, Roeland CHJ
2008-01-01
Background Modern omics research involves the application of high-throughput technologies that generate vast volumes of data. These data need to be pre-processed, analyzed and integrated with existing knowledge through the use of diverse sets of software tools, models and databases. The analyses are often interdependent and chained together to form complex workflows or pipelines. Given the volume of the data used and the multitude of computational resources available, specialized pipeline software is required to make high-throughput analysis of large-scale omics datasets feasible. Results We have developed a generic pipeline system called Cyrille2. The system is modular in design and consists of three functionally distinct parts: 1) a web based, graphical user interface (GUI) that enables a pipeline operator to manage the system; 2) the Scheduler, which forms the functional core of the system and which tracks what data enters the system and determines what jobs must be scheduled for execution, and; 3) the Executor, which searches for scheduled jobs and executes these on a compute cluster. Conclusion The Cyrille2 system is an extensible, modular system, implementing the stated requirements. Cyrille2 enables easy creation and execution of high throughput, flexible bioinformatics pipelines. PMID:18269742
Mehring, Michael; Haag, Max; Linde, Klaus; Wagenpfeil, Stefan; Schneider, Antonius
2014-09-24
Preliminary findings suggest that Web-based interventions may be effective in achieving significant smoking cessation. To date, very few findings are available for primary care patients, and especially for the involvement of general practitioners. Our goal was to examine the short-term effectiveness of a fully automated Web-based coaching program in combination with accompanied telephone counseling in smoking cessation in a primary care setting. The study was an unblinded cluster-randomized trial with an observation period of 12 weeks. Individuals recruited by general practitioners randomized to the intervention group participated in a Web-based coaching program based on education, motivation, exercise guidance, daily short message service (SMS) reminding, weekly feedback through Internet, and active monitoring by general practitioners. All components of the program are fully automated. Participants in the control group received usual care and advice from their practitioner without the Web-based coaching program. The main outcome was the biochemically confirmed smoking status after 12 weeks. We recruited 168 participants (86 intervention group, 82 control group) into the study. For 51 participants from the intervention group and 70 participants from the control group, follow-up data were available both at baseline and 12 weeks. Very few patients (9.8%, 5/51) from the intervention group and from the control group (8.6%, 6/70) successfully managed smoking cessation (OR 0.86, 95% CI 0.25-3.0; P=.816). Similar results were found within the intent-to-treat analysis: 5.8% (5/86) of the intervention group and 7.3% (6/82) of the control group (OR 1.28, 95% CI 0.38-4.36; P=.694). The number of smoked cigarettes per day decreased on average by 9.3 in the intervention group and by 6.6 in the control group (2.7 mean difference; 95% CI -5.33 to -0.58; P=.045). After adjustment for the baseline value, age, gender, and height, this significance decreases (mean difference 2.2; 95% CI -4.7 to 0.3; P=.080). This trial did not show that the tested Web-based intervention was effective for achieving smoking cessation compared to usual care. The limited statistical power and the high drop-out rate may have reduced the study's ability to detect significant differences between the groups. Further randomized controlled trials are needed in larger populations and to investigate the long-term outcome. German Register for Clinical Trials, registration number DRKS00003067; http://drks-neu.uniklinik-freiburg.de/drks_web/navigate.do?navigationId=trial.HTML&TRIAL_ ID=DRKS00003067 (Archived by WebCite at http://www.webcitation.org/6Sff1YZpx).
Patterns of usage for a Web-based clinical information system.
Chen, Elizabeth S; Cimino, James J
2004-01-01
Understanding how clinicians are using clinical information systems to assist with their everyday tasks is valuable to the system design and development process. Developers of such systems are interested in monitoring usage in order to make enhancements. System log files are rich resources for gaining knowledge about how the system is being used. We have analyzed the log files of our Web-based clinical information system (WebCIS) to obtain various usage statistics including which WebCIS features are frequently being used. We have also identified usage patterns, which convey how the user is traversing the system. We present our method and these results as well as describe how the results can be used to customize menus, shortcut lists, and patient reports in WebCIS and similar systems.
Muroff, Jordana; Amodeo, Maryann; Larson, Mary Jo; Carey, Margaret; Loftin, Ralph D
2011-01-01
This article describes a data management system (DMS) developed to support a large-scale randomized study of an innovative web-course that was designed to improve substance abuse counselors' knowledge and skills in applying a substance abuse treatment method (i.e., cognitive behavioral therapy; CBT). The randomized trial compared the performance of web-course-trained participants (intervention group) and printed-manual-trained participants (comparison group) to determine the effectiveness of the web-course in teaching CBT skills. A single DMS was needed to support all aspects of the study: web-course delivery and management, as well as randomized trial management. The authors briefly reviewed several other systems that were described as built either to handle randomized trials or to deliver and evaluate web-based training. However it was clear that these systems fell short of meeting our needs for simultaneous, coordinated management of the web-course and the randomized trial. New England Research Institute's (NERI) proprietary Advanced Data Entry and Protocol Tracking (ADEPT) system was coupled with the web-programmed course and customized for our purposes. This article highlights the requirements for a DMS that operates at the intersection of web-based course management systems and randomized clinical trial systems, and the extent to which the coupled, customized ADEPT satisfied those requirements. Recommendations are included for institutions and individuals considering conducting randomized trials and web-based training programs, and seeking a DMS that can meet similar requirements.
Wikipedias: Collaborative web-based encyclopedias as complex networks
NASA Astrophysics Data System (ADS)
Zlatić, V.; Božičević, M.; Štefančić, H.; Domazet, M.
2006-07-01
Wikipedia is a popular web-based encyclopedia edited freely and collaboratively by its users. In this paper we present an analysis of Wikipedias in several languages as complex networks. The hyperlinks pointing from one Wikipedia article to another are treated as directed links while the articles represent the nodes of the network. We show that many network characteristics are common to different language versions of Wikipedia, such as their degree distributions, growth, topology, reciprocity, clustering, assortativity, path lengths, and triad significance profiles. These regularities, found in the ensemble of Wikipedias in different languages and of different sizes, point to the existence of a unique growth process. We also compare Wikipedias to other previously studied networks.
Wikipedias: collaborative web-based encyclopedias as complex networks.
Zlatić, V; Bozicević, M; Stefancić, H; Domazet, M
2006-07-01
Wikipedia is a popular web-based encyclopedia edited freely and collaboratively by its users. In this paper we present an analysis of Wikipedias in several languages as complex networks. The hyperlinks pointing from one Wikipedia article to another are treated as directed links while the articles represent the nodes of the network. We show that many network characteristics are common to different language versions of Wikipedia, such as their degree distributions, growth, topology, reciprocity, clustering, assortativity, path lengths, and triad significance profiles. These regularities, found in the ensemble of Wikipedias in different languages and of different sizes, point to the existence of a unique growth process. We also compare Wikipedias to other previously studied networks.
2015-05-01
Achieving Better Buying Power through Acquisition of Open Architecture Software Systems for Web-Based and Mobile Devices Walt Scacchi and Thomas...2015 to 00-00-2015 4. TITLE AND SUBTITLE Achieving Better Buying Power through Acquisition of Open Architecture Software Systems for Web-Based and...architecture (OA) software systems Emerging challenges in achieving Better Buying Power (BBP) via OA software systems for Web- based and Mobile devices
Inference from clustering with application to gene-expression microarrays.
Dougherty, Edward R; Barrera, Junior; Brun, Marcel; Kim, Seungchan; Cesar, Roberto M; Chen, Yidong; Bittner, Michael; Trent, Jeffrey M
2002-01-01
There are many algorithms to cluster sample data points based on nearness or a similarity measure. Often the implication is that points in different clusters come from different underlying classes, whereas those in the same cluster come from the same class. Stochastically, the underlying classes represent different random processes. The inference is that clusters represent a partition of the sample points according to which process they belong. This paper discusses a model-based clustering toolbox that evaluates cluster accuracy. Each random process is modeled as its mean plus independent noise, sample points are generated, the points are clustered, and the clustering error is the number of points clustered incorrectly according to the generating random processes. Various clustering algorithms are evaluated based on process variance and the key issue of the rate at which algorithmic performance improves with increasing numbers of experimental replications. The model means can be selected by hand to test the separability of expected types of biological expression patterns. Alternatively, the model can be seeded by real data to test the expected precision of that output or the extent of improvement in precision that replication could provide. In the latter case, a clustering algorithm is used to form clusters, and the model is seeded with the means and variances of these clusters. Other algorithms are then tested relative to the seeding algorithm. Results are averaged over various seeds. Output includes error tables and graphs, confusion matrices, principal-component plots, and validation measures. Five algorithms are studied in detail: K-means, fuzzy C-means, self-organizing maps, hierarchical Euclidean-distance-based and correlation-based clustering. The toolbox is applied to gene-expression clustering based on cDNA microarrays using real data. Expression profile graphics are generated and error analysis is displayed within the context of these profile graphics. A large amount of generated output is available over the web.
EPA's Web Taxonomy is a faceted hierarchical vocabulary used to tag web pages with terms from a controlled vocabulary. Tagging enables search and discovery of EPA's Web based information assests. EPA's Web Taxonomy is being provided in Simple Knowledge Organization System (SKOS) format. SKOS is a standard for sharing and linking knowledge organization systems that promises to make Federal terminology resources more interoperable.
Sensor Webs as Virtual Data Systems for Earth Science
NASA Astrophysics Data System (ADS)
Moe, K. L.; Sherwood, R.
2008-05-01
The NASA Earth Science Technology Office established a 3-year Advanced Information Systems Technology (AIST) development program in late 2006 to explore the technical challenges associated with integrating sensors, sensor networks, data assimilation and modeling components into virtual data systems called "sensor webs". The AIST sensor web program was initiated in response to a renewed emphasis on the sensor web concepts. In 2004, NASA proposed an Earth science vision for a more robust Earth observing system, coupled with remote sensing data analysis tools and advances in Earth system models. The AIST program is conducting the research and developing components to explore the technology infrastructure that will enable the visionary goals. A working statement for a NASA Earth science sensor web vision is the following: On-demand sensing of a broad array of environmental and ecological phenomena across a wide range of spatial and temporal scales, from a heterogeneous suite of sensors both in-situ and in orbit. Sensor webs will be dynamically organized to collect data, extract information from it, accept input from other sensor / forecast / tasking systems, interact with the environment based on what they detect or are tasked to perform, and communicate observations and results in real time. The focus on sensor webs is to develop the technology and prototypes to demonstrate the evolving sensor web capabilities. There are 35 AIST projects ranging from 1 to 3 years in duration addressing various aspects of sensor webs involving space sensors such as Earth Observing-1, in situ sensor networks such as the southern California earthquake network, and various modeling and forecasting systems. Some of these projects build on proof-of-concept demonstrations of sensor web capabilities like the EO-1 rapid fire response initially implemented in 2003. Other projects simulate future sensor web configurations to evaluate the effectiveness of sensor-model interactions for producing improved science predictions. Still other projects are maturing technology to support autonomous operations, communications and system interoperability. This paper will highlight lessons learned by various projects during the first half of the AIST program. Several sensor web demonstrations have been implemented and resulting experience with evolving standards, such as the Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) among others, will be featured. The role of sensor webs in support of the intergovernmental Group on Earth Observations' Global Earth Observation System of Systems (GEOSS) will also be discussed. The GEOSS vision is a distributed system of systems that builds on international components to supply observing and processing systems that are, in the whole, comprehensive, coordinated and sustained. Sensor web prototypes are under development to demonstrate how remote sensing satellite data, in situ sensor networks and decision support systems collaborate in applications of interest to GEO, such as flood monitoring. Furthermore, the international Committee on Earth Observation Satellites (CEOS) has stepped up to the challenge to provide the space-based systems component for GEOSS. CEOS has proposed "virtual constellations" to address emerging data gaps in environmental monitoring, avoid overlap among observing systems, and make maximum use of existing space and ground assets. Exploratory applications that support the objectives of virtual constellations will also be discussed as a future role for sensor webs.
Empirical analysis of online social networks in the age of Web 2.0
NASA Astrophysics Data System (ADS)
Fu, Feng; Liu, Lianghuan; Wang, Long
2008-01-01
Today the World Wide Web is undergoing a subtle but profound shift to Web 2.0, to become more of a social web. The use of collaborative technologies such as blogs and social networking site (SNS) leads to instant online community in which people communicate rapidly and conveniently with each other. Moreover, there are growing interest and concern regarding the topological structure of these new online social networks. In this paper, we present empirical analysis of statistical properties of two important Chinese online social networks-a blogging network and an SNS open to college students. They are both emerging in the age of Web 2.0. We demonstrate that both networks possess small-world and scale-free features already observed in real-world and artificial networks. In addition, we investigate the distribution of topological distance. Furthermore, we study the correlations between degree (in/out) and degree (in/out), clustering coefficient and degree, popularity (in terms of number of page views) and in-degree (for the blogging network), respectively. We find that the blogging network shows disassortative mixing pattern, whereas the SNS network is an assortative one. Our research may help us to elucidate the self-organizing structural characteristics of these online social networks embedded in technical forms.
Sibling cannibalism in a web-building spider: effects of density and shared environment.
Modanu, Maria; Li, Lucy Dong Xuan; Said, Hosay; Rathitharan, Nizanthan; Andrade, Maydianne C B
2014-07-01
Sibling cannibalism occurs across diverse taxa and can affect population size and structure, as well as the fitness of parents and the cannibal, via density effects and variation in individual propensity to cannibalize. We examined these effects on sibling cannibalism in juveniles of a web-building spider (Latrodectus hasselti, Australian redbacks). Adult redbacks are solitary, but juveniles live in clusters of variable density for a week after hatching. We confined newly hatched siblings from a singly-mated female to a low or high density treatment in a split-clutch design, then left spiderlings unfed for a week. Our results showed no effect of density on overall cannibalism levels, but a strong correlation between cannibalism counts from the same maternal lines across densities. Unlike web-bound sit-and-wait predators, wandering spiders that are active hunters have been shown to experience density-dependent cannibalism. In contrast, we suggest sibling cannibalism in web-building spiders may be density independent because early cohabitation on the web selects for elevated tolerance of conspecifics. We conclude that, rather than being linked to density, cannibalism of siblings in these species may be controlled more strongly by variation in individual propensity to cannibalize. Copyright © 2014 Elsevier B.V. All rights reserved.
MALINA: a web service for visual analytics of human gut microbiota whole-genome metagenomic reads.
Tyakht, Alexander V; Popenko, Anna S; Belenikin, Maxim S; Altukhov, Ilya A; Pavlenko, Alexander V; Kostryukova, Elena S; Selezneva, Oksana V; Larin, Andrei K; Karpova, Irina Y; Alexeev, Dmitry G
2012-12-07
MALINA is a web service for bioinformatic analysis of whole-genome metagenomic data obtained from human gut microbiota sequencing. As input data, it accepts metagenomic reads of various sequencing technologies, including long reads (such as Sanger and 454 sequencing) and next-generation (including SOLiD and Illumina). It is the first metagenomic web service that is capable of processing SOLiD color-space reads, to authors' knowledge. The web service allows phylogenetic and functional profiling of metagenomic samples using coverage depth resulting from the alignment of the reads to the catalogue of reference sequences which are built into the pipeline and contain prevalent microbial genomes and genes of human gut microbiota. The obtained metagenomic composition vectors are processed by the statistical analysis and visualization module containing methods for clustering, dimension reduction and group comparison. Additionally, the MALINA database includes vectors of bacterial and functional composition for human gut microbiota samples from a large number of existing studies allowing their comparative analysis together with user samples, namely datasets from Russian Metagenome project, MetaHIT and Human Microbiome Project (downloaded from http://hmpdacc.org). MALINA is made freely available on the web at http://malina.metagenome.ru. The website is implemented in JavaScript (using Ext JS), Microsoft .NET Framework, MS SQL, Python, with all major browsers supported.
Functional Genomics Assistant (FUGA): a toolbox for the analysis of complex biological networks
2011-01-01
Background Cellular constituents such as proteins, DNA, and RNA form a complex web of interactions that regulate biochemical homeostasis and determine the dynamic cellular response to external stimuli. It follows that detailed understanding of these patterns is critical for the assessment of fundamental processes in cell biology and pathology. Representation and analysis of cellular constituents through network principles is a promising and popular analytical avenue towards a deeper understanding of molecular mechanisms in a system-wide context. Findings We present Functional Genomics Assistant (FUGA) - an extensible and portable MATLAB toolbox for the inference of biological relationships, graph topology analysis, random network simulation, network clustering, and functional enrichment statistics. In contrast to conventional differential expression analysis of individual genes, FUGA offers a framework for the study of system-wide properties of biological networks and highlights putative molecular targets using concepts of systems biology. Conclusion FUGA offers a simple and customizable framework for network analysis in a variety of systems biology applications. It is freely available for individual or academic use at http://code.google.com/p/fuga. PMID:22035155
Surveillance for human Salmonella infections in the United States.
Swaminathan, Bala; Barrett, Timothy J; Fields, Patricia
2006-01-01
Surveillance for human Salmonella infections plays a critical role in understanding and controlling foodborne illness due to Salmonella. Along with its public health partners, the Centers for Disease Control and Prevention (CDC) has several surveillance systems that collect information on Salmonella infections in the United States. The National Salmonella Surveillance System, begun in 1962, receives reports of laboratory-confirmed Salmonella infections through state public health laboratories. Salmonella outbreaks are reported by state and local health departments through the Foodborne Disease Outbreak Reporting System, which became a Web-based, electronic system (eFORS) in 2001. PulseNet facilitates the detection of clusters of Salmonella infections through standardized molecular subtyping (DNA "fingerprinting") of isolates and maintenance of "fingerprint" databases. The National Antimicrobial Resistance Monitoring System for Enteric Bacteria (NARMS) monitors antimicrobial resistance in Salmonella by susceptibility testing of every 20th Salmonella isolate received by state and local public health laboratories. FootNet is an active surveillance system that monitors Salmonella infections in sentinel areas, providing population-based estimates of infection rates. Efforts are underway to electronically link all of the Salmonella surveillance systems at CDC to facilitate optimum use of available data and minimize duplication.
The PhytoClust tool for metabolic gene clusters discovery in plant genomes
Fuchs, Lisa-Maria
2017-01-01
Abstract The existence of Metabolic Gene Clusters (MGCs) in plant genomes has recently raised increased interest. Thus far, MGCs were commonly identified for pathways of specialized metabolism, mostly those associated with terpene type products. For efficient identification of novel MGCs, computational approaches are essential. Here, we present PhytoClust; a tool for the detection of candidate MGCs in plant genomes. The algorithm employs a collection of enzyme families related to plant specialized metabolism, translated into hidden Markov models, to mine given genome sequences for physically co-localized metabolic enzymes. Our tool accurately identifies previously characterized plant MGCs. An exhaustive search of 31 plant genomes detected 1232 and 5531 putative gene cluster types and candidates, respectively. Clustering analysis of putative MGCs types by species reflected plant taxonomy. Furthermore, enrichment analysis revealed taxa- and species-specific enrichment of certain enzyme families in MGCs. When operating through our web-interface, PhytoClust users can mine a genome either based on a list of known cluster types or by defining new cluster rules. Moreover, for selected plant species, the output can be complemented by co-expression analysis. Altogether, we envisage PhytoClust to enhance novel MGCs discovery which will in turn impact the exploration of plant metabolism. PMID:28486689
The PhytoClust tool for metabolic gene clusters discovery in plant genomes.
Töpfer, Nadine; Fuchs, Lisa-Maria; Aharoni, Asaph
2017-07-07
The existence of Metabolic Gene Clusters (MGCs) in plant genomes has recently raised increased interest. Thus far, MGCs were commonly identified for pathways of specialized metabolism, mostly those associated with terpene type products. For efficient identification of novel MGCs, computational approaches are essential. Here, we present PhytoClust; a tool for the detection of candidate MGCs in plant genomes. The algorithm employs a collection of enzyme families related to plant specialized metabolism, translated into hidden Markov models, to mine given genome sequences for physically co-localized metabolic enzymes. Our tool accurately identifies previously characterized plant MGCs. An exhaustive search of 31 plant genomes detected 1232 and 5531 putative gene cluster types and candidates, respectively. Clustering analysis of putative MGCs types by species reflected plant taxonomy. Furthermore, enrichment analysis revealed taxa- and species-specific enrichment of certain enzyme families in MGCs. When operating through our web-interface, PhytoClust users can mine a genome either based on a list of known cluster types or by defining new cluster rules. Moreover, for selected plant species, the output can be complemented by co-expression analysis. Altogether, we envisage PhytoClust to enhance novel MGCs discovery which will in turn impact the exploration of plant metabolism. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Dias, Claudia; Mendes, Luís
2018-01-01
Despite the importance of the literature on food quality labels in the European Union (PDO, PGI and TSG), our search did not find any review joining the various research topics on this subject. This study aims therefore to consolidate the state of academic research in this field, and so the methodological option was to elaborate a bibliometric analysis resorting to the term co-occurrence technique. Analysis was made of 501 articles on the ISI Web of Science database, covering publications up to 2016. The results of the bibliometric analysis allowed identification of four clusters: "Protected Geographical Indication", "Certification of Olive Oil and Cultivars", "Certification of Cheese and Milk" and "Certification and Chemical Composition". Unlike the other clusters, where the PDO label predominates, the "Protected Geographical Indication" cluster covers the study of PGI products, highlighting analysis of consumer behaviour in relation to this type of product. The focus of studies in the "Certification of Olive Oil and Cultivars" cluster and the "Certification of Cheese and Milk" cluster is the development of authentication methods for certified traditional products. In the "Certification and Chemical Composition" cluster, standing out is analysis of the profiles of fatty acids present in this type of product. Copyright © 2017 Elsevier Ltd. All rights reserved.
The GBT Dynamic Scheduling System: Powered by the Web
NASA Astrophysics Data System (ADS)
Marganian, P.; Clark, M.; McCarty, M.; Sessoms, E.; Shelton, A.
2009-09-01
The web technologies utilized for the Robert C. Byrd Green Bank Telescope's (GBT) new Dynamic Scheduling System are discussed, focusing on languages, frameworks, and tools. We use a popular Python web framework, TurboGears, to take advantage of the extensive web services the system provides. TurboGears is a model-view-controller framework, which aggregates SQLAlchemy, Genshi, and CherryPy respectively. On top of this framework, Javascript (Prototype, script.aculo.us, and JQuery) and cascading style sheets (Blueprint) are used for desktop-quality web pages.
2009-01-01
Background Tardigrades represent an animal phylum with extraordinary resistance to environmental stress. Results To gain insights into their stress-specific adaptation potential, major clusters of related and similar proteins are identified, as well as specific functional clusters delineated comparing all tardigrades and individual species (Milnesium tardigradum, Hypsibius dujardini, Echiniscus testudo, Tulinus stephaniae, Richtersius coronifer) and functional elements in tardigrade mRNAs are analysed. We find that 39.3% of the total sequences clustered in 58 clusters of more than 20 proteins. Among these are ten tardigrade specific as well as a number of stress-specific protein clusters. Tardigrade-specific functional adaptations include strong protein, DNA- and redox protection, maintenance and protein recycling. Specific regulatory elements regulate tardigrade mRNA stability such as lox P DICE elements whereas 14 other RNA elements of higher eukaryotes are not found. Further features of tardigrade specific adaption are rapidly identified by sequence and/or pattern search on the web-tool tardigrade analyzer http://waterbear.bioapps.biozentrum.uni-wuerzburg.de. The work-bench offers nucleotide pattern analysis for promotor and regulatory element detection (tardigrade specific; nrdb) as well as rapid COG search for function assignments including species-specific repositories of all analysed data. Conclusion Different protein clusters and regulatory elements implicated in tardigrade stress adaptations are analysed including unpublished tardigrade sequences. PMID:19821996
Förster, Frank; Liang, Chunguang; Shkumatov, Alexander; Beisser, Daniela; Engelmann, Julia C; Schnölzer, Martina; Frohme, Marcus; Müller, Tobias; Schill, Ralph O; Dandekar, Thomas
2009-10-12
Tardigrades represent an animal phylum with extraordinary resistance to environmental stress. To gain insights into their stress-specific adaptation potential, major clusters of related and similar proteins are identified, as well as specific functional clusters delineated comparing all tardigrades and individual species (Milnesium tardigradum, Hypsibius dujardini, Echiniscus testudo, Tulinus stephaniae, Richtersius coronifer) and functional elements in tardigrade mRNAs are analysed. We find that 39.3% of the total sequences clustered in 58 clusters of more than 20 proteins. Among these are ten tardigrade specific as well as a number of stress-specific protein clusters. Tardigrade-specific functional adaptations include strong protein, DNA- and redox protection, maintenance and protein recycling. Specific regulatory elements regulate tardigrade mRNA stability such as lox P DICE elements whereas 14 other RNA elements of higher eukaryotes are not found. Further features of tardigrade specific adaption are rapidly identified by sequence and/or pattern search on the web-tool tardigrade analyzer http://waterbear.bioapps.biozentrum.uni-wuerzburg.de. The work-bench offers nucleotide pattern analysis for promotor and regulatory element detection (tardigrade specific; nrdb) as well as rapid COG search for function assignments including species-specific repositories of all analysed data. Different protein clusters and regulatory elements implicated in tardigrade stress adaptations are analysed including unpublished tardigrade sequences.
Evaluating the Efficacy of the Cloud for Cluster Computation
NASA Technical Reports Server (NTRS)
Knight, David; Shams, Khawaja; Chang, George; Soderstrom, Tom
2012-01-01
Computing requirements vary by industry, and it follows that NASA and other research organizations have computing demands that fall outside the mainstream. While cloud computing made rapid inroads for tasks such as powering web applications, performance issues on highly distributed tasks hindered early adoption for scientific computation. One venture to address this problem is Nebula, NASA's homegrown cloud project tasked with delivering science-quality cloud computing resources. However, another industry development is Amazon's high-performance computing (HPC) instances on Elastic Cloud Compute (EC2) that promises improved performance for cluster computation. This paper presents results from a series of benchmarks run on Amazon EC2 and discusses the efficacy of current commercial cloud technology for running scientific applications across a cluster. In particular, a 240-core cluster of cloud instances achieved 2 TFLOPS on High-Performance Linpack (HPL) at 70% of theoretical computational performance. The cluster's local network also demonstrated sub-100 ?s inter-process latency with sustained inter-node throughput in excess of 8 Gbps. Beyond HPL, a real-world Hadoop image processing task from NASA's Lunar Mapping and Modeling Project (LMMP) was run on a 29 instance cluster to process lunar and Martian surface images with sizes on the order of tens of gigapixels. These results demonstrate that while not a rival of dedicated supercomputing clusters, commercial cloud technology is now a feasible option for moderately demanding scientific workloads.
Liu, Chao; Abu-Jamous, Basel; Brattico, Elvira; Nandi, Asoke K
2017-03-01
In the past decades, neuroimaging of humans has gained a position of status within neuroscience, and data-driven approaches and functional connectivity analyses of functional magnetic resonance imaging (fMRI) data are increasingly favored to depict the complex architecture of human brains. However, the reliability of these findings is jeopardized by too many analysis methods and sometimes too few samples used, which leads to discord among researchers. We propose a tunable consensus clustering paradigm that aims at overcoming the clustering methods selection problem as well as reliability issues in neuroimaging by means of first applying several analysis methods (three in this study) on multiple datasets and then integrating the clustering results. To validate the method, we applied it to a complex fMRI experiment involving affective processing of hundreds of music clips. We found that brain structures related to visual, reward, and auditory processing have intrinsic spatial patterns of coherent neuroactivity during affective processing. The comparisons between the results obtained from our method and those from each individual clustering algorithm demonstrate that our paradigm has notable advantages over traditional single clustering algorithms in being able to evidence robust connectivity patterns even with complex neuroimaging data involving a variety of stimuli and affective evaluations of them. The consensus clustering method is implemented in the R package "UNCLES" available on http://cran.r-project.org/web/packages/UNCLES/index.html .
ERIC Educational Resources Information Center
Osipov, Ilya V.; Volinsky, Alex A.; Nikulchev, Evgeny; Prasikova, Anna Y.
2016-01-01
The paper describes development of the educational online web communication platform for teaching and learning foreign languages. The main objective was to develop a web application for teaching foreigners to understand casual fluent speech. The system is based on the time bank principle, allowing users to teach others their native language along…
Brooker, Simon J; Mwandawiro, Charles S; Halliday, Katherine E; Njenga, Sammy M; Mcharo, Carlos; Gichuki, Paul M; Wasunna, Beatrice; Kihara, Jimmy H; Njomo, Doris; Alusala, Dorcas; Chiguzo, Athuman; Turner, Hugo C; Teti, Caroline; Gwayi-Chore, Claire; Nikolay, Birgit; Truscott, James E; Hollingsworth, T Déirdre; Balabanova, Dina; Griffiths, Ulla K; Freeman, Matthew C; Allen, Elizabeth; Pullan, Rachel L; Anderson, Roy M
2015-01-01
Introduction In recent years, an unprecedented emphasis has been given to the control of neglected tropical diseases, including soil-transmitted helminths (STHs). The mainstay of STH control is school-based deworming (SBD), but mathematical modelling has shown that in all but very low transmission settings, SBD is unlikely to interrupt transmission, and that new treatment strategies are required. This study seeks to answer the question: is it possible to interrupt the transmission of STH, and, if so, what is the most cost-effective treatment strategy and delivery system to achieve this goal? Methods and analysis Two cluster randomised trials are being implemented in contrasting settings in Kenya. The interventions are annual mass anthelmintic treatment delivered to preschool- and school-aged children, as part of a national SBD programme, or to entire communities, delivered by community health workers. Allocation to study group is by cluster, using predefined units used in public health provision—termed community units (CUs). CUs are randomised to one of three groups: receiving either (1) annual SBD; (2) annual community-based deworming (CBD); or (3) biannual CBD. The primary outcome measure is the prevalence of hookworm infection, assessed by four cross-sectional surveys. Secondary outcomes are prevalence of Ascaris lumbricoides and Trichuris trichiura, intensity of species infections and treatment coverage. Costs and cost-effectiveness will be evaluated. Among a random subsample of participants, worm burden and proportion of unfertilised eggs will be assessed longitudinally. A nested process evaluation, using semistructured interviews, focus group discussions and a stakeholder analysis, will investigate the community acceptability, feasibility and scale-up of each delivery system. Ethics and dissemination Study protocols have been reviewed and approved by the ethics committees of the Kenya Medical Research Institute and National Ethics Review Committee, and London School of Hygiene and Tropical Medicine. The study has a dedicated web site. Trial registration number NCT02397772. PMID:26482774
Grooker, KartOO, Addict-o-Matic and More: Really Different Search Engines
ERIC Educational Resources Information Center
Descy, Don E.
2009-01-01
There are hundreds of unique search engines in the United States and thousands of unique search engines around the world. If people get into search engines designed just to search particular web sites, the number is in the hundreds of thousands. This article looks at: (1) clustering search engines, such as KartOO (www.kartoo.com) and Grokker…
Identifying Meta-Clusters of Students' Interest in Science and Their Change with Age
ERIC Educational Resources Information Center
Baram-Tsabari, Ayelet; Yarden, Anat
2009-01-01
Nearly 6,000 science questions collected from five different web-based, TV-based and school-based sources were rigorously analyzed in order to identify profiles of K-12 students' interest in science, and how these profiles change with age. The questions were analyzed according to their topic, thinking level, motivation for and level of autonomy in…
[Informational analysis of global health equity studies based on database of Web of Science].
Zhao, Bo; Cui, Lei; Guo, Yan
2011-06-18
To present the history of global health equity studies and provide reference for the selection of topics of China's health equity study. In this article citations on the subject of health equity from Web of Science (WOS) were analyzed and 60 papers concerned which were cited more than 30 times selected. Through the co-citation cluster analysis combined with the content analysis of the highly-cited papers, this article attempted to cluster them into several significant categories. Then we analyzed their strategic importance in the field of health equity by drawing citation strategic diagrams. Six hot topics in health equity studies were as follows: health service equity, the relationship between health service demand and utilization; definitions of health equity; socioeconomic status and mortality, income distribution and health, and the measurement of health inequity. Income distribution and health was the biggest concern and the measurement of health inequity was of the greatest novelty. Conducting empirical analyses on the effect of social determinants (including socioeconomic status, social network and psychosocial status etc.) on health by means of health equity measurements marks the development trend of health equity study.
The Rényi divergence enables accurate and precise cluster analysis for localisation microscopy.
Staszowska, Adela D; Fox-Roberts, Patrick; Hirvonen, Liisa M; Peddie, Christopher J; Collinson, Lucy M; Jones, Gareth E; Cox, Susan
2018-06-01
Clustering analysis is a key technique for quantitatively characterising structures in localisation microscopy images. To build up accurate information about biological structures, it is critical that the quantification is both accurate (close to the ground truth) and precise (has small scatter and is reproducible). Here we describe how the Rényi divergence can be used for cluster radius measurements in localisation microscopy data. We demonstrate that the Rényi divergence can operate with high levels of background and provides results which are more accurate than Ripley's functions, Voronoi tesselation or DBSCAN. Data supporting this research will be made accessible via a web link. Software codes developed for this work can be accessed via http://coxphysics.com/Renyi_divergence_software.zip. Implemented in C ++. Correspondence and requests for materials can be also addressed to the corresponding author. adela.staszowska@gmail.com or susan.cox@kcl.ac.uk. Supplementary data are available at Bioinformatics online.
Gravitational redshift of galaxies in clusters as predicted by general relativity.
Wojtak, Radosław; Hansen, Steen H; Hjorth, Jens
2011-09-28
The theoretical framework of cosmology is mainly defined by gravity, of which general relativity is the current model. Recent tests of general relativity within the Lambda Cold Dark Matter (ΛCDM) model have found a concordance between predictions and the observations of the growth rate and clustering of the cosmic web. General relativity has not hitherto been tested on cosmological scales independently of the assumptions of the ΛCDM model. Here we report an observation of the gravitational redshift of light coming from galaxies in clusters at the 99 per cent confidence level, based on archival data. Our measurement agrees with the predictions of general relativity and its modification created to explain cosmic acceleration without the need for dark energy (the f(R) theory), but is inconsistent with alternative models designed to avoid the presence of dark matter. © 2011 Macmillan Publishers Limited. All rights reserved
Kowiel, Marcin; Brzezinski, Dariusz; Jaskolski, Mariusz
2016-01-01
The refinement of macromolecular structures is usually aided by prior stereochemical knowledge in the form of geometrical restraints. Such restraints are also used for the flexible sugar-phosphate backbones of nucleic acids. However, recent highly accurate structural studies of DNA suggest that the phosphate bond angles may have inadequate description in the existing stereochemical dictionaries. In this paper, we analyze the bonding deformations of the phosphodiester groups in the Cambridge Structural Database, cluster the studied fragments into six conformation-related categories and propose a revised set of restraints for the O-P-O bond angles and distances. The proposed restraints have been positively validated against data from the Nucleic Acid Database and an ultrahigh-resolution Z-DNA structure in the Protein Data Bank. Additionally, the manual classification of PO4 geometry is compared with geometrical clusters automatically discovered by machine learning methods. The machine learning cluster analysis provides useful insights and a practical example for general applications of clustering algorithms for automatic discovery of hidden patterns of molecular geometry. Finally, we describe the implementation and application of a public-domain web server for automatic generation of the proposed restraints. PMID:27521371
Choi, Jihye; Cho, Youngtae; Shim, Eunyoung; Woo, Hyekyung
2016-12-08
Emerging and re-emerging infectious diseases are a significant public health concern, and early detection and immediate response is crucial for disease control. These challenges have led to the need for new approaches and technologies to reinforce the capacity of traditional surveillance systems for detecting emerging infectious diseases. In the last few years, the availability of novel web-based data sources has contributed substantially to infectious disease surveillance. This study explores the burgeoning field of web-based infectious disease surveillance systems by examining their current status, importance, and potential challenges. A systematic review framework was applied to the search, screening, and analysis of web-based infectious disease surveillance systems. We searched PubMed, Web of Science, and Embase databases to extensively review the English literature published between 2000 and 2015. Eleven surveillance systems were chosen for evaluation according to their high frequency of application. Relevant terms, including newly coined terms, development and classification of the surveillance systems, and various characteristics associated with the systems were studied. Based on a detailed and informative review of the 11 web-based infectious disease surveillance systems, it was evident that these systems exhibited clear strengths, as compared to traditional surveillance systems, but with some limitations yet to be overcome. The major strengths of the newly emerging surveillance systems are that they are intuitive, adaptable, low-cost, and operated in real-time, all of which are necessary features of an effective public health tool. The most apparent potential challenges of the web-based systems are those of inaccurate interpretation and prediction of health status, and privacy issues, based on an individual's internet activity. Despite being in a nascent stage with further modification needed, web-based surveillance systems have evolved to complement traditional national surveillance systems. This review highlights ways in which the strengths of existing systems can be maintained and weaknesses alleviated to implement optimal web surveillance systems.
Validation of Web-Based Physical Activity Measurement Systems Using Doubly Labeled Water
Yamaguchi, Yukio; Yamada, Yosuke; Tokushima, Satoru; Hatamoto, Yoichi; Sagayama, Hiroyuki; Kimura, Misaka; Higaki, Yasuki; Tanaka, Hiroaki
2012-01-01
Background Online or Web-based measurement systems have been proposed as convenient methods for collecting physical activity data. We developed two Web-based physical activity systems—the 24-hour Physical Activity Record Web (24hPAR WEB) and 7 days Recall Web (7daysRecall WEB). Objective To examine the validity of two Web-based physical activity measurement systems using the doubly labeled water (DLW) method. Methods We assessed the validity of the 24hPAR WEB and 7daysRecall WEB in 20 individuals, aged 25 to 61 years. The order of email distribution and subsequent completion of the two Web-based measurements systems was randomized. Each measurement tool was used for a week. The participants’ activity energy expenditure (AEE) and total energy expenditure (TEE) were assessed over each week using the DLW method and compared with the respective energy expenditures estimated using the Web-based systems. Results The mean AEE was 3.90 (SD 1.43) MJ estimated using the 24hPAR WEB and 3.67 (SD 1.48) MJ measured by the DLW method. The Pearson correlation for AEE between the two methods was r = .679 (P < .001). The Bland-Altman 95% limits of agreement ranged from –2.10 to 2.57 MJ between the two methods. The Pearson correlation for TEE between the two methods was r = .874 (P < .001). The mean AEE was 4.29 (SD 1.94) MJ using the 7daysRecall WEB and 3.80 (SD 1.36) MJ by the DLW method. The Pearson correlation for AEE between the two methods was r = .144 (P = .54). The Bland-Altman 95% limits of agreement ranged from –3.83 to 4.81 MJ between the two methods. The Pearson correlation for TEE between the two methods was r = .590 (P = .006). The average input times using terminal devices were 8 minutes and 10 seconds for the 24hPAR WEB and 6 minutes and 38 seconds for the 7daysRecall WEB. Conclusions Both Web-based systems were found to be effective methods for collecting physical activity data and are appropriate for use in epidemiological studies. Because the measurement accuracy of the 24hPAR WEB was moderate to high, it could be suitable for evaluating the effect of interventions on individuals as well as for examining physical activity behavior. PMID:23010345
Mears, Jessica; Abubakar, Ibrahim; Cohen, Theodore; McHugh, Timothy D; Sonnenberg, Pam
2015-01-21
To systematically review the evidence for the impact of study design and setting on the interpretation of tuberculosis (TB) transmission using clustering derived from Mycobacterial Interspersed Repetitive Units-Variable Number Tandem Repeats (MIRU-VNTR) strain typing. MEDLINE, EMBASE, CINHAL, Web of Science and Scopus were searched for articles published before 21st October 2014. Studies in humans that reported the proportion of clustering of TB isolates by MIRU-VNTR were included in the analysis. Univariable meta-regression analyses were conducted to assess the influence of study design and setting on the proportion of clustering. The search identified 27 eligible articles reporting clustering between 0% and 63%. The number of MIRU-VNTR loci typed, requiring consent to type patient isolates (as a proxy for sampling fraction), the TB incidence and the maximum cluster size explained 14%, 14%, 27% and 48% of between-study variation, respectively, and had a significant association with the proportion of clustering. Although MIRU-VNTR typing is being adopted worldwide there is a paucity of data on how study design and setting may influence estimates of clustering. We have highlighted study design variables for consideration in the design and interpretation of future studies. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
ST-analyzer: a web-based user interface for simulation trajectory analysis.
Jeong, Jong Cheol; Jo, Sunhwan; Wu, Emilia L; Qi, Yifei; Monje-Galvan, Viviana; Yeom, Min Sun; Gorenstein, Lev; Chen, Feng; Klauda, Jeffery B; Im, Wonpil
2014-05-05
Molecular dynamics (MD) simulation has become one of the key tools to obtain deeper insights into biological systems using various levels of descriptions such as all-atom, united-atom, and coarse-grained models. Recent advances in computing resources and MD programs have significantly accelerated the simulation time and thus increased the amount of trajectory data. Although many laboratories routinely perform MD simulations, analyzing MD trajectories is still time consuming and often a difficult task. ST-analyzer, http://im.bioinformatics.ku.edu/st-analyzer, is a standalone graphical user interface (GUI) toolset to perform various trajectory analyses. ST-analyzer has several outstanding features compared to other existing analysis tools: (i) handling various formats of trajectory files from MD programs, such as CHARMM, NAMD, GROMACS, and Amber, (ii) intuitive web-based GUI environment--minimizing administrative load and reducing burdens on the user from adapting new software environments, (iii) platform independent design--working with any existing operating system, (iv) easy integration into job queuing systems--providing options of batch processing either on the cluster or in an interactive mode, and (v) providing independence between foreground GUI and background modules--making it easier to add personal modules or to recycle/integrate pre-existing scripts utilizing other analysis tools. The current ST-analyzer contains nine main analysis modules that together contain 18 options, including density profile, lipid deuterium order parameters, surface area per lipid, and membrane hydrophobic thickness. This article introduces ST-analyzer with its design, implementation, and features, and also illustrates practical analysis of lipid bilayer simulations. Copyright © 2014 Wiley Periodicals, Inc.
Opal web services for biomedical applications.
Ren, Jingyuan; Williams, Nadya; Clementi, Luca; Krishnan, Sriram; Li, Wilfred W
2010-07-01
Biomedical applications have become increasingly complex, and they often require large-scale high-performance computing resources with a large number of processors and memory. The complexity of application deployment and the advances in cluster, grid and cloud computing require new modes of support for biomedical research. Scientific Software as a Service (sSaaS) enables scalable and transparent access to biomedical applications through simple standards-based Web interfaces. Towards this end, we built a production web server (http://ws.nbcr.net) in August 2007 to support the bioinformatics application called MEME. The server has grown since to include docking analysis with AutoDock and AutoDock Vina, electrostatic calculations using PDB2PQR and APBS, and off-target analysis using SMAP. All the applications on the servers are powered by Opal, a toolkit that allows users to wrap scientific applications easily as web services without any modification to the scientific codes, by writing simple XML configuration files. Opal allows both web forms-based access and programmatic access of all our applications. The Opal toolkit currently supports SOAP-based Web service access to a number of popular applications from the National Biomedical Computation Resource (NBCR) and affiliated collaborative and service projects. In addition, Opal's programmatic access capability allows our applications to be accessed through many workflow tools, including Vision, Kepler, Nimrod/K and VisTrails. From mid-August 2007 to the end of 2009, we have successfully executed 239,814 jobs. The number of successfully executed jobs more than doubled from 205 to 411 per day between 2008 and 2009. The Opal-enabled service model is useful for a wide range of applications. It provides for interoperation with other applications with Web Service interfaces, and allows application developers to focus on the scientific tool and workflow development. Web server availability: http://ws.nbcr.net.
NemaPath: online exploration of KEGG-based metabolic pathways for nematodes
Wylie, Todd; Martin, John; Abubucker, Sahar; Yin, Yong; Messina, David; Wang, Zhengyuan; McCarter, James P; Mitreva, Makedonka
2008-01-01
Background Nematode.net is a web-accessible resource for investigating gene sequences from parasitic and free-living nematode genomes. Beyond the well-characterized model nematode C. elegans, over 500,000 expressed sequence tags (ESTs) and nearly 600,000 genome survey sequences (GSSs) have been generated from 36 nematode species as part of the Parasitic Nematode Genomics Program undertaken by the Genome Center at Washington University School of Medicine. However, these sequencing data are not present in most publicly available protein databases, which only include sequences in Swiss-Prot. Swiss-Prot, in turn, relies on GenBank/Embl/DDJP for predicted proteins from complete genomes or full-length proteins. Description Here we present the NemaPath pathway server, a web-based pathway-level visualization tool for navigating putative metabolic pathways for over 30 nematode species, including 27 parasites. The NemaPath approach consists of two parts: 1) a backend tool to align and evaluate nematode genomic sequences (curated EST contigs) against the annotated Kyoto Encyclopedia of Genes and Genomes (KEGG) protein database; 2) a web viewing application that displays annotated KEGG pathway maps based on desired confidence levels of primary sequence similarity as defined by a user. NemaPath also provides cross-referenced access to nematode genome information provided by other tools available on Nematode.net, including: detailed NemaGene EST cluster information; putative translations; GBrowse EST cluster views; links from nematode data to external databases for corresponding synonymous C. elegans counterparts, subject matches in KEGG's gene database, and also KEGG Ontology (KO) identification. Conclusion The NemaPath server hosts metabolic pathway mappings for 30 nematode species and is available on the World Wide Web at . The nematode source sequences used for the metabolic pathway mappings are available via FTP , as provided by the Genome Center at Washington University School of Medicine. PMID:18983679
Gude, Wouter T; van Engen-Verheul, Mariëtte M; van der Veer, Sabine N; Kemps, Hareld M C; Jaspers, Monique W M; de Keizer, Nicolette F; Peek, Niels
2016-12-09
The objective of this study was to assess the effect of a web-based audit and feedback (A&F) intervention with outreach visits to support decision-making by multidisciplinary teams. We performed a multicentre cluster-randomized trial within the field of comprehensive cardiac rehabilitation (CR) in the Netherlands. Our participants were multidisciplinary teams in Dutch CR centres who were enrolled in the study between July 2012 and December 2013 and received the intervention for at least 1 year. The intervention included web-based A&F with feedback on clinical performance, facilities for goal setting and action planning, and educational outreach visits. Teams were randomized either to receive feedback that was limited to psychosocial rehabilitation (study group A) or to physical rehabilitation (study group B). The main outcome measure was the difference in performance between study groups in 11 care processes and six patient outcomes, measured at patient level. Secondary outcomes included effects on guideline concordance for the four main CR therapies. Data from 18 centres (14,847 patients) were analysed, of which 12 centres (9353 patients) were assigned to group A and six (5494 patients) to group B. During the intervention, a total of 233 quality improvement goals was identified by participating teams, of which 49 (21%) were achieved during the study period. Except for a modest improvement in data completeness (4.5% improvement per year; 95% CI 0.65 to 8.36), we found no effect of our intervention on any of our primary or secondary outcome measures. Within a multidisciplinary setting, our web-based A&F intervention engaged teams to define local performance improvement goals but failed to support them in actually completing the improvement actions that were needed to achieve those goals. Future research should focus on improving the actionability of feedback on clinical performance and on addressing the socio-technical perspective of the implementation process. NTR3251.
ERIC Educational Resources Information Center
Money, William H.
Instructors should be concerned with how to incorporate the World Wide Web into an information systems (IS) curriculum organized across three areas of knowledge: information technology, organizational and management concepts, and theory and development of systems. The Web fits broadly into the information technology component. For the Web to be…
Web-Based Intelligent E-Learning Systems: Technologies and Applications
ERIC Educational Resources Information Center
Ma, Zongmin
2006-01-01
Collecting and presenting the latest research and development results from the leading researchers in the field of e-learning systems, Web-Based Intelligent E-Learning Systems: Technologies and Applications provides a single record of current research and practical applications in Web-based intelligent e-learning systems. This book includes major…
Observing Interstellar and Intergalactic Magnetic Fields
NASA Astrophysics Data System (ADS)
Han, J. L.
2017-08-01
Observational results of interstellar and intergalactic magnetic fields are reviewed, including the fields in supernova remnants and loops, interstellar filaments and clouds, Hii regions and bubbles, the Milky Way and nearby galaxies, galaxy clusters, and the cosmic web. A variety of approaches are used to investigate these fields. The orientations of magnetic fields in interstellar filaments and molecular clouds are traced by polarized thermal dust emission and starlight polarization. The field strengths and directions along the line of sight in dense clouds and cores are measured by Zeeman splitting of emission or absorption lines. The large-scale magnetic fields in the Milky Way have been best probed by Faraday rotation measures of a large number of pulsars and extragalactic radio sources. The coherent Galactic magnetic fields are found to follow the spiral arms and have their direction reversals in arms and interarm regions in the disk. The azimuthal fields in the halo reverse their directions below and above the Galactic plane. The orientations of organized magnetic fields in nearby galaxies have been observed through polarized synchrotron emission. Magnetic fields in the intracluster medium have been indicated by diffuse radio halos, polarized radio relics, and Faraday rotations of embedded radio galaxies and background sources. Sparse evidence for very weak magnetic fields in the cosmic web is the detection of the faint radio bridge between the Coma cluster and A1367. Future observations should aim at the 3D tomography of the large-scale coherent magnetic fields in our Galaxy and nearby galaxies, a better description of intracluster field properties, and firm detections of intergalactic magnetic fields in the cosmic web.
Imprints of the large-scale structure on AGN formation and evolution
NASA Astrophysics Data System (ADS)
Porqueres, Natàlia; Jasche, Jens; Enßlin, Torsten A.; Lavaux, Guilhem
2018-04-01
Black hole masses are found to correlate with several global properties of their host galaxies, suggesting that black holes and galaxies have an intertwined evolution and that active galactic nuclei (AGN) have a significant impact on galaxy evolution. Since the large-scale environment can also affect AGN, this work studies how their formation and properties depend on the environment. We have used a reconstructed three-dimensional high-resolution density field obtained from a Bayesian large-scale structure reconstruction method applied to the 2M++ galaxy sample. A web-type classification relying on the shear tensor is used to identify different structures on the cosmic web, defining voids, sheets, filaments, and clusters. We confirm that the environmental density affects the AGN formation and their properties. We found that the AGN abundance is equivalent to the galaxy abundance, indicating that active and inactive galaxies reside in similar dark matter halos. However, occurrence rates are different for each spectral type and accretion rate. These differences are consistent with the AGN evolutionary sequence suggested by previous authors, Seyferts and Transition objects transforming into low-ionization nuclear emission line regions (LINERs), the weaker counterpart of Seyferts. We conclude that AGN properties depend on the environmental density more than on the web-type. More powerful starbursts and younger stellar populations are found in high densities, where interactions and mergers are more likely. AGN hosts show smaller masses in clusters for Seyferts and Transition objects, which might be due to gas stripping. In voids, the AGN population is dominated by the most massive galaxy hosts.
NASA Astrophysics Data System (ADS)
Wibonele, Kasanda J.; Zhang, Yanqing
2002-03-01
A web data mining system using granular computing and ASP programming is proposed. This is a web based application, which allows web users to submit survey data for many different companies. This survey is a collection of questions that will help these companies develop and improve their business and customer service with their clients by analyzing survey data. This web application allows users to submit data anywhere. All the survey data is collected into a database for further analysis. An administrator of this web application can login to the system and view all the data submitted. This web application resides on a web server, and the database resides on the MS SQL server.
Data Intensive Computing on Amazon Web Services
DOE Office of Scientific and Technical Information (OSTI.GOV)
Magana-Zook, S. A.
The Geophysical Monitoring Program (GMP) has spent the past few years building up the capability to perform data intensive computing using what have been referred to as “big data” tools. These big data tools would be used against massive archives of seismic signals (>300 TB) to conduct research not previously possible. Examples of such tools include Hadoop (HDFS, MapReduce), HBase, Hive, Storm, Spark, Solr, and many more by the day. These tools are useful for performing data analytics on datasets that exceed the resources of traditional analytic approaches. To this end, a research big data cluster (“Cluster A”) was setmore » up as a collaboration between GMP and Livermore Computing (LC).« less
SCHIP: Statistics for Chromosome Interphase Positioning Based on Interchange Data
NASA Technical Reports Server (NTRS)
Vives, Sergi; Loucas, Bradford; Vazquez, Mariel; Brenner, David J.; Sachs, Rainer K.; Hlatky, Lynn; Cornforth, Michael; Arsuaga, Javier
2005-01-01
he position of chromosomes in the interphase nucleus is believed to be associated with a number of biological processes. Here, we present a web-based application that helps analyze the relative position of chromosomes during interphase in human cells, based on observed radiogenic chromosome aberrations. The inputs of the program are a table of yields of pairwise chromosome interchanges and a proposed chromosome geometric cluster. Each can either be uploaded or selected from provided datasets. The main outputs are P-values for the proposed chromosome clusters. SCHIP is designed to be used by a number of scientific communities interested in nuclear architecture, including cancer and cell biologists, radiation biologists and mathematical/computational biologists.
Large area sheet task: Advanced Dendritic Web Growth Development
NASA Technical Reports Server (NTRS)
Duncan, C. S.; Seidensticker, R. G.; Mchugh, J. P.; Hopkins, R. H.; Meier, D.; Schruben, J.
1981-01-01
A melt level control system was implemented to provide stepless silicon feed rates from zero to rates exactly matching the silicon consumed during web growth. Bench tests of the unit were successfully completed and the system mounted in a web furnace for operational verification. Tests of long term temperature drift correction techniques were made; web width monitoring seems most appropriate for feedback purposes. A system to program the initiation of the web growth cycle was successfully tested. A low cost temperature controller was tested which functions as well as units four times as expensive.
Sensor system for web inspection
Sleefe, Gerard E.; Rudnick, Thomas J.; Novak, James L.
2002-01-01
A system for electrically measuring variations over a flexible web has a capacitive sensor including spaced electrically conductive, transmit and receive electrodes mounted on a flexible substrate. The sensor is held against a flexible web with sufficient force to deflect the path of the web, which moves relative to the sensor.
Visual Based Retrieval Systems and Web Mining--Introduction.
ERIC Educational Resources Information Center
Iyengar, S. S.
2001-01-01
Briefly discusses Web mining and image retrieval techniques, and then presents a summary of articles in this special issue. Articles focus on Web content mining, artificial neural networks as tools for image retrieval, content-based image retrieval systems, and personalizing the Web browsing experience using media agents. (AEF)
Development of a Web-based financial application System
NASA Astrophysics Data System (ADS)
Hasan, M. R.; Ibrahimy, M. I.; Motakabber, S. M. A.; Ferdaus, M. M.; Khan, M. N. H.; Mostafa, M. G.
2013-12-01
The paper describes a technique to develop a web based financial system, following latest technology and business needs. In the development of web based application, the user friendliness and technology both are very important. It is used ASP .NET MVC 4 platform and SQL 2008 server for development of web based financial system. It shows the technique for the entry system and report monitoring of the application is user friendly. This paper also highlights the critical situations of development, which will help to develop the quality product.
Leysen, Bert; Van den Eynden, Bart; Gielen, Birgit; Bastiaens, Hilde; Wens, Johan
2015-09-28
Starting with early identification of palliative care patients by general practitioners (GPs), the Care Pathway for Primary Palliative Care (CPPPC) is believed to help primary health care workers to deliver patient- and family-centered care in the last year of life. The care pathway has been pilot-tested, and will now be implemented in 5 Belgian regions: 2 Dutch-speaking regions, 2 French-speaking regions and the bilingual capital region of Brussels. The overall aim of the CPPPC is to provide better quality of primary palliative care, and in the end to reduce the hospital death rate. The aim of this article is to describe the quantitative design and innovative data collection strategy used in the evaluation of this complex intervention. A quasi-experimental stepped wedge cluster design is set up with the 5 regions being 5 non-randomized clusters. The primary outcome is reduced hospital death rate per GPs' patient population. Secondary outcomes are increased death at home and health care consumption patterns suggesting high quality palliative care. Per research cluster, GPs will be recruited via convenience sampling. These GPs -volunteering to be involved will recruit people with reduced life expectancy and their informal care givers. Health care consumption data in the last year of life, available for all deceased people having lived in the research clusters in the study period, will be used for comparison between patient populations of participating GPs and patient populations of non-participating GPs. Description of baseline characteristics of participating GPs and patients and monitoring of the level of involvement by GPs, patients and informal care givers will happen through regular, privacy-secured web-surveys. Web-survey data and health consumption data are linked in a secure way, respecting Belgian privacy laws. To evaluate this complex intervention, a quasi-experimental stepped wedge cluster design has been set up. Context characteristics and involvement level of participants are important parameters in evaluating complex interventions. It is possible to securely link survey data with health consumption data. By appealing to IT solutions we hope to be able to partly reduce respondent burden, a known problem in palliative care research. ClinicalTrials.gov Identifier: NCT02266069.
2016-02-22
SPONSORED REPORT SERIES Achieving Better Buying Power through Acquisition of Open Architecture Software Systems for Web and Mobile Devices 22...ACQUISITION RESEARCH PROGRAM SPONSORED REPORT SERIES Achieving Better Buying Power through Acquisition of Open Architecture Software Systems for Web ...Policy Naval Postgraduate School Executive Summary Many people within large enterprises rely on up to four Web -based or mobile devices for their
Detection and clustering of features in aerial images by neuron network-based algorithm
NASA Astrophysics Data System (ADS)
Vozenilek, Vit
2015-12-01
The paper presents the algorithm for detection and clustering of feature in aerial photographs based on artificial neural networks. The presented approach is not focused on the detection of specific topographic features, but on the combination of general features analysis and their use for clustering and backward projection of clusters to aerial image. The basis of the algorithm is a calculation of the total error of the network and a change of weights of the network to minimize the error. A classic bipolar sigmoid was used for the activation function of the neurons and the basic method of backpropagation was used for learning. To verify that a set of features is able to represent the image content from the user's perspective, the web application was compiled (ASP.NET on the Microsoft .NET platform). The main achievements include the knowledge that man-made objects in aerial images can be successfully identified by detection of shapes and anomalies. It was also found that the appropriate combination of comprehensive features that describe the colors and selected shapes of individual areas can be useful for image analysis.
A Granular Self-Organizing Map for Clustering and Gene Selection in Microarray Data.
Ray, Shubhra Sankar; Ganivada, Avatharam; Pal, Sankar K
2016-09-01
A new granular self-organizing map (GSOM) is developed by integrating the concept of a fuzzy rough set with the SOM. While training the GSOM, the weights of a winning neuron and the neighborhood neurons are updated through a modified learning procedure. The neighborhood is newly defined using the fuzzy rough sets. The clusters (granules) evolved by the GSOM are presented to a decision table as its decision classes. Based on the decision table, a method of gene selection is developed. The effectiveness of the GSOM is shown in both clustering samples and developing an unsupervised fuzzy rough feature selection (UFRFS) method for gene selection in microarray data. While the superior results of the GSOM, as compared with the related clustering methods, are provided in terms of β -index, DB-index, Dunn-index, and fuzzy rough entropy, the genes selected by the UFRFS are not only better in terms of classification accuracy and a feature evaluation index, but also statistically more significant than the related unsupervised methods. The C-codes of the GSOM and UFRFS are available online at http://avatharamg.webs.com/software-code.
A novel architecture for information retrieval system based on semantic web
NASA Astrophysics Data System (ADS)
Zhang, Hui
2011-12-01
Nowadays, the web has enabled an explosive growth of information sharing (there are currently over 4 billion pages covering most areas of human endeavor) so that the web has faced a new challenge of information overhead. The challenge that is now before us is not only to help people locating relevant information precisely but also to access and aggregate a variety of information from different resources automatically. Current web document are in human-oriented formats and they are suitable for the presentation, but machines cannot understand the meaning of document. To address this issue, Berners-Lee proposed a concept of semantic web. With semantic web technology, web information can be understood and processed by machine. It provides new possibilities for automatic web information processing. A main problem of semantic web information retrieval is that when these is not enough knowledge to such information retrieval system, the system will return to a large of no sense result to uses due to a huge amount of information results. In this paper, we present the architecture of information based on semantic web. In addiction, our systems employ the inference Engine to check whether the query should pose to Keyword-based Search Engine or should pose to the Semantic Search Engine.
ERIC Educational Resources Information Center
Ke, Chih-Horng; Sun, Huey-Min; Yang, Yuan-Chi; Sun, Huey-Min
2012-01-01
This study explores the effect of user and system characteristics on our proposed web-based classroom response system (CRS) by a longitudinal design. The results of research are expected to understand the important factors of user and system characteristics in the web-based CRS. The proposed system can supply interactive teaching contents,…
Li, Eldon Y; Tung, Chen-Yuan; Chang, Shu-Hsun
2016-08-01
The quest for an effective system capable of monitoring and predicting the trends of epidemic diseases is a critical issue for communities worldwide. With the prevalence of Internet access, more and more researchers today are using data from both search engines and social media to improve the prediction accuracy. In particular, a prediction market system (PMS) exploits the wisdom of crowds on the Internet to effectively accomplish relatively high accuracy. This study presents the architecture of a PMS and demonstrates the matching mechanism of logarithmic market scoring rules. The system was implemented to predict infectious diseases in Taiwan with the wisdom of crowds in order to improve the accuracy of epidemic forecasting. The PMS architecture contains three design components: database clusters, market engine, and Web applications. The system accumulated knowledge from 126 health professionals for 31 weeks to predict five disease indicators: the confirmed cases of dengue fever, the confirmed cases of severe and complicated influenza, the rate of enterovirus infections, the rate of influenza-like illnesses, and the confirmed cases of severe and complicated enterovirus infection. Based on the winning ratio, the PMS predicts the trends of three out of five disease indicators more accurately than does the existing system that uses the five-year average values of historical data for the same weeks. In addition, the PMS with the matching mechanism of logarithmic market scoring rules is easy to understand for health professionals and applicable to predict all the five disease indicators. The PMS architecture of this study affords organizations and individuals to implement it for various purposes in our society. The system can continuously update the data and improve prediction accuracy in monitoring and forecasting the trends of epidemic diseases. Future researchers could replicate and apply the PMS demonstrated in this study to more infectious diseases and wider geographical areas, especially the under-developed countries across Asia and Africa. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Tardiole Kuehne, Bruno; Estrella, Julio Cezar; Nunes, Luiz Henrique; Martins de Oliveira, Edvard; Hideo Nakamura, Luis; Gomes Ferreira, Carlos Henrique; Carlucci Santana, Regina Helena; Reiff-Marganiec, Stephan; Santana, Marcos José
2015-01-01
This paper proposes a system named AWSCS (Automatic Web Service Composition System) to evaluate different approaches for automatic composition of Web services, based on QoS parameters that are measured at execution time. The AWSCS is a system to implement different approaches for automatic composition of Web services and also to execute the resulting flows from these approaches. Aiming at demonstrating the results of this paper, a scenario was developed, where empirical flows were built to demonstrate the operation of AWSCS, since algorithms for automatic composition are not readily available to test. The results allow us to study the behaviour of running composite Web services, when flows with the same functionality but different problem-solving strategies were compared. Furthermore, we observed that the influence of the load applied on the running system as the type of load submitted to the system is an important factor to define which approach for the Web service composition can achieve the best performance in production. PMID:26068216
Tardiole Kuehne, Bruno; Estrella, Julio Cezar; Nunes, Luiz Henrique; Martins de Oliveira, Edvard; Hideo Nakamura, Luis; Gomes Ferreira, Carlos Henrique; Carlucci Santana, Regina Helena; Reiff-Marganiec, Stephan; Santana, Marcos José
2015-01-01
This paper proposes a system named AWSCS (Automatic Web Service Composition System) to evaluate different approaches for automatic composition of Web services, based on QoS parameters that are measured at execution time. The AWSCS is a system to implement different approaches for automatic composition of Web services and also to execute the resulting flows from these approaches. Aiming at demonstrating the results of this paper, a scenario was developed, where empirical flows were built to demonstrate the operation of AWSCS, since algorithms for automatic composition are not readily available to test. The results allow us to study the behaviour of running composite Web services, when flows with the same functionality but different problem-solving strategies were compared. Furthermore, we observed that the influence of the load applied on the running system as the type of load submitted to the system is an important factor to define which approach for the Web service composition can achieve the best performance in production.
RSAT 2018: regulatory sequence analysis tools 20th anniversary.
Nguyen, Nga Thi Thuy; Contreras-Moreira, Bruno; Castro-Mondragon, Jaime A; Santana-Garcia, Walter; Ossio, Raul; Robles-Espinoza, Carla Daniela; Bahin, Mathieu; Collombet, Samuel; Vincens, Pierre; Thieffry, Denis; van Helden, Jacques; Medina-Rivera, Alejandra; Thomas-Chollier, Morgane
2018-05-02
RSAT (Regulatory Sequence Analysis Tools) is a suite of modular tools for the detection and the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, including from genome-wide datasets like ChIP-seq/ATAC-seq, (ii) motif scanning, (iii) motif analysis (quality assessment, comparisons and clustering), (iv) analysis of regulatory variations, (v) comparative genomics. Six public servers jointly support 10 000 genomes from all kingdoms. Six novel or refactored programs have been added since the 2015 NAR Web Software Issue, including updated programs to analyse regulatory variants (retrieve-variation-seq, variation-scan, convert-variations), along with tools to extract sequences from a list of coordinates (retrieve-seq-bed), to select motifs from motif collections (retrieve-matrix), and to extract orthologs based on Ensembl Compara (get-orthologs-compara). Three use cases illustrate the integration of new and refactored tools to the suite. This Anniversary update gives a 20-year perspective on the software suite. RSAT is well-documented and available through Web sites, SOAP/WSDL (Simple Object Access Protocol/Web Services Description Language) web services, virtual machines and stand-alone programs at http://www.rsat.eu/.
NASA Astrophysics Data System (ADS)
Pomarède, Daniel; Hoffman, Yehuda; Courtois, Hélène M.; Tully, R. Brent
2017-08-01
The network of filaments with embedded clusters surrounding voids, which has been seen in maps derived from redshift surveys and reproduced in simulations, has been referred to as the cosmic web. A complementary description is provided by considering the shear in the velocity field of galaxies. The eigenvalues of the shear provide information regarding whether or not a region is collapsing in three dimensions, which is the condition for a knot, expanding in three dimensions, which is the condition for a void, or in the intermediate condition of a filament or sheet. The structures that are quantitatively defined by the eigenvalues can be approximated by iso-contours that provide a visual representation of the cosmic velocity (V) web. The current application is based on radial peculiar velocities from the Cosmicflows-2 collection of distances. The three-dimensional velocity field is constructed using the Wiener filter methodology in the linear approximation. Eigenvalues of the velocity shear are calculated at each point on a grid. Here, knots and filaments are visualized across a local domain of diameter ˜ 0.1c.
Featured Image: The Cosmic Velocity Web
NASA Astrophysics Data System (ADS)
Kohler, Susanna
2017-09-01
You may have heard of the cosmic web, a network of filaments, clusters and voids that describes the three-dimensional distribution of matter in our universe. But have you ever considered the idea of a cosmic velocity web? In a new study led by Daniel Pomarde (IRFU CEA-Saclay, France), a team of scientists has built a detailed 3D view of the flows in our universe, showing in particular motions along filaments and in collapsing knots. In the image above (click for the full view), surfaces of knots (red) are embedded within surfaces of filaments (grey). The rainbow lines show the flow motion, revealing acceleration (redder tones) toward knots and retardation (bluer tones) beyond them. You can learn more about Pomarde and collaborators work and see their unusual and intriguing visualizationsin the video they produced, below. Check out the original paper for more information.CitationDaniel Pomarde et al 2017 ApJ 845 55. doi:10.3847/1538-4357/aa7f78
The dependence of galaxy clustering on tidal environment in the Sloan Digital Sky Survey
NASA Astrophysics Data System (ADS)
Paranjape, Aseem; Hahn, Oliver; Sheth, Ravi K.
2018-06-01
The influence of the Cosmic Web on galaxy formation and evolution is of great observational and theoretical interest. We investigate whether the Cosmic Web leaves an imprint in the spatial clustering of galaxies in the Sloan Digital Sky Survey (SDSS), using the group catalogue of Yang et al. and tidal field estimates at ˜2 h-1 Mpc scales from the mass-tides-velocity data set of Wang et al. We use the tidal anisotropy α (Paranjape et al.) to characterize the tidal environment of groups, and measure the redshift-space 2-point correlation function (2pcf) of group positions and the luminosity- and colour-dependent clustering of group galaxies using samples segregated by α. We find that all the 2pcf measurements depend strongly on α, with factors of ˜20 between the large-scale 2pcf of objects in the most and least isotropic environments. To test whether these strong trends imply `beyond halo mass' effects for galaxy evolution, we compare our results with corresponding 2pcf measurements in mock catalogues constructed using a halo occupation distribution that uses only halo mass as an input. We find that this prescription qualitatively reproduces all observed trends, and also quantitatively matches many of the observed results. Although there are some statistically significant differences between our `halo mass only' mocks and the data - in the most and least isotropic environments - which deserve further investigation, our results suggest that if the tidal environment induces additional effects on galaxy properties other than those inherited from their host haloes, then these must be weak.
Clyne, Barbara; Bradley, Marie C; Smith, Susan M; Hughes, Carmel M; Motterlini, Nicola; Clear, Daniel; McDonnell, Ronan; Williams, David; Fahey, Tom
2013-03-13
Potentially inappropriate prescribing in older people is common in primary care and can result in increased morbidity, adverse drug events, hospitalizations and mortality. In Ireland, 36% of those aged 70 years or over received at least one potentially inappropriate medication, with an associated expenditure of over €45 million.The main objective of this study is to determine the effectiveness and acceptability of a complex, multifaceted intervention in reducing the level of potentially inappropriate prescribing in primary care. This study is a pragmatic cluster randomized controlled trial, conducted in primary care (OPTI-SCRIPT trial), involving 22 practices (clusters) and 220 patients. Practices will be allocated to intervention or control arms using minimization, with intervention participants receiving a complex multifaceted intervention incorporating academic detailing, medicines review with web-based pharmaceutical treatment algorithms that provide recommended alternative treatment options, and tailored patient information leaflets. Control practices will deliver usual care and receive simple patient-level feedback on potentially inappropriate prescribing. Routinely collected national prescribing data will also be analyzed for nonparticipating practices, acting as a contemporary national control. The primary outcomes are the proportion of participant patients with potentially inappropriate prescribing and the mean number of potentially inappropriate prescriptions per patient. In addition, economic and qualitative evaluations will be conducted. This study will establish the effectiveness of a multifaceted intervention in reducing potentially inappropriate prescribing in older people in Irish primary care that is generalizable to countries with similar prescribing challenges. Current controlled trials ISRCTN41694007.
ERIC Educational Resources Information Center
Mattord, Herbert J.
2012-01-01
Organizations continue to rely on password-based authentication methods to control access to many Web-based systems. This research study developed a benchmarking instrument intended to assess authentication methods used in Web-based information systems (IS). It developed an Authentication Method System Index (AMSI) to analyze collected data from…
InterProSurf: a web server for predicting interacting sites on protein surfaces
Negi, Surendra S.; Schein, Catherine H.; Oezguen, Numan; Power, Trevor D.; Braun, Werner
2009-01-01
Summary A new web server, InterProSurf, predicts interacting amino acid residues in proteins that are most likely to interact with other proteins, given the 3D structures of subunits of a protein complex. The prediction method is based on solvent accessible surface area of residues in the isolated subunits, a propensity scale for interface residues and a clustering algorithm to identify surface regions with residues of high interface propensities. Here we illustrate the application of InterProSurf to determine which areas of Bacillus anthracis toxins and measles virus hemagglutinin protein interact with their respective cell surface receptors. The computationally predicted regions overlap with those regions previously identified as interface regions by sequence analysis and mutagenesis experiments. PMID:17933856
Design and Applications of Rapid Image Tile Producing Software Based on Mosaic Dataset
NASA Astrophysics Data System (ADS)
Zha, Z.; Huang, W.; Wang, C.; Tang, D.; Zhu, L.
2018-04-01
Map tile technology is widely used in web geographic information services. How to efficiently produce map tiles is key technology for rapid service of images on web. In this paper, a rapid producing software for image tile data based on mosaic dataset is designed, meanwhile, the flow of tile producing is given. Key technologies such as cluster processing, map representation, tile checking, tile conversion and compression in memory are discussed. Accomplished by software development and tested by actual image data, the results show that this software has a high degree of automation, would be able to effectively reducing the number of IO and improve the tile producing efficiency. Moreover, the manual operations would be reduced significantly.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-08-14
... Information Collection; Comment Request; NTIA/FCC Web- based Frequency Coordination System AGENCY: National... INFORMATION: I. Abstract The National Telecommunications and Information Administration (NTIA) hosts a web... (RF) bands that are shared on a co-primary basis by federal and non-federal users. The web-based...
75 FR 29307 - Web Based Supply Chain Management Commodity Offer Form, Paperwork Collection Notice
Federal Register 2010, 2011, 2012, 2013, 2014
2010-05-25
... DEPARTMENT OF AGRICULTURE Agricultural Marketing Service [Doc. No FV10-CP-01, AMS-FV-10-0041] Web... collection request is required for the implementation of a new system named Web Based Supply Chain Management...-2782. Mail: David Tuckwiller, Project Manager, Web Based Supply Chain Management System, Agricultural...
Information Diversity in Web Search
ERIC Educational Resources Information Center
Liu, Jiahui
2009-01-01
The web is a rich and diverse information source with incredible amounts of information about all kinds of subjects in various forms. This information source affords great opportunity to build systems that support users in their work and everyday lives. To help users explore information on the web, web search systems should find information that…
Web information retrieval for health professionals.
Ting, S L; See-To, Eric W K; Tse, Y K
2013-06-01
This paper presents a Web Information Retrieval System (WebIRS), which is designed to assist the healthcare professionals to obtain up-to-date medical knowledge and information via the World Wide Web (WWW). The system leverages the document classification and text summarization techniques to deliver the highly correlated medical information to the physicians. The system architecture of the proposed WebIRS is first discussed, and then a case study on an application of the proposed system in a Hong Kong medical organization is presented to illustrate the adoption process and a questionnaire is administrated to collect feedback on the operation and performance of WebIRS in comparison with conventional information retrieval in the WWW. A prototype system has been constructed and implemented on a trial basis in a medical organization. It has proven to be of benefit to healthcare professionals through its automatic functions in classification and summarizing the medical information that the physicians needed and interested. The results of the case study show that with the use of the proposed WebIRS, significant reduction of searching time and effort, with retrieval of highly relevant materials can be attained.
Coordinated Science Campaign Scheduling for Sensor Webs
NASA Technical Reports Server (NTRS)
Edgington, Will; Morris, Robert; Dungan, Jennifer; Williams, Jenny; Carlson, Jean; Fleming, Damian; Wood, Terri; Yorke-Smith, Neil
2005-01-01
Future Earth observing missions will study different aspects and interacting pieces of the Earth's eco-system. Scientists are designing increasingly complex, interdisciplinary campaigns to exploit the diverse capabilities of multiple Earth sensing assets. In addition, spacecraft platforms are being configured into clusters, trains, or other distributed organizations in order to improve either the quality or the coverage of observations. These simultaneous advances in the design of science campaigns and in the missions that will provide the sensing resources to support them offer new challenges in the coordination of data and operations that are not addressed by current practice. For example, the scheduling of scientific observations for satellites in low Earth orbit is currently conducted independently by each mission operations center. An absence of an information infrastructure to enable the scheduling of coordinated observations involving multiple sensors makes it difficult to execute campaigns involving multiple assets. This paper proposes a software architecture and describes a prototype system called DESOPS (Distributed Earth Science Observation Planning and Scheduling) that will address this deficiency.
Interactive Machine Learning at Scale with CHISSL
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arendt, Dustin L.; Grace, Emily A.; Volkova, Svitlana
We demonstrate CHISSL, a scalable client-server system for real-time interactive machine learning. Our system is capa- ble of incorporating user feedback incrementally and imme- diately without a structured or pre-defined prediction task. Computation is partitioned between a lightweight web-client and a heavyweight server. The server relies on representation learning and agglomerative clustering to learn a dendrogram, a hierarchical approximation of a representation space. The client uses only this dendrogram to incorporate user feedback into the model via transduction. Distances and predictions for each unlabeled instance are updated incrementally and deter- ministically, with O(n) space and time complexity. Our al- gorithmmore » is implemented in a functional prototype, designed to be easy to use by non-experts. The prototype organizes the large amounts of data into recommendations. This allows the user to interact with actual instances by dragging and drop- ping to provide feedback in an intuitive manner. We applied CHISSL to several domains including cyber, social media, and geo-temporal analysis.« less
Protecting clinical data on Web client computers: the PCASSO approach.
Masys, D. R.; Baker, D. B.
1998-01-01
The ubiquity and ease of use of the Web have made it an increasingly popular medium for communication of health-related information. Web interfaces to commercially available clinical information systems are now available or under development by most major vendors. To the extent that such interfaces involve the use of unprotected operating systems, they are vulnerable to security limitations of Web client software environments. The Patient Centered Access to Secure Systems Online (PCASSO) project extends the protections for person-identifiable health data on Web client computers. PCASSO uses several approaches, including physical protection of authentication information, execution containment, graphical displays, and monitoring the client system for intrusions and co-existing programs that may compromise security. PMID:9929243
Evaluation Criteria for the Educational Web-Information System
ERIC Educational Resources Information Center
Seok, Soonhwa; Meyen, Edward; Poggio, John C.; Semon, Sarah; Tillberg-Webb, Heather
2008-01-01
This article addresses how evaluation criteria improve educational Web-information system design, and the tangible and intangible benefits of using evaluation criteria, when implemented in an educational Web-information system design. The evaluation criteria were developed by the authors through a content validation study applicable to…
Federal Register 2010, 2011, 2012, 2013, 2014
2013-12-17
...-0392] Proposed Enhancements to the Motor Carrier Safety Measurement System (SMS) Public Web Site AGENCY... proposed enhancements to the display of information on the Agency's Safety Measurement System (SMS) public Web site. On December 6, 2013, Advocates [[Page 76392
Web-services-based spatial decision support system to facilitate nuclear waste siting
NASA Astrophysics Data System (ADS)
Huang, L. Xinglai; Sheng, Grant
2006-10-01
The availability of spatial web services enables data sharing among managers, decision and policy makers and other stakeholders in much simpler ways than before and subsequently has created completely new opportunities in the process of spatial decision making. Though generally designed for a certain problem domain, web-services-based spatial decision support systems (WSDSS) can provide a flexible problem-solving environment to explore the decision problem, understand and refine problem definition, and generate and evaluate multiple alternatives for decision. This paper presents a new framework for the development of a web-services-based spatial decision support system. The WSDSS is comprised of distributed web services that either have their own functions or provide different geospatial data and may reside in different computers and locations. WSDSS includes six key components, namely: database management system, catalog, analysis functions and models, GIS viewers and editors, report generators, and graphical user interfaces. In this study, the architecture of a web-services-based spatial decision support system to facilitate nuclear waste siting is described as an example. The theoretical, conceptual and methodological challenges and issues associated with developing web services-based spatial decision support system are described.
NASA Astrophysics Data System (ADS)
Wang, Ruikang K.; Priezzhev, Alexander; Fantini, Sergio
2004-07-01
To honour Professor Valery Tuchin, one of the pioneers in biomedical optics, Journal of Physics D: Applied Physics invites manuscript submissions on topics in biomedical optics, for publication in a Special section in May 2005. Papers may cover a variety of topics related to photon propagation in turbid media, spectroscopy and imaging. This Special cluster will reflect the diversity, breadth and impact of Professor Tuchin's contributions to the field of biomedical optics over the course of his distinguished career. Biomedical optics is a recently emerged discipline providing a broad variety of optical techniques and instruments for diagnostic, therapeutic and basic science applications. Together with contributions from other pioneers in the field, Professor Tuchin's work on fundamental and experimental aspects in tissue optics contributed enormously to the formation of this exciting field. Although general submissions in biomedical optics are invited, the Special cluster Editors especially encourage submissions in areas that are explicitly or implicitly influenced by Professor Tuchin's contributions to the field of biomedical optics. Manuscripts submitted to this Special cluster of Journal of Physics D: Applied Physics will be refereed according to the normal criteria and procedures of the journal, in accordance with the following schedule: Deadline for receipt of contributed papers: 31 November 2004 Deadline for acceptance and completion of refereeing process: 28 February 2005 Publication of special issue: May 2005 Please submit your manuscript electronically to jphysd@iop.org or via the Web site at www.iop.org/Journals. Otherwise, please send a copy of your typescript, a set of original figures and a cover letter to: The Publishing Administrator, Journal of Physics D: Applied Physics, Institute of Physics Publishing, Dirac House, Temple Back, Bristol BS1 6BE, United Kingdom. Further information on how to submit may be obtained upon request by e-mailing the journal at the above address. Alternatively, visit the homepage of the journal on the World Wide Web (http://www.iop.org/journals/jphysd)
Goekoop, Rutger; Goekoop, Jaap G; Scholte, H Steven
2012-01-01
Human personality is described preferentially in terms of factors (dimensions) found using factor analysis. An alternative and highly related method is network analysis, which may have several advantages over factor analytic methods. To directly compare the ability of network community detection (NCD) and principal component factor analysis (PCA) to examine modularity in multidimensional datasets such as the neuroticism-extraversion-openness personality inventory revised (NEO-PI-R). 434 healthy subjects were tested on the NEO-PI-R. PCA was performed to extract factor structures (FS) of the current dataset using both item scores and facet scores. Correlational network graphs were constructed from univariate correlation matrices of interactions between both items and facets. These networks were pruned in a link-by-link fashion while calculating the network community structure (NCS) of each resulting network using the Wakita Tsurumi clustering algorithm. NCSs were matched against FS and networks of best matches were kept for further analysis. At facet level, NCS showed a best match (96.2%) with a 'confirmatory' 5-FS. At item level, NCS showed a best match (80%) with the standard 5-FS and involved a total of 6 network clusters. Lesser matches were found with 'confirmatory' 5-FS and 'exploratory' 6-FS of the current dataset. Network analysis did not identify facets as a separate level of organization in between items and clusters. A small-world network structure was found in both item- and facet level networks. We present the first optimized network graph of personality traits according to the NEO-PI-R: a 'Personality Web'. Such a web may represent the possible routes that subjects can take during personality development. NCD outperforms PCA by producing plausible modularity at item level in non-standard datasets, and can identify the key roles of individual items and clusters in the network.
System Testing of Desktop and Web Applications
ERIC Educational Resources Information Center
Slack, James M.
2011-01-01
We want our students to experience system testing of both desktop and web applications, but the cost of professional system-testing tools is far too high. We evaluate several free tools and find that AutoIt makes an ideal educational system-testing tool. We show several examples of desktop and web testing with AutoIt, starting with simple…
WebAlchemist: a Web transcoding system for mobile Web access in handheld devices
NASA Astrophysics Data System (ADS)
Whang, Yonghyun; Jung, Changwoo; Kim, Jihong; Chung, Sungkwon
2001-11-01
In this paper, we describe the design and implementation of WebAlchemist, a prototype web transcoding system, which automatically converts a given HTML page into a sequence of equivalent HTML pages that can be properly displayed on a hand-held device. The Web/Alchemist system is based on a set of HTML transcoding heuristics managed by the Transcoding Manager (TM) module. In order to tackle difficult-to-transcode pages such as ones with large or complex table structures, we have developed several new transcoding heuristics that extract partial semantics from syntactic information such as the table width, font size and cascading style sheet. Subjective evaluation results using popular HTML pages (such as the CNN home page) show that WebAlchemist generates readable, structure-preserving transcoded pages, which can be properly displayed on hand-held devices.
A cloud-based framework for large-scale traditional Chinese medical record retrieval.
Liu, Lijun; Liu, Li; Fu, Xiaodong; Huang, Qingsong; Zhang, Xianwen; Zhang, Yin
2018-01-01
Electronic medical records are increasingly common in medical practice. The secondary use of medical records has become increasingly important. It relies on the ability to retrieve the complete information about desired patient populations. How to effectively and accurately retrieve relevant medical records from large- scale medical big data is becoming a big challenge. Therefore, we propose an efficient and robust framework based on cloud for large-scale Traditional Chinese Medical Records (TCMRs) retrieval. We propose a parallel index building method and build a distributed search cluster, the former is used to improve the performance of index building, and the latter is used to provide high concurrent online TCMRs retrieval. Then, a real-time multi-indexing model is proposed to ensure the latest relevant TCMRs are indexed and retrieved in real-time, and a semantics-based query expansion method and a multi- factor ranking model are proposed to improve retrieval quality. Third, we implement a template-based visualization method for displaying medical reports. The proposed parallel indexing method and distributed search cluster can improve the performance of index building and provide high concurrent online TCMRs retrieval. The multi-indexing model can ensure the latest relevant TCMRs are indexed and retrieved in real-time. The semantics expansion method and the multi-factor ranking model can enhance retrieval quality. The template-based visualization method can enhance the availability and universality, where the medical reports are displayed via friendly web interface. In conclusion, compared with the current medical record retrieval systems, our system provides some advantages that are useful in improving the secondary use of large-scale traditional Chinese medical records in cloud environment. The proposed system is more easily integrated with existing clinical systems and be used in various scenarios. Copyright © 2017. Published by Elsevier Inc.
Adapting NBODY4 with a GRAPE-6a Supercomputer for Web Access, Using NBodyLab
NASA Astrophysics Data System (ADS)
Johnson, V.; Aarseth, S.
2006-07-01
A demonstration site has been developed by the authors that enables researchers and students to experiment with the capabilities and performance of NBODY4 running on a GRAPE-6a over the web. NBODY4 is a sophisticated open-source N-body code for high accuracy simulations of dense stellar systems (Aarseth 2003). In 2004, NBODY4 was successfully tested with a GRAPE-6a, yielding an unprecedented low-cost tool for astrophysical research. The GRAPE-6a is a supercomputer card developed by astrophysicists to accelerate high accuracy N-body simulations with a cluster or a desktop PC (Fukushige et al. 2005, Makino & Taiji 1998). The GRAPE-6a card became commercially available in 2004, runs at 125 Gflops peak, has a standard PCI interface and costs less than 10,000. Researchers running the widely used NBODY6 (which does not require GRAPE hardware) can compare their own PC or laptop performance with simulations run on http://www.NbodyLab.org. Such comparisons may help justify acquisition of a GRAPE-6a. For workgroups such as university physics or astronomy departments, the demonstration site may be replicated or serve as a model for a shared computing resource. The site was constructed using an NBodyLab server-side framework.
Research progress and hotspot analysis of spatial interpolation
NASA Astrophysics Data System (ADS)
Jia, Li-juan; Zheng, Xin-qi; Miao, Jin-li
2018-02-01
In this paper, the literatures related to spatial interpolation between 1982 and 2017, which are included in the Web of Science core database, are used as data sources, and the visualization analysis is carried out according to the co-country network, co-category network, co-citation network, keywords co-occurrence network. It is found that spatial interpolation has experienced three stages: slow development, steady development and rapid development; The cross effect between 11 clustering groups, the main convergence of spatial interpolation theory research, the practical application and case study of spatial interpolation and research on the accuracy and efficiency of spatial interpolation. Finding the optimal spatial interpolation is the frontier and hot spot of the research. Spatial interpolation research has formed a theoretical basis and research system framework, interdisciplinary strong, is widely used in various fields.
Development of a Web Based Simulating System for Earthquake Modeling on the Grid
NASA Astrophysics Data System (ADS)
Seber, D.; Youn, C.; Kaiser, T.
2007-12-01
Existing cyberinfrastructure-based information, data and computational networks now allow development of state- of-the-art, user-friendly simulation environments that democratize access to high-end computational environments and provide new research opportunities for many research and educational communities. Within the Geosciences cyberinfrastructure network, GEON, we have developed the SYNSEIS (SYNthetic SEISmogram) toolkit to enable efficient computations of 2D and 3D seismic waveforms for a variety of research purposes especially for helping to analyze the EarthScope's USArray seismic data in a speedy and efficient environment. The underlying simulation software in SYNSEIS is a finite difference code, E3D, developed by LLNL (S. Larsen). The code is embedded within the SYNSEIS portlet environment and it is used by our toolkit to simulate seismic waveforms of earthquakes at regional distances (<1000km). Architecturally, SYNSEIS uses both Web Service and Grid computing resources in a portal-based work environment and has a built in access mechanism to connect to national supercomputer centers as well as to a dedicated, small-scale compute cluster for its runs. Even though Grid computing is well-established in many computing communities, its use among domain scientists still is not trivial because of multiple levels of complexities encountered. We grid-enabled E3D using our own dialect XML inputs that include geological models that are accessible through standard Web services within the GEON network. The XML inputs for this application contain structural geometries, source parameters, seismic velocity, density, attenuation values, number of time steps to compute, and number of stations. By enabling a portal based access to a such computational environment coupled with its dynamic user interface we enable a large user community to take advantage of such high end calculations in their research and educational activities. Our system can be used to promote an efficient and effective modeling environment to help scientists as well as educators in their daily activities and speed up the scientific discovery process.
On Building a Search Interface Discovery System
NASA Astrophysics Data System (ADS)
Shestakov, Denis
A huge portion of the Web known as the deep Web is accessible via search interfaces to myriads of databases on the Web. While relatively good approaches for querying the contents of web databases have been recently proposed, one cannot fully utilize them having most search interfaces unlocated. Thus, the automatic recognition of search interfaces to online databases is crucial for any application accessing the deep Web. This paper describes the architecture of the I-Crawler, a system for finding and classifying search interfaces. The I-Crawler is intentionally designed to be used in the deep web characterization surveys and for constructing directories of deep web resources.
Intelligent Web-Based Learning System with Personalized Learning Path Guidance
ERIC Educational Resources Information Center
Chen, C. M.
2008-01-01
Personalized curriculum sequencing is an important research issue for web-based learning systems because no fixed learning paths will be appropriate for all learners. Therefore, many researchers focused on developing e-learning systems with personalized learning mechanisms to assist on-line web-based learning and adaptively provide learning paths…
Optimizing Decision Support for Tailored Health Behavior Change Applications.
Kukafka, Rita; Jeong, In cheol; Finkelstein, Joseph
2015-01-01
The Tailored Lifestyle Change Decision Aid (TLC DA) system was designed to provide support for a person to make an informed choice about which behavior change to work on when multiple unhealthy behaviors are present. TLC DA can be delivered via web, smartphones and tablets. The system collects a significant amount of information that is used to generate tailored messages to consumers to persuade them in certain healthy lifestyles. One limitation is the necessity to collect vast amounts of information from users who manually enter. By identifying an optimal set of self-reported parameters we will be able to minimize the data entry burden of the app users. The study was to identify primary determinants of health behavior choices made by patients after using the system. Using discriminant analysis an optimal set of predictors was identified. The resulting set included smoking status, smoking cessation success estimate, self-efficacy, body mass index and diet status. Predicting smoking cessation choice was the most accurate, followed by weight management. Physical activity and diet choices were better identified in a combined cluster.
CRISPR/Cas9-Based Multiplex Genome Editing in Monocot and Dicot Plants.
Ma, Xingliang; Liu, Yao-Guang
2016-07-01
The clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9-mediated genome targeting system has been applied to a variety of organisms, including plants. Compared to other genome-targeting technologies such as zinc-finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs), the CRISPR/Cas9 system is easier to use and has much higher editing efficiency. In addition, multiple "single guide RNAs" (sgRNAs) with different target sequences can be designed to direct the Cas9 protein to multiple genomic sites for simultaneous multiplex editing. Here, we present a procedure for highly efficient multiplex genome targeting in monocot and dicot plants using a versatile and robust CRISPR/Cas9 vector system, emphasizing the construction of binary constructs with multiple sgRNA expression cassettes in one round of cloning using Golden Gate ligation. We also describe the genotyping of targeted mutations in transgenic plants by direct Sanger sequencing followed by decoding of superimposed sequencing chromatograms containing biallelic or heterozygous mutations using the Web-based tool DSDecode. © 2016 by John Wiley & Sons, Inc. Copyright © 2016 John Wiley & Sons, Inc.
A web-based biosignal data management system for U-health data integration.
Ro, Dongwoo; Yoo, Sooyoung; Choi, Jinwook
2008-11-06
In the ubiquitous healthcare environment, the biosignal data should be easily accessed and properly maintained. This paper describes a web-based data management system. It consists of a device interface, a data upload control, a central repository, and a web server. For the user-specific web services, a MFER Upload ActiveX Control was developed.
The Evolution of Globular Cluster Systems In Early-Type Galaxies
NASA Astrophysics Data System (ADS)
Grillmair, Carl
1999-07-01
We will measure structural parameters {core radii and concentrations} of globular clusters in three early-type galaxies using deep, four-point dithered observations. We have chosen globular cluster systems which have young, medium-age and old cluster populations, as indicated by cluster colors and luminosities. Our primary goal is to test the hypothesis that globular cluster luminosity functions evolve towards a ``universal'' form. Previous observations have shown that young cluster systems have exponential luminosity functions rather than the characteristic log-normal luminosity function of old cluster systems. We will test to see whether such young system exhibits a wider range of structural parameters than an old systems, and whether and at what rate plausible disruption mechanisms will cause the luminosity function to evolve towards a log-normal form. A simple observational comparison of structural parameters between different age cluster populations and between diff er ent sub-populations within the same galaxy will also provide clues concerning both the formation and destruction mechanisms of star clusters, the distinction between open and globular clusters, and the advisability of using globular cluster luminosity functions as distance indicators.
NASA Astrophysics Data System (ADS)
Choy, Eun Jung; An, Soonmo; Kang, Chang-Keun
2008-06-01
The benthic macroinvertebrates of the Nakdong River estuary were sampled at three different habitats: two salt marsh ( Scirpus triqueter and Phragmites australis) beds and a bare intertidal flat. Fishes were sampled in the main channel. The trophic importance of marsh vascular plants, microphytobenthos, and riverine and channel particulate organic matter to macroinvertebrate and fish production was studied using stable carbon and nitrogen isotope tracers. There was a dramatic change in coverage of macrophytes (salt marshes and seagrass) after the construction of an estuarine barrage in 1987 in the Nakdong River estuary, with the S. triqueter bed increasing, the P. australis bed decreasing, and Zostera marina habitats being nearly lost. Although the invertebrate δ 13C were within a narrower range than those of the primary producers, the values varied considerably among consumers in these habitats. However, the isotope signatures of consumers showed similarities among different habitats. Cluster analysis based on their isotopic similarity suggested that the isotope variability among species was related more to functional feeding groups than to habitats or taxonomic groups. While δ 13C values of suspension feeders were close to that of the channel POM (mainly phytoplankton), other benthic feeders and predators had δ 13C similar to that of microphytobenthos. Isotopic mixing model estimates suggest that algal sources, including microphytobenthos and phytoplankton, play an important role in supporting the benthic food web. Despite the huge productivity of emergent salt marshes, the contribution of the marsh-derived organic matter to the estuarine food webs appears to be limited to some nutrition for some invertebrates just within marsh habitats, with little on the bare intertidal flats or in the channel fish communities. Isotope signatures of the channel fishes also confirm that algal sources are important in supporting fish nutrition. Our findings suggest that benthic and pelagic microalgae made a large contribution to consumer diets, while marsh plants may not have a large role in supporting food webs in this estuarine system.
Developing Distributed Collaboration Systems at NASA: A Report from the Field
NASA Technical Reports Server (NTRS)
Becerra-Fernandez, Irma; Stewart, Helen; Knight, Chris; Norvig, Peter (Technical Monitor)
2001-01-01
Web-based collaborative systems have assumed a pivotal role in the information systems development arena. While business to customers (B-to-C) and business to business (B-to-B) electronic commerce systems, search engines, and chat sites are the focus of attention, web-based systems span the gamut of information systems that were traditionally confined to internal organizational client server networks. For example, the Domino Application Server allows Lotus Notes (trademarked) uses to build collaborative intranet applications and mySAP.com (trademarked) enables web portals and e-commerce applications for SAP users. This paper presents the experiences in the development of one such system: Postdoc, a government off-the-shelf web-based collaborative environment. Issues related to the design of web-based collaborative information systems, including lessons learned from the development and deployment of the system as well as measured performance, are presented in this paper. Finally, the limitations of the implementation approach as well as future plans are presented as well.
System and method for merging clusters of wireless nodes in a wireless network
Budampati, Ramakrishna S [Maple Grove, MN; Gonia, Patrick S [Maplewood, MN; Kolavennu, Soumitri N [Blaine, MN; Mahasenan, Arun V [Kerala, IN
2012-05-29
A system includes a first cluster having multiple first wireless nodes. One first node is configured to act as a first cluster master, and other first nodes are configured to receive time synchronization information provided by the first cluster master. The system also includes a second cluster having one or more second wireless nodes. One second node is configured to act as a second cluster master, and any other second nodes configured to receive time synchronization information provided by the second cluster master. The system further includes a manager configured to merge the clusters into a combined cluster. One of the nodes is configured to act as a single cluster master for the combined cluster, and the other nodes are configured to receive time synchronization information provided by the single cluster master.
Bricker, Jonathan B; Sridharan, Vasundhara; Zhu, Yifan; Mull, Kristin E; Heffner, Jaimee L; Watson, Noreen L; McClure, Jennifer B; Di, Chongzhi
2018-04-20
Little is known about how individuals engage with electronic health (eHealth) interventions over time and whether this engagement predicts health outcomes. The objectives of this study, by using the example of a specific type of eHealth intervention (ie, websites for smoking cessation), were to determine (1) distinct groups of log-in trajectories over a 12-month period, (2) their association with smoking cessation, and (3) baseline user characteristics that predict trajectory group membership. We conducted a functional clustering analysis of 365 consecutive days of log-in data from both arms of a large (N=2637) randomized trial of 2 website interventions for smoking cessation (WebQuit and Smokefree), with a primary outcome of 30-day point prevalence smoking abstinence at 12 months. We conducted analyses for each website separately. A total of 3 distinct trajectory groups emerged for each website. For WebQuit, participants were clustered into 3 groups: 1-week users (682/1240, 55.00% of the sample), 5-week users (399/1240, 32.18%), and 52-week users (159/1240, 12.82%). Compared with the 1-week users, the 5- and 52-week users had 57% higher odds (odds ratio [OR] 1.57, 95% CI 1.13-2.17; P=.007) and 124% higher odds (OR 2.24, 95% CI 1.45-3.43; P<.001), respectively, of being abstinent at 12 months. Smokefree users were clustered into 3 groups: 1-week users (645/1309, 49.27% of the sample), 4-week users (395/1309, 30.18%), and 5-week users (269/1309, 20.55%). Compared with the 1-week users, 5-week users (but not 4-week users; P=.99) had 48% higher odds (OR 1.48, 95% CI 1.05-2.07; P=.02) of being abstinent at 12 months. In general, the WebQuit intervention had a greater number of weekly log-ins within each of the 3 trajectory groups as compared with those of the Smokefree intervention. Baseline characteristics associated with trajectory group membership varied between websites. Patterns of 1-, 4-, and 5-week usage of websites may be common for how people engage in eHealth interventions. The 5-week usage of either website, and 52-week usage only of WebQuit, predicted a higher odds of quitting smoking. Strategies to increase eHealth intervention engagement for 4 more weeks (ie, from 1 week to 5 weeks) could be highly cost effective. ClinicalTrials.gov NCT01812278; https://www.clinicaltrials.gov/ct2/show/NCT01812278 (Archived by WebCite at http://www.webcitation.org/6yPO2OIKR). ©Jonathan B Bricker, Vasundhara Sridharan, Yifan Zhu, Kristin E Mull, Jaimee L Heffner, Noreen L Watson, Jennifer B McClure, Chongzhi Di. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 20.04.2018.
CoP Sensing Framework on Web-Based Environment
NASA Astrophysics Data System (ADS)
Mustapha, S. M. F. D. Syed
The Web technologies and Web applications have shown similar high growth rate in terms of daily usages and user acceptance. The Web applications have not only penetrated in the traditional domains such as education and business but have also encroached into areas such as politics, social, lifestyle, and culture. The emergence of Web technologies has enabled Web access even to the person on the move through PDAs or mobile phones that are connected using Wi-Fi, HSDPA, or other communication protocols. These two phenomena are the inducement factors toward the need of building Web-based systems as the supporting tools in fulfilling many mundane activities. In doing this, one of the many focuses in research has been to look at the implementation challenges in building Web-based support systems in different types of environment. This chapter describes the implementation issues in building the community learning framework that can be supported on the Web-based platform. The Community of Practice (CoP) has been chosen as the community learning theory to be the case study and analysis as it challenges the creativity of the architectural design of the Web system in order to capture the presence of learning activities. The details of this chapter describe the characteristics of the CoP to understand the inherent intricacies in modeling in the Web-based environment, the evidences of CoP that need to be traced automatically in a slick manner such that the evidence-capturing process is unobtrusive, and the technologies needed to embrace a full adoption of Web-based support system for the community learning framework.
HUBBLE VIEWS DISTANT GALAXIES THROUGH A COSMIC LENS
NASA Technical Reports Server (NTRS)
2002-01-01
This NASA Hubble Space Telescope image of the rich galaxy cluster, Abell 2218, is a spectacular example of gravitational lensing. The arc-like pattern spread across the picture like a spider web is an illusion caused by the gravitational field of the cluster. The cluster is so massive and compact that light rays passing through it are deflected by its enormous gravitational field, much as an optical lens bends light to form an image. The process magnifies, brightens and distorts images of objects that lie far beyond the cluster. This provides a powerful 'zoom lens' for viewing galaxies that are so far away they could not normally be observed with the largest available telescopes. Hubble's high resolution reveals numerous arcs which are difficult to detect with ground-based telescopes because they appear to be so thin. The arcs are the distorted images of a very distant galaxy population extending 5-10 times farther than the lensing cluster. This population existed when the universe was just one quarter of its present age. The arcs provide a direct glimpse of how star forming regions are distributed in remote galaxies, and other clues to the early evoution of galaxies. Hubble also reveals multiple imaging, a rarer lensing event that happens when the distortion is large enough to produce more than one image of the same galaxy. Abell 2218 has an unprecedented total of seven multiple systems. The abundance of lensing features in Abell 2218 has been used to make a detailed map of the distribution of matter in the cluster's center. From this, distances can be calculated for a sample of 120 faint arclets found on the Hubble image. These arclets represent galaxies that are 50 times fainter than objects that can be seen with ground-based telescopes. Studies of remote galaxies viewed through well-studied lenses like Abell 2218 promise to reveal the nature of normal galaxies at much earlier epochs than was previously possible. The technique is a powerful combination of Hubble's superlative capabilities and the 'natural' focusing properties of massive clusters like Abell 2218. The image was taken with the Wide Field Planetary Camera 2. Credits: W.Couch (University of New South Wales), R. Ellis (Cambridge University), and NASA
Building a Snow Data Management System using Open Source Software (and IDL)
NASA Astrophysics Data System (ADS)
Goodale, C. E.; Mattmann, C. A.; Ramirez, P.; Hart, A. F.; Painter, T.; Zimdars, P. A.; Bryant, A.; Brodzik, M.; Skiles, M.; Seidel, F. C.; Rittger, K. E.
2012-12-01
At NASA's Jet Propulsion Laboratory free and open source software is used everyday to support a wide range of projects, from planetary to climate to research and development. In this abstract I will discuss the key role that open source software has played in building a robust science data processing pipeline for snow hydrology research, and how the system is also able to leverage programs written in IDL, making JPL's Snow Data System a hybrid of open source and proprietary software. Main Points: - The Design of the Snow Data System (illustrate how the collection of sub-systems are combined to create a complete data processing pipeline) - Discuss the Challenges of moving from a single algorithm on a laptop, to running 100's of parallel algorithms on a cluster of servers (lesson's learned) - Code changes - Software license related challenges - Storage Requirements - System Evolution (from data archiving, to data processing, to data on a map, to near-real-time products and maps) - Road map for the next 6 months (including how easily we re-used the snowDS code base to support the Airborne Snow Observatory Mission) Software in Use and their Software Licenses: IDL - Used for pre and post processing of data. Licensed under a proprietary software license held by Excelis. Apache OODT - Used for data management and workflow processing. Licensed under the Apache License Version 2. GDAL - Geospatial Data processing library used for data re-projection currently. Licensed under the X/MIT license. GeoServer - WMS Server. Licensed under the General Public License Version 2.0 Leaflet.js - Javascript web mapping library. Licensed under the Berkeley Software Distribution License. Python - Glue code and miscellaneous data processing support. Licensed under the Python Software Foundation License. Perl - Script wrapper for running the SCAG algorithm. Licensed under the General Public License Version 3. PHP - Front-end web application programming. Licensed under the PHP License Version 3.01
ERIC Educational Resources Information Center
Hung, Yen-Chu
2012-01-01
The instructional value of web-based education systems has been an important area of research in information systems education. This study investigates the effect of various teaching methods on program design learning for students with specific learning styles in web-based education systems. The study takes first-year Computer Science and…
Web-Based Triage in a College Health Setting
ERIC Educational Resources Information Center
Sole, Mary Lou; Stuart, Patricia L.; Deichen, Michael
2006-01-01
The authors describe the initiation and use of a Web-based triage system in a college health setting. During the first 4 months of implementation, the system recorded 1,290 encounters. More women accessed the system (70%); the average age was 21.8 years. The Web-based triage system advised the majority of students to seek care within 24 hours;…
ERIC Educational Resources Information Center
Wang, Tzu-Hua; Wang, Wei-Lung; Wang, Kuo-Hua; Huang, Shih-Chieh
The study attempted to adapt two web tools, FFS system (Frontpage Feedback System) and WATA system (Web-based Assessment and Test Analysis System), to construct a Hi-FAME (High Feedback-Assessment-Multimedia-Environment) Model in WBI (Web-based Instruction) to facilitate pre-service teacher training. Participants were 30 junior pre-service…
ERIC Educational Resources Information Center
Pujayanto, Pujayanto; Budiharti, Rini; Adhitama, Egy; Nuraini, Niken Rizky Amalia; Putri, Hanung Vernanda
2018-01-01
This research proposes the development of a web-based assessment system to identify students' misconception. The system, named WAS (web-based assessment system), can identify students' misconception profile on linear kinematics automatically after the student has finished the test. The test instrument was developed and validated. Items were…
Kong, Xiangli; Liu, Xin; Tu, Hong; Xu, Yan; Niu, Jianbing; Wang, Yongbin; Zhao, Changlei; Kou, Jingxuan; Feng, Jun
2017-01-31
Shandong Province experienced a declining malaria trend of local-acquired transmission, but the increasing imported malaria remains a challenge. Therefore, understanding the epidemiological characteristics of malaria and the control and elimination strategy and interventions is needed for better planning to achieve the overall elimination goal in Shandong Province. A retrospective study was conducted and all individual cases from a web-based reporting system were reviewed and analysed to explore malaria-endemic characteristics in Shandong from 2005 to 2015. Annual malaria incidence reported in 2005-2015 were geo-coded and matched to the county-level. Spatial cluster analysis was performed to evaluate any identified spatial disease clusters for statistical significance. The space-time cluster was detected with high rates through the retrospective space-time analysis scanning using the discrete Poisson model. The overall malaria incidence decreased to a low level during 2005-2015. In total, 1564 confirmed malaria cases were reported, 27.1% of which (n = 424) were indigenous cases. Most of the indigenous case (n = 339, 80.0%) occurred from June to October. However, the number and scale of imported cases have been increased but no significant difference was observed during months. Shandong is endemic for both Plasmodium vivax (n = 730) and Plasmodium falciparum (n = 674). The disease is mainly distributed in Southern (n = 710) and Eastern region (n = 424) of Shandong, such as Jinning (n = 214 [13.7%]), Weihai (n = 151 [9.7%]), and Yantai (n = 107 [6.8%]). Furthermore, the spatial cluster analysis of malaria cases from 2005 to 2015 indicated that the diseased was not randomly distributed. For indigenous cases, a total of 15 and 2 high-risk counties were determined from 2005 to 2009 (control phase) and from 2010 to 2015 (elimination phase), respectively. For imported cases, a total of 26 and 29 high-risk counties were determined from 2005 to 2009 (control phase) and from 2010 to 2015 (elimination phase), respectively. The method of spatial scan statistics identified different 13 significant spatial clusters between 2005 and 2015. The space-time clustering analysis determined that the most likely cluster included 14 and 19 counties for indigenous and imported, respectively. In order to cope with the requirements of malaria elimination phase, the surveillance system should be strengthened particularity on the frequent migration regions as well as the effective multisectoral cooperation and coordination mechanisms. Specific response packages should be tailored among different types of cities and capacity building should also be improved mainly focus on the emergence response and case management. Fund guarantees for scientific research should be maintained both during the elimination and post-elimination phase to consolidate the achievements of malaria elimination.
Development of grid-like applications for public health using Web 2.0 mashup techniques.
Scotch, Matthew; Yip, Kevin Y; Cheung, Kei-Hoi
2008-01-01
Development of public health informatics applications often requires the integration of multiple data sources. This process can be challenging due to issues such as different file formats, schemas, naming systems, and having to scrape the content of web pages. A potential solution to these system development challenges is the use of Web 2.0 technologies. In general, Web 2.0 technologies are new internet services that encourage and value information sharing and collaboration among individuals. In this case report, we describe the development and use of Web 2.0 technologies including Yahoo! Pipes within a public health application that integrates animal, human, and temperature data to assess the risk of West Nile Virus (WNV) outbreaks. The results of development and testing suggest that while Web 2.0 applications are reasonable environments for rapid prototyping, they are not mature enough for large-scale public health data applications. The application, in fact a "systems of systems," often failed due to varied timeouts for application response across web sites and services, internal caching errors, and software added to web sites by administrators to manage the load on their servers. In spite of these concerns, the results of this study demonstrate the potential value of grid computing and Web 2.0 approaches in public health informatics.
Bayesian analysis of the dynamic cosmic web in the SDSS galaxy survey
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leclercq, Florent; Wandelt, Benjamin; Jasche, Jens, E-mail: florent.leclercq@polytechnique.org, E-mail: jasche@iap.fr, E-mail: wandelt@iap.fr
Recent application of the Bayesian algorithm \\textsc(borg) to the Sloan Digital Sky Survey (SDSS) main sample galaxies resulted in the physical inference of the formation history of the observed large-scale structure from its origin to the present epoch. In this work, we use these inferences as inputs for a detailed probabilistic cosmic web-type analysis. To do so, we generate a large set of data-constrained realizations of the large-scale structure using a fast, fully non-linear gravitational model. We then perform a dynamic classification of the cosmic web into four distinct components (voids, sheets, filaments, and clusters) on the basis of themore » tidal field. Our inference framework automatically and self-consistently propagates typical observational uncertainties to web-type classification. As a result, this study produces accurate cosmographic classification of large-scale structure elements in the SDSS volume. By also providing the history of these structure maps, the approach allows an analysis of the origin and growth of the early traces of the cosmic web present in the initial density field and of the evolution of global quantities such as the volume and mass filling fractions of different structures. For the problem of web-type classification, the results described in this work constitute the first connection between theory and observations at non-linear scales including a physical model of structure formation and the demonstrated capability of uncertainty quantification. A connection between cosmology and information theory using real data also naturally emerges from our probabilistic approach. Our results constitute quantitative chrono-cosmography of the complex web-like patterns underlying the observed galaxy distribution.« less
Edelstein, Michael; Wallensten, Anders; Zetterqvist, Inga; Hulth, Anette
2014-01-01
Norovirus outbreaks severely disrupt healthcare systems. We evaluated whether Websök, an internet-based surveillance system using search engine data, improved norovirus surveillance and response in Sweden. We compared Websök users' characteristics with the general population, cross-correlated weekly Websök searches with laboratory notifications between 2006 and 2013, compared the time Websök and laboratory data crossed the epidemic threshold and surveyed infection control teams about their perception and use of Websök. Users of Websök were not representative of the general population. Websök correlated with laboratory data (b = 0.88-0.89) and gave an earlier signal to the onset of the norovirus season compared with laboratory-based surveillance. 17/21 (81%) infection control teams answered the survey, of which 11 (65%) believed Websök could help with infection control plans. Websök is a low-resource, easily replicable system that detects the norovirus season as reliably as laboratory data, but earlier. Using Websök in routine surveillance can help infection control teams prepare for the yearly norovirus season. PMID:24955857
Edelstein, Michael; Wallensten, Anders; Zetterqvist, Inga; Hulth, Anette
2014-01-01
Norovirus outbreaks severely disrupt healthcare systems. We evaluated whether Websök, an internet-based surveillance system using search engine data, improved norovirus surveillance and response in Sweden. We compared Websök users' characteristics with the general population, cross-correlated weekly Websök searches with laboratory notifications between 2006 and 2013, compared the time Websök and laboratory data crossed the epidemic threshold and surveyed infection control teams about their perception and use of Websök. Users of Websök were not representative of the general population. Websök correlated with laboratory data (b = 0.88-0.89) and gave an earlier signal to the onset of the norovirus season compared with laboratory-based surveillance. 17/21 (81%) infection control teams answered the survey, of which 11 (65%) believed Websök could help with infection control plans. Websök is a low-resource, easily replicable system that detects the norovirus season as reliably as laboratory data, but earlier. Using Websök in routine surveillance can help infection control teams prepare for the yearly norovirus season.
NASA Astrophysics Data System (ADS)
Manuaba, I. B. P.; Rudiastini, E.
2018-01-01
Assessment of lecturers is a tool used to measure lecturer performance. Lecturer’s assessment variable can be measured from three aspects : teaching activities, research and community service. Broad aspect to measure the performance of lecturers requires a special framework, so that the system can be developed in a sustainable manner. Issues of this research is to create a API web service data tool, so the lecturer assessment system can be developed in various frameworks. The research was developed with web service and php programming language with the output of json extension data. The conclusion of this research is API web service data application can be developed using several platforms such as web, mobile application
Sensor Webs with a Service-Oriented Architecture for On-demand Science Products
NASA Technical Reports Server (NTRS)
Mandl, Daniel; Ungar, Stephen; Ames, Troy; Justice, Chris; Frye, Stuart; Chien, Steve; Tran, Daniel; Cappelaere, Patrice; Derezinsfi, Linda; Paules, Granville;
2007-01-01
This paper describes the work being managed by the NASA Goddard Space Flight Center (GSFC) Information System Division (ISD) under a NASA Earth Science Technology Ofice (ESTO) Advanced Information System Technology (AIST) grant to develop a modular sensor web architecture which enables discovery of sensors and workflows that can create customized science via a high-level service-oriented architecture based on Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE) web service standards. These capabilities serve as a prototype to a user-centric architecture for Global Earth Observing System of Systems (GEOSS). This work builds and extends previous sensor web efforts conducted at NASA/GSFC using the Earth Observing 1 (EO-1) satellite and other low-earth orbiting satellites.
Spatial clustering of pixels of a multispectral image
Conger, James Lynn
2014-08-19
A method and system for clustering the pixels of a multispectral image is provided. A clustering system computes a maximum spectral similarity score for each pixel that indicates the similarity between that pixel and the most similar neighboring. To determine the maximum similarity score for a pixel, the clustering system generates a similarity score between that pixel and each of its neighboring pixels and then selects the similarity score that represents the highest similarity as the maximum similarity score. The clustering system may apply a filtering criterion based on the maximum similarity score so that pixels with similarity scores below a minimum threshold are not clustered. The clustering system changes the current pixel values of the pixels in a cluster based on an averaging of the original pixel values of the pixels in the cluster.
DotMapper: an open source tool for creating interactive disease point maps.
Smith, Catherine M; Hayward, Andrew C
2016-04-12
Molecular strain typing of tuberculosis isolates has led to increased understanding of the epidemiological characteristics of the disease and improvements in its control, diagnosis and treatment. However, molecular cluster investigations, which aim to detect previously unidentified cases, remain challenging. Interactive dot mapping is a simple approach which could aid investigations by highlighting cases likely to share epidemiological links. Current tools generally require technical expertise or lack interactivity. We designed a flexible application for producing disease dot maps using Shiny, a web application framework for the statistical software, R. The application displays locations of cases on an interactive map colour coded according to levels of categorical variables such as demographics and risk factors. Cases can be filtered by selecting combinations of these characteristics and by notification date. It can be used to rapidly identify geographic patterns amongst cases in molecular clusters of tuberculosis in space and time; generate hypotheses about disease transmission; identify outliers, and guide targeted control measures. DotMapper is a user-friendly application which enables rapid production of maps displaying locations of cases and their epidemiological characteristics without the need for specialist training in geographic information systems. Enhanced understanding of tuberculosis transmission using this application could facilitate improved detection of cases with epidemiological links and therefore lessen the public health impacts of the disease. It is a flexible system and also has broad international potential application to other investigations using geo-coded health information.