The CSB Incident Screening Database: description, summary statistics and uses.
Gomez, Manuel R; Casper, Susan; Smith, E Allen
2008-11-15
This paper briefly describes the Chemical Incident Screening Database currently used by the CSB to identify and evaluate chemical incidents for possible investigations, and summarizes descriptive statistics from this database that can potentially help to estimate the number, character, and consequences of chemical incidents in the US. The report compares some of the information in the CSB database to roughly similar information available from databases operated by EPA and the Agency for Toxic Substances and Disease Registry (ATSDR), and explores the possible implications of these comparisons with regard to the dimension of the chemical incident problem. Finally, the report explores in a preliminary way whether a system modeled after the existing CSB screening database could be developed to serve as a national surveillance tool for chemical incidents.
Chen, R S; Nadkarni, P; Marenco, L; Levin, F; Erdos, J; Miller, P L
2000-01-01
The entity-attribute-value representation with classes and relationships (EAV/CR) provides a flexible and simple database schema to store heterogeneous biomedical data. In certain circumstances, however, the EAV/CR model is known to retrieve data less efficiently than conventionally based database schemas. To perform a pilot study that systematically quantifies performance differences for database queries directed at real-world microbiology data modeled with EAV/CR and conventional representations, and to explore the relative merits of different EAV/CR query implementation strategies. Clinical microbiology data obtained over a ten-year period were stored using both database models. Query execution times were compared for four clinically oriented attribute-centered and entity-centered queries operating under varying conditions of database size and system memory. The performance characteristics of three different EAV/CR query strategies were also examined. Performance was similar for entity-centered queries in the two database models. Performance in the EAV/CR model was approximately three to five times less efficient than its conventional counterpart for attribute-centered queries. The differences in query efficiency became slightly greater as database size increased, although they were reduced with the addition of system memory. The authors found that EAV/CR queries formulated using multiple, simple SQL statements executed in batch were more efficient than single, large SQL statements. This paper describes a pilot project to explore issues in and compare query performance for EAV/CR and conventional database representations. Although attribute-centered queries were less efficient in the EAV/CR model, these inefficiencies may be addressable, at least in part, by the use of more powerful hardware or more memory, or both.
Subject searching of monographs online in the medical literature.
Brahmi, F A
1988-01-01
Searching by subject for monographic information online in the medical literature is a challenging task. The NLM database of choice is CATLINE. Other NLM databases of interest are BIOTHICSLINE, CANCERLIT, HEALTH, POPLINE, and TOXLINE. Ten BRS databases are also discussed. Of these, Books in Print, Bookinfo, and OCLC are explored further. The databases are compared as to number of total records and number and percentage of monographs. Three topics were searched on CROSS to compare hits on BBIP, BOOK, and OCLC. The same searches were run on CATLINE. The parameters of time coverage and language were equalized and the resulting citations were compared and analyzed for duplication and uniqueness. With the input of CATLINE tapes into OCLC, OCLC has become the database of choice for searching by subject for medical monographs.
Witkowska, Anna M; Zujko, Małgorzata E; Waśkiewicz, Anna; Terlikowska, Katarzyna M; Piotrowski, Walerian
2015-11-11
The primary aim of the study was to estimate the consumption of polyphenols in a population of 6661 subjects aged between 20 and 74 years representing a cross-section of the Polish society, and the second objective was to compare the intakes of flavonoids calculated on the basis of the two commonly used databases. Daily food consumption data were collected in 2003-2005 using a single 24-hour dietary recall. Intake of total polyphenols was estimated using an online Phenol-Explorer database, and flavonoid intake was determined using following data sources: the United States Department of Agriculture (USDA) database combined of flavonoid and isoflavone databases, and the Phenol-Explorer database. Total polyphenol intake, which was calculated with the Phenol-Explorer database, was 989 mg/day with the major contributions of phenolic acids 556 mg/day and flavonoids 403.5 mg/day. The flavonoid intake calculated on the basis of the USDA databases was 525 mg/day. This study found that tea is the primary source of polyphenols and flavonoids for the studied population, including mainly flavanols, while coffee is the most important contributor of phenolic acids, mostly hydroxycinnamic acids. Our study also demonstrated that flavonoid intakes estimated according to various databases may substantially differ. Further work should be undertaken to expand polyphenol databases to better reflect their food contents.
Petherick, Emily S; Pickett, Kate E; Cullum, Nicky A
2015-08-01
Primary care databases from the UK have been widely used to produce evidence on the epidemiology and health service usage of a wide range of conditions. To date there have been few evaluations of the comparability of estimates between different sources of these data. To estimate the comparability of two widely used primary care databases, the Health Improvement Network Database (THIN) and the General Practice Research Database (GPRD) using venous leg ulceration as an exemplar condition. Cross prospective cohort comparison. GPRD and the THIN databases using data from 1998 to 2006. A data set was extracted from both databases containing all cases of persons aged 20 years or greater with a database diagnosis of venous leg ulceration recorded in the databases for the period 1998-2006. Annual rates of incidence and prevalence of venous leg ulceration were calculated within each database and standardized to the European standard population and compared using standardized rate ratios. Comparable estimates of venous leg ulcer incidence from the GPRD and THIN databases could be obtained using data from 2000 to 2006 and of prevalence using data from 2001 to 2006. Recent data collected by these two databases are more likely to produce comparable results of the burden venous leg ulceration. These results require confirmation in other disease areas to enable researchers to have confidence in the comparability of findings from these two widely used primary care research resources. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Using SQL Databases for Sequence Similarity Searching and Analysis.
Pearson, William R; Mackey, Aaron J
2017-09-13
Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
InterRett, a Model for International Data Collection in a Rare Genetic Disorder
ERIC Educational Resources Information Center
Louise, Sandra; Fyfe, Sue; Bebbington, Ami; Bahi-Buisson, Nadia; Anderson, Alison; Pineda, Merce; Percy, Alan; Zeev, Bruria Ben; Wu, Xi Ru; Bao, Xinhua; MacLeod, Patrick; Armstrong, Judith; Leonard, Helen
2009-01-01
Rett syndrome (RTT) is a rare genetic disorder within the autistic spectrum. This study compared socio-demographic, clinical and genetic characteristics of the international database, InterRett, and the population-based Australian Rett syndrome database (ARSD). It also explored the strengths and limitations of InterRett in comparison with other…
ERIC Educational Resources Information Center
Caison, Amy L.
2007-01-01
This study empirically explores the comparability of traditional survey-based retention research methodology with an alternative approach that relies on data commonly available in institutional student databases. Drawing on Tinto's [Tinto, V. (1993). "Leaving College: Rethinking the Causes and Cures of Student Attrition" (2nd Ed.), The University…
A web-based data visualization tool for the MIMIC-II database.
Lee, Joon; Ribey, Evan; Wallace, James R
2016-02-04
Although MIMIC-II, a public intensive care database, has been recognized as an invaluable resource for many medical researchers worldwide, becoming a proficient MIMIC-II researcher requires knowledge of SQL programming and an understanding of the MIMIC-II database schema. These are challenging requirements especially for health researchers and clinicians who may have limited computer proficiency. In order to overcome this challenge, our objective was to create an interactive, web-based MIMIC-II data visualization tool that first-time MIMIC-II users can easily use to explore the database. The tool offers two main features: Explore and Compare. The Explore feature enables the user to select a patient cohort within MIMIC-II and visualize the distributions of various administrative, demographic, and clinical variables within the selected cohort. The Compare feature enables the user to select two patient cohorts and visually compare them with respect to a variety of variables. The tool is also helpful to experienced MIMIC-II researchers who can use it to substantially accelerate the cumbersome and time-consuming steps of writing SQL queries and manually visualizing extracted data. Any interested researcher can use the MIMIC-II data visualization tool for free to quickly and conveniently conduct a preliminary investigation on MIMIC-II with a few mouse clicks. Researchers can also use the tool to learn the characteristics of the MIMIC-II patients. Since it is still impossible to conduct multivariable regression inside the tool, future work includes adding analytics capabilities. Also, the next version of the tool will aim to utilize MIMIC-III which contains more data.
Comparison of flavonoid intake assessment methods.
Ivey, Kerry L; Croft, Kevin; Prince, Richard L; Hodgson, Jonathan M
2016-09-14
Flavonoids are a diverse group of polyphenolic compounds found in high concentrations in many plant foods and beverages. High flavonoid intake has been associated with reduced risk of chronic disease. To date, population based studies have used the United States Department of Agriculture (USDA) food content database to determine habitual flavonoid intake. More recently, a new flavonoid food content database, Phenol-Explorer (PE), has been developed. However, the level of agreement between the two databases is yet to be explored. To compare the methods used to create each database, and to explore the level of agreement between the flavonoid intake estimates derived from USDA and PE data. The study population included 1063 randomly selected women aged over 75 years. Two separate intake estimates were determined using food composition data from the USDA and the PE databases. There were many similarities in methods used to create each database; however, there are several methodological differences that manifest themselves in differences in flavonoid intake estimates between the 2 databases. Despite differences in net estimates, there was a strong level of agreement between total-flavonoid, flavanol, flavanone and anthocyanidin intake estimates derived from each database. Intake estimates for flavanol monomers showed greater agreement than flavanol polymers. The level of agreement between the two databases was the weakest for the flavonol and flavone intake estimates. In this population, the application of USDA and PE source data yielded highly correlated intake estimates for total-flavonoids, flavanols, flavanones and anthocyanidins. For these sub-classes, the USDA and PE databases may be used interchangeably in epidemiological investigations. There was poorer correlation between intake estimates for flavonols and flavones due to differences in USDA and PE methodologies. Individual flavonoid compound groups that comprise flavonoid sub-classes had varying levels of agreement. As such, when determining the appropriate database to calculate flavonoid intake variables, it is important to consider methodologies underpinning database creation and which foods are important contributors to dietary intake in the population of interest.
Industry ties in otolaryngology: initial insights from the physician payment sunshine act.
Rathi, Vinay K; Samuel, Andre M; Mehra, Saral
2015-06-01
To characterize nonresearch payments made by industry to otolaryngologists in order to explore how the potential for conflicts of interests varies among otolaryngologists and compares between otolaryngologists and other surgical specialists. Retrospective cross-sectional database analysis. Open Payments program database recently released by Centers for Medicare and Medicaid Services. Surgeons nationwide who were identified as receiving nonresearch payment from industry in accordance with the Physician Payment Sunshine Act. The proportion of otolaryngologists receiving payment, the mean payment per otolaryngologist, and the standard deviation thereof were determined using the Open Payments database and compared to other surgical specialties. Otolaryngologists were further compared by specialization, census region, sponsor, and payment amount. Less than half of otolaryngologists (48.1%) were reported as receiving payments over the study period, the second smallest proportion among surgical specialties. Otolaryngologists received the lowest mean payment per compensated individual ($573) compared to other surgical specialties. Although otolaryngology had the smallest variance in payment among surgical specialties (SD, $2806), the distribution was skewed by top earners; the top 10% of earners accounted for 87% ($2,199,254) of all payment to otolaryngologists. Otolaryngologists in the West census region were less likely to receive payments (38.6%, P < .001). Over the study period, otolaryngologists appeared to have more limited financial ties with industry compared to other surgeons, though variation exists within otolaryngology. Further refinement of the Open Payments database is needed to explore differences between otolaryngologists and leverage payment information as a tool for self-regulation. © American Academy of Otolaryngology—Head and Neck Surgery Foundation 2015.
The Comparative Toxicogenomics Database (CTD): A Resource for Comparative Toxicological Studies
CJ, Mattingly; MC, Rosenstein; GT, Colby; JN, Forrest; JL, Boyer
2006-01-01
The etiology of most chronic diseases involves interactions between environmental factors and genes that modulate important biological processes (Olden and Wilson, 2000). We are developing the publicly available Comparative Toxicogenomics Database (CTD) to promote understanding about the effects of environmental chemicals on human health. CTD identifies interactions between chemicals and genes and facilitates cross-species comparative studies of these genes. The use of diverse animal models and cross-species comparative sequence studies has been critical for understanding basic physiological mechanisms and gene and protein functions. Similarly, these approaches will be valuable for exploring the molecular mechanisms of action of environmental chemicals and the genetic basis of differential susceptibility. PMID:16902965
Scott, J; Botsis, T; Ball, R
2014-01-01
Spontaneous Reporting Systems [SRS] are critical tools in the post-licensure evaluation of medical product safety. Regulatory authorities use a variety of data mining techniques to detect potential safety signals in SRS databases. Assessing the performance of such signal detection procedures requires simulated SRS databases, but simulation strategies proposed to date each have limitations. We sought to develop a novel SRS simulation strategy based on plausible mechanisms for the growth of databases over time. We developed a simulation strategy based on the network principle of preferential attachment. We demonstrated how this strategy can be used to create simulations based on specific databases of interest, and provided an example of using such simulations to compare signal detection thresholds for a popular data mining algorithm. The preferential attachment simulations were generally structurally similar to our targeted SRS database, although they had fewer nodes of very high degree. The approach was able to generate signal-free SRS simulations, as well as mimicking specific known true signals. Explorations of different reporting thresholds for the FDA Vaccine Adverse Event Reporting System suggested that using proportional reporting ratio [PRR] > 3.0 may yield better signal detection operating characteristics than the more commonly used PRR > 2.0 threshold. The network analytic approach to SRS simulation based on the principle of preferential attachment provides an attractive framework for exploring the performance of safety signal detection algorithms. This approach is potentially more principled and versatile than existing simulation approaches. The utility of network-based SRS simulations needs to be further explored by evaluating other types of simulated signals with a broader range of data mining approaches, and comparing network-based simulations with other simulation strategies where applicable.
Tissue Molecular Anatomy Project (TMAP): an expression database for comparative cancer proteomics.
Medjahed, Djamel; Luke, Brian T; Tontesh, Tawady S; Smythers, Gary W; Munroe, David J; Lemkin, Peter F
2003-08-01
By mining publicly accessible databases, we have developed a collection of tissue-specific predictive protein expression maps as a function of cancer histological state. Data analysis is applied to the differential expression of gene products in pooled libraries from the normal to the altered state(s). We wish to report the initial results of our survey across different tissues and explore the extent to which this comparative approach may help uncover panels of potential biomarkers of tumorigenesis which would warrant further examination in the laboratory.
PIGD: a database for intronless genes in the Poaceae.
Yan, Hanwei; Jiang, Cuiping; Li, Xiaoyu; Sheng, Lei; Dong, Qing; Peng, Xiaojian; Li, Qian; Zhao, Yang; Jiang, Haiyang; Cheng, Beijiu
2014-10-01
Intronless genes are a feature of prokaryotes; however, they are widespread and unequally distributed among eukaryotes and represent an important resource to study the evolution of gene architecture. Although many databases on exons and introns exist, there is currently no cohesive database that collects intronless genes in plants into a single database. In this study, we present the Poaceae Intronless Genes Database (PIGD), a user-friendly web interface to explore information on intronless genes from different plants. Five Poaceae species, Sorghum bicolor, Zea mays, Setaria italica, Panicum virgatum and Brachypodium distachyon, are included in the current release of PIGD. Gene annotations and sequence data were collected and integrated from different databases. The primary focus of this study was to provide gene descriptions and gene product records. In addition, functional annotations, subcellular localization prediction and taxonomic distribution are reported. PIGD allows users to readily browse, search and download data. BLAST and comparative analyses are also provided through this online database, which is available at http://pigd.ahau.edu.cn/. PIGD provides a solid platform for the collection, integration and analysis of intronless genes in the Poaceae. As such, this database will be useful for subsequent bio-computational analysis in comparative genomics and evolutionary studies.
Bridging the Qualitative/Quantitative Software Divide
Annechino, Rachelle; Antin, Tamar M. J.; Lee, Juliet P.
2011-01-01
To compare and combine qualitative and quantitative data collected from respondents in a mixed methods study, the research team developed a relational database to merge survey responses stored and analyzed in SPSS and semistructured interview responses stored and analyzed in the qualitative software package ATLAS.ti. The process of developing the database, as well as practical considerations for researchers who may wish to use similar methods, are explored. PMID:22003318
ERIC Educational Resources Information Center
Kurtz, Michael J.; Eichorn, Guenther; Accomazzi, Alberto; Grant, Carolyn S.; Demleitner, Markus; Murray, Stephen S.; Jones, Michael L. W.; Gay, Geri K.; Rieger, Robert H.; Millman, David; Bruggemann-Klein, Anne; Klein, Rolf; Landgraf, Britta; Wang, James Ze; Li, Jia; Chan, Desmond; Wiederhold, Gio; Pitti, Daniel V.
1999-01-01
Includes six articles that discuss a digital library for astronomy; comparing evaluations of digital collection efforts; cross-organizational access management of Web-based resources; searching scientific bibliographic databases based on content-based relations between documents; semantics-sensitive retrieval for digital picture libraries; and…
Pérez-Jiménez, J; Neveu, V; Vos, F; Scalbert, A
2010-11-01
The diversity of the chemical structures of dietary polyphenols makes it difficult to estimate their total content in foods, and also to understand the role of polyphenols in health and the prevention of diseases. Global redox colorimetric assays have commonly been used to estimate the total polyphenol content in foods. However, these assays lack specificity. Contents of individual polyphenols have been determined by chromatography. These data, scattered in several hundred publications, have been compiled in the Phenol-Explorer database. The aim of this paper is to identify the 100 richest dietary sources of polyphenols using this database. Advanced queries in the Phenol-Explorer database (www.phenol-explorer.eu) allowed retrieval of information on the content of 502 polyphenol glycosides, esters and aglycones in 452 foods. Total polyphenol content was calculated as the sum of the contents of all individual polyphenols. These content values were compared with the content of antioxidants estimated using the Folin assay method in the same foods. These values were also extracted from the same database. Amounts per serving were calculated using common serving sizes. A list of the 100 richest dietary sources of polyphenols was produced, with contents varying from 15,000 mg per 100 g in cloves to 10 mg per 100 ml in rosé wine. The richest sources were various spices and dried herbs, cocoa products, some darkly coloured berries, some seeds (flaxseed) and nuts (chestnut, hazelnut) and some vegetables, including olive and globe artichoke heads. A list of the 89 foods and beverages providing more than 1 mg of total polyphenols per serving was established. A comparison of total polyphenol contents with antioxidant contents, as determined by the Folin assay, also showed that Folin values systematically exceed the total polyphenol content values. The comprehensive Phenol-Explorer data were used for the first time to identify the richest dietary sources of polyphenols and the foods contributing most significantly to polyphenol intake as inferred from their content per serving.
Exploring the CAESAR database using dimensionality reduction techniques
NASA Astrophysics Data System (ADS)
Mendoza-Schrock, Olga; Raymer, Michael L.
2012-06-01
The Civilian American and European Surface Anthropometry Resource (CAESAR) database containing over 40 anthropometric measurements on over 4000 humans has been extensively explored for pattern recognition and classification purposes using the raw, original data [1-4]. However, some of the anthropometric variables would be impossible to collect in an uncontrolled environment. Here, we explore the use of dimensionality reduction methods in concert with a variety of classification algorithms for gender classification using only those variables that are readily observable in an uncontrolled environment. Several dimensionality reduction techniques are employed to learn the underlining structure of the data. These techniques include linear projections such as the classical Principal Components Analysis (PCA) and non-linear (manifold learning) techniques, such as Diffusion Maps and the Isomap technique. This paper briefly describes all three techniques, and compares three different classifiers, Naïve Bayes, Adaboost, and Support Vector Machines (SVM), for gender classification in conjunction with each of these three dimensionality reduction approaches.
Using the Proteomics Identifications Database (PRIDE).
Martens, Lennart; Jones, Phil; Côté, Richard
2008-03-01
The Proteomics Identifications Database (PRIDE) is a public data repository designed to store, disseminate, and analyze mass spectrometry based proteomics datasets. The PRIDE database can accommodate any level of detailed metadata about the submitted results, which can be queried, explored, viewed, or downloaded via the PRIDE Web interface. The PRIDE database also provides a simple, yet powerful, access control mechanism that fully supports confidential peer-reviewing of data related to a manuscript, ensuring that these results remain invisible to the general public while allowing referees and journal editors anonymized access to the data. This unit describes in detail the functionality that PRIDE provides with regards to searching, viewing, and comparing the available data, as well as different options for submitting data to PRIDE.
PGSB/MIPS PlantsDB Database Framework for the Integration and Analysis of Plant Genome Data.
Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai; Gundlach, Heidrun; Mayer, Klaus F X
2017-01-01
Plant Genome and Systems Biology (PGSB), formerly Munich Institute for Protein Sequences (MIPS) PlantsDB, is a database framework for the integration and analysis of plant genome data, developed and maintained for more than a decade now. Major components of that framework are genome databases and analysis resources focusing on individual (reference) genomes providing flexible and intuitive access to data. Another main focus is the integration of genomes from both model and crop plants to form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny). Data exchange and integrated search functionality with/over many plant genome databases is provided within the transPLANT project.
Selecting Full-Text Undergraduate Periodicals Databases.
ERIC Educational Resources Information Center
Still, Julie M.; Kassabian, Vibiana
1999-01-01
Examines how libraries and librarians can compare full-text general periodical indices, using ProQuest Direct, Periodical Abstracts (via Ovid), and EBSCOhost as examples. Explores breadth and depth of coverage; manipulation of results (email/download/print); ease of use (searching); and indexing quirks. (AEF)
Exploring the feasibility of traditional image querying tasks for industrial radiographs
NASA Astrophysics Data System (ADS)
Bray, Iliana E.; Tsai, Stephany J.; Jimenez, Edward S.
2015-08-01
Although there have been great strides in object recognition with optical images (photographs), there has been comparatively little research into object recognition for X-ray radiographs. Our exploratory work contributes to this area by creating an object recognition system designed to recognize components from a related database of radiographs. Object recognition for radiographs must be approached differently than for optical images, because radiographs have much less color-based information to distinguish objects, and they exhibit transmission overlap that alters perceived object shapes. The dataset used in this work contained more than 55,000 intermixed radiographs and photographs, all in a compressed JPEG form and with multiple ways of describing pixel information. For this work, a robust and efficient system is needed to combat problems presented by properties of the X-ray imaging modality, the large size of the given database, and the quality of the images contained in said database. We have explored various pre-processing techniques to clean the cluttered and low-quality images in the database, and we have developed our object recognition system by combining multiple object detection and feature extraction methods. We present the preliminary results of the still-evolving hybrid object recognition system.
E. Freeman; G. Moisen; J. Coulston; B. Wilson
2014-01-01
Random forests (RF) and stochastic gradient boosting (SGB), both involving an ensemble of classification and regression trees, are compared for modeling tree canopy cover for the 2011 National Land Cover Database (NLCD). The objectives of this study were twofold. First, sensitivity of RF and SGB to choices in tuning parameters was explored. Second, performance of the...
Prototype Development of a Tradespace Analysis Tool for Spaceflight Medical Resources.
Antonsen, Erik L; Mulcahy, Robert A; Rubin, David; Blue, Rebecca S; Canga, Michael A; Shah, Ronak
2018-02-01
The provision of medical care in exploration-class spaceflight is limited by mass, volume, and power constraints, as well as limitations of available skillsets of crewmembers. A quantitative means of exploring the risks and benefits of inclusion or exclusion of onboard medical capabilities may help to inform the development of an appropriate medical system. A pilot project was designed to demonstrate the utility of an early tradespace analysis tool for identifying high-priority resources geared toward properly equipping an exploration mission medical system. Physician subject matter experts identified resources, tools, and skillsets required, as well as associated criticality scores of the same, to meet terrestrial, U.S.-specific ideal medical solutions for conditions concerning for exploration-class spaceflight. A database of diagnostic and treatment actions and resources was created based on this input and weighed against the probabilities of mission-specific medical events to help identify common and critical elements needed in a future exploration medical capability. Analysis of repository data demonstrates the utility of a quantitative method of comparing various medical resources and skillsets for future missions. Directed database queries can provide detailed comparative estimates concerning likelihood of resource utilization within a given mission and the weighted utility of tangible and intangible resources. This prototype tool demonstrates one quantitative approach to the complex needs and limitations of an exploration medical system. While this early version identified areas for refinement in future version development, more robust analysis tools may help to inform the development of a comprehensive medical system for future exploration missions.Antonsen EL, Mulcahy RA, Rubin D, Blue RS, Canga MA, Shah R. Prototype development of a tradespace analysis tool for spaceflight medical resources. Aerosp Med Hum Perform. 2018; 89(2):108-114.
Data exploration systems for databases
NASA Technical Reports Server (NTRS)
Greene, Richard J.; Hield, Christopher
1992-01-01
Data exploration systems apply machine learning techniques, multivariate statistical methods, information theory, and database theory to databases to identify significant relationships among the data and summarize information. The result of applying data exploration systems should be a better understanding of the structure of the data and a perspective of the data enabling an analyst to form hypotheses for interpreting the data. This paper argues that data exploration systems need a minimum amount of domain knowledge to guide both the statistical strategy and the interpretation of the resulting patterns discovered by these systems.
Exploring DNA Structure with Cn3D
ERIC Educational Resources Information Center
Porter, Sandra G.; Day, Joseph; McCarty, Richard E.; Shearn, Allen; Shingles, Richard; Fletcher, Linnea; Murphy, Stephanie; Pearlman, Rebecca
2007-01-01
Researchers in the field of bioinformatics have developed a number of analytical programs and databases that are increasingly important for advancing biological research. Because bioinformatics programs are used to analyze, visualize, and/or compare biological data, it is likely that the use of these programs will have a positive impact on biology…
ExplorEnz: a MySQL database of the IUBMB enzyme nomenclature.
McDonald, Andrew G; Boyce, Sinéad; Moss, Gerard P; Dixon, Henry B F; Tipton, Keith F
2007-07-27
We describe the database ExplorEnz, which is the primary repository for EC numbers and enzyme data that are being curated on behalf of the IUBMB. The enzyme nomenclature is incorporated into many other resources, including the ExPASy-ENZYME, BRENDA and KEGG bioinformatics databases. The data, which are stored in a MySQL database, preserve the formatting of chemical and enzyme names. A simple, easy to use, web-based query interface is provided, along with an advanced search engine for more complex queries. The database is publicly available at http://www.enzyme-database.org. The data are available for download as SQL and XML files via FTP. ExplorEnz has powerful and flexible search capabilities and provides the scientific community with the most up-to-date version of the IUBMB Enzyme List.
Aerothermal Testing for Project Orion Crew Exploration Vehicle
NASA Technical Reports Server (NTRS)
Berry, Scott A.; Horvath, Thomas J.; Lillard, Randolph P.; Kirk, Benjamin S.; Fischer-Cassady, Amy
2009-01-01
The Project Orion Crew Exploration Vehicle aerothermodynamic experimentation strategy, as it relates to flight database development, is reviewed. Experimental data has been obtained to both validate the computational predictions utilized as part of the database and support the development of engineering models for issues not adequately addressed with computations. An outline is provided of the working groups formed to address the key deficiencies in data and knowledge for blunt reentry vehicles. The facilities utilized to address these deficiencies are reviewed, along with some of the important results obtained thus far. For smooth wall comparisons of computational convective heating predictions against experimental data from several facilities, confidence was gained with the use of algebraic turbulence model solutions as part of the database. For cavities and protuberances, experimental data is being used for screening various designs, plus providing support to the development of engineering models. With the reaction-control system testing, experimental data were acquired on the surface in combination with off-body flow visualization of the jet plumes and interactions. These results are being compared against predictions for improved understanding of aftbody thermal environments and uncertainties.
Hirano, Yoko; Asami, Yuko; Kuribayashi, Kazuhiko; Kitazaki, Shigeru; Yamamoto, Yuji; Fujimoto, Yoko
2018-05-01
Many pharmacoepidemiologic studies using large-scale databases have recently been utilized to evaluate the safety and effectiveness of drugs in Western countries. In Japan, however, conventional methodology has been applied to postmarketing surveillance (PMS) to collect safety and effectiveness information on new drugs to meet regulatory requirements. Conventional PMS entails enormous costs and resources despite being an uncontrolled observational study method. This study is aimed at examining the possibility of database research as a more efficient pharmacovigilance approach by comparing a health care claims database and PMS with regard to the characteristics and safety profiles of sertraline-prescribed patients. The characteristics of sertraline-prescribed patients recorded in a large-scale Japanese health insurance claims database developed by MinaCare Co. Ltd. were scanned and compared with the PMS results. We also explored the possibility of detecting signals indicative of adverse reactions based on the claims database by using sequence symmetry analysis. Diabetes mellitus, hyperlipidemia, and hyperthyroidism served as exploratory events, and their detection criteria for the claims database were reported by the Pharmaceuticals and Medical Devices Agency in Japan. Most of the characteristics of sertraline-prescribed patients in the claims database did not differ markedly from those in the PMS. There was no tendency for higher risks of the exploratory events after exposure to sertraline, and this was consistent with sertraline's known safety profile. Our results support the concept of using database research as a cost-effective pharmacovigilance tool that is free of selection bias . Further investigation using database research is required to confirm our preliminary observations. Copyright © 2018. Published by Elsevier Inc.
Factors Affecting Volunteering among Older Rural and City Dwelling Adults in Australia
ERIC Educational Resources Information Center
Warburton, Jeni; Stirling, Christine
2007-01-01
In the absence of large scale Australian studies of volunteering among older adults, this study compared the relevance of two theoretical approaches--social capital theory and sociostructural resources theory--to predict voluntary activity in relation to a large national database. The paper explores volunteering by older people (aged 55+) in order…
Alignment of high-throughput sequencing data inside in-memory databases.
Firnkorn, Daniel; Knaup-Gregori, Petra; Lorenzo Bermejo, Justo; Ganzinger, Matthias
2014-01-01
In times of high-throughput DNA sequencing techniques, performance-capable analysis of DNA sequences is of high importance. Computer supported DNA analysis is still an intensive time-consuming task. In this paper we explore the potential of a new In-Memory database technology by using SAP's High Performance Analytic Appliance (HANA). We focus on read alignment as one of the first steps in DNA sequence analysis. In particular, we examined the widely used Burrows-Wheeler Aligner (BWA) and implemented stored procedures in both, HANA and the free database system MySQL, to compare execution time and memory management. To ensure that the results are comparable, MySQL has been running in memory as well, utilizing its integrated memory engine for database table creation. We implemented stored procedures, containing exact and inexact searching of DNA reads within the reference genome GRCh37. Due to technical restrictions in SAP HANA concerning recursion, the inexact matching problem could not be implemented on this platform. Hence, performance analysis between HANA and MySQL was made by comparing the execution time of the exact search procedures. Here, HANA was approximately 27 times faster than MySQL which means, that there is a high potential within the new In-Memory concepts, leading to further developments of DNA analysis procedures in the future.
Corwin, John; Silberschatz, Avi; Miller, Perry L; Marenco, Luis
2007-01-01
Data sparsity and schema evolution issues affecting clinical informatics and bioinformatics communities have led to the adoption of vertical or object-attribute-value-based database schemas to overcome limitations posed when using conventional relational database technology. This paper explores these issues and discusses why biomedical data are difficult to model using conventional relational techniques. The authors propose a solution to these obstacles based on a relational database engine using a sparse, column-store architecture. The authors provide benchmarks comparing the performance of queries and schema-modification operations using three different strategies: (1) the standard conventional relational design; (2) past approaches used by biomedical informatics researchers; and (3) their sparse, column-store architecture. The performance results show that their architecture is a promising technique for storing and processing many types of data that are not handled well by the other two semantic data models.
Food Composition Database Format and Structure: A User Focused Approach
Clancy, Annabel K.; Woods, Kaitlyn; McMahon, Anne; Probst, Yasmine
2015-01-01
This study aimed to investigate the needs of Australian food composition database user’s regarding database format and relate this to the format of databases available globally. Three semi structured synchronous online focus groups (M = 3, F = 11) and n = 6 female key informant interviews were recorded. Beliefs surrounding the use, training, understanding, benefits and limitations of food composition data and databases were explored. Verbatim transcriptions underwent preliminary coding followed by thematic analysis with NVivo qualitative analysis software to extract the final themes. Schematic analysis was applied to the final themes related to database format. Desktop analysis also examined the format of six key globally available databases. 24 dominant themes were established, of which five related to format; database use, food classification, framework, accessibility and availability, and data derivation. Desktop analysis revealed that food classification systems varied considerably between databases. Microsoft Excel was a common file format used in all databases, and available software varied between countries. User’s also recognised that food composition databases format should ideally be designed specifically for the intended use, have a user-friendly food classification system, incorporate accurate data with clear explanation of data derivation and feature user input. However, such databases are limited by data availability and resources. Further exploration of data sharing options should be considered. Furthermore, user’s understanding of food composition data and databases limitations is inherent to the correct application of non-specific databases. Therefore, further exploration of user FCDB training should also be considered. PMID:26554836
BμG@Sbase—a microbial gene expression and comparative genomic database
Witney, Adam A.; Waldron, Denise E.; Brooks, Lucy A.; Tyler, Richard H.; Withers, Michael; Stoker, Neil G.; Wren, Brendan W.; Butcher, Philip D.; Hinds, Jason
2012-01-01
The reducing cost of high-throughput functional genomic technologies is creating a deluge of high volume, complex data, placing the burden on bioinformatics resources and tool development. The Bacterial Microarray Group at St George's (BμG@S) has been at the forefront of bacterial microarray design and analysis for over a decade and while serving as a hub of a global network of microbial research groups has developed BμG@Sbase, a microbial gene expression and comparative genomic database. BμG@Sbase (http://bugs.sgul.ac.uk/bugsbase/) is a web-browsable, expertly curated, MIAME-compliant database that stores comprehensive experimental annotation and multiple raw and analysed data formats. Consistent annotation is enabled through a structured set of web forms, which guide the user through the process following a set of best practices and controlled vocabulary. The database currently contains 86 expertly curated publicly available data sets (with a further 124 not yet published) and full annotation information for 59 bacterial microarray designs. The data can be browsed and queried using an explorer-like interface; integrating intuitive tree diagrams to present complex experimental details clearly and concisely. Furthermore the modular design of the database will provide a robust platform for integrating other data types beyond microarrays into a more Systems analysis based future. PMID:21948792
BμG@Sbase--a microbial gene expression and comparative genomic database.
Witney, Adam A; Waldron, Denise E; Brooks, Lucy A; Tyler, Richard H; Withers, Michael; Stoker, Neil G; Wren, Brendan W; Butcher, Philip D; Hinds, Jason
2012-01-01
The reducing cost of high-throughput functional genomic technologies is creating a deluge of high volume, complex data, placing the burden on bioinformatics resources and tool development. The Bacterial Microarray Group at St George's (BμG@S) has been at the forefront of bacterial microarray design and analysis for over a decade and while serving as a hub of a global network of microbial research groups has developed BμG@Sbase, a microbial gene expression and comparative genomic database. BμG@Sbase (http://bugs.sgul.ac.uk/bugsbase/) is a web-browsable, expertly curated, MIAME-compliant database that stores comprehensive experimental annotation and multiple raw and analysed data formats. Consistent annotation is enabled through a structured set of web forms, which guide the user through the process following a set of best practices and controlled vocabulary. The database currently contains 86 expertly curated publicly available data sets (with a further 124 not yet published) and full annotation information for 59 bacterial microarray designs. The data can be browsed and queried using an explorer-like interface; integrating intuitive tree diagrams to present complex experimental details clearly and concisely. Furthermore the modular design of the database will provide a robust platform for integrating other data types beyond microarrays into a more Systems analysis based future.
ExplorEnz: a MySQL database of the IUBMB enzyme nomenclature
McDonald, Andrew G; Boyce, Sinéad; Moss, Gerard P; Dixon, Henry BF; Tipton, Keith F
2007-01-01
Background We describe the database ExplorEnz, which is the primary repository for EC numbers and enzyme data that are being curated on behalf of the IUBMB. The enzyme nomenclature is incorporated into many other resources, including the ExPASy-ENZYME, BRENDA and KEGG bioinformatics databases. Description The data, which are stored in a MySQL database, preserve the formatting of chemical and enzyme names. A simple, easy to use, web-based query interface is provided, along with an advanced search engine for more complex queries. The database is publicly available at . The data are available for download as SQL and XML files via FTP. Conclusion ExplorEnz has powerful and flexible search capabilities and provides the scientific community with the most up-to-date version of the IUBMB Enzyme List. PMID:17662133
ALDB: a domestic-animal long noncoding RNA database.
Li, Aimin; Zhang, Junying; Zhou, Zhongyin; Wang, Lei; Liu, Yujuan; Liu, Yajun
2015-01-01
Long noncoding RNAs (lncRNAs) have attracted significant attention in recent years due to their important roles in many biological processes. Domestic animals constitute a unique resource for understanding the genetic basis of phenotypic variation and are ideal models relevant to diverse areas of biomedical research. With improving sequencing technologies, numerous domestic-animal lncRNAs are now available. Thus, there is an immediate need for a database resource that can assist researchers to store, organize, analyze and visualize domestic-animal lncRNAs. The domestic-animal lncRNA database, named ALDB, is the first comprehensive database with a focus on the domestic-animal lncRNAs. It currently archives 12,103 pig intergenic lncRNAs (lincRNAs), 8,923 chicken lincRNAs and 8,250 cow lincRNAs. In addition to the annotations of lincRNAs, it offers related data that is not available yet in existing lncRNA databases (lncRNAdb and NONCODE), such as genome-wide expression profiles and animal quantitative trait loci (QTLs) of domestic animals. Moreover, a collection of interfaces and applications, such as the Basic Local Alignment Search Tool (BLAST), the Generic Genome Browser (GBrowse) and flexible search functionalities, are available to help users effectively explore, analyze and download data related to domestic-animal lncRNAs. ALDB enables the exploration and comparative analysis of lncRNAs in domestic animals. A user-friendly web interface, integrated information and tools make it valuable to researchers in their studies. ALDB is freely available from http://res.xaut.edu.cn/aldb/index.jsp.
MBGD update 2013: the microbial genome database for exploring the diversity of microbial world.
Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu
2013-01-01
The microbial genome database for comparative analysis (MBGD, available at http://mbgd.genome.ad.jp/) is a platform for microbial genome comparison based on orthology analysis. As its unique feature, MBGD allows users to conduct orthology analysis among any specified set of organisms; this flexibility allows MBGD to adapt to a variety of microbial genomic study. Reflecting the huge diversity of microbial world, the number of microbial genome projects now becomes several thousands. To efficiently explore the diversity of the entire microbial genomic data, MBGD now provides summary pages for pre-calculated ortholog tables among various taxonomic groups. For some closely related taxa, MBGD also provides the conserved synteny information (core genome alignment) pre-calculated using the CoreAligner program. In addition, efficient incremental updating procedure can create extended ortholog table by adding additional genomes to the default ortholog table generated from the representative set of genomes. Combining with the functionalities of the dynamic orthology calculation of any specified set of organisms, MBGD is an efficient and flexible tool for exploring the microbial genome diversity.
A study of the Immune Epitope Database for some fungi species using network topological indices.
Vázquez-Prieto, Severo; Paniagua, Esperanza; Solana, Hugo; Ubeira, Florencio M; González-Díaz, Humberto
2017-08-01
In the last years, the encryption of system structure information with different network topological indices has been a very active field of research. In the present study, we assembled for the first time a complex network using data obtained from the Immune Epitope Database for fungi species, and we then considered the general topology, the node degree distribution, and the local structure of this network. We also calculated eight node centrality measures for the observed network and compared it with three theoretical models. In view of the results obtained, we may expect that the present approach can become a valuable tool to explore the complexity of this database, as well as for the storage, manipulation, comparison, and retrieval of information contained therein.
Rothwell, Joseph A.; Perez-Jimenez, Jara; Neveu, Vanessa; Medina-Remón, Alexander; M'Hiri, Nouha; García-Lobato, Paula; Manach, Claudine; Knox, Craig; Eisner, Roman; Wishart, David S.; Scalbert, Augustin
2013-01-01
Polyphenols are a major class of bioactive phytochemicals whose consumption may play a role in the prevention of a number of chronic diseases such as cardiovascular diseases, type II diabetes and cancers. Phenol-Explorer, launched in 2009, is the only freely available web-based database on the content of polyphenols in food and their in vivo metabolism and pharmacokinetics. Here we report the third release of the database (Phenol-Explorer 3.0), which adds data on the effects of food processing on polyphenol contents in foods. Data on >100 foods, covering 161 polyphenols or groups of polyphenols before and after processing, were collected from 129 peer-reviewed publications and entered into new tables linked to the existing relational design. The effect of processing on polyphenol content is expressed in the form of retention factor coefficients, or the proportion of a given polyphenol retained after processing, adjusted for change in water content. The result is the first database on the effects of food processing on polyphenol content and, following the model initially defined for Phenol-Explorer, all data may be traced back to original sources. The new update will allow polyphenol scientists to more accurately estimate polyphenol exposure from dietary surveys. Database URL: http://www.phenol-explorer.eu PMID:24103452
The Advanced Composition Explorer Shock Database and Application to Particle Acceleration Theory
NASA Technical Reports Server (NTRS)
Parker, L. Neergaard; Zank, G. P.
2015-01-01
The theory of particle acceleration via diffusive shock acceleration (DSA) has been studied in depth by Gosling et al. (1981), van Nes et al. (1984), Mason (2000), Desai et al. (2003), Zank et al. (2006), among many others. Recently, Parker and Zank (2012, 2014) and Parker et al. (2014) using the Advanced Composition Explorer (ACE) shock database at 1 AU explored two questions: does the upstream distribution alone have enough particles to account for the accelerated downstream distribution and can the slope of the downstream accelerated spectrum be explained using DSA? As was shown in this research, diffusive shock acceleration can account for a large population of the shocks. However, Parker and Zank (2012, 2014) and Parker et al. (2014) used a subset of the larger ACE database. Recently, work has successfully been completed that allows for the entire ACE database to be considered in a larger statistical analysis. We explain DSA as it applies to single and multiple shocks and the shock criteria used in this statistical analysis. We calculate the expected injection energy via diffusive shock acceleration given upstream parameters defined from the ACE Solar Wind Electron, Proton, and Alpha Monitor (SWEPAM) data to construct the theoretical upstream distribution. We show the comparison of shock strength derived from diffusive shock acceleration theory to observations in the 50 keV to 5 MeV range from an instrument on ACE. Parameters such as shock velocity, shock obliquity, particle number, and time between shocks are considered. This study is further divided into single and multiple shock categories, with an additional emphasis on forward-forward multiple shock pairs. Finally with regard to forward-forward shock pairs, results comparing injection energies of the first shock, second shock, and second shock with previous energetic population will be given.
The Advanced Composition Explorer Shock Database and Application to Particle Acceleration Theory
NASA Technical Reports Server (NTRS)
Parker, L. Neergaard; Zank, G. P.
2015-01-01
The theory of particle acceleration via diffusive shock acceleration (DSA) has been studied in depth by Gosling et al. (1981), van Nes et al. (1984), Mason (2000), Desai et al. (2003), Zank et al. (2006), among many others. Recently, Parker and Zank (2012, 2014) and Parker et al. (2014) using the Advanced Composition Explorer (ACE) shock database at 1 AU explored two questions: does the upstream distribution alone have enough particles to account for the accelerated downstream distribution and can the slope of the downstream accelerated spectrum be explained using DSA? As was shown in this research, diffusive shock acceleration can account for a large population of the shocks. However, Parker and Zank (2012, 2014) and Parker et al. (2014) used a subset of the larger ACE database. Recently, work has successfully been completed that allows for the entire ACE database to be considered in a larger statistical analysis. We explain DSA as it applies to single and multiple shocks and the shock criteria used in this statistical analysis. We calculate the expected injection energy via diffusive shock acceleration given upstream parameters defined from the ACE Solar Wind Electron, Proton, and Alpha Monitor (SWEPAM) data to construct the theoretical upstream distribution. We show the comparison of shock strength derived from diffusive shock acceleration theory to observations in the 50 keV to 5 MeV range from an instrument on ACE. Parameters such as shock velocity, shock obliquity, particle number, and time between shocks are considered. This study is further divided into single and multiple shock categories, with an additional emphasis on forward-forward multiple shock pairs. Finally with regard to forwardforward shock pairs, results comparing injection energies of the first shock, second shock, and second shock with previous energetic population will be given.
NASA Astrophysics Data System (ADS)
Weatherill, G. A.; Pagani, M.; Garcia, J.
2016-09-01
The creation of a magnitude-homogenized catalogue is often one of the most fundamental steps in seismic hazard analysis. The process of homogenizing multiple catalogues of earthquakes into a single unified catalogue typically requires careful appraisal of available bulletins, identification of common events within multiple bulletins and the development and application of empirical models to convert from each catalogue's native scale into the required target. The database of the International Seismological Center (ISC) provides the most exhaustive compilation of records from local bulletins, in addition to its reviewed global bulletin. New open-source tools are developed that can utilize this, or any other compiled database, to explore the relations between earthquake solutions provided by different recording networks, and to build and apply empirical models in order to harmonize magnitude scales for the purpose of creating magnitude-homogeneous earthquake catalogues. These tools are described and their application illustrated in two different contexts. The first is a simple application in the Sub-Saharan Africa region where the spatial coverage and magnitude scales for different local recording networks are compared, and their relation to global magnitude scales explored. In the second application the tools are used on a global scale for the purpose of creating an extended magnitude-homogeneous global earthquake catalogue. Several existing high-quality earthquake databases, such as the ISC-GEM and the ISC Reviewed Bulletins, are harmonized into moment magnitude to form a catalogue of more than 562 840 events. This extended catalogue, while not an appropriate substitute for a locally calibrated analysis, can help in studying global patterns in seismicity and hazard, and is therefore released with the accompanying software.
Bhasi, Ashwini; Philip, Philge; Manikandan, Vinu; Senapathy, Periannan
2009-01-01
We have developed ExDom, a unique database for the comparative analysis of the exon–intron structures of 96 680 protein domains from seven eukaryotic organisms (Homo sapiens, Mus musculus, Bos taurus, Rattus norvegicus, Danio rerio, Gallus gallus and Arabidopsis thaliana). ExDom provides integrated access to exon-domain data through a sophisticated web interface which has the following analytical capabilities: (i) intergenomic and intragenomic comparative analysis of exon–intron structure of domains; (ii) color-coded graphical display of the domain architecture of proteins correlated with their corresponding exon-intron structures; (iii) graphical analysis of multiple sequence alignments of amino acid and coding nucleotide sequences of homologous protein domains from seven organisms; (iv) comparative graphical display of exon distributions within the tertiary structures of protein domains; and (v) visualization of exon–intron structures of alternative transcripts of a gene correlated to variations in the domain architecture of corresponding protein isoforms. These novel analytical features are highly suited for detailed investigations on the exon–intron structure of domains and make ExDom a powerful tool for exploring several key questions concerning the function, origin and evolution of genes and proteins. ExDom database is freely accessible at: http://66.170.16.154/ExDom/. PMID:18984624
Effects of food processing on polyphenol contents: a systematic analysis using Phenol-Explorer data.
Rothwell, Joseph A; Medina-Remón, Alexander; Pérez-Jiménez, Jara; Neveu, Vanessa; Knaze, Viktoria; Slimani, Nadia; Scalbert, Augustin
2015-01-01
The Phenol-Explorer web database (http://www.phenol-explorer.eu) was recently updated with new data on polyphenol retention due to food processing. Here, we analyze these data to investigate the effect of different variables on polyphenol content and make recommendations aimed at refining estimation of intake in epidemiological studies. Data on the effects of processing upon 161 polyphenols compiled for the Phenol-Explorer database were analyzed to investigate the effects of polyphenol structure, food, and process upon polyphenol loss. These were expressed as retention factors (RFs), fold changes in polyphenol content due to processing. Domestic cooking of common plant foods caused considerable losses (median RF = 0.45-0.70), although variability was high. Food storage caused fewer losses, regardless of food or polyphenol (median RF = 0.88, 0.95, 0.92 for ambient, refrigerated, and frozen storage, respectively). The food under study was often a more important determinant of retention than the process applied. Phenol-Explorer data enable polyphenol losses due to processing from many different foods to be rapidly compared. Where experimentally determined polyphenol contents of a processed food are not available, only published RFs matching at least the food and polyphenol of interest should be used when building food composition tables for epidemiological studies. © 2014 The Authors Molecular Nutrition & Food Research Published by Wiley-VCH Verlag GmbH & Co. KGaA.
Exploring consumer exposure pathways and patterns of use for chemicals in the environment through the Chemical/Product Categories Database (CPCat) (Presented by: Kathie Dionisio, Sc.D., NERL, US EPA, Research Triangle Park, NC (1/23/2014).
Uhlirova, Hana; Tian, Peifang; Kılıç, Kıvılcım; Thunemann, Martin; Sridhar, Vishnu B; Chmelik, Radim; Bartsch, Hauke; Dale, Anders M; Devor, Anna; Saisan, Payam A
2018-05-04
The importance of sharing experimental data in neuroscience grows with the amount and complexity of data acquired and various techniques used to obtain and process these data. However, the majority of experimental data, especially from individual studies of regular-sized laboratories never reach wider research community. A graphical user interface (GUI) engine called Neurovascular Network Explorer 2.0 (NNE 2.0) has been created as a tool for simple and low-cost sharing and exploring of vascular imaging data. NNE 2.0 interacts with a database containing optogenetically-evoked dilation/constriction time-courses of individual vessels measured in mice somatosensory cortex in vivo by 2-photon microscopy. NNE 2.0 enables selection and display of the time-courses based on different criteria (subject, branching order, cortical depth, vessel diameter, arteriolar tree) as well as simple mathematical manipulation (e.g. averaging, peak-normalization) and data export. It supports visualization of the vascular network in 3D and enables localization of the individual functional vessel diameter measurements within vascular trees. NNE 2.0, its source code, and the corresponding database are freely downloadable from UCSD Neurovascular Imaging Laboratory website 1 . The source code can be utilized by the users to explore the associated database or as a template for databasing and sharing their own experimental results provided the appropriate format.
Cooper, Chris; Lovell, Rebecca; Husk, Kerryn; Booth, Andrew; Garside, Ruth
2018-06-01
We undertook a systematic review to evaluate the health benefits of environmental enhancement and conservation activities. We were concerned that a conventional process of study identification, focusing on exhaustive searches of bibliographic databases as the primary search method, would be ineffective, offering limited value. The focus of this study is comparing study identification methods. We compare (1) an approach led by searches of bibliographic databases with (2) an approach led by supplementary search methods. We retrospectively assessed the effectiveness and value of both approaches. Effectiveness was determined by comparing (1) the total number of studies identified and screened and (2) the number of includable studies uniquely identified by each approach. Value was determined by comparing included study quality and by using qualitative sensitivity analysis to explore the contribution of studies to the synthesis. The bibliographic databases approach identified 21 409 studies to screen and 2 included qualitative studies were uniquely identified. Study quality was moderate, and contribution to the synthesis was minimal. The supplementary search approach identified 453 studies to screen and 9 included studies were uniquely identified. Four quantitative studies were poor quality but made a substantive contribution to the synthesis; 5 studies were qualitative: 3 studies were good quality, one was moderate quality, and 1 study was excluded from the synthesis due to poor quality. All 4 included qualitative studies made significant contributions to the synthesis. This case study found value in aligning primary methods of study identification to maximise location of relevant evidence. Copyright © 2017 John Wiley & Sons, Ltd.
ERIC Educational Resources Information Center
Bennett, Judith; Lubben, Fred; Hampden-Thompson, Gillian
2013-01-01
This paper presents the findings of the qualitative component of a combined methods research study that explores a range of individual and school factors that influence the uptake of chemistry and physics in post-compulsory study in England. The first phase involves using the National Pupil Database to provide a sampling frame to identify four…
Idaho and Montana non-fuel exploration database 1980-1997
Buckingham, David A.; DiFrancesco, Carl A.; Porter, Kenneth E.; Bleiwas, Donald I.; Causey, J. Douglas; Ferguson, William B.
2006-01-01
This report describes a relational database containing information about mineral exploration projects in the States of Idaho and Montana for the years 1980 through 1997 and a spatial (geographic) database constructed using data from the relational database. The focus of this project was to collect information on exploration for mineral commodities with the exception of sand, gravel, coal, geothermal, oil, and gas. The associate databases supplied with this report are prototypes that can be used or modified as needed. The following sources were used to create the databases-serial mining periodicals; annual mineral publications; mining company reports; U.S. Bureau of Mines (USBM) and U.S. Geological Survey (USGS) publications; an Idaho mineral property data base developed by Dave Boleneus, USGS, Spokane, Washington; Montana state publications; and discussions with representatives of Montana, principally the Montana Bureau of Mines and Geology and the Department of Environmental Quality. Fifty commodity groups were reported between the 596 exploration projects identified in this study. Precious metals (gold, silver, or platinum group elements) were the primary targets for about 67 percent of the exploration projects. Information on 17 of the projects did not include commodities. No location could be determined for 51 projects, all in Idaho. During the time period evaluated, some mineral properties were developed into large mining operations (for example Beal Mountain Mine, Stillwater Mine, Troy Mine, Montana Tunnels Mine) and six properties were reclaimed. Environmental Impact Statements were done on four properties. Some operating mines either closed or went through one or more shutdowns and re-openings. Other properties, where significant resources were delineated by recent exploration during this time frame, await the outcome of important factors for development such as defining additional reserves, higher metal prices, and the permitting process. Many of these projects examined relatively minor mineral occurrences. Approximately half of the exploration projects are located on Federal lands and about 40 percent were on lands managed by the U.S. Forest Service. More than 75 percent of the exploration occurred in areas with significant previous mineral activity.
Exploring Genetic, Genomic, and Phenotypic Data at the Rat Genome Database
Laulederkind, Stanley J. F.; Hayman, G. Thomas; Wang, Shur-Jen; Lowry, Timothy F.; Nigam, Rajni; Petri, Victoria; Smith, Jennifer R.; Dwinell, Melinda R.; Jacob, Howard J.; Shimoyama, Mary
2013-01-01
The laboratory rat, Rattus norvegicus, is an important model of human health and disease, and experimental findings in the rat have relevance to human physiology and disease. The Rat Genome Database (RGD, http://rgd.mcw.edu) is a model organism database that provides access to a wide variety of curated rat data including disease associations, phenotypes, pathways, molecular functions, biological processes and cellular components for genes, quantitative trait loci, and strains. We present an overview of the database followed by specific examples that can be used to gain experience in employing RGD to explore the wealth of functional data available for the rat. PMID:23255149
NASA Astrophysics Data System (ADS)
Mioulet, L.; Bideault, G.; Chatelain, C.; Paquet, T.; Brunessaux, S.
2015-01-01
The BLSTM-CTC is a novel recurrent neural network architecture that has outperformed previous state of the art algorithms in tasks such as speech recognition or handwriting recognition. It has the ability to process long term dependencies in temporal signals in order to label unsegmented data. This paper describes different ways of combining features using a BLSTM-CTC architecture. Not only do we explore the low level combination (feature space combination) but we also explore high level combination (decoding combination) and mid-level (internal system representation combination). The results are compared on the RIMES word database. Our results show that the low level combination works best, thanks to the powerful data modeling of the LSTM neurons.
Failure mode and effects analysis outputs: are they valid?
Shebl, Nada Atef; Franklin, Bryony Dean; Barber, Nick
2012-06-10
Failure Mode and Effects Analysis (FMEA) is a prospective risk assessment tool that has been widely used within the aerospace and automotive industries and has been utilised within healthcare since the early 1990s. The aim of this study was to explore the validity of FMEA outputs within a hospital setting in the United Kingdom. Two multidisciplinary teams each conducted an FMEA for the use of vancomycin and gentamicin. Four different validity tests were conducted: Face validity: by comparing the FMEA participants' mapped processes with observational work. Content validity: by presenting the FMEA findings to other healthcare professionals. Criterion validity: by comparing the FMEA findings with data reported on the trust's incident report database. Construct validity: by exploring the relevant mathematical theories involved in calculating the FMEA risk priority number. Face validity was positive as the researcher documented the same processes of care as mapped by the FMEA participants. However, other healthcare professionals identified potential failures missed by the FMEA teams. Furthermore, the FMEA groups failed to include failures related to omitted doses; yet these were the failures most commonly reported in the trust's incident database. Calculating the RPN by multiplying severity, probability and detectability scores was deemed invalid because it is based on calculations that breach the mathematical properties of the scales used. There are significant methodological challenges in validating FMEA. It is a useful tool to aid multidisciplinary groups in mapping and understanding a process of care; however, the results of our study cast doubt on its validity. FMEA teams are likely to need different sources of information, besides their personal experience and knowledge, to identify potential failures. As for FMEA's methodology for scoring failures, there were discrepancies between the teams' estimates and similar incidents reported on the trust's incident database. Furthermore, the concept of multiplying ordinal scales to prioritise failures is mathematically flawed. Until FMEA's validity is further explored, healthcare organisations should not solely depend on their FMEA results to prioritise patient safety issues.
Hansen, Bjoern Oest; Meyer, Etienne H; Ferrari, Camilla; Vaid, Neha; Movahedi, Sara; Vandepoele, Klaas; Nikoloski, Zoran; Mutwil, Marek
2018-03-01
Recent advances in gene function prediction rely on ensemble approaches that integrate results from multiple inference methods to produce superior predictions. Yet, these developments remain largely unexplored in plants. We have explored and compared two methods to integrate 10 gene co-function networks for Arabidopsis thaliana and demonstrate how the integration of these networks produces more accurate gene function predictions for a larger fraction of genes with unknown function. These predictions were used to identify genes involved in mitochondrial complex I formation, and for five of them, we confirmed the predictions experimentally. The ensemble predictions are provided as a user-friendly online database, EnsembleNet. The methods presented here demonstrate that ensemble gene function prediction is a powerful method to boost prediction performance, whereas the EnsembleNet database provides a cutting-edge community tool to guide experimentalists. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
MSDB: A Comprehensive Database of Simple Sequence Repeats
Avvaru, Akshay Kumar; Saxena, Saketh; Mishra, Rakesh Kumar
2017-01-01
Abstract Microsatellites, also known as Simple Sequence Repeats (SSRs), are short tandem repeats of 1–6 nt motifs present in all genomes, particularly eukaryotes. Besides their usefulness as genome markers, SSRs have been shown to perform important regulatory functions, and variations in their length at coding regions are linked to several disorders in humans. Microsatellites show a taxon-specific enrichment in eukaryotic genomes, and some may be functional. MSDB (Microsatellite Database) is a collection of >650 million SSRs from 6,893 species including Bacteria, Archaea, Fungi, Plants, and Animals. This database is by far the most exhaustive resource to access and analyze SSR data of multiple species. In addition to exploring data in a customizable tabular format, users can view and compare the data of multiple species simultaneously using our interactive plotting system. MSDB is developed using the Django framework and MySQL. It is freely available at http://tdb.ccmb.res.in/msdb. PMID:28854643
Exploring Discretization Error in Simulation-Based Aerodynamic Databases
NASA Technical Reports Server (NTRS)
Aftosmis, Michael J.; Nemec, Marian
2010-01-01
This work examines the level of discretization error in simulation-based aerodynamic databases and introduces strategies for error control. Simulations are performed using a parallel, multi-level Euler solver on embedded-boundary Cartesian meshes. Discretization errors in user-selected outputs are estimated using the method of adjoint-weighted residuals and we use adaptive mesh refinement to reduce these errors to specified tolerances. Using this framework, we examine the behavior of discretization error throughout a token database computed for a NACA 0012 airfoil consisting of 120 cases. We compare the cost and accuracy of two approaches for aerodynamic database generation. In the first approach, mesh adaptation is used to compute all cases in the database to a prescribed level of accuracy. The second approach conducts all simulations using the same computational mesh without adaptation. We quantitatively assess the error landscape and computational costs in both databases. This investigation highlights sensitivities of the database under a variety of conditions. The presence of transonic shocks or the stiffness in the governing equations near the incompressible limit are shown to dramatically increase discretization error requiring additional mesh resolution to control. Results show that such pathologies lead to error levels that vary by over factor of 40 when using a fixed mesh throughout the database. Alternatively, controlling this sensitivity through mesh adaptation leads to mesh sizes which span two orders of magnitude. We propose strategies to minimize simulation cost in sensitive regions and discuss the role of error-estimation in database quality.
NREL: Renewable Resource Data Center - Biomass Resource Publications
Marginal Lands in APEC Economies NREL Publications Database For a comprehensive list of other NREL biomass resource publications, explore NREL's Publications Database. When searching the database, search on "
KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation.
Wang, Dapeng; Xu, Jiayue; Yu, Jun
2015-09-16
The K-mer approach, treating genomic sequences as simple characters and counting the relative abundance of each string upon a fixed K, has been extensively applied to phylogeny inference for genome assembly, annotation, and comparison. To meet increasing demands for comparing large genome sequences and to promote the use of the K-mer approach, we develop a versatile database, KGCAK ( http://kgcak.big.ac.cn/KGCAK/ ), containing ~8,000 genomes that include genome sequences of diverse life forms (viruses, prokaryotes, protists, animals, and plants) and cellular organelles of eukaryotic lineages. It builds phylogeny based on genomic elements in an alignment-free fashion and provides in-depth data processing enabling users to compare the complexity of genome sequences based on K-mer distribution. We hope that KGCAK becomes a powerful tool for exploring relationship within and among groups of species in a tree of life based on genomic data.
"There's so Much Data": Exploring the Realities of Data-Based School Governance
ERIC Educational Resources Information Center
Selwyn, Neil
2016-01-01
Educational governance is commonly predicated around the generation, collation and processing of data through digital technologies. Drawing upon an empirical study of two Australian secondary schools, this paper explores the different forms of data-based governance that are being enacted by school leaders, managers, administrators and teachers.…
Distributed Structure-Searchable Toxicity (DSSTox) Database Network: Making Public Toxicity Data Resources More Accessible and U sable for Data Exploration and SAR Development
Many sources of public toxicity data are not currently linked to chemical structure, are not ...
Computer Assisted Learning Feature--Using Databases in Economics and Business Studies.
ERIC Educational Resources Information Center
Davies, Peter; Allison, Ron.
1989-01-01
Describes ways in which databases can be used in economics and business education classes. Explores arguments put forth by advocates for the use of databases in the classroom. Offers information on British software and discusses six online database systems listing the features of each. (KO)
A web-based relational database for monitoring and analyzing mosquito population dynamics.
Sucaet, Yves; Van Hemert, John; Tucker, Brad; Bartholomay, Lyric
2008-07-01
Mosquito population dynamics have been monitored on an annual basis in the state of Iowa since 1969. The primary goal of this project was to integrate light trap data from these efforts into a centralized back-end database and interactive website that is available through the internet at http://iowa-mosquito.ent.iastate.edu. For comparative purposes, all data were categorized according to the week of the year and normalized according to the number of traps running. Users can readily view current, weekly mosquito abundance compared with data from previous years. Additional interactive capabilities facilitate analyses of the data based on mosquito species, distribution, or a time frame of interest. All data can be viewed in graphical and tabular format and can be downloaded to a comma separated value (CSV) file for import into a spreadsheet or more specialized statistical software package. Having this long-term dataset in a centralized database/website is useful for informing mosquito and mosquito-borne disease control and for exploring the ecology of the species represented therein. In addition to mosquito population dynamics, this database is available as a standardized platform that could be modified and applied to a multitude of projects that involve repeated collection of observational data. The development and implementation of this tool provides capacity for the user to mine data from standard spreadsheets into a relational database and then view and query the data in an interactive website.
Alaska Geochemical Database - Mineral Exploration Tool for the 21st Century - PDF of presentation
Granitto, Matthew; Schmidt, Jeanine M.; Labay, Keith A.; Shew, Nora B.; Gamble, Bruce M.
2012-01-01
The U.S. Geological Survey has created a geochemical database of geologic material samples collected in Alaska. This database is readily accessible to anyone with access to the Internet. Designed as a tool for mineral or environmental assessment, land management, or mineral exploration, the initial version of the Alaska Geochemical Database - U.S. Geological Survey Data Series 637 - contains geochemical, geologic, and geospatial data for 264,158 samples collected from 1962-2009: 108,909 rock samples; 92,701 sediment samples; 48,209 heavy-mineral-concentrate samples; 6,869 soil samples; and 7,470 mineral samples. In addition, the Alaska Geochemical Database contains mineralogic data for 18,138 nonmagnetic-fraction heavy mineral concentrates, making it the first U.S. Geological Survey database of this scope that contains both geochemical and mineralogic data. Examples from the Alaska Range will illustrate potential uses of the Alaska Geochemical Database in mineral exploration. Data from the Alaska Geochemical Database have been extensively checked for accuracy of sample media description, sample site location, and analytical method using U.S. Geological Survey sample-submittal archives and U.S. Geological Survey publications (plus field notebooks and sample site compilation base maps from the Alaska Technical Data Unit in Anchorage, Alaska). The database is also the repository for nearly all previously released U.S. Geological Survey Alaska geochemical datasets. Although the Alaska Geochemical Database is a fully relational database in Microsoft® Access 2003 and 2010 formats, these same data are also provided as a series of spreadsheet files in Microsoft® Excel 2003 and 2010 formats, and as ASCII text files. A DVD version of the Alaska Geochemical Database was released in October 2011, as U.S. Geological Survey Data Series 637, and data downloads are available at http://pubs.usgs.gov/ds/637/. Also, all Alaska Geochemical Database data have been incorporated into the interactive U.S. Geological Survey Mineral Resource Data web portal, available at http://mrdata.usgs.gov/.
Maetens, Arno; De Schreye, Robrecht; Faes, Kristof; Houttekier, Dirk; Deliens, Luc; Gielen, Birgit; De Gendt, Cindy; Lusyne, Patrick; Annemans, Lieven; Cohen, Joachim
2016-10-18
The use of full-population databases is under-explored to study the use, quality and costs of end-of-life care. Using the case of Belgium, we explored: (1) which full-population databases provide valid information about end-of-life care, (2) what procedures are there to use these databases, and (3) what is needed to integrate separate databases. Technical and privacy-related aspects of linking and accessing Belgian administrative databases and disease registries were assessed in cooperation with the database administrators and privacy commission bodies. For all relevant databases, we followed procedures in cooperation with database administrators to link the databases and to access the data. We identified several databases as fitting for end-of-life care research in Belgium: the InterMutualistic Agency's national registry of health care claims data, the Belgian Cancer Registry including data on incidence of cancer, and databases administrated by Statistics Belgium including data from the death certificate database, the socio-economic survey and fiscal data. To obtain access to the data, approval was required from all database administrators, supervisory bodies and two separate national privacy bodies. Two Trusted Third Parties linked the databases via a deterministic matching procedure using multiple encrypted social security numbers. In this article we describe how various routinely collected population-level databases and disease registries can be accessed and linked to study patterns in the use, quality and costs of end-of-life care in the full population and in specific diagnostic groups.
Rothwell, Joseph A; Perez-Jimenez, Jara; Neveu, Vanessa; Medina-Remón, Alexander; M'hiri, Nouha; García-Lobato, Paula; Manach, Claudine; Knox, Craig; Eisner, Roman; Wishart, David S; Scalbert, Augustin
2013-01-01
Polyphenols are a major class of bioactive phytochemicals whose consumption may play a role in the prevention of a number of chronic diseases such as cardiovascular diseases, type II diabetes and cancers. Phenol-Explorer, launched in 2009, is the only freely available web-based database on the content of polyphenols in food and their in vivo metabolism and pharmacokinetics. Here we report the third release of the database (Phenol-Explorer 3.0), which adds data on the effects of food processing on polyphenol contents in foods. Data on >100 foods, covering 161 polyphenols or groups of polyphenols before and after processing, were collected from 129 peer-reviewed publications and entered into new tables linked to the existing relational design. The effect of processing on polyphenol content is expressed in the form of retention factor coefficients, or the proportion of a given polyphenol retained after processing, adjusted for change in water content. The result is the first database on the effects of food processing on polyphenol content and, following the model initially defined for Phenol-Explorer, all data may be traced back to original sources. The new update will allow polyphenol scientists to more accurately estimate polyphenol exposure from dietary surveys.
Exploring Antarctic Land Surface Temperature Extremes Using Condensed Anomaly Databases
NASA Astrophysics Data System (ADS)
Grant, Glenn Edwin
Satellite observations have revolutionized the Earth Sciences and climate studies. However, data and imagery continue to accumulate at an accelerating rate, and efficient tools for data discovery, analysis, and quality checking lag behind. In particular, studies of long-term, continental-scale processes at high spatiotemporal resolutions are especially problematic. The traditional technique of downloading an entire dataset and using customized analysis code is often impractical or consumes too many resources. The Condensate Database Project was envisioned as an alternative method for data exploration and quality checking. The project's premise was that much of the data in any satellite dataset is unneeded and can be eliminated, compacting massive datasets into more manageable sizes. Dataset sizes are further reduced by retaining only anomalous data of high interest. Hosting the resulting "condensed" datasets in high-speed databases enables immediate availability for queries and exploration. Proof of the project's success relied on demonstrating that the anomaly database methods can enhance and accelerate scientific investigations. The hypothesis of this dissertation is that the condensed datasets are effective tools for exploring many scientific questions, spurring further investigations and revealing important information that might otherwise remain undetected. This dissertation uses condensed databases containing 17 years of Antarctic land surface temperature anomalies as its primary data. The study demonstrates the utility of the condensate database methods by discovering new information. In particular, the process revealed critical quality problems in the source satellite data. The results are used as the starting point for four case studies, investigating Antarctic temperature extremes, cloud detection errors, and the teleconnections between Antarctic temperature anomalies and climate indices. The results confirm the hypothesis that the condensate databases are a highly useful tool for Earth Science analyses. Moreover, the quality checking capabilities provide an important method for independent evaluation of dataset veracity.
ERIC Educational Resources Information Center
Freeman, Carla; And Others
In order to understand how the database software or online database functioned in the overall curricula, the use of database management (DBMs) systems was studied at eight elementary and middle schools through classroom observation and interviews with teachers and administrators, librarians, and students. Three overall areas were addressed:…
Engel, Stacia R.; Cherry, J. Michael
2013-01-01
The first completed eukaryotic genome sequence was that of the yeast Saccharomyces cerevisiae, and the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the original model organism database. SGD remains the authoritative community resource for the S. cerevisiae reference genome sequence and its annotation, and continues to provide comprehensive biological information correlated with S. cerevisiae genes and their products. A diverse set of yeast strains have been sequenced to explore commercial and laboratory applications, and a brief history of those strains is provided. The publication of these new genomes has motivated the creation of new tools, and SGD will annotate and provide comparative analyses of these sequences, correlating changes with variations in strain phenotypes and protein function. We are entering a new era at SGD, as we incorporate these new sequences and make them accessible to the scientific community, all in an effort to continue in our mission of educating researchers and facilitating discovery. Database URL: http://www.yeastgenome.org/ PMID:23487186
Patel, Angira; Hickey, Edward; Mavroudis, Constantine; Jacobs, Jeffrey P; Jacobs, Marshall L; Backer, Carl L; Gevitz, Melanie; Mavroudis, Constantine D
2010-06-01
Hypoplastic left heart syndrome may coexist with noncardiac congenital defects or genetic syndromes. We explored the impact of such lesions on outcomes after staged univentricular palliation. Society of Thoracic Surgeons database 2002 to 2006: Children diagnosed with hypoplastic left heart syndrome who underwent stage 1 Norwood (n = 1,236), stage 2 superior cavopulmonary anastamosis (n = 702) or stage 3 Fontan (n = 553) procedures were studied. In-hospital mortality, postoperative complications, and length of stay were compared at each stage between those with and without noncardiac-genetic defects. Congenital Heart Surgeons' Society database 1994 to 2001: All 703 infants enrolled in the Congenital Heart Surgeons' Society critical left ventricular outflow tract obstruction study who underwent primary stage 1 palliation were reviewed. The impact of noncardiac defects-syndromes on survival was explored using multivariable parametric models with bootstrap bagging. Society of Thoracic Surgeons database: Stage 1 in-hospital mortality (26% vs 20%, p = 0.04) and mean postoperative length of stay (42 versus 31 days, p < 0.0001) were greater, and postoperative complications significantly more prevalent in infants with noncardiac-genetic defects. Congenital Heart Surgeons' Society database: Noncardiac-genetic defects were present in 55 (8%). Early hazard for death after Norwood was significantly worse in infants with noncardiac defects-syndromes (p = 0.008). Chromosomal defects (n = 14) were highly unfavorable: the early risk of death was doubled (10-year survival 25 +/- 9% vs 54 +/- 2%, p = 0.005). Turner syndrome accounted for the majority of chromosomal defects in this population (11 of 14, 79%). Mode of death was rarely attributable to the noncardiac-genetic defect. Survival in hypoplastic left heart syndrome is strongly influenced by the presence of noncardiac abnormalities. Strategies to improve mortality in infants with noncardiac abnormalities should be explored. Presence of chromosomal defects, especially Turner syndrome, should enter decision-management options for parents and physicians. 2010 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.
Improved Bond Equations for Fiber-Reinforced Polymer Bars in Concrete.
Pour, Sadaf Moallemi; Alam, M Shahria; Milani, Abbas S
2016-08-30
This paper explores a set of new equations to predict the bond strength between fiber reinforced polymer (FRP) rebar and concrete. The proposed equations are based on a comprehensive statistical analysis and existing experimental results in the literature. Namely, the most effective parameters on bond behavior of FRP concrete were first identified by applying a factorial analysis on a part of the available database. Then the database that contains 250 pullout tests were divided into four groups based on the concrete compressive strength and the rebar surface. Afterward, nonlinear regression analysis was performed for each study group in order to determine the bond equations. The results show that the proposed equations can predict bond strengths more accurately compared to the other previously reported models.
Image-Based Localization for Indoor Environment Using Mobile Phone
NASA Astrophysics Data System (ADS)
Huang, Y.; Wang, H.; Zhan, K.; Zhao, J.; Gui, P.; Feng, T.
2015-05-01
Real-time indoor localization based on supporting infrastructures like wireless devices and QR codes are usually costly and labor intensive to implement. In this study, we explored a cheap alternative approach based on images for indoor localization. A user can localize him/herself by just shooting a photo of the surrounding indoor environment using the mobile phone. No any other equipment is required. This is achieved by employing image-matching and searching techniques with a dataset of pre-captured indoor images. In the beginning, a database of structured images of the indoor environment is constructed by using image matching and the bundle adjustment algorithm. Then each image's relative pose (its position and orientation) is estimated and the semantic locations of images are tagged. A user's location can then be determined by comparing a photo taken by the mobile phone to the database. This is done by combining quick image searching, matching and the relative orientation. This study also try to explore image acquisition plans and the processing capacity of off-the-shell mobile phones. During the whole pipeline, a collection of indoor images with both rich and poor textures are examined. Several feature detectors are used and compared. Pre-processing of complex indoor photo is also implemented on the mobile phone. The preliminary experimental results prove the feasibility of this method. In the future, we are trying to raise the efficiency of matching between indoor images and explore the fast 4G wireless communication to ensure the speed and accuracy of the localization based on a client-server framework.
Shifman, Mark A.; Sayward, Frederick G.; Mattie, Mark E.; Miller, Perry L.
2002-01-01
This case study describes a project that explores issues of quality of service (QoS) relevant to the next-generation Internet (NGI), using the PathMaster application in a testbed environment. PathMaster is a prototype computer system that analyzes digitized cell images from cytology specimens and compares those images against an image database, returning a ranked set of “similar” cell images from the database. To perform NGI testbed evaluations, we used a cluster of nine parallel computation workstations configured as three subclusters using Cisco routers. This architecture provides a local “simulated Internet” in which we explored the following QoS strategies: (1) first-in-first-out queuing, (2) priority queuing, (3) weighted fair queuing, (4) weighted random early detection, and (5) traffic shaping. The study describes the results of using these strategies with a distributed version of the PathMaster system in the presence of different amounts of competing network traffic and discusses certain of the issues that arise. The goal of the study is to help introduce NGI QoS issues to the Medical Informatics community and to use the PathMaster NGI testbed to illustrate concretely certain of the QoS issues that arise. PMID:12223501
ERIC Educational Resources Information Center
Hollis, Michael J.
2010-01-01
The purpose of this study was to explore areas of research in regards to how students learn about violent crime on university campuses and what level of awareness they hold regarding their personal safety. A combination of databases was used to measure reported rates of violent crime on campus and in the community and these were compared with…
Bigger Is (Maybe) Better: Librarians' Views of Interdisciplinary Databases
ERIC Educational Resources Information Center
Gilbert, Julie K.
2010-01-01
This study investigates librarians' satisfaction with general interdisciplinary databases for undergraduate research and explores possibilities for improving these databases. Results from a national survey suggest that librarians at a variety of institutions are relatively satisfied overall with the content and usability of general,…
Owens, John
2009-01-01
Technological advances in the acquisition of DNA and protein sequence information and the resulting onrush of data can quickly overwhelm the scientist unprepared for the volume of information that must be evaluated and carefully dissected to discover its significance. Few laboratories have the luxury of dedicated personnel to organize, analyze, or consistently record a mix of arriving sequence data. A methodology based on a modern relational-database manager is presented that is both a natural storage vessel for antibody sequence information and a conduit for organizing and exploring sequence data and accompanying annotation text. The expertise necessary to implement such a plan is equal to that required by electronic word processors or spreadsheet applications. Antibody sequence projects maintained as independent databases are selectively unified by the relational-database manager into larger database families that contribute to local analyses, reports, interactive HTML pages, or exported to facilities dedicated to sophisticated sequence analysis techniques. Database files are transposable among current versions of Microsoft, Macintosh, and UNIX operating systems.
Exploring Deep Learning and Transfer Learning for Colonic Polyp Classification
Uhl, Andreas; Wimmer, Georg; Häfner, Michael
2016-01-01
Recently, Deep Learning, especially through Convolutional Neural Networks (CNNs) has been widely used to enable the extraction of highly representative features. This is done among the network layers by filtering, selecting, and using these features in the last fully connected layers for pattern classification. However, CNN training for automated endoscopic image classification still provides a challenge due to the lack of large and publicly available annotated databases. In this work we explore Deep Learning for the automated classification of colonic polyps using different configurations for training CNNs from scratch (or full training) and distinct architectures of pretrained CNNs tested on 8-HD-endoscopic image databases acquired using different modalities. We compare our results with some commonly used features for colonic polyp classification and the good results suggest that features learned by CNNs trained from scratch and the “off-the-shelf” CNNs features can be highly relevant for automated classification of colonic polyps. Moreover, we also show that the combination of classical features and “off-the-shelf” CNNs features can be a good approach to further improve the results. PMID:27847543
Intelligent communication assistant for databases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jakobson, G.; Shaked, V.; Rowley, S.
1983-01-01
An intelligent communication assistant for databases, called FRED (front end for databases) is explored. FRED is designed to facilitate access to database systems by users of varying levels of experience. FRED is a second generation of natural language front-ends for databases and intends to solve two critical interface problems existing between end-users and databases: connectivity and communication problems. The authors report their experiences in developing software for natural language query processing, dialog control, and knowledge representation, as well as the direction of future work. 10 references.
GreenPhylDB v2.0: comparative and functional genomics in plants.
Rouard, Mathieu; Guignon, Valentin; Aluome, Christelle; Laporte, Marie-Angélique; Droc, Gaëtan; Walde, Christian; Zmasek, Christian M; Périn, Christophe; Conte, Matthieu G
2011-01-01
GreenPhylDB is a database designed for comparative and functional genomics based on complete genomes. Version 2 now contains sixteen full genomes of members of the plantae kingdom, ranging from algae to angiosperms, automatically clustered into gene families. Gene families are manually annotated and then analyzed phylogenetically in order to elucidate orthologous and paralogous relationships. The database offers various lists of gene families including plant, phylum and species specific gene families. For each gene cluster or gene family, easy access to gene composition, protein domains, publications, external links and orthologous gene predictions is provided. Web interfaces have been further developed to improve the navigation through information related to gene families. New analysis tools are also available, such as a gene family ontology browser that facilitates exploration. GreenPhylDB is a component of the South Green Bioinformatics Platform (http://southgreen.cirad.fr/) and is accessible at http://greenphyl.cirad.fr. It enables comparative genomics in a broad taxonomy context to enhance the understanding of evolutionary processes and thus tends to speed up gene discovery.
Ambiguity and variability of database and software names in bioinformatics.
Duck, Geraint; Kovacevic, Aleksandar; Robertson, David L; Stevens, Robert; Nenadic, Goran
2015-01-01
There are numerous options available to achieve various tasks in bioinformatics, but until recently, there were no tools that could systematically identify mentions of databases and tools within the literature. In this paper we explore the variability and ambiguity of database and software name mentions and compare dictionary and machine learning approaches to their identification. Through the development and analysis of a corpus of 60 full-text documents manually annotated at the mention level, we report high variability and ambiguity in database and software mentions. On a test set of 25 full-text documents, a baseline dictionary look-up achieved an F-score of 46 %, highlighting not only variability and ambiguity but also the extensive number of new resources introduced. A machine learning approach achieved an F-score of 63 % (with precision of 74 %) and 70 % (with precision of 83 %) for strict and lenient matching respectively. We characterise the issues with various mention types and propose potential ways of capturing additional database and software mentions in the literature. Our analyses show that identification of mentions of databases and tools is a challenging task that cannot be achieved by relying on current manually-curated resource repositories. Although machine learning shows improvement and promise (primarily in precision), more contextual information needs to be taken into account to achieve a good degree of accuracy.
EPAUS9R - An Energy Systems Database for use with the Market Allocation (MARKAL) Model
EPA’s MARKAL energy system databases estimate future-year technology dispersals and associated emissions. These databases are valuable tools for exploring a variety of future scenarios for the U.S. energy-production systems that can impact climate change c
Potential use of routine databases in health technology assessment.
Raftery, J; Roderick, P; Stevens, A
2005-05-01
To develop criteria for classifying databases in relation to their potential use in health technology (HT) assessment and to apply them to a list of databases of relevance in the UK. To explore the extent to which prioritized databases could pick up those HTs being assessed by the National Coordinating Centre for Health Technology Assessment (NCCHTA) and the extent to which these databases have been used in HT assessment. To explore the validation of the databases and their cost. Electronic databases. Key literature sources. Experienced users of routine databases. A 'first principles' examination of the data necessary for each type of HT assessment was carried out, supplemented by literature searches and a historical review. The principal investigators applied the criteria to the databases. Comments of the 'keepers' of the prioritized databases were incorporated. Details of 161 topics funded by the NHS R&D Health Technology Assessment (HTA) programme were reviewed iteratively by the principal investigators. Uses of databases in HTAs were identified by literature searches, which included the title of each prioritized database as a keyword. Annual reports of databases were examined and 'keepers' queried. The validity of each database was assessed using criteria based on a literature search and involvement by the authors in a national academic network. The costs of databases were established from annual reports, enquiries to 'keepers' of databases and 'guesstimates' based on cost per record. For assessing effectiveness, equity and diffusion, routine databases were classified into three broad groups: (1) group I databases, identifying both HTs and health states, (2) group II databases, identifying the HTs, but not a health state, and (3) group III databases, identifying health states, but not an HT. Group I datasets were disaggregated into clinical registries, clinical administrative databases and population-oriented databases. Group III were disaggregated into adverse event reporting, confidential enquiries, disease-only registers and health surveys. Databases in group I can be used not only to assess effectiveness but also to assess diffusion and equity. Databases in group II can only assess diffusion. Group III has restricted scope for assessing HTs, except for analysis of adverse events. For use in costing, databases need to include unit costs or prices. Some databases included unit cost as well as a specific HT. A list of around 270 databases was identified at the level of UK, England and Wales or England (over 1000 including Scotland, Wales and Northern Ireland). Allocation of these to the above groups identified around 60 databases with some potential for HT assessment, roughly half to group I. Eighteen clinical registers were identified as having the greatest potential although the clinical administrative datasets had potential mainly owing to their inclusion of a wide range of technologies. Only two databases were identified that could directly be used in costing. The review of the potential capture of HTs prioritized by the UK's NHS R&D HTA programme showed that only 10% would be captured in these databases, mainly drugs prescribed in primary care. The review of the use of routine databases in any form of HT assessment indicated that clinical registers were mainly used for national comparative audit. Some databases have only been used in annual reports, usually time trend analysis. A few peer-reviewed papers used a clinical register to assess the effectiveness of a technology. Accessibility is suggested as a barrier to using most databases. Clinical administrative databases (group Ib) have mainly been used to build population needs indices and performance indicators. A review of the validity of used databases showed that although internal consistency checks were common, relatively few had any form of external audit. Some comparative audit databases have data scrutinised by participating units. Issues around coverage and coding have, in general, received little attention. NHS funding of databases has been mainly for 'Central Returns' for management purposes, which excludes those databases with the greatest potential for HT assessment. Funding for databases was various, but some are unfunded, relying on goodwill. The estimated total cost of databases in group I plus selected databases from groups II and III has been estimated at pound 50 million or around 0.1% of annual NHS spend. A few databases with limited potential for HT assessment account for the bulk of spending. Suggestions for policy include clarification of responsibility for the strategic development of databases, improved resourcing, and issues around coding, confidentiality, ownership and access, maintenance of clinical support, optimal use of information technology, filling gaps and remedying deficiencies. Recommendations for researchers include closer policy links between routine data and R&D, and selective investment in the more promising databases. Recommended research topics include optimal capture and coding of the range of HTs, international comparisons of the role, funding and use of routine data in healthcare systems and use of routine database in trials and in modelling. Independent evaluations are recommended for information strategies (such as those around the National Service Frameworks and various collaborations) and for electronic patient and health records.
A Visual Analytics Paradigm Enabling Trillion-Edge Graph Exploration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wong, Pak C.; Haglin, David J.; Gillen, David S.
We present a visual analytics paradigm and a system prototype for exploring web-scale graphs. A web-scale graph is described as a graph with ~one trillion edges and ~50 billion vertices. While there is an aggressive R&D effort in processing and exploring web-scale graphs among internet vendors such as Facebook and Google, visualizing a graph of that scale still remains an underexplored R&D area. The paper describes a nontraditional peek-and-filter strategy that facilitates the exploration of a graph database of unprecedented size for visualization and analytics. We demonstrate that our system prototype can 1) preprocess a graph with ~25 billion edgesmore » in less than two hours and 2) support database query and visualization on the processed graph database afterward. Based on our computational performance results, we argue that we most likely will achieve the one trillion edge mark (a computational performance improvement of 40 times) for graph visual analytics in the near future.« less
MSDB: A Comprehensive Database of Simple Sequence Repeats.
Avvaru, Akshay Kumar; Saxena, Saketh; Sowpati, Divya Tej; Mishra, Rakesh Kumar
2017-06-01
Microsatellites, also known as Simple Sequence Repeats (SSRs), are short tandem repeats of 1-6 nt motifs present in all genomes, particularly eukaryotes. Besides their usefulness as genome markers, SSRs have been shown to perform important regulatory functions, and variations in their length at coding regions are linked to several disorders in humans. Microsatellites show a taxon-specific enrichment in eukaryotic genomes, and some may be functional. MSDB (Microsatellite Database) is a collection of >650 million SSRs from 6,893 species including Bacteria, Archaea, Fungi, Plants, and Animals. This database is by far the most exhaustive resource to access and analyze SSR data of multiple species. In addition to exploring data in a customizable tabular format, users can view and compare the data of multiple species simultaneously using our interactive plotting system. MSDB is developed using the Django framework and MySQL. It is freely available at http://tdb.ccmb.res.in/msdb. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Geographical and temporal distribution of basic research experiments in homeopathy.
Clausen, Jürgen; van Wijk, Roeland; Albrecht, Henning
2014-07-01
The database HomBRex (Homeopathy Basic Research experiments) was established in 2002 to provide an overview of the basic research already done on homeopathy (http://www.carstens-stiftung.de/hombrex). By this means, it facilitates the exploration of the Similia Principle and the working mechanism of homeopathy. Since 2002, the total number of experiments listed has almost doubled. The current review reports the history of basic research in homeopathy as evidenced by publication dates and origin of publications. In July 2013, the database held 1868 entries. Most publications were reported from France (n = 267), followed by Germany (n = 246) and India (n = 237). In the last ten years, the number of publications from Brazil dramatically increased from n = 13 (before 2004) to n = 164 (compared to n = 251 published in France before 2004, and n = 16 between 2004 and 2013). The oldest database entry was from Germany (1832). Copyright © 2014 The Faculty of Homeopathy. Published by Elsevier Ltd. All rights reserved.
Phenol-Explorer: an online comprehensive database on polyphenol contents in foods.
Neveu, V; Perez-Jiménez, J; Vos, F; Crespy, V; du Chaffaut, L; Mennen, L; Knox, C; Eisner, R; Cruz, J; Wishart, D; Scalbert, A
2010-01-01
A number of databases on the plant metabolome describe the chemistry and biosynthesis of plant chemicals. However, no such database is specifically focused on foods and more precisely on polyphenols, one of the major classes of phytochemicals. As antioxidants, polyphenols influence human health and may play a role in the prevention of a number of chronic diseases such as cardiovascular diseases, some cancers or type 2 diabetes. To determine polyphenol intake in populations and study their association with health, it is essential to have detailed information on their content in foods. However this information is not easily collected due to the variety of their chemical structures and the variability of their content in a given food. Phenol-Explorer is the first comprehensive web-based database on polyphenol content in foods. It contains more than 37,000 original data points collected from 638 scientific articles published in peer-reviewed journals. The quality of these data has been evaluated before they were aggregated to produce final representative mean content values for 502 polyphenols in 452 foods. The web interface allows making various queries on the aggregated data to identify foods containing a given polyphenol or polyphenols present in a given food. For each mean content value, it is possible to trace all original content values and their literature sources. Phenol-Explorer is a major step forward in the development of databases on food constituents and the food metabolome. It should help researchers to better understand the role of phytochemicals in the technical and nutritional quality of food, and food manufacturers to develop tailor-made healthy foods. Database URL: http://www.phenol-explorer.eu.
Phenol-Explorer: an online comprehensive database on polyphenol contents in foods
Neveu, V.; Perez-Jiménez, J.; Vos, F.; Crespy, V.; du Chaffaut, L.; Mennen, L.; Knox, C.; Eisner, R.; Cruz, J.; Wishart, D.; Scalbert, A.
2010-01-01
A number of databases on the plant metabolome describe the chemistry and biosynthesis of plant chemicals. However, no such database is specifically focused on foods and more precisely on polyphenols, one of the major classes of phytochemicals. As antoxidants, polyphenols influence human health and may play a role in the prevention of a number of chronic diseases such as cardiovascular diseases, some cancers or type 2 diabetes. To determine polyphenol intake in populations and study their association with health, it is essential to have detailed information on their content in foods. However this information is not easily collected due to the variety of their chemical structures and the variability of their content in a given food. Phenol-Explorer is the first comprehensive web-based database on polyphenol content in foods. It contains more than 37 000 original data points collected from 638 scientific articles published in peer-reviewed journals. The quality of these data has been evaluated before they were aggregated to produce final representative mean content values for 502 polyphenols in 452 foods. The web interface allows making various queries on the aggregated data to identify foods containing a given polyphenol or polyphenols present in a given food. For each mean content value, it is possible to trace all original content values and their literature sources. Phenol-Explorer is a major step forward in the development of databases on food constituents and the food metabolome. It should help researchers to better understand the role of phytochemicals in the technical and nutritional quality of food, and food manufacturers to develop tailor-made healthy foods. Database URL: http://www.phenol-explorer.eu PMID:20428313
Louisse, Jochem; Dingemans, Milou M L; Baken, Kirsten A; van Wezel, Annemarie P; Schriks, Merijn
2018-06-14
The present study explores the ToxCast/Tox21 database to select candidate bioassays as bioanalytical tools for measuring groups of chemicals in water. To this aim, the ToxCast/Tox21 database was explored for bioassays that detect polycyclic aromatic hydrocarbons (PAHs), aromatic amines (AAs), (chloro)phenols ((C)Ps) and halogenated aliphatic hydrocarbons (HAliHs), which are included in the European and/or Dutch Drinking Water Directives. Based on the analysis of the availability and performance of bioassays included in the database, we concluded that several bioassays are suitable as bioanalytical tools for assessing the presence of PAHs and (C)Ps in drinking water sources. No bioassays were identified for AAs and HAliHs, due to the limited activity of these chemicals and/or the limited amount of data on these chemicals in the database. A series of bioassays was selected that measure molecular or cellular effects that are covered by bioassays currently in use for chemical water quality monitoring. Interestingly, also bioassays were selected that represent molecular or cellular effects that are not covered by bioassays currently applied. The usefulness of these newly identified bioassays as bioanalytical tools should be further evaluated in follow-up studies. Altogether, this study shows how exploration of the ToxCast/Tox21 database provides a series of candidate bioassays as bioanalytical tools for measuring groups of chemicals in water. This assessment can be performed for any group of chemicals of interest (if represented in the database), and may provide candidate bioassays that can be used to complement the currently applied bioassays for chemical water quality assessment. Copyright © 2018. Published by Elsevier Ltd.
Exploration of options for publishing databases and supplemental material in society journals
USDA-ARS?s Scientific Manuscript database
As scientific information becomes increasingly more abundant, there is increasing interest among members of our societies to share databases. These databases have great value, for example, in providing long-term perspectives of various scientific problems and for use by modelers to extend the inform...
Challenges in Database Design with Microsoft Access
ERIC Educational Resources Information Center
Letkowski, Jerzy
2014-01-01
Design, development and explorations of databases are popular topics covered in introductory courses taught at business schools. Microsoft Access is the most popular software used in those courses. Despite quite high complexity of Access, it is considered to be one of the most friendly database programs for beginners. A typical Access textbook…
Modeling and Databases for Teaching Petrology
NASA Astrophysics Data System (ADS)
Asher, P.; Dutrow, B.
2003-12-01
With the widespread availability of high-speed computers with massive storage and ready transport capability of large amounts of data, computational and petrologic modeling and the use of databases provide new tools with which to teach petrology. Modeling can be used to gain insights into a system, predict system behavior, describe a system's processes, compare with a natural system or simply to be illustrative. These aspects result from data driven or empirical, analytical or numerical models or the concurrent examination of multiple lines of evidence. At the same time, use of models can enhance core foundations of the geosciences by improving critical thinking skills and by reinforcing prior knowledge gained. However, the use of modeling to teach petrology is dictated by the level of expectation we have for students and their facility with modeling approaches. For example, do we expect students to push buttons and navigate a program, understand the conceptual model and/or evaluate the results of a model. Whatever the desired level of sophistication, specific elements of design should be incorporated into a modeling exercise for effective teaching. These include, but are not limited to; use of the scientific method, use of prior knowledge, a clear statement of purpose and goals, attainable goals, a connection to the natural/actual system, a demonstration that complex heterogeneous natural systems are amenable to analyses by these techniques and, ideally, connections to other disciplines and the larger earth system. Databases offer another avenue with which to explore petrology. Large datasets are available that allow integration of multiple lines of evidence to attack a petrologic problem or understand a petrologic process. These are collected into a database that offers a tool for exploring, organizing and analyzing the data. For example, datasets may be geochemical, mineralogic, experimental and/or visual in nature, covering global, regional to local scales. These datasets provide students with access to large amount of related data through space and time. Goals of the database working group include educating earth scientists about information systems in general, about the importance of metadata about ways of using databases and datasets as educational tools and about the availability of existing datasets and databases. The modeling and databases groups hope to create additional petrologic teaching tools using these aspects and invite the community to contribute to the effort.
ExplorEnz: the primary source of the IUBMB enzyme list
McDonald, Andrew G.; Boyce, Sinéad; Tipton, Keith F.
2009-01-01
ExplorEnz is the MySQL database that is used for the curation and dissemination of the International Union of Biochemistry and Molecular Biology (IUBMB) Enzyme Nomenclature. A simple web-based query interface is provided, along with an advanced search engine for more complex Boolean queries. The WWW front-end is accessible at http://www.enzyme-database.org, from where downloads of the database as SQL and XML are also available. An associated form-based curatorial application has been developed to facilitate the curation of enzyme data as well as the internal and public review processes that occur before an enzyme entry is made official. Suggestions for new enzyme entries, or modifications to existing ones, can be made using the forms provided at http://www.enzyme-database.org/forms.php. PMID:18776214
Public variant databases: liability?
Thorogood, Adrian; Cook-Deegan, Robert; Knoppers, Bartha Maria
2017-07-01
Public variant databases support the curation, clinical interpretation, and sharing of genomic data, thus reducing harmful errors or delays in diagnosis. As variant databases are increasingly relied on in the clinical context, there is concern that negligent variant interpretation will harm patients and attract liability. This article explores the evolving legal duties of laboratories, public variant databases, and physicians in clinical genomics and recommends a governance framework for databases to promote responsible data sharing.Genet Med advance online publication 15 December 2016.
2011-09-01
National Park system. Britch et al2 explored relationships between the Normalized Difference Vegetation Index* and 2003- 2005 APHCR-W mosquito...outside all polygons and points that did not match relations for the country and state names. Data were imported into ARCVIEW GIS 3.3...data points, especially for rare species. Compared with the arthropod , vertebrate, and plant species analyzed in Heyer et at,’ 5 the mosquito curves
3D matching techniques using OCT fingerprint point clouds
NASA Astrophysics Data System (ADS)
Gutierrez da Costa, Henrique S.; Silva, Luciano; Bellon, Olga R. P.; Bowden, Audrey K.; Czovny, Raphael K.
2017-02-01
Optical Coherence Tomography (OCT) makes viable acquisition of 3D fingerprints from both dermis and epidermis skin layers and their interfaces, exposing features that can be explored to improve biometric identification such as the curvatures and distinctive 3D regions. Scanned images from eleven volunteers allowed the construction of the first OCT 3D fingerprint database, to our knowledge, containing epidermal and dermal fingerprints. 3D dermal fingerprints can be used to overcome cases of Failure to Enroll (FTE) due to poor ridge image quality and skin alterations, cases that affect 2D matching performance. We evaluate three matching techniques, including the well-established Iterative Closest Points algorithm (ICP), Surface Interpenetration Measure (SIM) and the well-known KH Curvature Maps, all assessed using a 3D OCT fingerprint database, the first one for this purpose. Two of these techniques are based on registration techniques and one on curvatures. These were evaluated, compared and the fusion of matching scores assessed. We applied a sequence of steps to extract regions of interest named (ROI) minutiae clouds, representing small regions around distinctive minutia, usually located at ridges/valleys endings or bifurcations. The obtained ROI is acquired from the epidermis and dermis-epidermis interface by OCT imaging. A comparative analysis of identification accuracy was explored using different scenarios and the obtained results shows improvements for biometric identification. A comparison against 2D fingerprint matching algorithms is also presented to assess the improvements.
NASA Technical Reports Server (NTRS)
Shull, Sarah A.; Gralla, Erica L.; deWeck, Olivier L.; Shishko, Robert
2006-01-01
One of the major logistical challenges in human space exploration is asset management. This paper presents observations on the practice of asset management in support of human space flight to date and discusses a functional-based supply classification and a framework for an integrated database that could be used to improve asset management and logistics for human missions to the Moon, Mars and beyond.
Exploration of the Chemical Space of Public Genomic Databases
The current project aims to chemically index the content of public genomic databases to make these data accessible in relation to other publicly available, chemically-indexed toxicological information.
Improved Bond Equations for Fiber-Reinforced Polymer Bars in Concrete
Pour, Sadaf Moallemi; Alam, M. Shahria; Milani, Abbas S.
2016-01-01
This paper explores a set of new equations to predict the bond strength between fiber reinforced polymer (FRP) rebar and concrete. The proposed equations are based on a comprehensive statistical analysis and existing experimental results in the literature. Namely, the most effective parameters on bond behavior of FRP concrete were first identified by applying a factorial analysis on a part of the available database. Then the database that contains 250 pullout tests were divided into four groups based on the concrete compressive strength and the rebar surface. Afterward, nonlinear regression analysis was performed for each study group in order to determine the bond equations. The results show that the proposed equations can predict bond strengths more accurately compared to the other previously reported models. PMID:28773859
Failure mode and effects analysis outputs: are they valid?
2012-01-01
Background Failure Mode and Effects Analysis (FMEA) is a prospective risk assessment tool that has been widely used within the aerospace and automotive industries and has been utilised within healthcare since the early 1990s. The aim of this study was to explore the validity of FMEA outputs within a hospital setting in the United Kingdom. Methods Two multidisciplinary teams each conducted an FMEA for the use of vancomycin and gentamicin. Four different validity tests were conducted: · Face validity: by comparing the FMEA participants’ mapped processes with observational work. · Content validity: by presenting the FMEA findings to other healthcare professionals. · Criterion validity: by comparing the FMEA findings with data reported on the trust’s incident report database. · Construct validity: by exploring the relevant mathematical theories involved in calculating the FMEA risk priority number. Results Face validity was positive as the researcher documented the same processes of care as mapped by the FMEA participants. However, other healthcare professionals identified potential failures missed by the FMEA teams. Furthermore, the FMEA groups failed to include failures related to omitted doses; yet these were the failures most commonly reported in the trust’s incident database. Calculating the RPN by multiplying severity, probability and detectability scores was deemed invalid because it is based on calculations that breach the mathematical properties of the scales used. Conclusion There are significant methodological challenges in validating FMEA. It is a useful tool to aid multidisciplinary groups in mapping and understanding a process of care; however, the results of our study cast doubt on its validity. FMEA teams are likely to need different sources of information, besides their personal experience and knowledge, to identify potential failures. As for FMEA’s methodology for scoring failures, there were discrepancies between the teams’ estimates and similar incidents reported on the trust’s incident database. Furthermore, the concept of multiplying ordinal scales to prioritise failures is mathematically flawed. Until FMEA’s validity is further explored, healthcare organisations should not solely depend on their FMEA results to prioritise patient safety issues. PMID:22682433
Messner, Donna A; Mohr, Penny; Towse, Adrian
2015-08-01
Explore key factors influencing future expectations for the production of evidence from comparative effectiveness research for drugs in the USA in 2020 and construct three plausible future scenarios. Semistructured key informant interviews and three rounds of modified Delphi with systematic scenario-building methods. Most influential key factors were: health delivery system integration; electronic health record development; exploitation of very large databases and mixed data sources; and proactive patient engagement in research. The scenario deemed most likely entailed uneven development of large integrated health systems with pockets of increased provider risk for patient care, enhanced data collection systems, changing incentives to do comparative effectiveness research and new opportunities for evidence generation partnerships.
Majewska, Małgorzata; Wysokińska, Halina; Kuźma, Łukasz; Szymczyk, Piotr
2018-02-20
The complete exploration of the regulation of gene expression remains one of the top-priority goals for researchers. As the regulation is mainly controlled at the level of transcription by promoters, study on promoters and findings are of great importance. This review summarizes forty selected databases that centralize experimental and theoretical knowledge regarding the organization of promoters, interacting transcription factors (TFs) and microRNAs (miRNAs) in many eukaryotic and prokaryotic species. The presented databases offer researchers valuable support in elucidating the regulation of gene transcription. Copyright © 2017 Elsevier B.V. All rights reserved.
Li, Jiazhong; Gramatica, Paola
2010-11-01
Quantitative structure-activity relationship (QSAR) methodology aims to explore the relationship between molecular structures and experimental endpoints, producing a model for the prediction of new data; the predictive performance of the model must be checked by external validation. Clearly, the qualities of chemical structure information and experimental endpoints, as well as the statistical parameters used to verify the external predictivity have a strong influence on QSAR model reliability. Here, we emphasize the importance of these three aspects by analyzing our models on estrogen receptor binders (Endocrine disruptor knowledge base (EDKB) database). Endocrine disrupting chemicals, which mimic or antagonize the endogenous hormones such as estrogens, are a hot topic in environmental and toxicological sciences. QSAR shows great values in predicting the estrogenic activity and exploring the interactions between the estrogen receptor and ligands. We have verified our previously published model for additional external validation on new EDKB chemicals. Having found some errors in the used 3D molecular conformations, we redevelop a new model using the same data set with corrected structures, the same method (ordinary least-square regression, OLS) and DRAGON descriptors. The new model, based on some different descriptors, is more predictive on external prediction sets. Three different formulas to calculate correlation coefficient for the external prediction set (Q2 EXT) were compared, and the results indicated that the new proposal of Consonni et al. had more reasonable results, consistent with the conclusions from regression line, Williams plot and root mean square error (RMSE) values. Finally, the importance of reliable endpoints values has been highlighted by comparing the classification assignments of EDKB with those of another estrogen receptor binders database (METI): we found that 16.1% assignments of the common compounds were opposite (20 among 124 common compounds). In order to verify the real assignments for these inconsistent compounds, we predicted these samples, as a blind external set, by our regression models and compared the results with the two databases. The results indicated that most of the predictions were consistent with METI. Furthermore, we built a kNN classification model using the 104 consistent compounds to predict those inconsistent ones, and most of the predictions were also in agreement with METI database.
Massarotti, Alberto; Brunco, Angelo; Sorba, Giovanni; Tron, Gian Cesare
2014-02-24
Since Professors Sharpless, Finn, and Kolb first introduced the concept of "click reactions" in 2001 as powerful tools in drug discovery, 1,4-disubstituted-1,2,3-triazoles have become important in medicinal chemistry due to the simultaneous discovery by Sharpless, Fokin, and Meldal of a perfect click 1,3-dipolar cycloaddition reaction between azides and alkynes catalyzed by copper salts. Because of their chemical features, these triazoles are proposed to be aggressive pharmacophores that participate in drug-receptor interactions while maintaining an excellent chemical and metabolic profile. Surprisingly, no virtual libraries of 1,4-disubstituted-1,2,3-triazoles have been generated for the systematic investigation of the click-chemical space. In this manuscript, a database of triazoles called ZINClick is generated from literature-reported alkynes and azides that can be synthesized within three steps from commercially available products. This combinatorial database contains over 16 million 1,4-disubstituted-1,2,3-triazoles that are easily synthesizable, new, and patentable! The structural diversity of ZINClick ( http://www.symech.it/ZINClick ) will be explored. ZINClick will also be compared to other available databases, and its application during the design of novel bioactive molecules containing triazole nuclei will be discussed.
Public variant databases: liability?
Thorogood, Adrian; Cook-Deegan, Robert; Knoppers, Bartha Maria
2017-01-01
Public variant databases support the curation, clinical interpretation, and sharing of genomic data, thus reducing harmful errors or delays in diagnosis. As variant databases are increasingly relied on in the clinical context, there is concern that negligent variant interpretation will harm patients and attract liability. This article explores the evolving legal duties of laboratories, public variant databases, and physicians in clinical genomics and recommends a governance framework for databases to promote responsible data sharing. Genet Med advance online publication 15 December 2016 PMID:27977006
Lee, Jennifer F.; Hesselberth, Jay R.; Meyers, Lauren Ancel; Ellington, Andrew D.
2004-01-01
The aptamer database is designed to contain comprehensive sequence information on aptamers and unnatural ribozymes that have been generated by in vitro selection methods. Such data are not normally collected in ‘natural’ sequence databases, such as GenBank. Besides serving as a storehouse of sequences that may have diagnostic or therapeutic utility, the database serves as a valuable resource for theoretical biologists who describe and explore fitness landscapes. The database is updated monthly and is publicly available at http://aptamer.icmb.utexas.edu/. PMID:14681367
ERIC Educational Resources Information Center
Takusi, Gabriel Samuto
2010-01-01
This quantitative analysis explored the intrinsic and extrinsic turnover factors of relational database support specialists. Two hundred and nine relational database support specialists were surveyed for this research. The research was conducted based on Hackman and Oldham's (1980) Job Diagnostic Survey. Regression analysis and a univariate ANOVA…
Reference point detection for camera-based fingerprint image based on wavelet transformation.
Khalil, Mohammed S
2015-04-30
Fingerprint recognition systems essentially require core-point detection prior to fingerprint matching. The core-point is used as a reference point to align the fingerprint with a template database. When processing a larger fingerprint database, it is necessary to consider the core-point during feature extraction. Numerous core-point detection methods are available and have been reported in the literature. However, these methods are generally applied to scanner-based images. Hence, this paper attempts to explore the feasibility of applying a core-point detection method to a fingerprint image obtained using a camera phone. The proposed method utilizes a discrete wavelet transform to extract the ridge information from a color image. The performance of proposed method is evaluated in terms of accuracy and consistency. These two indicators are calculated automatically by comparing the method's output with the defined core points. The proposed method is tested on two data sets, controlled and uncontrolled environment, collected from 13 different subjects. In the controlled environment, the proposed method achieved a detection rate 82.98%. In uncontrolled environment, the proposed method yield a detection rate of 78.21%. The proposed method yields promising results in a collected-image database. Moreover, the proposed method outperformed compare to existing method.
Chacon, Diego; Beck, Dominik; Perera, Dilmi; Wong, Jason W H; Pimanda, John E
2014-01-01
The BloodChIP database (http://www.med.unsw.edu.au/CRCWeb.nsf/page/BloodChIP) supports exploration and visualization of combinatorial transcription factor (TF) binding at a particular locus in human CD34-positive and other normal and leukaemic cells or retrieval of target gene sets for user-defined combinations of TFs across one or more cell types. Increasing numbers of genome-wide TF binding profiles are being added to public repositories, and this trend is likely to continue. For the power of these data sets to be fully harnessed by experimental scientists, there is a need for these data to be placed in context and easily accessible for downstream applications. To this end, we have built a user-friendly database that has at its core the genome-wide binding profiles of seven key haematopoietic TFs in human stem/progenitor cells. These binding profiles are compared with binding profiles in normal differentiated and leukaemic cells. We have integrated these TF binding profiles with chromatin marks and expression data in normal and leukaemic cell fractions. All queries can be exported into external sites to construct TF-gene and protein-protein networks and to evaluate the association of genes with cellular processes and tissue expression.
Fungal genome resources at NCBI.
Robbertse, B; Tatusova, T
2011-09-01
The National Center for Biotechnology Information (NCBI) is well known for the nucleotide sequence archive, GenBank and sequence analysis tool BLAST. However, NCBI integrates many types of biomolecular data from variety of sources and makes it available to the scientific community as interactive web resources as well as organized releases of bulk data. These tools are available to explore and compare fungal genomes. Searching all databases with Fungi [organism] at http://www.ncbi.nlm.nih.gov/ is the quickest way to find resources of interest with fungal entries. Some tools though are resources specific and can be indirectly accessed from a particular database in the Entrez system. These include graphical viewers and comparative analysis tools such as TaxPlot, TaxMap and UniGene DDD (found via UniGene Homepage). Gene and BioProject pages also serve as portals to external data such as community annotation websites, BioGrid and UniProt. There are many different ways of accessing genomic data at NCBI. Depending on the focus and goal of research projects or the level of interest, a user would select a particular route for accessing genomic databases and resources. This review article describes methods of accessing fungal genome data and provides examples that illustrate the use of analysis tools.
Chamrad, Daniel C; Körting, Gerhard; Schäfer, Heike; Stephan, Christian; Thiele, Herbert; Apweiler, Rolf; Meyer, Helmut E; Marcus, Katrin; Blüggel, Martin
2006-09-01
A novel software tool named PTM-Explorer has been applied to LC-MS/MS datasets acquired within the Human Proteome Organisation (HUPO) Brain Proteome Project (BPP). PTM-Explorer enables automatic identification of peptide MS/MS spectra that were not explained in typical sequence database searches. The main focus was detection of PTMs, but PTM-Explorer detects also unspecific peptide cleavage, mass measurement errors, experimental modifications, amino acid substitutions, transpeptidation products and unknown mass shifts. To avoid a combinatorial problem the search is restricted to a set of selected protein sequences, which stem from previous protein identifications using a common sequence database search. Prior to application to the HUPO BPP data, PTM-Explorer was evaluated on excellently manually characterized and evaluated LC-MS/MS data sets from Alpha-A-Crystallin gel spots obtained from mouse eye lens. Besides various PTMs including phosphorylation, a wealth of experimental modifications and unspecific cleavage products were successfully detected, completing the primary structure information of the measured proteins. Our results indicate that a large amount of MS/MS spectra that currently remain unidentified in standard database searches contain valuable information that can only be elucidated using suitable software tools.
DOE Office of Scientific and Technical Information (OSTI.GOV)
D. D. Blackwell; K. W. Wisian; M. C. Richards
2000-04-01
Several activities related to geothermal resources in the western United States are described in this report. A database of geothermal site-specific thermal gradient and heat flow results from individual exploration wells in the western US has been assembled. Extensive temperature gradient and heat flow exploration data from the active exploration of the 1970's and 1980's were collected, compiled, and synthesized, emphasizing previously unavailable company data. Examples of the use and applications of the database are described. The database and results are available on the world wide web. In this report numerical models are used to establish basic qualitative relationships betweenmore » structure, heat input, and permeability distribution, and the resulting geothermal system. A series of steady state, two-dimensional numerical models evaluate the effect of permeability and structural variations on an idealized, generic Basin and Range geothermal system and the results are described.« less
Interactive Exploration for Continuously Expanding Neuron Databases.
Li, Zhongyu; Metaxas, Dimitris N; Lu, Aidong; Zhang, Shaoting
2017-02-15
This paper proposes a novel framework to help biologists explore and analyze neurons based on retrieval of data from neuron morphological databases. In recent years, the continuously expanding neuron databases provide a rich source of information to associate neuronal morphologies with their functional properties. We design a coarse-to-fine framework for efficient and effective data retrieval from large-scale neuron databases. In the coarse-level, for efficiency in large-scale, we employ a binary coding method to compress morphological features into binary codes of tens of bits. Short binary codes allow for real-time similarity searching in Hamming space. Because the neuron databases are continuously expanding, it is inefficient to re-train the binary coding model from scratch when adding new neurons. To solve this problem, we extend binary coding with online updating schemes, which only considers the newly added neurons and update the model on-the-fly, without accessing the whole neuron databases. In the fine-grained level, we introduce domain experts/users in the framework, which can give relevance feedback for the binary coding based retrieval results. This interactive strategy can improve the retrieval performance through re-ranking the above coarse results, where we design a new similarity measure and take the feedback into account. Our framework is validated on more than 17,000 neuron cells, showing promising retrieval accuracy and efficiency. Moreover, we demonstrate its use case in assisting biologists to identify and explore unknown neurons. Copyright © 2017 Elsevier Inc. All rights reserved.
MGIS: managing banana (Musa spp.) genetic resources information and high-throughput genotyping data
Guignon, V.; Sempere, G.; Sardos, J.; Hueber, Y.; Duvergey, H.; Andrieu, A.; Chase, R.; Jenny, C.; Hazekamp, T.; Irish, B.; Jelali, K.; Adeka, J.; Ayala-Silva, T.; Chao, C.P.; Daniells, J.; Dowiya, B.; Effa effa, B.; Gueco, L.; Herradura, L.; Ibobondji, L.; Kempenaers, E.; Kilangi, J.; Muhangi, S.; Ngo Xuan, P.; Paofa, J.; Pavis, C.; Thiemele, D.; Tossou, C.; Sandoval, J.; Sutanto, A.; Vangu Paka, G.; Yi, G.; Van den houwe, I.; Roux, N.
2017-01-01
Abstract Unraveling the genetic diversity held in genebanks on a large scale is underway, due to advances in Next-generation sequence (NGS) based technologies that produce high-density genetic markers for a large number of samples at low cost. Genebank users should be in a position to identify and select germplasm from the global genepool based on a combination of passport, genotypic and phenotypic data. To facilitate this, a new generation of information systems is being designed to efficiently handle data and link it with other external resources such as genome or breeding databases. The Musa Germplasm Information System (MGIS), the database for global ex situ-held banana genetic resources, has been developed to address those needs in a user-friendly way. In developing MGIS, we selected a generic database schema (Chado), the robust content management system Drupal for the user interface, and Tripal, a set of Drupal modules which links the Chado schema to Drupal. MGIS allows germplasm collection examination, accession browsing, advanced search functions, and germplasm orders. Additionally, we developed unique graphical interfaces to compare accessions and to explore them based on their taxonomic information. Accession-based data has been enriched with publications, genotyping studies and associated genotyping datasets reporting on germplasm use. Finally, an interoperability layer has been implemented to facilitate the link with complementary databases like the Banana Genome Hub and the MusaBase breeding database. Database URL: https://www.crop-diversity.org/mgis/ PMID:29220435
Thakar, Sambhaji B; Ghorpade, Pradnya N; Kale, Manisha V; Sonawane, Kailas D
2015-01-01
Fern plants are known for their ethnomedicinal applications. Huge amount of fern medicinal plants information is scattered in the form of text. Hence, database development would be an appropriate endeavor to cope with the situation. So by looking at the importance of medicinally useful fern plants, we developed a web based database which contains information about several group of ferns, their medicinal uses, chemical constituents as well as protein/enzyme sequences isolated from different fern plants. Fern ethnomedicinal plant database is an all-embracing, content management web-based database system, used to retrieve collection of factual knowledge related to the ethnomedicinal fern species. Most of the protein/enzyme sequences have been extracted from NCBI Protein sequence database. The fern species, family name, identification, taxonomy ID from NCBI, geographical occurrence, trial for, plant parts used, ethnomedicinal importance, morphological characteristics, collected from various scientific literatures and journals available in the text form. NCBI's BLAST, InterPro, phylogeny, Clustal W web source has also been provided for the future comparative studies. So users can get information related to fern plants and their medicinal applications at one place. This Fern ethnomedicinal plant database includes information of 100 fern medicinal species. This web based database would be an advantageous to derive information specifically for computational drug discovery, botanists or botanical interested persons, pharmacologists, researchers, biochemists, plant biotechnologists, ayurvedic practitioners, doctors/pharmacists, traditional medicinal users, farmers, agricultural students and teachers from universities as well as colleges and finally fern plant lovers. This effort would be useful to provide essential knowledge for the users about the adventitious applications for drug discovery, applications, conservation of fern species around the world and finally to create social awareness.
SMART operational field test evaluation : operations database report : final report
DOT National Transportation Integrated Search
1997-09-01
Based on the Suburban Mobility Authority For Regional Transportations (SMART) weekly operating reports from its Macomb, Wayne, Troy, and Pontiac terminals, this Operations Database Report explores productivity measures over time, and examines how ...
Labrèche, France; Kosatsky, Tom; Przybysz, Raymond
2008-01-01
The absence of ongoing surveillance for childhood asthma in Montreal, Quebec, prompted the present investigation to assess the validity and practicality of administrative databases as a foundation for surveillance. To explore the consistency between cases of asthma identified through physician billings compared with hospital discharge summaries. Rates of service use for asthma in 1998 among Montreal children aged one, four and eight years were estimated. Correspondence between the two databases (physician billing claims versus medical billing claims) were explored during three different time periods: the first day of hospitalization, during the entire hospital stay, and during the hospital stay plus a one-day margin before admission and after discharge ('hospital stay +/- 1 day'). During 1998, 7.6% of Montreal children consulted a physician for asthma at least once and 0.6% were hospitalized with a principal diagnosis of asthma. There were no contemporaneous physician billings for asthma 'in hospital' during hospital stay +/- 1 day for 22% of hospitalizations in which asthma was the primary diagnosis recorded at discharge. Conversely, among children with a physician billing for asthma 'in hospital', 66% were found to have a contemporaneous in-hospital record of a stay for 'asthma'. Both databases of hospital and medical billing claims are useful for estimating rates of hospitalization for asthma in children. The potential for diagnostic imprecision is of concern, especially if capturing the exact number of uses is more important than establishing patterns of use.
PExFInS: An Integrative Post-GWAS Explorer for Functional Indels and SNPs
Cheng, Zhongshan; Chu, Hin; Fan, Yanhui; Li, Cun; Song, You-Qiang; Zhou, Jie; Yuen, Kwok-Yung
2015-01-01
Expression quantitative trait loci (eQTLs) mapping and linkage disequilibrium (LD) analysis have been widely employed to interpret findings of genome-wide association studies (GWAS). With the availability of deep sequencing data of 423 lymphoblastoid cell lines (LCLs) from six global populations and the microarray expression data, we performed eQTL analysis, identified more than 228 K SNP cis-eQTLs and 21 K indel cis-eQTLs and generated a LCL cis-eQTL database. We demonstrate that the percentages of population-shared and population-specific cis-eQTLs are comparable; while indel cis-eQTLs in the population-specific subsection make more contribution to gene expression variations than those in the population-shared subsection. We found cis-eQTLs, especially the population-shared cis-eQTLs are significantly enriched toward transcription start site. Moreover, the National Human Genome Research Institute cataloged GWAS SNPs are enriched for LCL cis-eQTLs. Specifically, 32.8% GWAS SNPs are LCL cis-eQTLs, among which 12.5% can be tagged by indel cis-eQTLs, suggesting the fundamental contribution of indel cis-eQTLs to GWAS association signals. To search for functional indels and SNPs tagging GWAS SNPs, a pipeline Post-GWAS Explorer for Functional Indels and SNPs (PExFInS) has been developed, integrating LD analysis, functional annotation from public databases, cis-eQTL mapping with our LCL cis-eQTL database and other published cis-eQTL datasets. PMID:26612672
Is Library Database Searching a Language Learning Activity?
ERIC Educational Resources Information Center
Bordonaro, Karen
2010-01-01
This study explores how non-native speakers of English think of words to enter into library databases when they begin the process of searching for information in English. At issue is whether or not language learning takes place when these students use library databases. Language learning in this study refers to the use of strategies employed by…
ERIC Educational Resources Information Center
Qin, Jian
2000-01-01
Explores similarities or dissimilarities between citation-semantic and analytic indexing based on a study of records in the Science Citation Index and MEDLINE databases on antibiotic resistance in pneumonia. Concludes that disparate indexing terms may be an advantage for better recall and precision in information retrieval. (Contains 42…
Patterns of Undergraduates' Use of Scholarly Databases in a Large Research University
ERIC Educational Resources Information Center
Mbabu, Loyd Gitari; Bertram, Albert; Varnum, Ken
2013-01-01
Authentication data was utilized to explore undergraduate usage of subscription electronic databases. These usage patterns were linked to the information literacy curriculum of the library. The data showed that out of the 26,208 enrolled undergraduate students, 42% of them accessed a scholarly database at least once in the course of the entire…
Data-Based Decision-Making: Developing a Method for Capturing Teachers' Understanding of CBM Graphs
ERIC Educational Resources Information Center
Espin, Christine A.; Wayman, Miya Miura; Deno, Stanley L.; McMaster, Kristen L.; de Rooij, Mark
2017-01-01
In this special issue, we explore the decision-making aspect of "data-based decision-making". The articles in the issue address a wide range of research questions, designs, methods, and analyses, but all focus on data-based decision-making for students with learning difficulties. In this first article, we introduce the topic of…
Rossi, Ernest; Mortimer, Jane; Rossi, Kathryn
2013-04-01
Culturomics is a new scientific discipline of the digital humanities-the use of computer algorithms to search for meaning in large databases of text and media. This new digital discipline is used to explore 200 years of the history of hypnosis and psychotherapy in over five million digitized books from more than 40 university libraries around the world. It graphically compares the frequencies of English words about hypnosis, hypnotherapy, psychoanalysis, psychotherapy, and their founders from 1800 to 2008. This new perspective explore issues such as: Who were the major innovators in the history of therapeutic hypnosis, psychoanalysis, and psychotherapy? How well does this new digital approach to the humanities correspond to traditional histories of hypnosis and psychotherapy?
Persistent hydrogen bonding in polymorphic crystal structures.
Galek, Peter T A; Fábián, László; Allen, Frank H
2009-02-01
The significance of hydrogen bonding and its variability in polymorphic crystal structures is explored using new automated structural analysis methods. The concept of a chemically equivalent hydrogen bond is defined, which may be identified in pairs of structures, revealing those types of bonds that may persist, or not, in moving from one polymorphic form to another. Their frequency and nature are investigated in 882 polymorphic structures from the Cambridge Structural Database. A new method to compare conformations of equivalent molecules is introduced and applied to derive distinct subsets of conformational and packing polymorphs. The roles of chemical functionality and hydrogen-bond geometry in persistent interactions are systematically explored. Detailed structural comparisons reveal a large majority of persistent hydrogen bonds that are energetically crucial to structural stability.
A Dictionary Learning Approach for Signal Sampling in Task-Based fMRI for Reduction of Big Data
Ge, Bao; Li, Xiang; Jiang, Xi; Sun, Yifei; Liu, Tianming
2018-01-01
The exponential growth of fMRI big data offers researchers an unprecedented opportunity to explore functional brain networks. However, this opportunity has not been fully explored yet due to the lack of effective and efficient tools for handling such fMRI big data. One major challenge is that computing capabilities still lag behind the growth of large-scale fMRI databases, e.g., it takes many days to perform dictionary learning and sparse coding of whole-brain fMRI data for an fMRI database of average size. Therefore, how to reduce the data size but without losing important information becomes a more and more pressing issue. To address this problem, we propose a signal sampling approach for significant fMRI data reduction before performing structurally-guided dictionary learning and sparse coding of whole brain's fMRI data. We compared the proposed structurally guided sampling method with no sampling, random sampling and uniform sampling schemes, and experiments on the Human Connectome Project (HCP) task fMRI data demonstrated that the proposed method can achieve more than 15 times speed-up without sacrificing the accuracy in identifying task-evoked functional brain networks. PMID:29706880
A Dictionary Learning Approach for Signal Sampling in Task-Based fMRI for Reduction of Big Data.
Ge, Bao; Li, Xiang; Jiang, Xi; Sun, Yifei; Liu, Tianming
2018-01-01
The exponential growth of fMRI big data offers researchers an unprecedented opportunity to explore functional brain networks. However, this opportunity has not been fully explored yet due to the lack of effective and efficient tools for handling such fMRI big data. One major challenge is that computing capabilities still lag behind the growth of large-scale fMRI databases, e.g., it takes many days to perform dictionary learning and sparse coding of whole-brain fMRI data for an fMRI database of average size. Therefore, how to reduce the data size but without losing important information becomes a more and more pressing issue. To address this problem, we propose a signal sampling approach for significant fMRI data reduction before performing structurally-guided dictionary learning and sparse coding of whole brain's fMRI data. We compared the proposed structurally guided sampling method with no sampling, random sampling and uniform sampling schemes, and experiments on the Human Connectome Project (HCP) task fMRI data demonstrated that the proposed method can achieve more than 15 times speed-up without sacrificing the accuracy in identifying task-evoked functional brain networks.
RISE: a database of RNA interactome from sequencing experiments
Gong, Jing; Shao, Di; Xu, Kui
2018-01-01
Abstract We present RISE (http://rise.zhanglab.net), a database of RNA Interactome from Sequencing Experiments. RNA-RNA interactions (RRIs) are essential for RNA regulation and function. RISE provides a comprehensive collection of RRIs that mainly come from recent transcriptome-wide sequencing-based experiments like PARIS, SPLASH, LIGR-seq, and MARIO, as well as targeted studies like RIA-seq, RAP-RNA and CLASH. It also includes interactions aggregated from other primary databases and publications. The RISE database currently contains 328,811 RNA-RNA interactions mainly in human, mouse and yeast. While most existing RNA databases mainly contain interactions of miRNA targeting, notably, more than half of the RRIs in RISE are among mRNA and long non-coding RNAs. We compared different RRI datasets in RISE and found limited overlaps in interactions resolved by different techniques and in different cell lines. It may suggest technology preference and also dynamic natures of RRIs. We also analyzed the basic features of the human and mouse RRI networks and found that they tend to be scale-free, small-world, hierarchical and modular. The analysis may nominate important RNAs or RRIs for further investigation. Finally, RISE provides a Circos plot and several table views for integrative visualization, with extensive molecular and functional annotations to facilitate exploration of biological functions for any RRI of interest. PMID:29040625
AtlasCBS: a web server to map and explore chemico-biological space
NASA Astrophysics Data System (ADS)
Cortés-Cabrera, Álvaro; Morreale, Antonio; Gago, Federico; Abad-Zapatero, Celerino
2012-09-01
New approaches are needed that can help decrease the unsustainable failure in small-molecule drug discovery. Ligand Efficiency Indices (LEI) are making a great impact on early-stage compound selection and prioritization. Given a target-ligand database with chemical structures and associated biological affinities/activities for a target, the AtlasCBS server generates two-dimensional, dynamical representations of its contents in terms of LEI. These variables allow an effective decoupling of the chemical (angular) and biological (radial) components. BindingDB, PDBBind and ChEMBL databases are currently implemented. Proprietary datasets can also be uploaded and compared. The utility of this atlas-like representation in the future of drug design is highlighted with some examples. The web server can be accessed at http://ub.cbm.uam.es/atlascbs and https://www.ebi.ac.uk/chembl/atlascbs.
Application of connectivity mapping in predictive toxicology based on gene-expression similarity.
Smalley, Joshua L; Gant, Timothy W; Zhang, Shu-Dong
2010-02-09
Connectivity mapping is the process of establishing connections between different biological states using gene-expression profiles or signatures. There are a number of applications but in toxicology the most pertinent is for understanding mechanisms of toxicity. In its essence the process involves comparing a query gene signature generated as a result of exposure of a biological system to a chemical to those in a database that have been previously derived. In the ideal situation the query gene-expression signature is characteristic of the event and will be matched to similar events in the database. Key criteria are therefore the means of choosing the signature to be matched and the means by which the match is made. In this article we explore these concepts with examples applicable to toxicology. (c) 2009 Elsevier Ireland Ltd. All rights reserved.
AtlasCBS: a web server to map and explore chemico-biological space.
Cortés-Cabrera, Alvaro; Morreale, Antonio; Gago, Federico; Abad-Zapatero, Celerino
2012-09-01
New approaches are needed that can help decrease the unsustainable failure in small-molecule drug discovery. Ligand Efficiency Indices (LEI) are making a great impact on early-stage compound selection and prioritization. Given a target-ligand database with chemical structures and associated biological affinities/activities for a target, the AtlasCBS server generates two-dimensional, dynamical representations of its contents in terms of LEI. These variables allow an effective decoupling of the chemical (angular) and biological (radial) components. BindingDB, PDBBind and ChEMBL databases are currently implemented. Proprietary datasets can also be uploaded and compared. The utility of this atlas-like representation in the future of drug design is highlighted with some examples. The web server can be accessed at http://ub.cbm.uam.es/atlascbs and https://www.ebi.ac.uk/chembl/atlascbs.
BioQ: tracing experimental origins in public genomic databases using a novel data provenance model.
Saccone, Scott F; Quan, Jiaxi; Jones, Peter L
2012-04-15
Public genomic databases, which are often used to guide genetic studies of human disease, are now being applied to genomic medicine through in silico integrative genomics. These databases, however, often lack tools for systematically determining the experimental origins of the data. We introduce a new data provenance model that we have implemented in a public web application, BioQ, for assessing the reliability of the data by systematically tracing its experimental origins to the original subjects and biologics. BioQ allows investigators to both visualize data provenance as well as explore individual elements of experimental process flow using precise tools for detailed data exploration and documentation. It includes a number of human genetic variation databases such as the HapMap and 1000 Genomes projects. BioQ is freely available to the public at http://bioq.saclab.net.
Active Exploration of Large 3D Model Repositories.
Gao, Lin; Cao, Yan-Pei; Lai, Yu-Kun; Huang, Hao-Zhi; Kobbelt, Leif; Hu, Shi-Min
2015-12-01
With broader availability of large-scale 3D model repositories, the need for efficient and effective exploration becomes more and more urgent. Existing model retrieval techniques do not scale well with the size of the database since often a large number of very similar objects are returned for a query, and the possibilities to refine the search are quite limited. We propose an interactive approach where the user feeds an active learning procedure by labeling either entire models or parts of them as "like" or "dislike" such that the system can automatically update an active set of recommended models. To provide an intuitive user interface, candidate models are presented based on their estimated relevance for the current query. From the methodological point of view, our main contribution is to exploit not only the similarity between a query and the database models but also the similarities among the database models themselves. We achieve this by an offline pre-processing stage, where global and local shape descriptors are computed for each model and a sparse distance metric is derived that can be evaluated efficiently even for very large databases. We demonstrate the effectiveness of our method by interactively exploring a repository containing over 100 K models.
A Brief Review of RNA–Protein Interaction Database Resources
Yi, Ying; Zhao, Yue; Huang, Yan; Wang, Dong
2017-01-01
RNA–Protein interactions play critical roles in various biological processes. By collecting and analyzing the RNA–Protein interactions and binding sites from experiments and predictions, RNA–Protein interaction databases have become an essential resource for the exploration of the transcriptional and post-transcriptional regulatory network. Here, we briefly review several widely used RNA–Protein interaction database resources developed in recent years to provide a guide of these databases. The content and major functions in databases are presented. The brief description of database helps users to quickly choose the database containing information they interested. In short, these RNA–Protein interaction database resources are continually updated, but the current state shows the efforts to identify and analyze the large amount of RNA–Protein interactions. PMID:29657278
Mansel, Charlotte; Davies, Sharon
2012-10-01
There are currently over 250,000 children between the ages of 10 and 18 years who have their genetic information stored on the National DNA Database. This paper explores the legal and ethical issues surrounding this controversial subject, with particular focus on juvenile capacity and the potential results of criminalizing young children and adolescents. The implications of the adverse legal judgement of the European Court of Human Rights in S and Marper v UK (2008) and the violation of Article 8 of the Convention are discussed. The authors have considered the requirement to balance the rights of the individual, particularly those of minors, against the need to protect the public and have compared the position in Scotland to that of the rest of the UK. The authors conclude that a more ethically acceptable alternative could be the creation of a separate forensic database for children aged 10-18 years, set up to safeguard the interests of those who have not been convicted of any crime.
Norum, Jan; Olsen, Aina Iren; Nohr, Frank Ivar; Heyd, Anca; Totth, Arpad
2014-01-01
Objectives: Attention-deficit/hyperactivity disorder (ADHD) is a lifelong neurological condition with a profound effect on quality of life. Prescription databases may document pattern of use. In this study we aimed to explore the use in Norway employing such a database. Methods: All prescriptions on drugs for the treatment of ADHD between 2004 and 2011, as registered in the Norwegian Prescription Database (NPD) were analyzed. The following drugs were included: Amphetamine, dexamphetamine, methylphenidate and atomoxetine. In-hospital drug administration was excluded. Numbers of users per 1,000 inhabitants were calculated according to gender, age and residence. A sub-analysis compared users born in January-June with those born in July-December. Drug costs were calculated and converted into Euros (€ 1 = N.kr 7.4540). Results: Drugs for the treatment of ADHD was significantly more often prescribed in northern Norway than in any other Norwegian health region (P < 0.001). Within the northern region, Nordland County was the “culprit” (P < 0.02). Compared to Norwegian figures, significantly more females (aged 10-19 years) were treated in northern Norway [male/female ratios 3:1 and 2.2:1 (P < 0.01)] and especially in Nordland County (ratio 2.1:1). The subanalysis did not indicate a northern overtreatment of those being a younger group in their grade. The annual drug cost per user in Norway was € 919. Conclusions: The prescription rate was significantly higher in northern Norway and Nordland County was the culprit. A prescription database may be a tool for monitoring the national use of these drugs. PMID:24999151
Mathematical models for exploring different aspects of genotoxicity and carcinogenicity databases.
Benigni, R; Giuliani, A
1991-12-01
One great obstacle to understanding and using the information contained in the genotoxicity and carcinogenicity databases is the very size of such databases. Their vastness makes them difficult to read; this leads to inadequate exploitation of the information, which becomes costly in terms of time, labor, and money. In its search for adequate approaches to the problem, the scientific community has, curiously, almost entirely neglected an existent series of very powerful methods of data analysis: the multivariate data analysis techniques. These methods were specifically designed for exploring large data sets. This paper presents the multivariate techniques and reports a number of applications to genotoxicity problems. These studies show how biology and mathematical modeling can be combined and how successful this combination is.
The ADAMS interactive interpreter
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rietscha, E.R.
1990-12-17
The ADAMS (Advanced DAta Management System) project is exploring next generation database technology. Database management does not follow the usual programming paradigm. Instead, the database dictionary provides an additional name space environment that should be interactively created and tested before writing application code. This document describes the implementation and operation of the ADAMS Interpreter, an interactive interface to the ADAMS data dictionary and runtime system. The Interpreter executes individual statements of the ADAMS Interface Language, providing a fast, interactive mechanism to define and access persistent databases. 5 refs.
Sperrin, Matthew; Rushton, Helen; Dixon, William G; Normand, Alexis; Villard, Joffrey; Chieh, Angela; Buchan, Iain
2016-01-21
Digital self-monitoring, particularly of weight, is increasingly prevalent. The associated data could be reused for clinical and research purposes. The aim was to compare participants who use connected smart scale technologies with the general population and explore how use of smart scale technology affects, or is affected by, weight change. This was a retrospective study comparing 2 databases: (1) the longitudinal height and weight measurement database of smart scale users and (2) the Health Survey for England, a cross-sectional survey of the general population in England. Baseline comparison was of body mass index (BMI) in the 2 databases via a regression model. For exploring engagement with the technology, two analyses were performed: (1) a regression model of BMI change predicted by measures of engagement and (2) a recurrent event survival analysis with instantaneous probability of a subsequent self-weighing predicted by previous BMI change. Among women, users of self-weighing technology had a mean BMI of 1.62 kg/m(2) (95% CI 1.03-2.22) lower than the general population (of the same age and height) (P<.001). Among men, users had a mean BMI of 1.26 kg/m(2) (95% CI 0.84-1.69) greater than the general population (of the same age and height) (P<.001). Reduction in BMI was independently associated with greater engagement with self-weighing. Self-weighing events were more likely when users had recently reduced their BMI. Users of self-weighing technology are a selected sample of the general population and this must be accounted for in studies that employ these data. Engagement with self-weighing is associated with recent weight change; more research is needed to understand the extent to which weight change encourages closer monitoring versus closer monitoring driving the weight change. The concept of isolated measures needs to give way to one of connected health metrics.
An open experimental database for exploring inorganic materials
Zakutayev, Andriy; Wunder, Nick; Schwarting, Marcus; ...
2018-04-03
The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half ofmore » these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.« less
An open experimental database for exploring inorganic materials.
Zakutayev, Andriy; Wunder, Nick; Schwarting, Marcus; Perkins, John D; White, Robert; Munch, Kristin; Tumas, William; Phillips, Caleb
2018-04-03
The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half of these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.
An open experimental database for exploring inorganic materials
Zakutayev, Andriy; Wunder, Nick; Schwarting, Marcus; Perkins, John D.; White, Robert; Munch, Kristin; Tumas, William; Phillips, Caleb
2018-01-01
The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half of these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource. PMID:29611842
An open experimental database for exploring inorganic materials
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zakutayev, Andriy; Wunder, Nick; Schwarting, Marcus
The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half ofmore » these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.« less
NASA Technical Reports Server (NTRS)
Donnellan, Andrea; Parker, Jay W.; Lyzenga, Gregory A.; Granat, Robert A.; Norton, Charles D.; Rundle, John B.; Pierce, Marlon E.; Fox, Geoffrey C.; McLeod, Dennis; Ludwig, Lisa Grant
2012-01-01
QuakeSim 2.0 improves understanding of earthquake processes by providing modeling tools and integrating model applications and various heterogeneous data sources within a Web services environment. QuakeSim is a multisource, synergistic, data-intensive environment for modeling the behavior of earthquake faults individually, and as part of complex interacting systems. Remotely sensed geodetic data products may be explored, compared with faults and landscape features, mined by pattern analysis applications, and integrated with models and pattern analysis applications in a rich Web-based and visualization environment. Integration of heterogeneous data products with pattern informatics tools enables efficient development of models. Federated database components and visualization tools allow rapid exploration of large datasets, while pattern informatics enables identification of subtle, but important, features in large data sets. QuakeSim is valuable for earthquake investigations and modeling in its current state, and also serves as a prototype and nucleus for broader systems under development. The framework provides access to physics-based simulation tools that model the earthquake cycle and related crustal deformation. Spaceborne GPS and Inter ferometric Synthetic Aperture (InSAR) data provide information on near-term crustal deformation, while paleoseismic geologic data provide longerterm information on earthquake fault processes. These data sources are integrated into QuakeSim's QuakeTables database system, and are accessible by users or various model applications. UAVSAR repeat pass interferometry data products are added to the QuakeTables database, and are available through a browseable map interface or Representational State Transfer (REST) interfaces. Model applications can retrieve data from Quake Tables, or from third-party GPS velocity data services; alternatively, users can manually input parameters into the models. Pattern analysis of GPS and seismicity data has proved useful for mid-term forecasting of earthquakes, and for detecting subtle changes in crustal deformation. The GPS time series analysis has also proved useful as a data-quality tool, enabling the discovery of station anomalies and data processing and distribution errors. Improved visualization tools enable more efficient data exploration and understanding. Tools provide flexibility to science users for exploring data in new ways through download links, but also facilitate standard, intuitive, and routine uses for science users and end users such as emergency responders.
Wilbur, W. John
2012-01-01
The Comparative Toxicogenomics Database (CTD) contains manually curated literature that describes chemical–gene interactions, chemical–disease relationships and gene–disease relationships. Finding articles containing this information is the first and an important step to assist manual curation efficiency. However, the complex nature of named entities and their relationships make it challenging to choose relevant articles. In this article, we introduce a machine learning framework for prioritizing CTD-relevant articles based on our prior system for the protein–protein interaction article classification task in BioCreative III. To address new challenges in the CTD task, we explore a new entity identification method for genes, chemicals and diseases. In addition, latent topics are analyzed and used as a feature type to overcome the small size of the training set. Applied to the BioCreative 2012 Triage dataset, our method achieved 0.8030 mean average precision (MAP) in the official runs, resulting in the top MAP system among participants. Integrated with PubTator, a Web interface for annotating biomedical literature, the proposed system also received a positive review from the CTD curation team. PMID:23160415
Kim, Sun; Kim, Won; Wei, Chih-Hsuan; Lu, Zhiyong; Wilbur, W John
2012-01-01
The Comparative Toxicogenomics Database (CTD) contains manually curated literature that describes chemical-gene interactions, chemical-disease relationships and gene-disease relationships. Finding articles containing this information is the first and an important step to assist manual curation efficiency. However, the complex nature of named entities and their relationships make it challenging to choose relevant articles. In this article, we introduce a machine learning framework for prioritizing CTD-relevant articles based on our prior system for the protein-protein interaction article classification task in BioCreative III. To address new challenges in the CTD task, we explore a new entity identification method for genes, chemicals and diseases. In addition, latent topics are analyzed and used as a feature type to overcome the small size of the training set. Applied to the BioCreative 2012 Triage dataset, our method achieved 0.8030 mean average precision (MAP) in the official runs, resulting in the top MAP system among participants. Integrated with PubTator, a Web interface for annotating biomedical literature, the proposed system also received a positive review from the CTD curation team.
mirEX: a platform for comparative exploration of plant pri-miRNA expression data.
Bielewicz, Dawid; Dolata, Jakub; Zielezinski, Andrzej; Alaba, Sylwia; Szarzynska, Bogna; Szczesniak, Michal W; Jarmolowski, Artur; Szweykowska-Kulinska, Zofia; Karlowski, Wojciech M
2012-01-01
mirEX is a comprehensive platform for comparative analysis of primary microRNA expression data. RT-qPCR-based gene expression profiles are stored in a universal and expandable database scheme and wrapped by an intuitive user-friendly interface. A new way of accessing gene expression data in mirEX includes a simple mouse operated querying system and dynamic graphs for data mining analyses. In contrast to other publicly available databases, the mirEX interface allows a simultaneous comparison of expression levels between various microRNA genes in diverse organs and developmental stages. Currently, mirEX integrates information about the expression profile of 190 Arabidopsis thaliana pri-miRNAs in seven different developmental stages: seeds, seedlings and various organs of mature plants. Additionally, by providing RNA structural models, publicly available deep sequencing results, experimental procedure details and careful selection of auxiliary data in the form of web links, mirEX can function as a one-stop solution for Arabidopsis microRNA information. A web-based mirEX interface can be accessed at http://bioinfo.amu.edu.pl/mirex.
A geographically-diverse collection of 418 human gut microbiome pathway genome databases
Hahn, Aria S.; Altman, Tomer; Konwar, Kishori M.; Hanson, Niels W.; Kim, Dongjae; Relman, David A.; Dill, David L.; Hallam, Steven J.
2017-01-01
Advances in high-throughput sequencing are reshaping how we perceive microbial communities inhabiting the human body, with implications for therapeutic interventions. Several large-scale datasets derived from hundreds of human microbiome samples sourced from multiple studies are now publicly available. However, idiosyncratic data processing methods between studies introduce systematic differences that confound comparative analyses. To overcome these challenges, we developed GutCyc, a compendium of environmental pathway genome databases (ePGDBs) constructed from 418 assembled human microbiome datasets using MetaPathways, enabling reproducible functional metagenomic annotation. We also generated metabolic network reconstructions for each metagenome using the Pathway Tools software, empowering researchers and clinicians interested in visualizing and interpreting metabolic pathways encoded by the human gut microbiome. For the first time, GutCyc provides consistent annotations and metabolic pathway predictions, making possible comparative community analyses between health and disease states in inflammatory bowel disease, Crohn’s disease, and type 2 diabetes. GutCyc data products are searchable online, or may be downloaded and explored locally using MetaPathways and Pathway Tools. PMID:28398290
Dynamic taxonomies applied to a web-based relational database for geo-hydrological risk mitigation
NASA Astrophysics Data System (ADS)
Sacco, G. M.; Nigrelli, G.; Bosio, A.; Chiarle, M.; Luino, F.
2012-02-01
In its 40 years of activity, the Research Institute for Geo-hydrological Protection of the Italian National Research Council has amassed a vast and varied collection of historical documentation on landslides, muddy-debris flows, and floods in northern Italy from 1600 to the present. Since 2008, the archive resources have been maintained through a relational database management system. The database is used for routine study and research purposes as well as for providing support during geo-hydrological emergencies, when data need to be quickly and accurately retrieved. Retrieval speed and accuracy are the main objectives of an implementation based on a dynamic taxonomies model. Dynamic taxonomies are a general knowledge management model for configuring complex, heterogeneous information bases that support exploratory searching. At each stage of the process, the user can explore or browse the database in a guided yet unconstrained way by selecting the alternatives suggested for further refining the search. Dynamic taxonomies have been successfully applied to such diverse and apparently unrelated domains as e-commerce and medical diagnosis. Here, we describe the application of dynamic taxonomies to our database and compare it to traditional relational database query methods. The dynamic taxonomy interface, essentially a point-and-click interface, is considerably faster and less error-prone than traditional form-based query interfaces that require the user to remember and type in the "right" search keywords. Finally, dynamic taxonomy users have confirmed that one of the principal benefits of this approach is the confidence of having considered all the relevant information. Dynamic taxonomies and relational databases work in synergy to provide fast and precise searching: one of the most important factors in timely response to emergencies.
Dececchi, T Alex; Mabee, Paula M; Blackburn, David C
2016-01-01
Databases of organismal traits that aggregate information from one or multiple sources can be leveraged for large-scale analyses in biology. Yet the differences among these data streams and how well they capture trait diversity have never been explored. We present the first analysis of the differences between phenotypes captured in free text of descriptive publications ('monographs') and those used in phylogenetic analyses ('matrices'). We focus our analysis on osteological phenotypes of the limbs of four extinct vertebrate taxa critical to our understanding of the fin-to-limb transition. We find that there is low overlap between the anatomical entities used in these two sources of phenotype data, indicating that phenotypes represented in matrices are not simply a subset of those found in monographic descriptions. Perhaps as expected, compared to characters found in matrices, phenotypes in monographs tend to emphasize descriptive and positional morphology, be somewhat more complex, and relate to fewer additional taxa. While based on a small set of focal taxa, these qualitative and quantitative data suggest that either source of phenotypes alone will result in incomplete knowledge of variation for a given taxon. As a broader community develops to use and expand databases characterizing organismal trait diversity, it is important to recognize the limitations of the data sources and develop strategies to more fully characterize variation both within species and across the tree of life.
Dececchi, T. Alex; Mabee, Paula M.; Blackburn, David C.
2016-01-01
Databases of organismal traits that aggregate information from one or multiple sources can be leveraged for large-scale analyses in biology. Yet the differences among these data streams and how well they capture trait diversity have never been explored. We present the first analysis of the differences between phenotypes captured in free text of descriptive publications (‘monographs’) and those used in phylogenetic analyses (‘matrices’). We focus our analysis on osteological phenotypes of the limbs of four extinct vertebrate taxa critical to our understanding of the fin-to-limb transition. We find that there is low overlap between the anatomical entities used in these two sources of phenotype data, indicating that phenotypes represented in matrices are not simply a subset of those found in monographic descriptions. Perhaps as expected, compared to characters found in matrices, phenotypes in monographs tend to emphasize descriptive and positional morphology, be somewhat more complex, and relate to fewer additional taxa. While based on a small set of focal taxa, these qualitative and quantitative data suggest that either source of phenotypes alone will result in incomplete knowledge of variation for a given taxon. As a broader community develops to use and expand databases characterizing organismal trait diversity, it is important to recognize the limitations of the data sources and develop strategies to more fully characterize variation both within species and across the tree of life. PMID:27191170
Developing a national strategy to prevent dementia: Leon Thal Symposium 2009.
Khachaturian, Zaven S; Barnes, Deborah; Einstein, Richard; Johnson, Sterling; Lee, Virginia; Roses, Allen; Sager, Mark A; Shankle, William R; Snyder, Peter J; Petersen, Ronald C; Schellenberg, Gerard; Trojanowski, John; Aisen, Paul; Albert, Marilyn S; Breitner, John C S; Buckholtz, Neil; Carrillo, Maria; Ferris, Steven; Greenberg, Barry D; Grundman, Michael; Khachaturian, Ara S; Kuller, Lewis H; Lopez, Oscar L; Maruff, Paul; Mohs, Richard C; Morrison-Bogorad, Marcelle; Phelps, Creighton; Reiman, Eric; Sabbagh, Marwan; Sano, Mary; Schneider, Lon S; Siemers, Eric; Tariot, Pierre; Touchon, Jacques; Vellas, Bruno; Bain, Lisa J
2010-03-01
Among the major impediments to the design of clinical trials for the prevention of Alzheimer's disease (AD), the most critical is the lack of validated biomarkers, assessment tools, and algorithms that would facilitate identification of asymptomatic individuals with elevated risk who might be recruited as study volunteers. Thus, the Leon Thal Symposium 2009 (LTS'09), on October 27-28, 2009 in Las Vegas, Nevada, was convened to explore strategies to surmount the barriers in designing a multisite, comparative study to evaluate and validate various approaches for detecting and selecting asymptomatic people at risk for cognitive disorders/dementia. The deliberations of LTS'09 included presentations and reviews of different approaches (algorithms, biomarkers, or measures) for identifying asymptomatic individuals at elevated risk for AD who would be candidates for longitudinal or prevention studies. The key nested recommendations of LTS'09 included: (1) establishment of a National Database for Longitudinal Studies as a shared research core resource; (2) launch of a large collaborative study that will compare multiple screening approaches and biomarkers to determine the best method for identifying asymptomatic people at risk for AD; (3) initiation of a Global Database that extends the concept of the National Database for Longitudinal Studies for longitudinal studies beyond the United States; and (4) development of an educational campaign that will address public misconceptions about AD and promote healthy brain aging. 2010. Published by Elsevier Inc.
PhyloExplorer: a web server to validate, explore and query phylogenetic trees
Ranwez, Vincent; Clairon, Nicolas; Delsuc, Frédéric; Pourali, Saeed; Auberval, Nicolas; Diser, Sorel; Berry, Vincent
2009-01-01
Background Many important problems in evolutionary biology require molecular phylogenies to be reconstructed. Phylogenetic trees must then be manipulated for subsequent inclusion in publications or analyses such as supertree inference and tree comparisons. However, no tool is currently available to facilitate the management of tree collections providing, for instance: standardisation of taxon names among trees with respect to a reference taxonomy; selection of relevant subsets of trees or sub-trees according to a taxonomic query; or simply computation of descriptive statistics on the collection. Moreover, although several databases of phylogenetic trees exist, there is currently no easy way to find trees that are both relevant and complementary to a given collection of trees. Results We propose a tool to facilitate assessment and management of phylogenetic tree collections. Given an input collection of rooted trees, PhyloExplorer provides facilities for obtaining statistics describing the collection, correcting invalid taxon names, extracting taxonomically relevant parts of the collection using a dedicated query language, and identifying related trees in the TreeBASE database. Conclusion PhyloExplorer is a simple and interactive website implemented through underlying Python libraries and MySQL databases. It is available at: and the source code can be downloaded from: . PMID:19450253
Benigni, Romualdo; Bossa, Cecilia; Richard, Ann M; Yang, Chihae
2008-01-01
Mutagenicity and carcinogenicity databases are crucial resources for toxicologists and regulators involved in chemicals risk assessment. Until recently, existing public toxicity databases have been constructed primarily as "look-up-tables" of existing data, and most often did not contain chemical structures. Concepts and technologies originated from the structure-activity relationships science have provided powerful tools to create new types of databases, where the effective linkage of chemical toxicity with chemical structure can facilitate and greatly enhance data gathering and hypothesis generation, by permitting: a) exploration across both chemical and biological domains; and b) structure-searchability through the data. This paper reviews the main public databases, together with the progress in the field of chemical relational databases, and presents the ISSCAN database on experimental chemical carcinogens.
Jeffryes, James G; Colastani, Ricardo L; Elbadawi-Sidhu, Mona; Kind, Tobias; Niehaus, Thomas D; Broadbelt, Linda J; Hanson, Andrew D; Fiehn, Oliver; Tyo, Keith E J; Henry, Christopher S
2015-01-01
In spite of its great promise, metabolomics has proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography-mass spectrometry (LC-MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likely to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC-MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose results include irrelevant synthetic compounds. Furthermore, MINEs complement and expand on previous in silico generated compound databases that focus on human metabolism. We are actively developing the database; future versions of this resource will incorporate transformation rules for spontaneous chemical reactions and more advanced filtering and prioritization of candidate structures. Graphical abstractMINE database construction and access methods. The process of constructing a MINE database from the curated source databases is depicted on the left. The methods for accessing the database are shown on the right.
Evaluating deep learning architectures for Speech Emotion Recognition.
Fayek, Haytham M; Lech, Margaret; Cavedon, Lawrence
2017-08-01
Speech Emotion Recognition (SER) can be regarded as a static or dynamic classification problem, which makes SER an excellent test bed for investigating and comparing various deep learning architectures. We describe a frame-based formulation to SER that relies on minimal speech processing and end-to-end deep learning to model intra-utterance dynamics. We use the proposed SER system to empirically explore feed-forward and recurrent neural network architectures and their variants. Experiments conducted illuminate the advantages and limitations of these architectures in paralinguistic speech recognition and emotion recognition in particular. As a result of our exploration, we report state-of-the-art results on the IEMOCAP database for speaker-independent SER and present quantitative and qualitative assessments of the models' performances. Copyright © 2017 Elsevier Ltd. All rights reserved.
Object technology: A white paper
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jordan, S.R.; Arrowood, L.F.; Cain, W.D.
1992-05-11
Object-Oriented Technology (OOT), although not a new paradigm, has recently been prominently featured in the trade press and even general business publications. Indeed, the promises of object technology are alluring: the ability to handle complex design and engineering information through the full manufacturing production life cycle or to manipulate multimedia information, and the ability to improve programmer productivity in creating and maintaining high quality software. Groups at a number of the DOE facilities have been exploring the use of object technology for engineering, business, and other applications. In this white paper, the technology is explored thoroughly and compared with previousmore » means of developing software and storing databases of information. Several specific projects within the DOE Complex are described, and the state of the commercial marketplace is indicated.« less
The brain MRI classification problem from wavelets perspective
NASA Astrophysics Data System (ADS)
Bendib, Mohamed M.; Merouani, Hayet F.; Diaba, Fatma
2015-02-01
Haar and Daubechies 4 (DB4) are the most used wavelets for brain MRI (Magnetic Resonance Imaging) classification. The former is simple and fast to compute while the latter is more complex and offers a better resolution. This paper explores the potential of both of them in performing Normal versus Pathological discrimination on the one hand, and Multiclassification on the other hand. The Whole Brain Atlas is used as a validation database, and the Random Forest (RF) algorithm is employed as a learning approach. The achieved results are discussed and statistically compared.
The development of a virtual camera system for astronaut-rover planetary exploration.
Platt, Donald W; Boy, Guy A
2012-01-01
A virtual assistant is being developed for use by astronauts as they use rovers to explore the surface of other planets. This interactive database, called the Virtual Camera (VC), is an interactive database that allows the user to have better situational awareness for exploration. It can be used for training, data analysis and augmentation of actual surface exploration. This paper describes the development efforts and Human-Computer Interaction considerations for implementing a first-generation VC on a tablet mobile computer device. Scenarios for use will be presented. Evaluation and success criteria such as efficiency in terms of processing time and precision situational awareness, learnability, usability, and robustness will also be presented. Initial testing and the impact of HCI design considerations of manipulation and improvement in situational awareness using a prototype VC will be discussed.
DataSpread: Unifying Databases and Spreadsheets.
Bendre, Mangesh; Sun, Bofan; Zhang, Ding; Zhou, Xinyan; Chang, Kevin ChenChuan; Parameswaran, Aditya
2015-08-01
Spreadsheet software is often the tool of choice for ad-hoc tabular data management, processing, and visualization, especially on tiny data sets. On the other hand, relational database systems offer significant power, expressivity, and efficiency over spreadsheet software for data management, while lacking in the ease of use and ad-hoc analysis capabilities. We demonstrate DataSpread, a data exploration tool that holistically unifies databases and spreadsheets. It continues to offer a Microsoft Excel-based spreadsheet front-end, while in parallel managing all the data in a back-end database, specifically, PostgreSQL. DataSpread retains all the advantages of spreadsheets, including ease of use, ad-hoc analysis and visualization capabilities, and a schema-free nature, while also adding the advantages of traditional relational databases, such as scalability and the ability to use arbitrary SQL to import, filter, or join external or internal tables and have the results appear in the spreadsheet. DataSpread needs to reason about and reconcile differences in the notions of schema, addressing of cells and tuples, and the current "pane" (which exists in spreadsheets but not in traditional databases), and support data modifications at both the front-end and the back-end. Our demonstration will center on our first and early prototype of the DataSpread, and will give the attendees a sense for the enormous data exploration capabilities offered by unifying spreadsheets and databases.
DataSpread: Unifying Databases and Spreadsheets
Bendre, Mangesh; Sun, Bofan; Zhang, Ding; Zhou, Xinyan; Chang, Kevin ChenChuan; Parameswaran, Aditya
2015-01-01
Spreadsheet software is often the tool of choice for ad-hoc tabular data management, processing, and visualization, especially on tiny data sets. On the other hand, relational database systems offer significant power, expressivity, and efficiency over spreadsheet software for data management, while lacking in the ease of use and ad-hoc analysis capabilities. We demonstrate DataSpread, a data exploration tool that holistically unifies databases and spreadsheets. It continues to offer a Microsoft Excel-based spreadsheet front-end, while in parallel managing all the data in a back-end database, specifically, PostgreSQL. DataSpread retains all the advantages of spreadsheets, including ease of use, ad-hoc analysis and visualization capabilities, and a schema-free nature, while also adding the advantages of traditional relational databases, such as scalability and the ability to use arbitrary SQL to import, filter, or join external or internal tables and have the results appear in the spreadsheet. DataSpread needs to reason about and reconcile differences in the notions of schema, addressing of cells and tuples, and the current “pane” (which exists in spreadsheets but not in traditional databases), and support data modifications at both the front-end and the back-end. Our demonstration will center on our first and early prototype of the DataSpread, and will give the attendees a sense for the enormous data exploration capabilities offered by unifying spreadsheets and databases. PMID:26900487
Definition of architectural ideotypes for good yield capacity in Coffea canephora.
Cilas, Christian; Bar-Hen, Avner; Montagnon, Christophe; Godin, Christophe
2006-03-01
Yield capacity is a target trait for selection of agronomically desirable lines; it is preferred to simple yields recorded over different harvests. Yield capacity is derived using certain architectural parameters used to measure the components of yield capacity. Observation protocols for describing architecture and yield capacity were applied to six clones of coffee trees (Coffea canephora) in a comparative trial. The observations were used to establish architectural databases, which were explored using AMAPmod, a software dedicated to the analyses of plant architecture data. The traits extracted from the database were used to identify architectural parameters for predicting the yield of the plant material studied. Architectural traits are highly heritable and some display strong genetic correlations with cumulated yield. In particular, the proportion of fruiting nodes at plagiotropic level 15 counting from the top of the tree proved to be a good predictor of yield over two fruiting cycles.
Exploring human disease using the Rat Genome Database.
Shimoyama, Mary; Laulederkind, Stanley J F; De Pons, Jeff; Nigam, Rajni; Smith, Jennifer R; Tutaj, Marek; Petri, Victoria; Hayman, G Thomas; Wang, Shur-Jen; Ghiasvand, Omid; Thota, Jyothi; Dwinell, Melinda R
2016-10-01
Rattus norvegicus, the laboratory rat, has been a crucial model for studies of the environmental and genetic factors associated with human diseases for over 150 years. It is the primary model organism for toxicology and pharmacology studies, and has features that make it the model of choice in many complex-disease studies. Since 1999, the Rat Genome Database (RGD; http://rgd.mcw.edu) has been the premier resource for genomic, genetic, phenotype and strain data for the laboratory rat. The primary role of RGD is to curate rat data and validate orthologous relationships with human and mouse genes, and make these data available for incorporation into other major databases such as NCBI, Ensembl and UniProt. RGD also provides official nomenclature for rat genes, quantitative trait loci, strains and genetic markers, as well as unique identifiers. The RGD team adds enormous value to these basic data elements through functional and disease annotations, the analysis and visual presentation of pathways, and the integration of phenotype measurement data for strains used as disease models. Because much of the rat research community focuses on understanding human diseases, RGD provides a number of datasets and software tools that allow users to easily explore and make disease-related connections among these datasets. RGD also provides comprehensive human and mouse data for comparative purposes, illustrating the value of the rat in translational research. This article introduces RGD and its suite of tools and datasets to researchers - within and beyond the rat community - who are particularly interested in leveraging rat-based insights to understand human diseases. © 2016. Published by The Company of Biologists Ltd.
BioCarian: search engine for exploratory searches in heterogeneous biological databases.
Zaki, Nazar; Tennakoon, Chandana
2017-10-02
There are a large number of biological databases publicly available for scientists in the web. Also, there are many private databases generated in the course of research projects. These databases are in a wide variety of formats. Web standards have evolved in the recent times and semantic web technologies are now available to interconnect diverse and heterogeneous sources of data. Therefore, integration and querying of biological databases can be facilitated by techniques used in semantic web. Heterogeneous databases can be converted into Resource Description Format (RDF) and queried using SPARQL language. Searching for exact queries in these databases is trivial. However, exploratory searches need customized solutions, especially when multiple databases are involved. This process is cumbersome and time consuming for those without a sufficient background in computer science. In this context, a search engine facilitating exploratory searches of databases would be of great help to the scientific community. We present BioCarian, an efficient and user-friendly search engine for performing exploratory searches on biological databases. The search engine is an interface for SPARQL queries over RDF databases. We note that many of the databases can be converted to tabular form. We first convert the tabular databases to RDF. The search engine provides a graphical interface based on facets to explore the converted databases. The facet interface is more advanced than conventional facets. It allows complex queries to be constructed, and have additional features like ranking of facet values based on several criteria, visually indicating the relevance of a facet value and presenting the most important facet values when a large number of choices are available. For the advanced users, SPARQL queries can be run directly on the databases. Using this feature, users will be able to incorporate federated searches of SPARQL endpoints. We used the search engine to do an exploratory search on previously published viral integration data and were able to deduce the main conclusions of the original publication. BioCarian is accessible via http://www.biocarian.com . We have developed a search engine to explore RDF databases that can be used by both novice and advanced users.
Empirical evidence of the importance of comparative studies of diagnostic test accuracy.
Takwoingi, Yemisi; Leeflang, Mariska M G; Deeks, Jonathan J
2013-04-02
Systematic reviews that "compare" the accuracy of 2 or more tests often include different sets of studies for each test. To investigate the availability of direct comparative studies of test accuracy and to assess whether summary estimates of accuracy differ between meta-analyses of noncomparative and comparative studies. Systematic reviews in any language from the Database of Abstracts of Reviews of Effects and the Cochrane Database of Systematic Reviews from 1994 to October 2012. 1 of 2 assessors selected reviews that evaluated at least 2 tests and identified meta-analyses that included both noncomparative studies and comparative studies. 1 of 3 assessors extracted data about review and study characteristics and test performance. 248 reviews compared test accuracy; of the 6915 studies, 2113 (31%) were comparative. Thirty-six reviews (with 52 meta-analyses) had adequate studies to compare results of noncomparative and comparative studies by using a hierarchical summary receiver-operating characteristic meta-regression model for each test comparison. In 10 meta-analyses, noncomparative studies ranked tests in the opposite order of comparative studies. A total of 25 meta-analyses showed more than a 2-fold discrepancy in the relative diagnostic odds ratio between noncomparative and comparative studies. Differences in accuracy estimates between noncomparative and comparative studies were greater than expected by chance (P < 0.001). A paucity of comparative studies limited exploration of direction in bias. Evidence derived from noncomparative studies often differs from that derived from comparative studies. Robustly designed studies in which all patients receive all tests or are randomly assigned to receive one or other of the tests should be more routinely undertaken and are preferred for evidence to guide test selection. National Institute for Health Research (United Kingdom).
ERIC Educational Resources Information Center
Lamothe, Alain R.
2011-01-01
The purpose of this paper is to report the results of a quantitative analysis exploring the interaction and relationship between the online database and electronic journal collections at the J. N. Desmarais Library of Laurentian University. A very strong relationship exists between the number of searches and the size of the online database…
Jeffryes, James G.; Colastani, Ricardo L.; Elbadawi-Sidhu, Mona; ...
2015-08-28
Metabolomics have proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography–mass spectrometry (LC–MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likelymore » to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC–MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose results include irrelevant synthetic compounds. MINEs complement and expand on previous in silico generated compound databases that focus on human metabolism. We are actively developing the database; future versions of this resource will incorporate transformation rules for spontaneous chemical reactions and more advanced filtering and prioritization of candidate structures.« less
Response to Pilaar Burch and Graham
USDA-ARS?s Scientific Manuscript database
We are delighted that our call for IsoBank, a database for isotopes, has generated interest among our colleagues, and we applaud Pilaar Birch and Graham in their letter for offering a potential repository, Neotoma Paleoecological Database. Their suggestion is promising, and should be explored. We en...
BioQ: tracing experimental origins in public genomic databases using a novel data provenance model
Saccone, Scott F.; Quan, Jiaxi; Jones, Peter L.
2012-01-01
Motivation: Public genomic databases, which are often used to guide genetic studies of human disease, are now being applied to genomic medicine through in silico integrative genomics. These databases, however, often lack tools for systematically determining the experimental origins of the data. Results: We introduce a new data provenance model that we have implemented in a public web application, BioQ, for assessing the reliability of the data by systematically tracing its experimental origins to the original subjects and biologics. BioQ allows investigators to both visualize data provenance as well as explore individual elements of experimental process flow using precise tools for detailed data exploration and documentation. It includes a number of human genetic variation databases such as the HapMap and 1000 Genomes projects. Availability and implementation: BioQ is freely available to the public at http://bioq.saclab.net Contact: ssaccone@wustl.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22426342
Southan, Christopher; Várkonyi, Péter; Muresan, Sorel
2009-07-06
Since 2004 public cheminformatic databases and their collective functionality for exploring relationships between compounds, protein sequences, literature and assay data have advanced dramatically. In parallel, commercial sources that extract and curate such relationships from journals and patents have also been expanding. This work updates a previous comparative study of databases chosen because of their bioactive content, availability of downloads and facility to select informative subsets. Where they could be calculated, extracted compounds-per-journal article were in the range of 12 to 19 but compound-per-protein counts increased with document numbers. Chemical structure filtration to facilitate standardised comparisons typically reduced source counts by between 5% and 30%. The pair-wise overlaps between 23 databases and subsets were determined, as well as changes between 2006 and 2008. While all compound sets have increased, PubChem has doubled to 14.2 million. The 2008 comparison matrix shows not only overlap but also unique content across all sources. Many of the detailed differences could be attributed to individual strategies for data selection and extraction. While there was a big increase in patent-derived structures entering PubChem since 2006, GVKBIO contains over 0.8 million unique structures from this source. Venn diagrams showed extensive overlap between compounds extracted by independent expert curation from journals by GVKBIO, WOMBAT (both commercial) and BindingDB (public) but each included unique content. In contrast, the approved drug collections from GVKBIO, MDDR (commercial) and DrugBank (public) showed surprisingly low overlap. Aggregating all commercial sources established that while 1 million compounds overlapped with PubChem 1.2 million did not. On the basis of chemical structure content per se public sources have covered an increasing proportion of commercial databases over the last two years. However, commercial products included in this study provide links between compounds and information from patents and journals at a larger scale than current public efforts. They also continue to capture a significant proportion of unique content. Our results thus demonstrate not only an encouraging overall expansion of data-supported bioactive chemical space but also that both commercial and public sources are complementary for its exploration.
A database of new zeolite-like materials.
Pophale, Ramdas; Cheeseman, Phillip A; Deem, Michael W
2011-07-21
We here describe a database of computationally predicted zeolite-like materials. These crystals were discovered by a Monte Carlo search for zeolite-like materials. Positions of Si atoms as well as unit cell, space group, density, and number of crystallographically unique atoms were explored in the construction of this database. The database contains over 2.6 M unique structures. Roughly 15% of these are within +30 kJ mol(-1) Si of α-quartz, the band in which most of the known zeolites lie. These structures have topological, geometrical, and diffraction characteristics that are similar to those of known zeolites. The database is the result of refinement by two interatomic potentials that both satisfy the Pauli exclusion principle. The database has been deposited in the publicly available PCOD database and in www.hypotheticalzeolites.net/database/deem/. This journal is © the Owner Societies 2011
76 FR 67732 - Agency Information Collection Activities: Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2011-11-02
... proposed information collection project: ``Nursing Home Survey on Patient Safety Culture Comparative... Nursing Home Survey on Patient Safety Culture Comparative Database The Agency for Healthcare Research and... Culture (Nursing Home SOPS) Comparative Database. The Nursing Home SOPS Comparative Database consists of...
Hottes, Travis S.; Skowronski, Danuta M.; Hiebert, Brett; Janjua, Naveed Z.; Roos, Leslie L.; Van Caeseele, Paul; Law, Barbara J.; De Serres, Gaston
2011-01-01
Background Administrative databases provide efficient methods to estimate influenza vaccine effectiveness (IVE) against severe outcomes in the elderly but are prone to intractable bias. This study returns to one of the linked population databases by which IVE against hospitalization and death in the elderly was first assessed. We explore IVE across six more recent influenza seasons, including periods before, during, and after peak activity to identify potential markers for bias. Methods and Findings Acute respiratory hospitalization and all-cause mortality were compared between immunized/non-immunized community-dwelling seniors ≥65years through administrative databases in Manitoba, Canada between 2000-01 and 2005-06. IVE was compared during pre-season/influenza/post-season periods through logistic regression with multivariable adjustment (age/sex/income/residence/prior influenza or pneumococcal immunization/medical visits/comorbidity), stratification based on prior influenza immunization history, and propensity scores. Analysis during pre-season periods assessed baseline differences between immunized and unimmunized groups. The study population included ∼140,000 seniors, of whom 50–60% were immunized annually. Adjustment for key covariates and use of propensity scores consistently increased IVE. Estimates were paradoxically higher pre-season and for all-cause mortality vs. acute respiratory hospitalization. Stratified analysis showed that those twice consecutively and currently immunized were always at significantly lower hospitalization/mortality risk with odds ratios (OR) of 0.60 [95%CI0.48–0.75] and 0.58 [0.53–0.64] pre-season and 0.77 [0.69–0.86] and 0.71 [0.66–0.77] during influenza circulation, relative to the consistently unimmunized. Conversely, those forgoing immunization when twice previously immunized were always at significantly higher hospitalization/mortality risk with OR of 1.41 [1.14–1.73] and 2.45 [2.21–2.72] pre-season and 1.21 [1.03–1.43] and 1.78 [1.61–1.96] during influenza circulation. Conclusions The most pronounced IVE estimates were paradoxically observed pre-season, indicating bias tending to over-estimate vaccine protection. Change in immunization habit from that of the prior two years may be a marker for this bias in administrative data sets; however, no analytic technique explored could adjust for its influence. Improved methods to achieve valid interpretation of protection in the elderly are needed. PMID:21818350
CD-ROM End-User Instruction: A Planning Model.
ERIC Educational Resources Information Center
Johnson, Mary E.; Rosen, Barbara S.
1990-01-01
Discusses methods and content of library instruction for CD-ROM searching in terms of the needs of end-users. Instructional methods explored include staff instruction, structured instruction, database documentation, tutorials and help screens, and floaters. Suggestions for effective instruction in transfer of skills, database content, database…
Bourke, Jenny; Wong, Kingsley; Leonard, Helen
2018-01-23
To investigate how well intellectual disability (ID) can be ascertained using hospital morbidity data compared with a population-based data source. All children born in 1983-2010 with a hospital admission in the Western Australian Hospital Morbidity Data System (HMDS) were linked with the Western Australian Intellectual Disability Exploring Answers (IDEA) database. The International Classification of Diseases hospital codes consistent with ID were also identified. The characteristics of those children identified with ID through either or both sources were investigated. Of the 488 905 individuals in the study, 10 218 (2.1%) were identified with ID in either IDEA or HMDS with 1435 (14.0%) individuals identified in both databases, 8305 (81.3%) unique to the IDEA database and 478 (4.7%) unique to the HMDS dataset only. Of those unique to the HMDS dataset, about a quarter (n=124) had died before 1 year of age and most of these (75%) before 1 month. Children with ID who were also coded as such in the HMDS data were more likely to be aged under 1 year, female, non-Aboriginal and have a severe level of ID, compared with those not coded in the HMDS data. The sensitivity of using HMDS to identify ID was 14.7%, whereas the specificity was much higher at 99.9%. Hospital morbidity data are not a reliable source for identifying ID within a population, and epidemiological researchers need to take these findings into account in their study design. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Bourke, Jenny; Wong, Kingsley
2018-01-01
Objectives To investigate how well intellectual disability (ID) can be ascertained using hospital morbidity data compared with a population-based data source. Design, setting and participants All children born in 1983–2010 with a hospital admission in the Western Australian Hospital Morbidity Data System (HMDS) were linked with the Western Australian Intellectual Disability Exploring Answers (IDEA) database. The International Classification of Diseases hospital codes consistent with ID were also identified. Main outcome measures The characteristics of those children identified with ID through either or both sources were investigated. Results Of the 488 905 individuals in the study, 10 218 (2.1%) were identified with ID in either IDEA or HMDS with 1435 (14.0%) individuals identified in both databases, 8305 (81.3%) unique to the IDEA database and 478 (4.7%) unique to the HMDS dataset only. Of those unique to the HMDS dataset, about a quarter (n=124) had died before 1 year of age and most of these (75%) before 1 month. Children with ID who were also coded as such in the HMDS data were more likely to be aged under 1 year, female, non-Aboriginal and have a severe level of ID, compared with those not coded in the HMDS data. The sensitivity of using HMDS to identify ID was 14.7%, whereas the specificity was much higher at 99.9%. Conclusion Hospital morbidity data are not a reliable source for identifying ID within a population, and epidemiological researchers need to take these findings into account in their study design. PMID:29362262
Fossil-Fuel C02 Emissions Database and Exploration System
NASA Astrophysics Data System (ADS)
Krassovski, M.; Boden, T.
2012-04-01
Fossil-Fuel C02 Emissions Database and Exploration System Misha Krassovski and Tom Boden Carbon Dioxide Information Analysis Center Oak Ridge National Laboratory The Carbon Dioxide Information Analysis Center (CDIAC) at Oak Ridge National Laboratory (ORNL) quantifies the release of carbon from fossil-fuel use and cement production each year at global, regional, and national spatial scales. These estimates are vital to climate change research given the strong evidence suggesting fossil-fuel emissions are responsible for unprecedented levels of carbon dioxide (CO2) in the atmosphere. The CDIAC fossil-fuel emissions time series are based largely on annual energy statistics published for all nations by the United Nations (UN). Publications containing historical energy statistics make it possible to estimate fossil-fuel CO2 emissions back to 1751 before the Industrial Revolution. From these core fossil-fuel CO2 emission time series, CDIAC has developed a number of additional data products to satisfy modeling needs and to address other questions aimed at improving our understanding of the global carbon cycle budget. For example, CDIAC also produces a time series of gridded fossil-fuel CO2 emission estimates and isotopic (e.g., C13) emissions estimates. The gridded data are generated using the methodology described in Andres et al. (2011) and provide monthly and annual estimates for 1751-2008 at 1° latitude by 1° longitude resolution. These gridded emission estimates are being used in the latest IPCC Scientific Assessment (AR4). Isotopic estimates are possible thanks to detailed information for individual nations regarding the carbon content of select fuels (e.g., the carbon signature of natural gas from Russia). CDIAC has recently developed a relational database to house these baseline emissions estimates and associated derived products and a web-based interface to help users worldwide query these data holdings. Users can identify, explore and download desired CDIAC fossil-fuel CO2 emissions data. This presentation introduces the architecture and design of the new relational database and web interface, summarizes the present state and functionality of the Fossil-Fuel CO2 Emissions Database and Exploration System, and highlights future plans for expansion of the relational database and interface.
Sridhar, Vishnu B; Tian, Peifang; Dale, Anders M; Devor, Anna; Saisan, Payam A
2014-01-01
We present a database client software-Neurovascular Network Explorer 1.0 (NNE 1.0)-that uses MATLAB(®) based Graphical User Interface (GUI) for interaction with a database of 2-photon single-vessel diameter measurements from our previous publication (Tian et al., 2010). These data are of particular interest for modeling the hemodynamic response. NNE 1.0 is downloaded by the user and then runs either as a MATLAB script or as a standalone program on a Windows platform. The GUI allows browsing the database according to parameters specified by the user, simple manipulation and visualization of the retrieved records (such as averaging and peak-normalization), and export of the results. Further, we provide NNE 1.0 source code. With this source code, the user can database their own experimental results, given the appropriate data structure and naming conventions, and thus share their data in a user-friendly format with other investigators. NNE 1.0 provides an example of seamless and low-cost solution for sharing of experimental data by a regular size neuroscience laboratory and may serve as a general template, facilitating dissemination of biological results and accelerating data-driven modeling approaches.
Viral taxonomy needs a spring clean; its exploration era is over.
Gibbs, Adrian J
2013-08-09
The International Committee on Taxonomy of Viruses has recently changed its approved definition of a viral species, and also discontinued work on its database of virus descriptions. These events indicate that the exploration era of viral taxonomy has ended; over the past century the principles of viral taxonomy have been established, the tools for phylogenetic inference invented, and the ultimate discriminatory data required for taxonomy, namely gene sequences, are now readily available. Further changes would make viral taxonomy more informative. First, the status of a 'taxonomic species' with an italicized name should only be given to viruses that are specifically linked with a single 'type genomic sequence' like those in the NCBI Reference Sequence Database. Secondly all approved taxa should be predominately monophyletic, and uninformative higher taxa disendorsed. These are 'quality assurance' measures and would improve the value of viral nomenclature to its users. The ICTV should also promote the use of a public database, such as Wikipedia, to replace the ICTV database as a store of the primary metadata of individual viruses, and should publish abstracts of the ICTV Reports in that database, so that they are 'Open Access'.
Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database.
Zappia, Luke; Phipson, Belinda; Oshlack, Alicia
2018-06-25
As single-cell RNA-sequencing (scRNA-seq) datasets have become more widespread the number of tools designed to analyse these data has dramatically increased. Navigating the vast sea of tools now available is becoming increasingly challenging for researchers. In order to better facilitate selection of appropriate analysis tools we have created the scRNA-tools database (www.scRNA-tools.org) to catalogue and curate analysis tools as they become available. Our database collects a range of information on each scRNA-seq analysis tool and categorises them according to the analysis tasks they perform. Exploration of this database gives insights into the areas of rapid development of analysis methods for scRNA-seq data. We see that many tools perform tasks specific to scRNA-seq analysis, particularly clustering and ordering of cells. We also find that the scRNA-seq community embraces an open-source and open-science approach, with most tools available under open-source licenses and preprints being extensively used as a means to describe methods. The scRNA-tools database provides a valuable resource for researchers embarking on scRNA-seq analysis and records the growth of the field over time.
A systematic study of chemogenomics of carbohydrates.
Gu, Jiangyong; Luo, Fang; Chen, Lirong; Yuan, Gu; Xu, Xiaojie
2014-03-04
Chemogenomics focuses on the interactions between biologically active molecules and protein targets for drug discovery. Carbohydrates are the most abundant compounds in natural products. Compared with other drugs, the carbohydrate drugs show weaker side effects. Searching for multi-target carbohydrate drugs can be regarded as a solution to improve therapeutic efficacy and safety. In this work, we collected 60 344 carbohydrates from the Universal Natural Products Database (UNPD) and explored the chemical space of carbohydrates by principal component analysis. We found that there is a large quantity of potential lead compounds among carbohydrates. Then we explored the potential of carbohydrates in drug discovery by using a network-based multi-target computational approach. All carbohydrates were docked to 2389 target proteins. The most potential carbohydrates for drug discovery and their indications were predicted based on a docking score-weighted prediction model. We also explored the interactions between carbohydrates and target proteins to find the pathological networks, potential drug candidates and new indications.
Bioinformatics: Cheap and robust method to explore biomaterial from Indonesia biodiversity
NASA Astrophysics Data System (ADS)
Widodo
2015-02-01
Indonesia has a huge amount of biodiversity, which may contain many biomaterials for pharmaceutical application. These resources potency should be explored to discover new drugs for human wealth. However, the bioactive screening using conventional methods is very expensive and time-consuming. Therefore, we developed a methodology for screening the potential of natural resources based on bioinformatics. The method is developed based on the fact that organisms in the same taxon will have similar genes, metabolism and secondary metabolites product. Then we employ bioinformatics to explore the potency of biomaterial from Indonesia biodiversity by comparing species with the well-known taxon containing the active compound through published paper or chemical database. Then we analyze drug-likeness, bioactivity and the target proteins of the active compound based on their molecular structure. The target protein was examined their interaction with other proteins in the cell to determine action mechanism of the active compounds in the cellular level, as well as to predict its side effects and toxicity. By using this method, we succeeded to screen anti-cancer, immunomodulators and anti-inflammation from Indonesia biodiversity. For example, we found anticancer from marine invertebrate by employing the method. The anti-cancer was explore based on the isolated compounds of marine invertebrate from published article and database, and then identified the protein target, followed by molecular pathway analysis. The data suggested that the active compound of the invertebrate able to kill cancer cell. Further, we collect and extract the active compound from the invertebrate, and then examined the activity on cancer cell (MCF7). The MTT result showed that the methanol extract of marine invertebrate was highly potent in killing MCF7 cells. Therefore, we concluded that bioinformatics is cheap and robust way to explore bioactive from Indonesia biodiversity for source of drug and another pharmaceutical material.
A geologic and mineral exploration spatial database for the Stillwater Complex, Montana
Zientek, Michael L.; Parks, Heather L.
2014-01-01
This report provides essential spatially referenced datasets based on geologic mapping and mineral exploration activities conducted from the 1920s to the 1990s. This information will facilitate research on the complex and provide background material needed to explore for mineral resources and to develop sound land-management policy.
End-User Use of Data Base Query Language: Pros and Cons.
ERIC Educational Resources Information Center
Nicholes, Walter
1988-01-01
Man-machine interface, the concept of a computer "query," a review of database technology, and a description of the use of query languages at Brigham Young University are discussed. The pros and cons of end-user use of database query languages are explored. (Author/MLW)
Proteomics: Protein Identification Using Online Databases
ERIC Educational Resources Information Center
Eurich, Chris; Fields, Peter A.; Rice, Elizabeth
2012-01-01
Proteomics is an emerging area of systems biology that allows simultaneous study of thousands of proteins expressed in cells, tissues, or whole organisms. We have developed this activity to enable high school or college students to explore proteomic databases using mass spectrometry data files generated from yeast proteins in a college laboratory…
Clark, A. S.; Shea, S.
1991-01-01
The use of Folio Views, a PC DOS based product for free text databases, is explored in three applications in an Integrated Academic Information System (IAIMS): (1) a telephone directory, (2) a grants and contracts newsletter, and (3) nursing care plans. PMID:1666967
Toward a National Computerized Database for Moving Image Materials.
ERIC Educational Resources Information Center
Gartenberg, Jon
This report summarizes a project conducted by a group of catalogers from film archives devoted to nitrate preservation, which explored ways of developing a database to provide a complete film and television information service that would be available nationwide and could contain filmographic data, information on holdings in archives and…
ProtaBank: A repository for protein design and engineering data.
Wang, Connie Y; Chang, Paul M; Ary, Marie L; Allen, Benjamin D; Chica, Roberto A; Mayo, Stephen L; Olafson, Barry D
2018-03-25
We present ProtaBank, a repository for storing, querying, analyzing, and sharing protein design and engineering data in an actively maintained and updated database. ProtaBank provides a format to describe and compare all types of protein mutational data, spanning a wide range of properties and techniques. It features a user-friendly web interface and programming layer that streamlines data deposition and allows for batch input and queries. The database schema design incorporates a standard format for reporting protein sequences and experimental data that facilitates comparison of results across different data sets. A suite of analysis and visualization tools are provided to facilitate discovery, to guide future designs, and to benchmark and train new predictive tools and algorithms. ProtaBank will provide a valuable resource to the protein engineering community by storing and safeguarding newly generated data, allowing for fast searching and identification of relevant data from the existing literature, and exploring correlations between disparate data sets. ProtaBank invites researchers to contribute data to the database to make it accessible for search and analysis. ProtaBank is available at https://protabank.org. © 2018 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
The HARPS-N archive through a Cassandra, NoSQL database suite?
NASA Astrophysics Data System (ADS)
Molinari, Emilio; Guerra, Jose; Harutyunyan, Avet; Lodi, Marcello; Martin, Adrian
2016-07-01
The TNG-INAF is developing the science archive for the WEAVE instrument. The underlying architecture of the archive is based on a non relational database, more precisely, on Apache Cassandra cluster, which uses a NoSQL technology. In order to test and validate the use of this architecture, we created a local archive which we populated with all the HARPSN spectra collected at the TNG since the instrument's start of operations in mid-2012, as well as developed tools for the analysis of this data set. The HARPS-N data set is two orders of magnitude smaller than WEAVE, but we want to demonstrate the ability to walk through a complete data set and produce scientific output, as valuable as that produced by an ordinary pipeline, though without accessing directly the FITS files. The analytics is done by Apache Solr and Spark and on a relational PostgreSQL database. As an example, we produce observables like metallicity indexes for the targets in the archive and compare the results with the ones coming from the HARPS-N regular data reduction software. The aim of this experiment is to explore the viability of a high availability cluster and distributed NoSQL database as a platform for complex scientific analytics on a large data set, which will then be ported to the WEAVE Archive System (WAS) which we are developing for the WEAVE multi object, fiber spectrograph.
Monolithic Cu-Cr-Nb Alloys for High Temperature, High Heat Flux Applications
NASA Technical Reports Server (NTRS)
Ellis, David L.; Locci, Ivan E.; Michal, Gary M.; Humphrey, Derek M.
1999-01-01
Work during the prior four years of this grant has resulted in significant advances in the development of Cu-8 Cr4 Nb and related Cu-Cr-Nb alloys. The alloys are nearing commercial use in the Reusable Launch Vehicle (RLV) where they are candidate materials for the thrust cell liners of the aerospike engines being developed by Rocketdyne. During the fifth and final year of the grant, it is proposed to complete development of the design level database of mechanical and thermophysical properties and transfer it to NASA Glenn Research Center and Rocketdyne. The database development work will be divided into three main areas: Thermophysical Database Augmentation, Mechanical Testing and Metallography and Fractography. In addition to the database development, work will continue that is focussed on the production of alternatives to the powder metallurgy alloys currently used. Exploration of alternative alloys will be aimed at both the development of lower cost materials and higher performance materials. A key element of this effort will be the use of Thermo-Calc software to survey the solubility behavior of a wide range of alloying elements in a copper matrix. The ultimate goals would be to define suitable alloy compositions and processing routes to produce thin sheets of the material at either a lower cost, or, with improved mechanical and thermal properties compared to the current Cu-Cr-Nb powder metallurgy alloys.
The LSST Data Mining Research Agenda
NASA Astrophysics Data System (ADS)
Borne, K.; Becla, J.; Davidson, I.; Szalay, A.; Tyson, J. A.
2008-12-01
We describe features of the LSST science database that are amenable to scientific data mining, object classification, outlier identification, anomaly detection, image quality assurance, and survey science validation. The data mining research agenda includes: scalability (at petabytes scales) of existing machine learning and data mining algorithms; development of grid-enabled parallel data mining algorithms; designing a robust system for brokering classifications from the LSST event pipeline (which may produce 10,000 or more event alerts per night) multi-resolution methods for exploration of petascale databases; indexing of multi-attribute multi-dimensional astronomical databases (beyond spatial indexing) for rapid querying of petabyte databases; and more.
Database trial impact on graduate nursing comprehensive exams
Pionke, Katharine; Huckstadt, Alicia
2015-01-01
While the authors were doing a test period of databases, the question of whether or not databases affect outcomes of graduate nursing comprehensive examinations came up. This study explored that question through using citation analysis of exams that were taken during a database trial and exams that were not. The findings showed no difference in examination pass/fail rates. While the pass/fail rates did not change, a great deal was learned in terms of citation accuracy and types of materials that students used, leading to discussions about changing how citation and plagiarism awareness were taught. PMID:26512218
78 FR 46338 - Agency Information Collection Activities: Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2013-07-31
... Quality's (AHRQ) Hospital Survey on Patient Safety Culture Comparative Database.'' In accordance with the... Safety Culture Comparative Database Request for information collection approval. The Agency for... on Patient Safety Culture (Hospital SOPS) Comparative Database; OMB NO. 0935-0162, last approved on...
International exploration of Mars. A special bibliography
NASA Technical Reports Server (NTRS)
1991-01-01
This bibliography lists 173 reports, articles, and other documents introduced into the NASA Scientific and Technical Information Database on the exploration of Mars. Historical references are cited for background. The bibliography was created for the 1991 session of the International Space University.
Rieckmann, Andreas; Tamason, Charlotte C.; Gurley, Emily S.; Rod, Naja Hulvej; Jensen, Peter Kjær Mackie
2018-01-01
Abstract. Cholera outbreaks in Africa have been attributed to both droughts and floods, but whether the risk of a cholera outbreak is elevated during droughts is unknown. We estimated the risk of cholera outbreaks during droughts and floods compared with drought- and flood-free periods in 40 sub-Saharan African countries during 1990–2010 based on data from Emergency Events Database: the Office of Foreign Disaster Assistance /Centre for Research on the Epidemiology of Disasters International Disaster Database (www.emdat.be). A cholera outbreak was registered in one of every three droughts and one of every 15 floods. We observed an increased incidence rate of cholera outbreaks during drought periods (incidence rate ratio [IRR] = 4.3, 95% confidence interval [CI] = 2.9–7.2) and during flood periods (IRR = 144, 95% CI = 101–208) when compared with drought/flood-free periods. Floods are more strongly associated with cholera outbreaks, yet the prevalence of cholera outbreaks is higher during droughts because of droughts’ long durations. The results suggest that droughts in addition to floods call for increased cholera preparedness. PMID:29512484
Bolbase: a comprehensive genomics database for Brassica oleracea.
Yu, Jingyin; Zhao, Meixia; Wang, Xiaowu; Tong, Chaobo; Huang, Shunmou; Tehrim, Sadia; Liu, Yumei; Hua, Wei; Liu, Shengyi
2013-09-30
Brassica oleracea is a morphologically diverse species in the family Brassicaceae and contains a group of nutrition-rich vegetable crops, including common heading cabbage, cauliflower, broccoli, kohlrabi, kale, Brussels sprouts. This diversity along with its phylogenetic membership in a group of three diploid and three tetraploid species, and the recent availability of genome sequences within Brassica provide an unprecedented opportunity to study intra- and inter-species divergence and evolution in this species and its close relatives. We have developed a comprehensive database, Bolbase, which provides access to the B. oleracea genome data and comparative genomics information. The whole genome of B. oleracea is available, including nine fully assembled chromosomes and 1,848 scaffolds, with 45,758 predicted genes, 13,382 transposable elements, and 3,581 non-coding RNAs. Comparative genomics information is available, including syntenic regions among B. oleracea, Brassica rapa and Arabidopsis thaliana, synonymous (Ks) and non-synonymous (Ka) substitution rates between orthologous gene pairs, gene families or clusters, and differences in quantity, category, and distribution of transposable elements on chromosomes. Bolbase provides useful search and data mining tools, including a keyword search, a local BLAST server, and a customized GBrowse tool, which can be used to extract annotations of genome components, identify similar sequences and visualize syntenic regions among species. Users can download all genomic data and explore comparative genomics in a highly visual setting. Bolbase is the first resource platform for the B. oleracea genome and for genomic comparisons with its relatives, and thus it will help the research community to better study the function and evolution of Brassica genomes as well as enhance molecular breeding research. This database will be updated regularly with new features, improvements to genome annotation, and new genomic sequences as they become available. Bolbase is freely available at http://ocri-genomics.org/bolbase.
Bibliometrics of NIHR HTA monographs and their related journal articles
Royle, Pamela
2015-01-01
Objectives A bibliometric analysis of the UK National Institute for Health Research (NIHR) Health Technology Assessment (HTA) monographs and their related journal articles by: (1) exploring the differences in citations to the HTA monographs in Google Scholar (GS), Scopus and Web of Science (WoS), and (2) comparing Scopus citations to the monographs with their related journal articles. Setting A study of 111 HTA monographs published in 2010 and 2011, and their external journal articles. Main outcome measures Citations to the monographs in GS, Scopus and WoS, and to their external journal articles in Scopus. Results The number of citations varied among the three databases, with GS having the highest and WoS the lowest; however, the citation-based rankings among the databases were highly correlated. Overall, 56% of monographs had a related publication, with the highest proportion for primary research (76%) and lowest for evidence syntheses (43%). There was a large variation in how the monographs were cited, compared to journal articles, resulting in more frequent problems, with unlinked citations in Scopus and WoS. When comparing differences in the number of citations between monograph publications with their related journal articles from the same project, we found that monographs received more citations than their journal articles for evidence syntheses and methodology projects; by contrast, journal articles related to primary research monographs were more highly cited than their monograph. Conclusions The numbers of citations to the HTA monographs differed considerably between the databases, but were highly correlated. When a HTA monograph had a journal article from the same study, there were more citations to the journal article for primary research, but more to the monographs for evidence syntheses. Citations to the related journal articles were more reliably recorded than citations to the HTA monographs. PMID:25694457
Exploring of the molecular mechanism of rhinitis via bioinformatics methods
Song, Yufen; Yan, Zhaohui
2018-01-01
The aim of this study was to analyze gene expression profiles for exploring the function and regulatory network of differentially expressed genes (DEGs) in pathogenesis of rhinitis by a bioinformatics method. The gene expression profile of GSE43523 was downloaded from the Gene Expression Omnibus database. The dataset contained 7 seasonal allergic rhinitis samples and 5 non-allergic normal samples. DEGs between rhinitis samples and normal samples were identified via the limma package of R. The webGestal database was used to identify enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways of the DEGs. The differentially co-expressed pairs of the DEGs were identified via the DCGL package in R, and the differential co-expression network was constructed based on these pairs. A protein-protein interaction (PPI) network of the DEGs was constructed based on the Search Tool for the Retrieval of Interacting Genes database. A total of 263 DEGs were identified in rhinitis samples compared with normal samples, including 125 downregulated ones and 138 upregulated ones. The DEGs were enriched in 7 KEGG pathways. 308 differential co-expression gene pairs were obtained. A differential co-expression network was constructed, containing 212 nodes. In total, 148 PPI pairs of the DEGs were identified, and a PPI network was constructed based on these pairs. Bioinformatics methods could help us identify significant genes and pathways related to the pathogenesis of rhinitis. Steroid biosynthesis pathway and metabolic pathways might play important roles in the development of allergic rhinitis (AR). Genes such as CDC42 effector protein 5, solute carrier family 39 member A11 and PR/SET domain 10 might be also associated with the pathogenesis of AR, which provided references for the molecular mechanisms of AR. PMID:29257233
pyGeno: A Python package for precision medicine and proteogenomics.
Daouda, Tariq; Perreault, Claude; Lemieux, Sébastien
2016-01-01
pyGeno is a Python package mainly intended for precision medicine applications that revolve around genomics and proteomics. It integrates reference sequences and annotations from Ensembl, genomic polymorphisms from the dbSNP database and data from next-gen sequencing into an easy to use, memory-efficient and fast framework, therefore allowing the user to easily explore subject-specific genomes and proteomes. Compared to a standalone program, pyGeno gives the user access to the complete expressivity of Python, a general programming language. Its range of application therefore encompasses both short scripts and large scale genome-wide studies.
pyGeno: A Python package for precision medicine and proteogenomics
Daouda, Tariq; Perreault, Claude; Lemieux, Sébastien
2016-01-01
pyGeno is a Python package mainly intended for precision medicine applications that revolve around genomics and proteomics. It integrates reference sequences and annotations from Ensembl, genomic polymorphisms from the dbSNP database and data from next-gen sequencing into an easy to use, memory-efficient and fast framework, therefore allowing the user to easily explore subject-specific genomes and proteomes. Compared to a standalone program, pyGeno gives the user access to the complete expressivity of Python, a general programming language. Its range of application therefore encompasses both short scripts and large scale genome-wide studies. PMID:27785359
A new generation of intelligent trainable tools for analyzing large scientific image databases
NASA Technical Reports Server (NTRS)
Fayyad, Usama M.; Smyth, Padhraic; Atkinson, David J.
1994-01-01
The focus of this paper is on the detection of natural, as opposed to human-made, objects. The distinction is important because, in the context of image analysis, natural objects tend to possess much greater variability in appearance than human-made objects. Hence, we shall focus primarily on the use of algorithms that 'learn by example' as the basis for image exploration. The 'learn by example' approach is potentially more generally applicable compared to model-based vision methods since domain scientists find it relatively easier to provide examples of what they are searching for versus describing a model.
He, Ying; Chang, Tsung C; Li, Haijing; Shi, Gongyi; Tang, Yi-Wei
2011-07-01
More than 20 species of Legionella have been identified in relation to human infections. Rapid detection and identification of Legionella isolates is clinically useful to differentiate between infection and contamination and to determine treatment regimens. We explored the use of matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) Biotyper system (Bruker Daltonik GmbH, Bremen, Germany) for the identification of Legionella species. The MALDI MS spectra were generated and compared with the Biotyper database, which includes 25 Legionella strains covering 22 species and four Legionella pneumophila serogroups. A total of 83 blind-coded Legionella strains, consisting of 54 reference and 29 clinical strains, were analyzed in the study. Overall, the Biotyper system correctly identified 51 (61.4%) of all strains and isolates to the species level. For species included in the Biotyper database, the method identified 51 (86.4%) strains out of 59 Legionella strains to the correct species level, including 24 (100%) L. pneumophila and 27 (77.1%) non-L. pneumophila strains. The remaining 24 Legionella strains, belonging to species not covered by the Biotyper database, were either identified to the Legionella genus level or had no reliable identification. The Biotyper system produces constant and reproducible MALDI MS spectra for Legionella strains and can be used for rapid and accurate Legionella identification. More Legionella strains, especially the non-L. pneumophila strains, need to be included in the current Biotyper database to cover varieties of Legionella species and to increase identification accuracy.
De Natale, Antonino; Pezzatti, Gianni Boris; Pollio, Antonino
2009-01-01
Background Ethnobotanical studies generally describe the traditional knowledge of a territory according to a "hic et nunc" principle. The need of approaching this field also embedding historical data has been frequently acknowledged. With their long history of civilization some regions of the Mediterranean basin seem to be particularly suited for an historical approach to be adopted. Campania, a region of southern Italy, has been selected for a database implementation containing present and past information on plant uses. Methods A relational database has been built on the basis of information gathered from different historical sources, including diaries, travel accounts, and treatises on medicinal plants, written by explorers, botanists, physicians, who travelled in Campania during the last three centuries. Moreover, ethnobotanical uses described in historical herbal collections and in Ancient and Medieval texts from the Mediterranean Region have been included in the database. Results 1672 different uses, ranging from medicinal, to alimentary, ceremonial, veterinary, have been recorded for 474 species listed in the data base. Information is not uniformly spread over the Campanian territory; Sannio being the most studied geographical area and Cilento the least one. About 50 plants have been continuously used in the last three centuries in the cure of the same affections. A comparison with the uses reported for the same species in Ancient treatises shows that the origin of present ethnomedicine from old learned medical doctrines needs a case-by-case confirmation. Conclusion The database is flexible enough to represent a useful tool for researchers who need to store and compare present and previous ethnobotanical uses from Mediterranean Countries. PMID:19228384
Exploration and Evaluation of Nanometer Low-power Multi-core VLSI Computer Architectures
2015-03-01
ICC, the Milkway database was created using the command: milkyway –galaxy –nogui –tcl –log memory.log one.tcl As stated previously, it is...EDA tools. Typically, Synopsys® tools use Milkway databases, whereas, Cadence Design System® use Layout Exchange Format (LEF) formats. To help
Proposal for Implementing Multi-User Database (MUD) Technology in an Academic Library.
ERIC Educational Resources Information Center
Filby, A. M. Iliana
1996-01-01
Explores the use of MOO (multi-user object oriented) virtual environments in academic libraries to enhance reference services. Highlights include the development of multi-user database (MUD) technology from gaming to non-recreational settings; programming issues; collaborative MOOs; MOOs as distinguished from other types of virtual reality; audio…
PharmDB-K: Integrated Bio-Pharmacological Network Database for Traditional Korean Medicine
Lee, Ji-Hyun; Park, Kyoung Mii; Han, Dong-Jin; Bang, Nam Young; Kim, Do-Hee; Na, Hyeongjin; Lim, Semi; Kim, Tae Bum; Kim, Dae Gyu; Kim, Hyun-Jung; Chung, Yeonseok; Sung, Sang Hyun; Surh, Young-Joon; Kim, Sunghoon; Han, Byung Woo
2015-01-01
Despite the growing attention given to Traditional Medicine (TM) worldwide, there is no well-known, publicly available, integrated bio-pharmacological Traditional Korean Medicine (TKM) database for researchers in drug discovery. In this study, we have constructed PharmDB-K, which offers comprehensive information relating to TKM-associated drugs (compound), disease indication, and protein relationships. To explore the underlying molecular interaction of TKM, we integrated fourteen different databases, six Pharmacopoeias, and literature, and established a massive bio-pharmacological network for TKM and experimentally validated some cases predicted from the PharmDB-K analyses. Currently, PharmDB-K contains information about 262 TKMs, 7,815 drugs, 3,721 diseases, 32,373 proteins, and 1,887 side effects. One of the unique sets of information in PharmDB-K includes 400 indicator compounds used for standardization of herbal medicine. Furthermore, we are operating PharmDB-K via phExplorer (a network visualization software) and BioMart (a data federation framework) for convenient search and analysis of the TKM network. Database URL: http://pharmdb-k.org, http://biomart.i-pharm.org. PMID:26555441
Cheng, Ching-Wu; Leu, Sou-Sen; Cheng, Ying-Mei; Wu, Tsung-Chih; Lin, Chen-Chung
2012-09-01
Construction accident research involves the systematic sorting, classification, and encoding of comprehensive databases of injuries and fatalities. The present study explores the causes and distribution of occupational accidents in the Taiwan construction industry by analyzing such a database using the data mining method known as classification and regression tree (CART). Utilizing a database of 1542 accident cases during the period 2000-2009, the study seeks to establish potential cause-and-effect relationships regarding serious occupational accidents in the industry. The results of this study show that the occurrence rules for falls and collapses in both public and private project construction industries serve as key factors to predict the occurrence of occupational injuries. The results of the study provide a framework for improving the safety practices and training programs that are essential to protecting construction workers from occasional or unexpected accidents. Copyright © 2011 Elsevier Ltd. All rights reserved.
Young, Katherine
2014-09-30
database.) In fiscal year 2015, NREL is working with universities to populate additional case studies on OpenEI. The goal is to provide a large enough dataset to start conducting analyses of exploration programs to identify correlations between successful exploration plans for areas with similar geologic occurrence models.
Catlin, Ann Christine; Fernando, Sumudinie; Gamage, Ruwan; Renner, Lorna; Antwi, Sampson; Tettey, Jonas Kusah; Amisah, Kofi Aikins; Kyriakides, Tassos; Cong, Xiangyu; Reynolds, Nancy R; Paintsil, Elijah
2015-01-01
Prevalence of pediatric HIV disclosure is low in resource-limited settings. Innovative, culturally sensitive, and patient-centered disclosure approaches are needed. Conducting such studies in resource-limited settings is not trivial considering the challenges of capturing, cleaning, and storing clinical research data. To overcome some of these challenges, the Sankofa pediatric disclosure intervention adopted an interactive cyber infrastructure for data capture and analysis. The Sankofa Project database system is built on the HUBzero cyber infrastructure ( https://hubzero.org ), an open source software platform. The hub database components support: (1) data management - the "databases" component creates, configures, and manages database access, backup, repositories, applications, and access control; (2) data collection - the "forms" component is used to build customized web case report forms that incorporate common data elements and include tailored form submit processing to handle error checking, data validation, and data linkage as the data are stored to the database; and (3) data exploration - the "dataviewer" component provides powerful methods for users to view, search, sort, navigate, explore, map, graph, visualize, aggregate, drill-down, compute, and export data from the database. The Sankofa cyber data management tool supports a user-friendly, secure, and systematic collection of all data. We have screened more than 400 child-caregiver dyads and enrolled nearly 300 dyads, with tens of thousands of data elements. The dataviews have successfully supported all data exploration and analysis needs of the Sankofa Project. Moreover, the ability of the sites to query and view data summaries has proven to be an incentive for collecting complete and accurate data. The data system has all the desirable attributes of an electronic data capture tool. It also provides an added advantage of building data management capacity in resource-limited settings due to its innovative data query and summary views and availability of real-time support by the data management team.
Evaluation of Database Coverage: A Comparison of Two Methodologies.
ERIC Educational Resources Information Center
Tenopir, Carol
1982-01-01
Describes experiment which compared two techniques used for evaluating and comparing database coverage of a subject area, e.g., "bibliography" and "subject profile." Differences in time, cost, and results achieved are compared by applying techniques to field of volcanology using two databases, Geological Reference File and GeoArchive. Twenty…
Konc, Janez; Cesnik, Tomo; Konc, Joanna Trykowska; Penca, Matej; Janežič, Dušanka
2012-02-27
ProBiS-Database is a searchable repository of precalculated local structural alignments in proteins detected by the ProBiS algorithm in the Protein Data Bank. Identification of functionally important binding regions of the protein is facilitated by structural similarity scores mapped to the query protein structure. PDB structures that have been aligned with a query protein may be rapidly retrieved from the ProBiS-Database, which is thus able to generate hypotheses concerning the roles of uncharacterized proteins. Presented with uncharacterized protein structure, ProBiS-Database can discern relationships between such a query protein and other better known proteins in the PDB. Fast access and a user-friendly graphical interface promote easy exploration of this database of over 420 million local structural alignments. The ProBiS-Database is updated weekly and is freely available online at http://probis.cmm.ki.si/database.
Fei, Lin; Zhao, Jing; Leng, Jiahao; Zhang, Shujian
2017-10-12
The ALIPORC full-text database is targeted at a specific full-text database of acupuncture literature in the Republic of China. Starting in 2015, till now, the database has been getting completed, focusing on books relevant with acupuncture, articles and advertising documents, accomplished or published in the Republic of China. The construction of this database aims to achieve the source sharing of acupuncture medical literature in the Republic of China through the retrieval approaches to diversity and accurate content presentation, contributes to the exchange of scholars, reduces the paper damage caused by paging and simplify the retrieval of the rare literature. The writers have made the explanation of the database in light of sources, characteristics and current situation of construction; and have discussed on improving the efficiency and integrity of the database and deepening the development of acupuncture literature in the Republic of China.
A Framework for Cloudy Model Optimization and Database Storage
NASA Astrophysics Data System (ADS)
Calvén, Emilia; Helton, Andrew; Sankrit, Ravi
2018-01-01
We present a framework for producing Cloudy photoionization models of the nebular emission from novae ejecta and storing a subset of the results in SQL database format for later usage. The database can be searched for models best fitting observed spectral line ratios. Additionally, the framework includes an optimization feature that can be used in tandem with the database to search for and improve on models by creating new Cloudy models while, varying the parameters. The database search and optimization can be used to explore the structures of nebulae by deriving their properties from the best-fit models. The goal is to provide the community with a large database of Cloudy photoionization models, generated from parameters reflecting conditions within novae ejecta, that can be easily fitted to observed spectral lines; either by directly accessing the database using the framework code or by usage of a website specifically made for this purpose.
Asian Americans and materialism: Exploring the phenomenon and its why and when.
Zhang, Jia Wei
2018-05-24
Consumer values, including but not limited to materialism, have received much less attention than other topics within research on Asian Americans. Across 3 studies (N = 6,955), the author explored the difference between Asian Americans and White/European Americans on materialism, and the mediating and moderating mechanisms. Studies 1a-1c found Asian Americans, compared to White/European Americans, more strongly endorsed materialistic values. In Study 2, the author tested a multiple mediation model and demonstrated that Asian Americans, compared to White/European Americans, more strongly endorse materialistic values because they reported higher extrinsic aspirations (i.e., stronger desires for money, image, and popularity). Finally, in Study 3, the author tested a moderation model and found that Asian Americans who are higher on a general tendency to adhere to norms endorse a greater level of materialism than White/European Americans. The author discussed how these results have implications for expanding the research topics within research on Asian Americans, consequences for mental health and provide future directions to counteract materialism. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Exploring Large-Scale Cross-Correlation for Teleseismic and Regional Seismic Event Characterization
NASA Astrophysics Data System (ADS)
Dodge, Doug; Walter, William; Myers, Steve; Ford, Sean; Harris, Dave; Ruppert, Stan; Buttler, Dave; Hauk, Terri
2013-04-01
The decrease in costs of both digital storage space and computation power invites new methods of seismic data processing. At Lawrence Livermore National Laboratory(LLNL) we operate a growing research database of seismic events and waveforms for nuclear explosion monitoring and other applications. Currently the LLNL database contains several million events associated with tens of millions of waveforms at thousands of stations. We are making use of this database to explore the power of seismic waveform correlation to quantify signal similarities, to discover new events not in catalogs, and to more accurately locate events and identify source types. Building on the very efficient correlation methodologies of Harris and Dodge (2011) we computed the waveform correlation for event pairs in the LLNL database in two ways. First we performed entire waveform cross-correlation over seven distinct frequency bands. The correlation coefficient exceeds 0.6 for more than 40 million waveform pairs for several hundred thousand events at more than a thousand stations. These correlations reveal clusters of mining events and aftershock sequences, which can be used to readily identify and locate events. Second we determine relative pick times by correlating signals in time windows for distinct seismic phases. These correlated picks are then used to perform very high accuracy event relocations. We are examining the percentage of events that correlate as a function of magnitude and observing station distance in selected high seismicity regions. Combining these empirical results and those using synthetic data, we are working to quantify relationships between correlation and event pair separation (in epicenter and depth) as well as mechanism differences. Our exploration of these techniques on a large seismic database is in process and we will report on our findings in more detail at the meeting.
Exploring Large-Scale Cross-Correlation for Teleseismic and Regional Seismic Event Characterization
NASA Astrophysics Data System (ADS)
Dodge, D.; Walter, W. R.; Myers, S. C.; Ford, S. R.; Harris, D.; Ruppert, S.; Buttler, D.; Hauk, T. F.
2012-12-01
The decrease in costs of both digital storage space and computation power invites new methods of seismic data processing. At Lawrence Livermore National Laboratory (LLNL) we operate a growing research database of seismic events and waveforms for nuclear explosion monitoring and other applications. Currently the LLNL database contains several million events associated with tens of millions of waveforms at thousands of stations. We are making use of this database to explore the power of seismic waveform correlation to quantify signal similarities, to discover new events not in catalogs, and to more accurately locate events and identify source types. Building on the very efficient correlation methodologies of Harris and Dodge (2011) we computed the waveform correlation for event pairs in the LLNL database in two ways. First we performed entire waveform cross-correlation over seven distinct frequency bands. The correlation coefficient exceeds 0.6 for more than 40 million waveform pairs for several hundred thousand events at more than a thousand stations. These correlations reveal clusters of mining events and aftershock sequences, which can be used to readily identify and locate events. Second we determine relative pick times by correlating signals in time windows for distinct seismic phases. These correlated picks are then used to perform very high accuracy event relocations. We are examining the percentage of events that correlate as a function of magnitude and observing station distance in selected high seismicity regions. Combining these empirical results and those using synthetic data, we are working to quantify relationships between correlation and event pair separation (in epicenter and depth) as well as mechanism differences. Our exploration of these techniques on a large seismic database is in process and we will report on our findings in more detail at the meeting.
Ghandikota, Sudhir; Hershey, Gurjit K Khurana; Mersha, Tesfaye B
2018-03-24
Advances in high-throughput sequencing technologies have made it possible to generate multiple omics data at an unprecedented rate and scale. The accumulation of these omics data far outpaces the rate at which biologists can mine and generate new hypothesis to test experimentally. There is an urgent need to develop a myriad of powerful tools to efficiently and effectively search and filter these resources to address specific post-GWAS functional genomics questions. However, to date, these resources are scattered across several databases and often lack a unified portal for data annotation and analytics. In addition, existing tools to analyze and visualize these databases are highly fragmented, resulting researchers to access multiple applications and manual interventions for each gene or variant in an ad hoc fashion until all the questions are answered. In this study, we present GENEASE, a web-based one-stop bioinformatics tool designed to not only query and explore multi-omics and phenotype databases (e.g., GTEx, ClinVar, dbGaP, GWAS Catalog, ENCODE, Roadmap Epigenomics, KEGG, Reactome, Gene and Phenotype Ontology) in a single web interface but also to perform seamless post genome-wide association downstream functional and overlap analysis for non-coding regulatory variants. GENEASE accesses over 50 different databases in public domain including model organism-specific databases to facilitate gene/variant and disease exploration, enrichment and overlap analysis in real time. It is a user-friendly tool with point-and-click interface containing links for support information including user manual and examples. GENEASE can be accessed freely at http://research.cchmc.org/mershalab/genease_new/login.html. Tesfaye.Mersha@cchmc.org, Sudhir.Ghandikota@cchmc.org. Supplementary data are available at Bioinformatics online.
Specialized microbial databases for inductive exploration of microbial genome sequences
Fang, Gang; Ho, Christine; Qiu, Yaowu; Cubas, Virginie; Yu, Zhou; Cabau, Cédric; Cheung, Frankie; Moszer, Ivan; Danchin, Antoine
2005-01-01
Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore , a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya) has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis) associated to related organisms for comparison. PMID:15698474
Pareja, Eduardo; Pareja-Tobes, Pablo; Manrique, Marina; Pareja-Tobes, Eduardo; Bonal, Javier; Tobes, Raquel
2006-01-01
Background Transcriptional regulation processes are the principal mechanisms of adaptation in prokaryotes. In these processes, the regulatory proteins and the regulatory DNA signals located in extragenic regions are the key elements involved. As all extragenic spaces are putative regulatory regions, ExtraTrain covers all extragenic regions of available genomes and regulatory proteins from bacteria and archaea included in the UniProt database. Description ExtraTrain provides integrated and easily manageable information for 679816 extragenic regions and for the genes delimiting each of them. In addition ExtraTrain supplies a tool to explore extragenic regions, named Palinsight, oriented to detect and search palindromic patterns. This interactive visual tool is totally integrated in the database, allowing the search for regulatory signals in user defined sets of extragenic regions. The 26046 regulatory proteins included in ExtraTrain belong to the families AraC/XylS, ArsR, AsnC, Cold shock domain, CRP-FNR, DeoR, GntR, IclR, LacI, LuxR, LysR, MarR, MerR, NtrC/Fis, OmpR and TetR. The database follows the InterPro criteria to define these families. The information about regulators includes manually curated sets of references specifically associated to regulator entries. In order to achieve a sustainable and maintainable knowledge database ExtraTrain is a platform open to the contribution of knowledge by the scientific community providing a system for the incorporation of textual knowledge. Conclusion ExtraTrain is a new database for exploring Extragenic regions and Transcriptional information in bacteria and archaea. ExtraTrain database is available at . PMID:16539733
2016-12-01
VARIABILITY OF THE ACOUSTIC PROPAGATION IN THE MEDITERRANEAN SEA IDENTIFIED FROM A SYNOPTIC MONTHLY GRIDDED DATABASE AS COMPARED WITH GDEM by...ANNUAL VARIABILITY OF THE ACOUSTIC PROPAGATION IN THE MEDITERRANEAN SEA IDENTIFIED FROM A SYNOPTIC MONTHLY GRIDDED DATABASE AS COMPARED WITH GDEM 5...profiles obtained from the synoptic monthly gridded World Ocean Database (SMD-WOD) and Generalized Digital Environmental Model (GDEM) temperature (T
2013-01-01
Background Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing. Results We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user’s query, advanced data searching based on the specified user’s query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases. Conclusions search GenBank extends standard capabilities of the NCBI Entrez search engine in querying biomedical databases. The possibility of creating and saving macros in the search GenBank is a unique feature and has a great potential. The potential will further grow in the future with the increasing density of networks of relationships between data stored in particular databases. search GenBank is available for public use at http://sgb.biotools.pl/. PMID:23452691
Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Siążnik, Artur
2013-03-01
Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing. We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user's query, advanced data searching based on the specified user's query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases. search GenBank extends standard capabilities of the NCBI Entrez search engine in querying biomedical databases. The possibility of creating and saving macros in the search GenBank is a unique feature and has a great potential. The potential will further grow in the future with the increasing density of networks of relationships between data stored in particular databases. search GenBank is available for public use at http://sgb.biotools.pl/.
Charting the expansion of strategic exploratory behavior during adolescence.
Somerville, Leah H; Sasse, Stephanie F; Garrad, Megan C; Drysdale, Andrew T; Abi Akar, Nadine; Insel, Catherine; Wilson, Robert C
2017-02-01
Although models of exploratory decision making implicate a suite of strategies that guide the pursuit of information, the developmental emergence of these strategies remains poorly understood. This study takes an interdisciplinary perspective, merging computational decision making and developmental approaches to characterize age-related shifts in exploratory strategy from adolescence to young adulthood. Participants were 149 12-28-year-olds who completed a computational explore-exploit paradigm that manipulated reward value, information value, and decision horizon (i.e., the utility that information holds for future choices). Strategic directed exploration, defined as information seeking selective for long time horizons, emerged during adolescence and maintained its level through early adulthood. This age difference was partially driven by adolescents valuing immediate reward over new information. Strategic random exploration, defined as stochastic choice behavior selective for long time horizons, was invoked at comparable levels over the age range, and predicted individual differences in attitudes toward risk taking in daily life within the adolescent portion of the sample. Collectively, these findings reveal an expansion of the diversity of strategic exploration over development, implicate distinct mechanisms for directed and random exploratory strategies, and suggest novel mechanisms for adolescent-typical shifts in decision making. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
On exploration of medical database of Crohn's disease
NASA Astrophysics Data System (ADS)
Manerowska, Anna; Dadalski, Maciej; Socha, Piotr; Mulawka, Jan
2010-09-01
The primary objective of this article is to find a new, more effective method of diagnosis of Crohn's disease. Having created the database on this disease we wanted to find the most suitable classification models. We used the algorithms with their implementations stored in R environment. Having carried out the investigations we have reached results interesting for clinical practice.
ERIC Educational Resources Information Center
Micco, Mary; Popp, Rich
Techniques for building a world-wide information infrastructure by reverse engineering existing databases to link them in a hierarchical system of subject clusters to create an integrated database are explored. The controlled vocabulary of the Library of Congress Subject Headings is used to ensure consistency and group similar items. Each database…
Future mobile access for open-data platforms and the BBC-DaaS system
NASA Astrophysics Data System (ADS)
Edlich, Stefan; Singh, Sonam; Pfennigstorf, Ingo
2013-03-01
In this paper, we develop an open data platform on multimedia devices to act as marketplace of data for information seekers and data providers. We explore the important aspects of Data-as-a-Service (DaaS) service in the cloud with a mobile access point. The basis of the DaaS service is to act as a marketplace for information, utilizing new technologies and recent new scalable polyglot architectures based on NoSql databases. Whereas Open-Data platforms are beginning to be widely accepted, its mobile use is not. We compare similar products, their approach and a possible mobile usage. We discuss several approaches to address the mobile access as a native app, html5 and a mobile first approach together with the several frontend presentation techniques. Big data visualization itself is in the early days and we explore some possibilities to get big data / open data accessed by mobile users.
Large-scale contamination of microbial isolate genomes by Illumina PhiX control.
Mukherjee, Supratim; Huntemann, Marcel; Ivanova, Natalia; Kyrpides, Nikos C; Pati, Amrita
2015-01-01
With the rapid growth and development of sequencing technologies, genomes have become the new go-to for exploring solutions to some of the world's biggest challenges such as searching for alternative energy sources and exploration of genomic dark matter. However, progress in sequencing has been accompanied by its share of errors that can occur during template or library preparation, sequencing, imaging or data analysis. In this study we screened over 18,000 publicly available microbial isolate genome sequences in the Integrated Microbial Genomes database and identified more than 1000 genomes that are contaminated with PhiX, a control frequently used during Illumina sequencing runs. Approximately 10% of these genomes have been published in literature and 129 contaminated genomes were sequenced under the Human Microbiome Project. Raw sequence reads are prone to contamination from various sources and are usually eliminated during downstream quality control steps. Detection of PhiX contaminated genomes indicates a lapse in either the application or effectiveness of proper quality control measures. The presence of PhiX contamination in several publicly available isolate genomes can result in additional errors when such data are used in comparative genomics analyses. Such contamination of public databases have far-reaching consequences in the form of erroneous data interpretation and analyses, and necessitates better measures to proofread raw sequences before releasing them to the broader scientific community.
Nanocubes for real-time exploration of spatiotemporal datasets.
Lins, Lauro; Klosowski, James T; Scheidegger, Carlos
2013-12-01
Consider real-time exploration of large multidimensional spatiotemporal datasets with billions of entries, each defined by a location, a time, and other attributes. Are certain attributes correlated spatially or temporally? Are there trends or outliers in the data? Answering these questions requires aggregation over arbitrary regions of the domain and attributes of the data. Many relational databases implement the well-known data cube aggregation operation, which in a sense precomputes every possible aggregate query over the database. Data cubes are sometimes assumed to take a prohibitively large amount of space, and to consequently require disk storage. In contrast, we show how to construct a data cube that fits in a modern laptop's main memory, even for billions of entries; we call this data structure a nanocube. We present algorithms to compute and query a nanocube, and show how it can be used to generate well-known visual encodings such as heatmaps, histograms, and parallel coordinate plots. When compared to exact visualizations created by scanning an entire dataset, nanocube plots have bounded screen error across a variety of scales, thanks to a hierarchical structure in space and time. We demonstrate the effectiveness of our technique on a variety of real-world datasets, and present memory, timing, and network bandwidth measurements. We find that the timings for the queries in our examples are dominated by network and user-interaction latencies.
Comparative policy analysis for alcohol and drugs: Current state of the field.
Ritter, Alison; Livingston, Michael; Chalmers, Jenny; Berends, Lynda; Reuter, Peter
2016-05-01
A central policy research question concerns the extent to which specific policies produce certain effects - and cross-national (or between state/province) comparisons appear to be an ideal way to answer such a question. This paper explores the current state of comparative policy analysis (CPA) with respect to alcohol and drugs policies. We created a database of journal articles published between 2010 and 2014 as the body of CPA work for analysis. We used this database of 57 articles to clarify, extract and analyse the ways in which CPA has been defined. Quantitative and qualitative analysis of the CPA methods employed, the policy areas that have been studied, and differences between alcohol CPA and drug CPA are explored. There is a lack of clear definition as to what counts as a CPA. The two criteria for a CPA (explicit study of a policy, and comparison across two or more geographic locations), exclude descriptive epidemiology and single state comparisons. With the strict definition, most CPAs were with reference to alcohol (42%), although the most common policy to be analysed was medical cannabis (23%). The vast majority of papers undertook quantitative data analysis, with a variety of advanced statistical methods. We identified five approaches to the policy specification: classification or categorical coding of policy as present or absent; the use of an index; implied policy differences; described policy difference and data-driven policy coding. Each of these has limitations, but perhaps the most common limitation was the inability for the method to account for the differences between policy-as-stated versus policy-as-implemented. There is significant diversity in CPA methods for analysis of alcohol and drugs policy, and some substantial challenges with the currently employed methods. The absence of clear boundaries to a definition of what counts as a 'comparative policy analysis' may account for the methodological plurality but also appears to stand in the way of advancing the techniques. Copyright © 2016 Elsevier B.V. All rights reserved.
Davis, Allan Peter; Wiegers, Thomas C.; King, Benjamin L.; Wiegers, Jolene; Grondin, Cynthia J.; Sciaky, Daniela; Johnson, Robin J.; Mattingly, Carolyn J.
2016-01-01
Strategies for discovering common molecular events among disparate diseases hold promise for improving understanding of disease etiology and expanding treatment options. One technique is to leverage curated datasets found in the public domain. The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) manually curates chemical-gene, chemical-disease, and gene-disease interactions from the scientific literature. The use of official gene symbols in CTD interactions enables this information to be combined with the Gene Ontology (GO) file from NCBI Gene. By integrating these GO-gene annotations with CTD’s gene-disease dataset, we produce 753,000 inferences between 15,700 GO terms and 4,200 diseases, providing opportunities to explore presumptive molecular underpinnings of diseases and identify biological similarities. Through a variety of applications, we demonstrate the utility of this novel resource. As a proof-of-concept, we first analyze known repositioned drugs (e.g., raloxifene and sildenafil) and see that their target diseases have a greater degree of similarity when comparing GO terms vs. genes. Next, a computational analysis predicts seemingly non-intuitive diseases (e.g., stomach ulcers and atherosclerosis) as being similar to bipolar disorder, and these are validated in the literature as reported co-diseases. Additionally, we leverage other CTD content to develop testable hypotheses about thalidomide-gene networks to treat seemingly disparate diseases. Finally, we illustrate how CTD tools can rank a series of drugs as potential candidates for repositioning against B-cell chronic lymphocytic leukemia and predict cisplatin and the small molecule inhibitor JQ1 as lead compounds. The CTD dataset is freely available for users to navigate pathologies within the context of extensive biological processes, molecular functions, and cellular components conferred by GO. This inference set should aid researchers, bioinformaticists, and pharmaceutical drug makers in finding commonalities in disease mechanisms, which in turn could help identify new therapeutics, new indications for existing pharmaceuticals, potential disease comorbidities, and alerts for side effects. PMID:27171405
Davis, Allan Peter; Wiegers, Thomas C; King, Benjamin L; Wiegers, Jolene; Grondin, Cynthia J; Sciaky, Daniela; Johnson, Robin J; Mattingly, Carolyn J
2016-01-01
Strategies for discovering common molecular events among disparate diseases hold promise for improving understanding of disease etiology and expanding treatment options. One technique is to leverage curated datasets found in the public domain. The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) manually curates chemical-gene, chemical-disease, and gene-disease interactions from the scientific literature. The use of official gene symbols in CTD interactions enables this information to be combined with the Gene Ontology (GO) file from NCBI Gene. By integrating these GO-gene annotations with CTD's gene-disease dataset, we produce 753,000 inferences between 15,700 GO terms and 4,200 diseases, providing opportunities to explore presumptive molecular underpinnings of diseases and identify biological similarities. Through a variety of applications, we demonstrate the utility of this novel resource. As a proof-of-concept, we first analyze known repositioned drugs (e.g., raloxifene and sildenafil) and see that their target diseases have a greater degree of similarity when comparing GO terms vs. genes. Next, a computational analysis predicts seemingly non-intuitive diseases (e.g., stomach ulcers and atherosclerosis) as being similar to bipolar disorder, and these are validated in the literature as reported co-diseases. Additionally, we leverage other CTD content to develop testable hypotheses about thalidomide-gene networks to treat seemingly disparate diseases. Finally, we illustrate how CTD tools can rank a series of drugs as potential candidates for repositioning against B-cell chronic lymphocytic leukemia and predict cisplatin and the small molecule inhibitor JQ1 as lead compounds. The CTD dataset is freely available for users to navigate pathologies within the context of extensive biological processes, molecular functions, and cellular components conferred by GO. This inference set should aid researchers, bioinformaticists, and pharmaceutical drug makers in finding commonalities in disease mechanisms, which in turn could help identify new therapeutics, new indications for existing pharmaceuticals, potential disease comorbidities, and alerts for side effects.
Study of Geological Analogues for Understanding the Radar Sounder Response of the RIME Targets
NASA Astrophysics Data System (ADS)
Thakur, S.; Bruzzone, L.
2017-12-01
Radar for Icy Moon Exploration (RIME), the radar sounder onboard the Jupiter Icy Moons Explorer (JUICE), is aimed at characterizing the ice shells of the Jovian moons - Ganymede, Europa and Callisto. RIME is optimized to operate at 9 MHz central frequency with bandwidth of 1 MHz and 2.7 MHz to achieve a penetration depth up to 9 km through ice. We have developed an approach to the definition of a database of simulated RIME radargrams by leveraging the data available from airborne and orbital radar sounder acquisitions over geological analogues of the expected icy moon features. These simulated radargrams are obtained by merging real radar sounder data with models of the subsurface of the Jupiter icy moons. They will be useful for geological interpretation of the RIME radargrams and for better predicting the performance of RIME. The database will also be useful in developing pre-processing and automatic feature extraction algorithms to support data analysis during the mission phase of RIME. Prior to the JUICE mission exploring the Jovian satellites with RIME, there exist radar sounders such as SHARAD (onboard MRO) and MARSIS (onboard MEX) probing Mars, the LRS (onboard SELENE) probing the Moon, and many airborne sounders probing the polar regions of Earth. Analogues have been identified in these places based on similarity in geo-morphological expression. Moreover, other analogues have been identified on the Earth for possible dedicated acquisition campaigns before the RIME operations. By assuming that the subsurface structure of the RIME targets is approximately represented in the analogue radargrams, the difference in composition is accounted for by imposing different dielectric and subsurface attenuation models. The RIME radargrams are simulated from the analogue radargrams using the radar equation and the RIME processing chain and accounting for different possible scenarios in terms of subsurface structure, dielectric properties and instrument parameters. For cross-validation, the database is compared with radargrams simulated from the analysis of radio wave propagation through geo-electrical models representing the subsurface hypotheses for the RIME targets.
Lee, Wonhoon; Park, Jongsun; Choi, Jaeyoung; Jung, Kyongyong; Park, Bongsoo; Kim, Donghan; Lee, Jaeyoung; Ahn, Kyohun; Song, Wonho; Kang, Seogchan; Lee, Yong-Hwan; Lee, Seunghwan
2009-01-01
Background Sequences and organization of the mitochondrial genome have been used as markers to investigate evolutionary history and relationships in many taxonomic groups. The rapidly increasing mitochondrial genome sequences from diverse insects provide ample opportunities to explore various global evolutionary questions in the superclass Hexapoda. To adequately support such questions, it is imperative to establish an informatics platform that facilitates the retrieval and utilization of available mitochondrial genome sequence data. Results The Insect Mitochondrial Genome Database (IMGD) is a new integrated platform that archives the mitochondrial genome sequences from 25,747 hexapod species, including 112 completely sequenced and 20 nearly completed genomes and 113,985 partially sequenced mitochondrial genomes. The Species-driven User Interface (SUI) of IMGD supports data retrieval and diverse analyses at multi-taxon levels. The Phyloviewer implemented in IMGD provides three methods for drawing phylogenetic trees and displays the resulting trees on the web. The SNP database incorporated to IMGD presents the distribution of SNPs and INDELs in the mitochondrial genomes of multiple isolates within eight species. A newly developed comparative SNU Genome Browser supports the graphical presentation and interactive interface for the identified SNPs/INDELs. Conclusion The IMGD provides a solid foundation for the comparative mitochondrial genomics and phylogenetics of insects. All data and functions described here are available at the web site . PMID:19351385
Compilation of geothermal information: exploration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1978-01-01
The Database for Geothermal Energy Exploration and Evaluation is a printout of selected references to publications covering the development of geothermal resources from the identification of an area to the production of elecric power. This annotated bibliography contains four sections: references, author index, author affiliation index, and descriptor index.
Schwarzkopf, Larissa; Holle, Rolf; Schunk, Michaela
2017-01-01
Aims This claims data-based study compares the intensity of diabetes care in community dwellers and nursing home residents with dementia. Methods Delivery of diabetes-related medical examinations (DRMEs) was compared via logistic regression in 1,604 community dwellers and 1,010 nursing home residents with dementia. The intra-individual effect of nursing home transfer was evaluated within mixed models. Results Delivery of DRMEs decreases with increasing care dependency, with more community-living individuals receiving DRMEs. Moreover, DRME provision decreases after nursing home transfer. Conclusion Dementia patients receive fewer DRMEs than recommended, especially in cases of higher care dependency and particularly in nursing homes. This suggests lacking awareness regarding the specific challenges of combined diabetes and dementia care. PMID:28413415
Genome-wide comparative analysis of four Indian Drosophila species.
Mohanty, Sujata; Khanna, Radhika
2017-12-01
Comparative analysis of multiple genomes of closely or distantly related Drosophila species undoubtedly creates excitement among evolutionary biologists in exploring the genomic changes with an ecology and evolutionary perspective. We present herewith the de novo assembled whole genome sequences of four Drosophila species, D. bipectinata, D. takahashii, D. biarmipes and D. nasuta of Indian origin using Next Generation Sequencing technology on an Illumina platform along with their detailed assembly statistics. The comparative genomics analysis, e.g. gene predictions and annotations, functional and orthogroup analysis of coding sequences and genome wide SNP distribution were performed. The whole genome of Zaprionus indianus of Indian origin published earlier by us and the genome sequences of previously sequenced 12 Drosophila species available in the NCBI database were included in the analysis. The present work is a part of our ongoing genomics project of Indian Drosophila species.
Face recognition using an enhanced independent component analysis approach.
Kwak, Keun-Chang; Pedrycz, Witold
2007-03-01
This paper is concerned with an enhanced independent component analysis (ICA) and its application to face recognition. Typically, face representations obtained by ICA involve unsupervised learning and high-order statistics. In this paper, we develop an enhancement of the generic ICA by augmenting this method by the Fisher linear discriminant analysis (LDA); hence, its abbreviation, FICA. The FICA is systematically developed and presented along with its underlying architecture. A comparative analysis explores four distance metrics, as well as classification with support vector machines (SVMs). We demonstrate that the FICA approach leads to the formation of well-separated classes in low-dimension subspace and is endowed with a great deal of insensitivity to large variation in illumination and facial expression. The comprehensive experiments are completed for the facial-recognition technology (FERET) face database; a comparative analysis demonstrates that FICA comes with improved classification rates when compared with some other conventional approaches such as eigenface, fisherface, and the ICA itself.
Using Web Ontology Language to Integrate Heterogeneous Databases in the Neurosciences
Lam, Hugo Y.K.; Marenco, Luis; Shepherd, Gordon M.; Miller, Perry L.; Cheung, Kei-Hoi
2006-01-01
Integrative neuroscience involves the integration and analysis of diverse types of neuroscience data involving many different experimental techniques. This data will increasingly be distributed across many heterogeneous databases that are web-accessible. Currently, these databases do not expose their schemas (database structures) and their contents to web applications/agents in a standardized, machine-friendly way. This limits database interoperation. To address this problem, we describe a pilot project that illustrates how neuroscience databases can be expressed using the Web Ontology Language, which is a semantically-rich ontological language, as a common data representation language to facilitate complex cross-database queries. In this pilot project, an existing tool called “D2RQ” was used to translate two neuroscience databases (NeuronDB and CoCoDat) into OWL, and the resulting OWL ontologies were then merged. An OWL-based reasoner (Racer) was then used to provide a sophisticated query language (nRQL) to perform integrated queries across the two databases based on the merged ontology. This pilot project is one step toward exploring the use of semantic web technologies in the neurosciences. PMID:17238384
Ventetuolo, Corey E; Hess, Edward; Austin, Eric D; Barón, Anna E; Klinger, James R; Lahm, Tim; Maddox, Thomas M; Plomondon, Mary E; Thompson, Lauren; Zamanian, Roham T; Choudhary, Gaurav; Maron, Bradley A
2017-01-01
Women have an increased risk of pulmonary hypertension (PH) but better survival compared to men. Few studies have explored sex-based differences in population-based cohorts with PH. We sought to determine whether sex was associated with hemodynamics and survival in US veterans with PH (mean pulmonary artery pressure [mPAP] ≥ 25 mm Hg) from the Veterans Affairs Clinical Assessment, Reporting, and Tracking database. The relationship between sex and hemodynamics was assessed with multivariable linear mixed modeling. Cox proportional hazards models were used to compare survival by sex for those with PH and precapillary PH (mPAP ≥ 25 mm Hg, pulmonary artery wedge pressure [PAWP] ≤ 15 mm Hg and pulmonary vascular resistance [PVR] > 3 Wood units) respectively. The study population included 15,464 veterans with PH, 516 (3%) of whom were women; 1,942 patients (13%) had precapillary PH, of whom 120 (6%) were women. Among those with PH, women had higher PVR and pulmonary artery pulse pressure, and lower right atrial pressure and PAWP (all p <0.001) compared with men. There were no significant differences in hemodynamics according to sex in veterans with precapillary PH. Women with PH had 18% greater survival compared to men with PH (adjusted HR 0.82, 95% CI 0.69-0.97, p = 0.020). Similarly, women with precapillary PH were 29% more likely to survive as compared to men with PH (adjusted HR 0.71, 95% CI 0.52-0.98, p = 0.040). In conclusion, female veterans with PH have better survival than males despite higher pulmonary afterload.
The Plant Genome Integrative Explorer Resource: PlantGenIE.org.
Sundell, David; Mannapperuma, Chanaka; Netotea, Sergiu; Delhomme, Nicolas; Lin, Yao-Cheng; Sjödin, Andreas; Van de Peer, Yves; Jansson, Stefan; Hvidsten, Torgeir R; Street, Nathaniel R
2015-12-01
Accessing and exploring large-scale genomics data sets remains a significant challenge to researchers without specialist bioinformatics training. We present the integrated PlantGenIE.org platform for exploration of Populus, conifer and Arabidopsis genomics data, which includes expression networks and associated visualization tools. Standard features of a model organism database are provided, including genome browsers, gene list annotation, Blast homology searches and gene information pages. Community annotation updating is supported via integration of WebApollo. We have produced an RNA-sequencing (RNA-Seq) expression atlas for Populus tremula and have integrated these data within the expression tools. An updated version of the ComPlEx resource for performing comparative plant expression analyses of gene coexpression network conservation between species has also been integrated. The PlantGenIE.org platform provides intuitive access to large-scale and genome-wide genomics data from model forest tree species, facilitating both community contributions to annotation improvement and tools supporting use of the included data resources to inform biological insight. © 2015 The Authors. New Phytologist © 2015 New Phytologist Trust.
SANSparallel: interactive homology search against Uniprot
Somervuo, Panu; Holm, Liisa
2015-01-01
Proteins evolve by mutations and natural selection. The network of sequence similarities is a rich source for mining homologous relationships that inform on protein structure and function. There are many servers available to browse the network of homology relationships but one has to wait up to a minute for results. The SANSparallel webserver provides protein sequence database searches with immediate response and professional alignment visualization by third-party software. The output is a list, pairwise alignment or stacked alignment of sequence-similar proteins from Uniprot, UniRef90/50, Swissprot or Protein Data Bank. The stacked alignments are viewed in Jalview or as sequence logos. The database search uses the suffix array neighborhood search (SANS) method, which has been re-implemented as a client-server, improved and parallelized. The method is extremely fast and as sensitive as BLAST above 50% sequence identity. Benchmarks show that the method is highly competitive compared to previously published fast database search programs: UBLAST, DIAMOND, LAST, LAMBDA, RAPSEARCH2 and BLAT. The web server can be accessed interactively or programmatically at http://ekhidna2.biocenter.helsinki.fi/cgi-bin/sans/sans.cgi. It can be used to make protein functional annotation pipelines more efficient, and it is useful in interactive exploration of the detailed evidence supporting the annotation of particular proteins of interest. PMID:25855811
Conformational flexibility of two RNA trimers explored by computational tools and database search.
Fadrná, Eva; Koca, Jaroslav
2003-04-01
Two RNA sequences, AAA and AUG, were studied by the conformational search program CICADA and by molecular dynamics (MD) in the framework of the AMBER force field, and also via thorough PDB database search. CICADA was used to provide detailed information about conformers and conformational interconversions on the energy surfaces of the above molecules. Several conformational families were found for both sequences. Analysis of the results shows differences, especially between the energy of the single families, and also in flexibility and concerted conformational movement. Therefore, several MD trajectories (altogether 16 ns) were run to obtain more details about both the stability of conformers belonging to different conformational families and about the dynamics of the two systems. Results show that the trajectories strongly depend on the starting structure. When the MD start from the global minimum found by CICADA, they provide a stable run, while MD starting from another conformational family generates a trajectory where several different conformational families are visited. The results obtained by theoretical methods are compared with the thorough database search data. It is concluded that all except for the highest energy conformational families found in theoretical result also appear in experimental data. Registry numbers: adenylyl-(3' --> 5')-adenylyl-(3' --> 5')-adenosine [917-44-2] adenylyl-(3' --> 5')-uridylyl-(3' --> 5')-guanosine [3494-35-7].
Castañón, Jesús; Román, José Pablo; Jessop, Theodore C; de Blas, Jesús; Haro, Rubén
2018-06-01
DNA-encoded libraries (DELs) have emerged as an efficient and cost-effective drug discovery tool for the exploration and screening of very large chemical space using small-molecule collections of unprecedented size. Herein, we report an integrated automation and informatics system designed to enhance the quality, efficiency, and throughput of the production and affinity selection of these libraries. The platform is governed by software developed according to a database-centric architecture to ensure data consistency, integrity, and availability. Through its versatile protocol management functionalities, this application captures the wide diversity of experimental processes involved with DEL technology, keeps track of working protocols in the database, and uses them to command robotic liquid handlers for the synthesis of libraries. This approach provides full traceability of building-blocks and DNA tags in each split-and-pool cycle. Affinity selection experiments and high-throughput sequencing reads are also captured in the database, and the results are automatically deconvoluted and visualized in customizable representations. Researchers can compare results of different experiments and use machine learning methods to discover patterns in data. As of this writing, the platform has been validated through the generation and affinity selection of various libraries, and it has become the cornerstone of the DEL production effort at Lilly.
SCPortalen: human and mouse single-cell centric database
Noguchi, Shuhei; Böttcher, Michael; Hasegawa, Akira; Kouno, Tsukasa; Kato, Sachi; Tada, Yuhki; Ura, Hiroki; Abe, Kuniya; Shin, Jay W; Plessy, Charles; Carninci, Piero
2018-01-01
Abstract Published single-cell datasets are rich resources for investigators who want to address questions not originally asked by the creators of the datasets. The single-cell datasets might be obtained by different protocols and diverse analysis strategies. The main challenge in utilizing such single-cell data is how we can make the various large-scale datasets to be comparable and reusable in a different context. To challenge this issue, we developed the single-cell centric database ‘SCPortalen’ (http://single-cell.clst.riken.jp/). The current version of the database covers human and mouse single-cell transcriptomics datasets that are publicly available from the INSDC sites. The original metadata was manually curated and single-cell samples were annotated with standard ontology terms. Following that, common quality assessment procedures were conducted to check the quality of the raw sequence. Furthermore, primary data processing of the raw data followed by advanced analyses and interpretation have been performed from scratch using our pipeline. In addition to the transcriptomics data, SCPortalen provides access to single-cell image files whenever available. The target users of SCPortalen are all researchers interested in specific cell types or population heterogeneity. Through the web interface of SCPortalen users are easily able to search, explore and download the single-cell datasets of their interests. PMID:29045713
Cerqueira, Gustavo C; Arnaud, Martha B; Inglis, Diane O; Skrzypek, Marek S; Binkley, Gail; Simison, Matt; Miyasato, Stuart R; Binkley, Jonathan; Orvis, Joshua; Shah, Prachi; Wymore, Farrell; Sherlock, Gavin; Wortman, Jennifer R
2014-01-01
The Aspergillus Genome Database (AspGD; http://www.aspgd.org) is a freely available web-based resource that was designed for Aspergillus researchers and is also a valuable source of information for the entire fungal research community. In addition to being a repository and central point of access to genome, transcriptome and polymorphism data, AspGD hosts a comprehensive comparative genomics toolbox that facilitates the exploration of precomputed orthologs among the 20 currently available Aspergillus genomes. AspGD curators perform gene product annotation based on review of the literature for four key Aspergillus species: Aspergillus nidulans, Aspergillus oryzae, Aspergillus fumigatus and Aspergillus niger. We have iteratively improved the structural annotation of Aspergillus genomes through the analysis of publicly available transcription data, mostly expressed sequenced tags, as described in a previous NAR Database article (Arnaud et al. 2012). In this update, we report substantive structural annotation improvements for A. nidulans, A. oryzae and A. fumigatus genomes based on recently available RNA-Seq data. Over 26 000 loci were updated across these species; although those primarily comprise the addition and extension of untranslated regions (UTRs), the new analysis also enabled over 1000 modifications affecting the coding sequence of genes in each target genome.
Functionally Graded Materials Database
NASA Astrophysics Data System (ADS)
Kisara, Katsuto; Konno, Tomomi; Niino, Masayuki
2008-02-01
Functionally Graded Materials Database (hereinafter referred to as FGMs Database) was open to the society via Internet in October 2002, and since then it has been managed by the Japan Aerospace Exploration Agency (JAXA). As of October 2006, the database includes 1,703 research information entries with 2,429 researchers data, 509 institution data and so on. Reading materials such as "Applicability of FGMs Technology to Space Plane" and "FGMs Application to Space Solar Power System (SSPS)" were prepared in FY 2004 and 2005, respectively. The English version of "FGMs Application to Space Solar Power System (SSPS)" is now under preparation. This present paper explains the FGMs Database, describing the research information data, the sitemap and how to use it. From the access analysis, user access results and users' interests are discussed.
DOT National Transportation Integrated Search
2017-03-01
This research explored the second Strategic Highway Research Program (SHRP2) Naturalistic Driving Study (NDS) database for the potential to identify freeway entrance and exit ramps and teen drivers behavior while traveling those ramps. This is in ...
Participation Patterns of Korean Adolescents in School-Based Career Exploration Activities
ERIC Educational Resources Information Center
Rojewski, Jay W.; Lee, In Heok; Hill, Roger B.
2014-01-01
Variations in the school-based career exploration activities of Korean high school students were examined. Data represented 5,227 Korean adolescents in Grade 11 contained in the Korean Education Longitudinal Study of 2005, a nationally representative longitudinal database administered by the Korean Educational Development Institute. Latent class…
Black African Parents' Experiences of an Educational Psychology Service
ERIC Educational Resources Information Center
Lawrence, Zena
2014-01-01
The evidence base that explores Black African parents' experiences of an Educational Psychology Service (EPS) is limited. This article describes an exploratory mixed methods research study undertaken during 2009-2011, that explored Black African parents' engagement with a UK EPS. Quantitative data were gathered from the EPS preschool database and…
Nascimento, Leandro Costa; Salazar, Marcela Mendes; Lepikson-Neto, Jorge; Camargo, Eduardo Leal Oliveira; Parreiras, Lucas Salera; Carazzolle, Marcelo Falsarella
2017-01-01
Abstract Tree species of the genus Eucalyptus are the most valuable and widely planted hardwoods in the world. Given the economic importance of Eucalyptus trees, much effort has been made towards the generation of specimens with superior forestry properties that can deliver high-quality feedstocks, customized to the industrýs needs for both cellulosic (paper) and lignocellulosic biomass production. In line with these efforts, large sets of molecular data have been generated by several scientific groups, providing invaluable information that can be applied in the development of improved specimens. In order to fully explore the potential of available datasets, the development of a public database that provides integrated access to genomic and transcriptomic data from Eucalyptus is needed. EUCANEXT is a database that analyses and integrates publicly available Eucalyptus molecular data, such as the E. grandis genome assembly and predicted genes, ESTs from several species and digital gene expression from 26 RNA-Seq libraries. The database has been implemented in a Fedora Linux machine running MySQL and Apache, while Perl CGI was used for the web interfaces. EUCANEXT provides a user-friendly web interface for easy access and analysis of publicly available molecular data from Eucalyptus species. This integrated database allows for complex searches by gene name, keyword or sequence similarity and is publicly accessible at http://www.lge.ibi.unicamp.br/eucalyptusdb. Through EUCANEXT, users can perform complex analysis to identify genes related traits of interest using RNA-Seq libraries and tools for differential expression analysis. Moreover, all the bioinformatics pipeline here described, including the database schema and PERL scripts, are readily available and can be applied to any genomic and transcriptomic project, regardless of the organism. Database URL: http://www.lge.ibi.unicamp.br/eucalyptusdb PMID:29220468
Software Classifications: Trends in Literacy Software Publication and Marketing.
ERIC Educational Resources Information Center
Balajthy, Ernest
First in a continuing series of reports on trends in marketing and publication of software for literacy education, a study explored the development of a database to track the trends and reported on trends seen in 1995. The final version of the 1995 database consisted of 1011 software titles, 165 of which had been published in 1995 and 846…
3DNALandscapes: a database for exploring the conformational features of DNA.
Zheng, Guohui; Colasanti, Andrew V; Lu, Xiang-Jun; Olson, Wilma K
2010-01-01
3DNALandscapes, located at: http://3DNAscapes.rutgers.edu, is a new database for exploring the conformational features of DNA. In contrast to most structural databases, which archive the Cartesian coordinates and/or derived parameters and images for individual structures, 3DNALandscapes enables searches of conformational information across multiple structures. The database contains a wide variety of structural parameters and molecular images, computed with the 3DNA software package and known to be useful for characterizing and understanding the sequence-dependent spatial arrangements of the DNA sugar-phosphate backbone, sugar-base side groups, base pairs, base-pair steps, groove structure, etc. The data comprise all DNA-containing structures--both free and bound to proteins, drugs and other ligands--currently available in the Protein Data Bank. The web interface allows the user to link, report, plot and analyze this information from numerous perspectives and thereby gain insight into DNA conformation, deformability and interactions in different sequence and structural contexts. The data accumulated from known, well-resolved DNA structures can serve as useful benchmarks for the analysis and simulation of new structures. The collective data can also help to understand how DNA deforms in response to proteins and other molecules and undergoes conformational rearrangements.
NASA Astrophysics Data System (ADS)
Hazelhoff, Lykele; Creusen, Ivo M.; Woudsma, Thomas; de With, Peter H. N.
2015-11-01
Combined databases of road markings and traffic signs provide a complete and full description of the present traffic legislation and instructions. Such databases contribute to efficient signage maintenance, improve navigation, and benefit autonomous driving vehicles. A system is presented for the automated creation of such combined databases, which additionally investigates the benefit of this combination for automated contextual placement analysis. This analysis involves verification of the co-occurrence of traffic signs and road markings to retrieve a list of potentially incorrectly signaled (and thus potentially unsafe) road situations. This co-occurrence verification is specifically explored for both pedestrian crossings and yield situations. Evaluations on 420 km of road have shown that individual detection of traffic signs and road markings denoting these road situations can be performed with accuracies of 98% and 85%, respectively. Combining both approaches shows that over 95% of the pedestrian crossings and give-way situations can be identified. An exploration toward additional co-occurrence analysis of signs and markings shows that inconsistently signaled situations can successfully be extracted, such that specific safety actions can be directed toward cases lacking signs or markings, while most consistently signaled situations can be omitted from this analysis.
Dellaire, G.; Farrall, R.; Bickmore, W.A.
2003-01-01
The Nuclear Protein Database (NPD) is a curated database that contains information on more than 1300 vertebrate proteins that are thought, or are known, to localise to the cell nucleus. Each entry is annotated with information on predicted protein size and isoelectric point, as well as any repeats, motifs or domains within the protein sequence. In addition, information on the sub-nuclear localisation of each protein is provided and the biological and molecular functions are described using Gene Ontology (GO) terms. The database is searchable by keyword, protein name, sub-nuclear compartment and protein domain/motif. Links to other databases are provided (e.g. Entrez, SWISS-PROT, OMIM, PubMed, PubMed Central). Thus, NPD provides a gateway through which the nuclear proteome may be explored. The database can be accessed at http://npd.hgu.mrc.ac.uk and is updated monthly. PMID:12520015
Garrity, Christopher P.; Soller, David R.
2009-01-01
The Geological Society of America's (GSA) Geologic Map of North America (Reed and others, 2005; 1:5,000,000) shows the geology of a significantly large area of the Earth, centered on North and Central America and including the submarine geology of parts of the Atlantic and Pacific Oceans. This map is now converted to a Geographic Information System (GIS) database that contains all geologic and base-map information shown on the two printed map sheets and the accompanying explanation sheet. We anticipate this map database will be revised at some unspecified time in the future, likely through the actions of a steering committee managed by the Geological Society of America (GSA) and staffed by scientists from agencies including, but not limited to, those responsible for the original map compilation (U.S. Geological Survey, Geological Survey of Canada, and Woods Hole Oceanographic Institute). Regarding the use of this product, as noted by the map's compilers: 'The Geologic Map of North America is an essential educational tool for teaching the geology of North America to university students and for the continuing education of professional geologists in North America and elsewhere. In addition, simplified maps derived from the Geologic Map of North America are useful for enlightening younger students and the general public about the geology of the continent.' With publication of this database, the preparation of any type of simplified map is made significantly easier. More important perhaps, the database provides a more accessible means to explore the map information and to compare and analyze it in conjunction with other types of information (for example, land use, soils, biology) to better understand the complex interrelations among factors that affect Earth resources, hazards, ecosystems, and climate.
NASA Astrophysics Data System (ADS)
Paiva, L. M. S.; Bodstein, G. C. R.; Pimentel, L. C. G.
2014-08-01
Large-eddy simulations are performed using the Advanced Regional Prediction System (ARPS) code at horizontal grid resolutions as fine as 300 m to assess the influence of detailed and updated surface databases on the modeling of local atmospheric circulation systems of urban areas with complex terrain. Applications to air pollution and wind energy are sought. These databases are comprised of 3 arc-sec topographic data from the Shuttle Radar Topography Mission, 10 arc-sec vegetation-type data from the European Space Agency (ESA) GlobCover project, and 30 arc-sec leaf area index and fraction of absorbed photosynthetically active radiation data from the ESA GlobCarbon project. Simulations are carried out for the metropolitan area of Rio de Janeiro using six one-way nested-grid domains that allow the choice of distinct parametric models and vertical resolutions associated to each grid. ARPS is initialized using the Global Forecasting System with 0.5°-resolution data from the National Center of Environmental Prediction, which is also used every 3 h as lateral boundary condition. Topographic shading is turned on and two soil layers are used to compute the soil temperature and moisture budgets in all runs. Results for two simulated runs covering three periods of time are compared to surface and upper-air observational data to explore the dependence of the simulations on initial and boundary conditions, grid resolution, topographic and land-use databases. Our comparisons show overall good agreement between simulated and observational data, mainly for the potential temperature and the wind speed fields, and clearly indicate that the use of high-resolution databases improves significantly our ability to predict the local atmospheric circulation.
Shepshelovich, D; Goldvaser, H; Wang, L; Abdul Razak, A R
2017-12-13
Introduction The role of phase I cancer trials is constantly evolving and they are increasingly being used in 'go/no' decisions in drug development. As a result, there is a growing need to ensure trials are published when completed. There are limited data on the publication rate and the factors associated with publication in phase I trials. Methods The ClinicalTrials.gov database was searched for completed adult phase I cancer trials with reported results. PubMed was searched for matching publications published prior to April 1, 2017. Logistic regression was used to identify factors associated with unpublished trials. Linear regression was used to explore factors associated with time lag from study database lock to publication for published trials. Results The study cohort included 319 trials. 95 (30%) trials had no matching publication. Thirty (9%) trials were not published in abstract form as well. On multivariable analysis, the most significant factor associated with unpublished trials was industry funding (odds ratio 3.3, 95% confidence interval 1.7-6.6, p=0.019). For published trials, time lag between database lock and publication was longer by 10.9 months (standard error 3.6, p<0.001) for industry funded trials compared with medical center funded trials. Conclusions Timely publishing of early cancer clinical trials results remains unsatisfactory. Industry funded phase I cancer trials were more likely to remain unpublished, and were associated with a longer time lag from database lock to publication. Policies that promote transparency and data sharing in clinical trial research might improve accountability among industry and investigators and improve timely results publication.
IsoPlot: a database for comparison of mRNA isoforms in fruit fly and mosquitoes
Ng, I-Man; Tsai, Shang-Chi
2017-01-01
Abstract Alternative splicing (AS), a mechanism by which different forms of mature messenger RNAs (mRNAs) are generated from the same gene, widely occurs in the metazoan genomes. Knowledge about isoform variants and abundance is crucial for understanding the functional context in the molecular diversity of the species. With increasing transcriptome data of model and non-model species, a database for visualization and comparison of AS events with up-to-date information is needed for further research. IsoPlot is a publicly available database with visualization tools for exploration of AS events, including three major species of mosquitoes, Aedes aegypti, Anopheles gambiae, and Culex quinquefasciatus, and fruit fly Drosophila melanogaster, the model insect species. IsoPlot includes not only 88,663 annotated transcripts but also 17,037 newly predicted transcripts from massive transcriptome data at different developmental stages of mosquitoes. The web interface enables users to explore the patterns and abundance of isoforms in different experimental conditions as well as cross-species sequence comparison of orthologous transcripts. IsoPlot provides a platform for researchers to access comprehensive information about AS events in mosquitoes and fruit fly. Our database is available on the web via an interactive user interface with an intuitive graphical design, which is applicable for the comparison of complex isoforms within or between species. Database URL: http://isoplot.iis.sinica.edu.tw/ PMID:29220459
Elucidation of metabolic pathways from enzyme classification data.
McDonald, Andrew G; Tipton, Keith F
2014-01-01
The IUBMB Enzyme List is widely used by other databases as a source for avoiding ambiguity in the recognition of enzymes as catalytic entities. However, it was not designed for metabolic pathway tracing, which has become increasingly important in systems biology. A Reactions Database has been created from the material in the Enzyme List to allow reactions to be searched by substrate/product, and pathways to be traced from any selected starting/seed substrate. An extensive synonym glossary allows searches by many of the alternative names, including accepted abbreviations, by which a chemical compound may be known. This database was necessary for the development of the application Reaction Explorer ( http://www.reaction-explorer.org ), which was written in Real Studio ( http://www.realsoftware.com/realstudio/ ) to search the Reactions Database and draw metabolic pathways from reactions selected by the user. Having input the name of the starting compound (the "seed"), the user is presented with a list of all reactions containing that compound and then selects the product of interest as the next point on the ensuing graph. The pathway diagram is then generated as the process iterates. A contextual menu is provided, which allows the user: (1) to remove a compound from the graph, along with all associated links; (2) to search the reactions database again for additional reactions involving the compound; (3) to search for the compound within the Enzyme List.
NASA Astrophysics Data System (ADS)
Diamant, Idit; Shalhon, Moran; Goldberger, Jacob; Greenspan, Hayit
2016-03-01
Classification of clustered breast microcalcifications into benign and malignant categories is an extremely challenging task for computerized algorithms and expert radiologists alike. In this paper we present a novel method for feature selection based on mutual information (MI) criterion for automatic classification of microcalcifications. We explored the MI based feature selection for various texture features. The proposed method was evaluated on a standardized digital database for screening mammography (DDSM). Experimental results demonstrate the effectiveness and the advantage of using the MI-based feature selection to obtain the most relevant features for the task and thus to provide for improved performance as compared to using all features.
TIPdb-3D: the three-dimensional structure database of phytochemicals from Taiwan indigenous plants
Tung, Chun-Wei; Lin, Ying-Chi; Chang, Hsun-Shuo; Wang, Chia-Chi; Chen, Ih-Sheng; Jheng, Jhao-Liang; Li, Jih-Heng
2014-01-01
The rich indigenous and endemic plants in Taiwan serve as a resourceful bank for biologically active phytochemicals. Based on our TIPdb database curating bioactive phytochemicals from Taiwan indigenous plants, this study presents a three-dimensional (3D) chemical structure database named TIPdb-3D to support the discovery of novel pharmacologically active compounds. The Merck Molecular Force Field (MMFF94) was used to generate 3D structures of phytochemicals in TIPdb. The 3D structures could facilitate the analysis of 3D quantitative structure–activity relationship, the exploration of chemical space and the identification of potential pharmacologically active compounds using protein–ligand docking. Database URL: http://cwtung.kmu.edu.tw/tipdb. PMID:24930145
Exploring Protein Function Using the Saccharomyces Genome Database.
Wong, Edith D
2017-01-01
Elucidating the function of individual proteins will help to create a comprehensive picture of cell biology, as well as shed light on human disease mechanisms, possible treatments, and cures. Due to its compact genome, and extensive history of experimentation and annotation, the budding yeast Saccharomyces cerevisiae is an ideal model organism in which to determine protein function. This information can then be leveraged to infer functions of human homologs. Despite the large amount of research and biological data about S. cerevisiae, many proteins' functions remain unknown. Here, we explore ways to use the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org ) to predict the function of proteins and gain insight into their roles in various cellular processes.
Moving BASISplus and TECHLIBplus from VAX/VMS to UNIX
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dominiak, R.
1993-12-31
BASISplus is used at the Laboratory by the Technical Information Services (TIS) Department which is part of the Information and Publishing Division at ARGONNE. TIS operates the Argonne Information Management System (AIM). The AIM System consists of the ANL Libraries On-Line Database (a TECHLIBplus database), the Current Journals Database (IDI`s current contents search), the ANL Publications Tracking Database (a TECHLIBplus database), the Powder Diffraction File Database, and several CD-ROM databases available through a Novell network. The AIM System is available from the desktop of ANL staff through modem and network connections, as well as from the 10 science libraries atmore » ARGONNE. TIS has been a BASISplus and TECHLIBplus site from the start, and never migrated from BASIS K. The decision to migrate from the VAX/VMS platform to a UNIX platform. Migrating a product from one platform to another involves many decisions and considerations. These justifications, decisions, and considerations are explored in this report.« less
Building An Integrated Neurodegenerative Disease Database At An Academic Health Center
Xie, Sharon X.; Baek, Young; Grossman, Murray; Arnold, Steven E.; Karlawish, Jason; Siderowf, Andrew; Hurtig, Howard; Elman, Lauren; McCluskey, Leo; Van Deerlin, Vivianna; Lee, Virginia M.-Y.; Trojanowski, John Q.
2010-01-01
Background It is becoming increasingly important to study common and distinct etiologies, clinical and pathological features, and mechanisms related to neurodegenerative diseases such as Alzheimer’s disease (AD), Parkinson’s disease (PD), amyotrophic lateral sclerosis (ALS), and frontotemporal lobar degeneration (FTLD). These comparative studies rely on powerful database tools to quickly generate data sets which match diverse and complementary criteria set by the studies. Methods In this paper, we present a novel Integrated NeuroDegenerative Disease (INDD) database developed at the University of Pennsylvania (Penn) through a consortium of Penn investigators. Since these investigators work on AD, PD, ALS and FTLD, this allowed us to achieve the goal of developing an INDD database for these major neurodegenerative disorders. We used Microsoft SQL Server as the platform with built-in “backwards” functionality to provide Access as a front-end client to interface with the database. We used PHP hypertext Preprocessor to create the “front end” web interface and then integrated individual neurodegenerative disease databases using a master lookup table. We also present methods of data entry, database security, database backups, and database audit trails for this INDD database. Results We compare the results of a biomarker study using the INDD database to those using an alternative approach by querying individual database separately. Conclusions We have demonstrated that the Penn INDD database has the ability to query multiple database tables from a single console with high accuracy and reliability. The INDD database provides a powerful tool for generating data sets in comparative studies across several neurodegenerative diseases. PMID:21784346
Flower, Andrew; Lewith, George T; Little, Paul
2007-11-01
For most complementary and alternative medicine interventions, the absence of a high-quality evidence base to define good practice presents a serious problem for clinicians, educators, and researchers. The Delphi process may offer a pragmatic way to establish good practice guidelines until more rigorous forms of assessment can be undertaken. To use a modified Delphi to develop good practice guidelines for a feasibility study exploring the role of Chinese herbal medicine (CHM) in the treatment of endometriosis. To compare the outcomes from Delphi with data derived from a systematic review of the Chinese language database. An expert group was convened for a three-round Delphi that initially produced key statements relating to the CHM diagnosis and treatment of endometriosis (round 1) and then anonymously rated these on a 1-7 Likert scale (rounds 2 and 3). Statements with a median score of 5 and above were regarded as demonstrating positive group consensus. The differential diagnoses within Chinese Medicine and rating of the clinical value of individual herbs were then contrasted with comparable data from a review of Chinese language reports in the Chinese Biomedical Retrieval System (1978-2002), and China Academy of Traditional Chinese Medicine (1985-2002) databases and the Chinese TCM and magazine literature (1984-2004) databases. Consensus (good practice) guidelines for the CHM treatment of endometriosis relating to common diagnostic patterns, herb selection, dosage, and patient management were produced. The Delphi guidelines demonstrated a high degree of congruence with the information from the Chinese language databases. In the absence of rigorous evidence, Delphi offers a way to synthesize expert knowledge relating to diagnosis, patient management, and herbal selection in the treatment of endometriosis. The limitations of the expert group and the inability of Delphi to capture the subtle nuances of individualized clinical decision-making limit the usefulness of this approach.
GEMINI: Integrative Exploration of Genetic Variation and Genome Annotations
Paila, Umadevi; Chapman, Brad A.; Kirchner, Rory; Quinlan, Aaron R.
2013-01-01
Modern DNA sequencing technologies enable geneticists to rapidly identify genetic variation among many human genomes. However, isolating the minority of variants underlying disease remains an important, yet formidable challenge for medical genetics. We have developed GEMINI (GEnome MINIng), a flexible software package for exploring all forms of human genetic variation. Unlike existing tools, GEMINI integrates genetic variation with a diverse and adaptable set of genome annotations (e.g., dbSNP, ENCODE, UCSC, ClinVar, KEGG) into a unified database to facilitate interpretation and data exploration. Whereas other methods provide an inflexible set of variant filters or prioritization methods, GEMINI allows researchers to compose complex queries based on sample genotypes, inheritance patterns, and both pre-installed and custom genome annotations. GEMINI also provides methods for ad hoc queries and data exploration, a simple programming interface for custom analyses that leverage the underlying database, and both command line and graphical tools for common analyses. We demonstrate GEMINI's utility for exploring variation in personal genomes and family based genetic studies, and illustrate its ability to scale to studies involving thousands of human samples. GEMINI is designed for reproducibility and flexibility and our goal is to provide researchers with a standard framework for medical genomics. PMID:23874191
Direct hydrocarbon identification using AVO analysis in the Malay Basin
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lye, Y.C.; Yaacob, M.R.; Birkett, N.E.
1994-07-01
Esso Production Malaysia Inc. and Petronas Carigali Sdn. Bhd. have been conducting AVO (amplitude versus offset) processing and interpretation since April 1991 in an attempt to identify hydrocarbon fluids predrill. The major part of this effort was in the PM-5, PM-8, and PM-9 contract areas where an extensive exploration program is underway. To date, more than 1000 km of seismic data have been analyzed using the AVO technique, and the results were used to support the drilling of more than 50 exploration and delineation wells. Gather modeling of well data was used to calibrate and predict the presence of hydrocarbonmore » in proposed well locations. In order to generate accurate gather models, a geophysical properties and AVO database was needed, and great effort was spent in producing an accurate and complete database. This database is continuously being updated so that an experience file can be built to further improve the reliability of the AVO prediction.« less
Southern California Earthquake Center Geologic Vertical Motion Database
NASA Astrophysics Data System (ADS)
Niemi, Nathan A.; Oskin, Michael; Rockwell, Thomas K.
2008-07-01
The Southern California Earthquake Center Geologic Vertical Motion Database (VMDB) integrates disparate sources of geologic uplift and subsidence data at 104- to 106-year time scales into a single resource for investigations of crustal deformation in southern California. Over 1800 vertical deformation rate data points in southern California and northern Baja California populate the database. Four mature data sets are now represented: marine terraces, incised river terraces, thermochronologic ages, and stratigraphic surfaces. An innovative architecture and interface of the VMDB exposes distinct data sets and reference frames, permitting user exploration of this complex data set and allowing user control over the assumptions applied to convert geologic and geochronologic information into absolute uplift rates. Online exploration and download tools are available through all common web browsers, allowing the distribution of vertical motion results as HTML tables, tab-delimited GIS-compatible text files, or via a map interface through the Google Maps™ web service. The VMDB represents a mature product for research of fault activity and elastic deformation of southern California.
Exploring Human Cognition Using Large Image Databases.
Griffiths, Thomas L; Abbott, Joshua T; Hsu, Anne S
2016-07-01
Most cognitive psychology experiments evaluate models of human cognition using a relatively small, well-controlled set of stimuli. This approach stands in contrast to current work in neuroscience, perception, and computer vision, which have begun to focus on using large databases of natural images. We argue that natural images provide a powerful tool for characterizing the statistical environment in which people operate, for better evaluating psychological theories, and for bringing the insights of cognitive science closer to real applications. We discuss how some of the challenges of using natural images as stimuli in experiments can be addressed through increased sample sizes, using representations from computer vision, and developing new experimental methods. Finally, we illustrate these points by summarizing recent work using large image databases to explore questions about human cognition in four different domains: modeling subjective randomness, defining a quantitative measure of representativeness, identifying prior knowledge used in word learning, and determining the structure of natural categories. Copyright © 2016 Cognitive Science Society, Inc.
Machine learning landscapes and predictions for patient outcomes
NASA Astrophysics Data System (ADS)
Das, Ritankar; Wales, David J.
2017-07-01
The theory and computational tools developed to interpret and explore energy landscapes in molecular science are applied to the landscapes defined by local minima for neural networks. These machine learning landscapes correspond to fits of training data, where the inputs are vital signs and laboratory measurements for a database of patients, and the objective is to predict a clinical outcome. In this contribution, we test the predictions obtained by fitting to single measurements, and then to combinations of between 2 and 10 different patient medical data items. The effect of including measurements over different time intervals from the 48 h period in question is analysed, and the most recent values are found to be the most important. We also compare results obtained for neural networks as a function of the number of hidden nodes, and for different values of a regularization parameter. The predictions are compared with an alternative convex fitting function, and a strong correlation is observed. The dependence of these results on the patients randomly selected for training and testing decreases systematically with the size of the database available. The machine learning landscapes defined by neural network fits in this investigation have single-funnel character, which probably explains why it is relatively straightforward to obtain the global minimum solution, or a fit that behaves similarly to this optimal parameterization.
Clement, Fiona; Zimmer, Scott; Dixon, Elijah; Ball, Chad G.; Heitman, Steven J.; Swain, Mark; Ghosh, Subrata
2016-01-01
Importance At the turn of the 21st century, studies evaluating the change in incidence of appendicitis over time have reported inconsistent findings. Objectives We compared the differences in the incidence of appendicitis derived from a pathology registry versus an administrative database in order to validate coding in administrative databases and establish temporal trends in the incidence of appendicitis. Design We conducted a population-based comparative cohort study to identify all individuals with appendicitis from 2000 to2008. Setting & Participants Two population-based data sources were used to identify cases of appendicitis: 1) a pathology registry (n = 8,822); and 2) a hospital discharge abstract database (n = 10,453). Intervention & Main Outcome The administrative database was compared to the pathology registry for the following a priori analyses: 1) to calculate the positive predictive value (PPV) of administrative codes; 2) to compare the annual incidence of appendicitis; and 3) to assess differences in temporal trends. Temporal trends were assessed using a generalized linear model that assumed a Poisson distribution and reported as an annual percent change (APC) with 95% confidence intervals (CI). Analyses were stratified by perforated and non-perforated appendicitis. Results The administrative database (PPV = 83.0%) overestimated the incidence of appendicitis (100.3 per 100,000) when compared to the pathology registry (84.2 per 100,000). Codes for perforated appendicitis were not reliable (PPV = 52.4%) leading to overestimation in the incidence of perforated appendicitis in the administrative database (34.8 per 100,000) as compared to the pathology registry (19.4 per 100,000). The incidence of appendicitis significantly increased over time in both the administrative database (APC = 2.1%; 95% CI: 1.3, 2.8) and pathology registry (APC = 4.1; 95% CI: 3.1, 5.0). Conclusion & Relevance The administrative database overestimated the incidence of appendicitis, particularly among perforated appendicitis. Therefore, studies utilizing administrative data to analyze perforated appendicitis should be interpreted cautiously. PMID:27820826
Establishment and Assessment of Plasma Disruption and Warning Databases from EAST
NASA Astrophysics Data System (ADS)
Wang, Bo; Robert, Granetz; Xiao, Bingjia; Li, Jiangang; Yang, Fei; Li, Junjun; Chen, Dalong
2016-12-01
Disruption database and disruption warning database of the EAST tokamak had been established by a disruption research group. The disruption database, based on Structured Query Language (SQL), comprises 41 disruption parameters, which include current quench characteristics, EFIT equilibrium characteristics, kinetic parameters, halo currents, and vertical motion. Presently most disruption databases are based on plasma experiments of non-superconducting tokamak devices. The purposes of the EAST database are to find disruption characteristics and disruption statistics to the fully superconducting tokamak EAST, to elucidate the physics underlying tokamak disruptions, to explore the influence of disruption on superconducting magnets and to extrapolate toward future burning plasma devices. In order to quantitatively assess the usefulness of various plasma parameters for predicting disruptions, a similar SQL database to Alcator C-Mod for EAST has been created by compiling values for a number of proposed disruption-relevant parameters sampled from all plasma discharges in the 2015 campaign. The detailed statistic results and analysis of two databases on the EAST tokamak are presented. supported by the National Magnetic Confinement Fusion Science Program of China (No. 2014GB103000)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grabaskas, Dave; Brunett, Acacia J.; Bucknor, Matthew
GE Hitachi Nuclear Energy (GEH) and Argonne National Laboratory are currently engaged in a joint effort to modernize and develop probabilistic risk assessment (PRA) techniques for advanced non-light water reactors. At a high level the primary outcome of this project will be the development of next-generation PRA methodologies that will enable risk-informed prioritization of safety- and reliability-focused research and development, while also identifying gaps that may be resolved through additional research. A subset of this effort is the development of a reliability database (RDB) methodology to determine applicable reliability data for inclusion in the quantification of the PRA. The RDBmore » method developed during this project seeks to satisfy the requirements of the Data Analysis element of the ASME/ANS Non-LWR PRA standard. The RDB methodology utilizes a relevancy test to examine reliability data and determine whether it is appropriate to include as part of the reliability database for the PRA. The relevancy test compares three component properties to establish the level of similarity to components examined as part of the PRA. These properties include the component function, the component failure modes, and the environment/boundary conditions of the component. The relevancy test is used to gauge the quality of data found in a variety of sources, such as advanced reactor-specific databases, non-advanced reactor nuclear databases, and non-nuclear databases. The RDB also establishes the integration of expert judgment or separate reliability analysis with past reliability data. This paper provides details on the RDB methodology, and includes an example application of the RDB methodology for determining the reliability of the intermediate heat exchanger of a sodium fast reactor. The example explores a variety of reliability data sources, and assesses their applicability for the PRA of interest through the use of the relevancy test.« less
NeuroTransDB: highly curated and structured transcriptomic metadata for neurodegenerative diseases.
Bagewadi, Shweta; Adhikari, Subash; Dhrangadhariya, Anjani; Irin, Afroza Khanam; Ebeling, Christian; Namasivayam, Aishwarya Alex; Page, Matthew; Hofmann-Apitius, Martin; Senger, Philipp
2015-01-01
Neurodegenerative diseases are chronic debilitating conditions, characterized by progressive loss of neurons that represent a significant health care burden as the global elderly population continues to grow. Over the past decade, high-throughput technologies such as the Affymetrix GeneChip microarrays have provided new perspectives into the pathomechanisms underlying neurodegeneration. Public transcriptomic data repositories, namely Gene Expression Omnibus and curated ArrayExpress, enable researchers to conduct integrative meta-analysis; increasing the power to detect differentially regulated genes in disease and explore patterns of gene dysregulation across biologically related studies. The reliability of retrospective, large-scale integrative analyses depends on an appropriate combination of related datasets, in turn requiring detailed meta-annotations capturing the experimental setup. In most cases, we observe huge variation in compliance to defined standards for submitted metadata in public databases. Much of the information to complete, or refine meta-annotations are distributed in the associated publications. For example, tissue preparation or comorbidity information is frequently described in an article's supplementary tables. Several value-added databases have employed additional manual efforts to overcome this limitation. However, none of these databases explicate annotations that distinguish human and animal models in neurodegeneration context. Therefore, adopting a more specific disease focus, in combination with dedicated disease ontologies, will better empower the selection of comparable studies with refined annotations to address the research question at hand. In this article, we describe the detailed development of NeuroTransDB, a manually curated database containing metadata annotations for neurodegenerative studies. The database contains more than 20 dimensions of metadata annotations within 31 mouse, 5 rat and 45 human studies, defined in collaboration with domain disease experts. We elucidate the step-by-step guidelines used to critically prioritize studies from public archives and their metadata curation and discuss the key challenges encountered. Curated metadata for Alzheimer's disease gene expression studies are available for download. Database URL: www.scai.fraunhofer.de/NeuroTransDB.html. © The Author(s) 2015. Published by Oxford University Press.
NeuroTransDB: highly curated and structured transcriptomic metadata for neurodegenerative diseases
Bagewadi, Shweta; Adhikari, Subash; Dhrangadhariya, Anjani; Irin, Afroza Khanam; Ebeling, Christian; Namasivayam, Aishwarya Alex; Page, Matthew; Hofmann-Apitius, Martin
2015-01-01
Neurodegenerative diseases are chronic debilitating conditions, characterized by progressive loss of neurons that represent a significant health care burden as the global elderly population continues to grow. Over the past decade, high-throughput technologies such as the Affymetrix GeneChip microarrays have provided new perspectives into the pathomechanisms underlying neurodegeneration. Public transcriptomic data repositories, namely Gene Expression Omnibus and curated ArrayExpress, enable researchers to conduct integrative meta-analysis; increasing the power to detect differentially regulated genes in disease and explore patterns of gene dysregulation across biologically related studies. The reliability of retrospective, large-scale integrative analyses depends on an appropriate combination of related datasets, in turn requiring detailed meta-annotations capturing the experimental setup. In most cases, we observe huge variation in compliance to defined standards for submitted metadata in public databases. Much of the information to complete, or refine meta-annotations are distributed in the associated publications. For example, tissue preparation or comorbidity information is frequently described in an article’s supplementary tables. Several value-added databases have employed additional manual efforts to overcome this limitation. However, none of these databases explicate annotations that distinguish human and animal models in neurodegeneration context. Therefore, adopting a more specific disease focus, in combination with dedicated disease ontologies, will better empower the selection of comparable studies with refined annotations to address the research question at hand. In this article, we describe the detailed development of NeuroTransDB, a manually curated database containing metadata annotations for neurodegenerative studies. The database contains more than 20 dimensions of metadata annotations within 31 mouse, 5 rat and 45 human studies, defined in collaboration with domain disease experts. We elucidate the step-by-step guidelines used to critically prioritize studies from public archives and their metadata curation and discuss the key challenges encountered. Curated metadata for Alzheimer’s disease gene expression studies are available for download. Database URL: www.scai.fraunhofer.de/NeuroTransDB.html PMID:26475471
ERIC Educational Resources Information Center
Reed, Deborah K.
2015-01-01
This study explored the data-based decision making of 12 teachers in grades 6-8 who were asked about their perceptions and use of three required interim measures of reading performance: oral reading fluency (ORF), retell, and a benchmark comprised of released state test items. Focus group participants reported they did not believe the benchmark or…
2009-03-01
37 Figure 8 New Information Sharing Model from United States Intelligence Community Information Sharing...PRIDE while the Coast Guard has MISSLE and the newly constructed WATCHKEEPER. All these databases contain intelligence on incoming vessels...decisions making. Experts rely heavily on future projections as hallmarks of skilled performance." (Endsley et al. 2006) The SA model above
Parente, Eugenio; Cocolin, Luca; De Filippis, Francesca; Zotta, Teresa; Ferrocino, Ilario; O'Sullivan, Orla; Neviani, Erasmo; De Angelis, Maria; Cotter, Paul D; Ercolini, Danilo
2016-02-16
Amplicon targeted high-throughput sequencing has become a popular tool for the culture-independent analysis of microbial communities. Although the data obtained with this approach are portable and the number of sequences available in public databases is increasing, no tool has been developed yet for the analysis and presentation of data obtained in different studies. This work describes an approach for the development of a database for the rapid exploration and analysis of data on food microbial communities. Data from seventeen studies investigating the structure of bacterial communities in dairy, meat, sourdough and fermented vegetable products, obtained by 16S rRNA gene targeted high-throughput sequencing, were collated and analysed using Gephi, a network analysis software. The resulting database, which we named FoodMicrobionet, was used to analyse nodes and network properties and to build an interactive web-based visualisation. The latter allows the visual exploration of the relationships between Operational Taxonomic Units (OTUs) and samples and the identification of core- and sample-specific bacterial communities. It also provides additional search tools and hyperlinks for the rapid selection of food groups and OTUs and for rapid access to external resources (NCBI taxonomy, digital versions of the original articles). Microbial interaction network analysis was carried out using CoNet on datasets extracted from FoodMicrobionet: the complexity of interaction networks was much lower than that found for other bacterial communities (human microbiome, soil and other environments). This may reflect both a bias in the dataset (which was dominated by fermented foods and starter cultures) and the lower complexity of food bacterial communities. Although some technical challenges exist, and are discussed here, the net result is a valuable tool for the exploration of food bacterial communities by the scientific community and food industry. Copyright © 2015. Published by Elsevier B.V.
ERIC Educational Resources Information Center
Zhang, Xiaorong
2011-01-01
We incorporated a bioinformatics component into the freshman biology course that allows students to explore cystic fibrosis (CF), a common genetic disorder, using bioinformatics tools and skills. Students learn about CF through searching genetic databases, analyzing genetic sequences, and observing the three-dimensional structures of proteins…
The NOAO Data Lab PHAT Photometry Database
NASA Astrophysics Data System (ADS)
Olsen, Knut; Williams, Ben; Fitzpatrick, Michael; PHAT Team
2018-01-01
We present a database containing both the combined photometric object catalog and the single epoch measurements from the Panchromatic Hubble Andromeda Treasury (PHAT). This database is hosted by the NOAO Data Lab (http://datalab.noao.edu), and as such exposes a number of data services to the PHAT photometry, including access through a Table Access Protocol (TAP) service, direct PostgreSQL queries, web-based and programmatic query interfaces, remote storage space for personal database tables and files, and a JupyterHub-based Notebook analysis environment, as well as image access through a Simple Image Access (SIA) service. We show how the Data Lab database and Jupyter Notebook environment allow for straightforward and efficient analyses of PHAT catalog data, including maps of object density, depth, and color, extraction of light curves of variable objects, and proper motion exploration.
Human spaceflight technology needs-a foundation for JSC's technology strategy
NASA Astrophysics Data System (ADS)
Stecklein, J. M.
Human space exploration has always been heavily influenced by goals to achieve a specific mission on a specific schedule. This approach drove rapid technology development, the rapidity of which added risks and became a major driver for costs and cost uncertainty. The National Aeronautics and Space Administration (NASA) is now approaching the extension of human presence throughout the solar system by balancing a proactive yet less schedule-driven development of technology with opportunistic scheduling of missions as the needed technologies are realized. This approach should provide cost effective, low risk technology development that will enable efficient and effective manned spaceflight missions. As a first step, the NASA Human Spaceflight Architecture Team (HAT) has identified a suite of critical technologies needed to support future manned missions across a range of destinations, including in cis-lunar space, near earth asteroid visits, lunar exploration, Mars moons, and Mars exploration. The challenge now is to develop a strategy and plan for technology development that efficiently enables these missions over a reasonable time period, without increasing technology development costs unnecessarily due to schedule pressure, and subsequently mitigating development and mission risks. NASA's Johnson Space Center (JSC), as the nation's primary center for human exploration, is addressing this challenge through an innovative approach in allocating Internal Research and Development funding to projects. The HAT Technology Needs (Tech Needs) Database has been developed to correlate across critical technologies and the NASA Office of Chief Technologist Technology Area Breakdown Structure (TABS). The TechNeeds Database illuminates that many critical technologies may support a single technical capability gap, that many HAT technology needs may map to a single TABS technology discipline, and that a single HAT technology need may map to multiple TABS technology disciplines. Th- TechNeeds Database greatly clarifies understanding of the complex relationships of critical technologies to mission and architecture element needs. Extensions to the core TechNeeds Database allow JSC to factor in and appropriately weight JSC core technology competencies, and considerations of commercialization potential and partnership potential. The inherent coupling among these, along with an appropriate importance weighting, has provided an initial prioritization for allocation of technology development research funding at JSc. The HAT Technology Needs Database, with a core of built-in reports, clarifies and communicates complex technology needs for cost effective human space exploration so that an organization seeking to assure that research prioritization supports human spaceflight of the future can be successful.
Human Spaceflight Technology Needs - A Foundation for JSC's Technology Strategy
NASA Technical Reports Server (NTRS)
Stecklein, Jonette M.
2013-01-01
Human space exploration has always been heavily influenced by goals to achieve a specific mission on a specific schedule. This approach drove rapid technology development, the rapidity of which adds risks as well as provides a major driver for costs and cost uncertainty. The National Aeronautics and Space Administration (NASA) is now approaching the extension of human presence throughout the solar system by balancing a proactive yet less schedule-driven development of technology with opportunistic scheduling of missions as the needed technologies are realized. This approach should provide cost effective, low risk technology development that will enable efficient and effective manned spaceflight missions. As a first step, the NASA Human Spaceflight Architecture Team (HAT) has identified a suite of critical technologies needed to support future manned missions across a range of destinations, including in cis-lunar space, near earth asteroid visits, lunar exploration, Mars moons, and Mars exploration. The challenge now is to develop a strategy and plan for technology development that efficiently enables these missions over a reasonable time period, without increasing technology development costs unnecessarily due to schedule pressure, and subsequently mitigating development and mission risks. NASA's Johnson Space Center (JSC), as the nation s primary center for human exploration, is addressing this challenge through an innovative approach in allocating Internal Research and Development funding to projects. The HAT Technology Needs (TechNeeds) Database has been developed to correlate across critical technologies and the NASA Office of Chief Technologist Technology Area Breakdown Structure (TABS). The TechNeeds Database illuminates that many critical technologies may support a single technical capability gap, that many HAT technology needs may map to a single TABS technology discipline, and that a single HAT technology need may map to multiple TABS technology disciplines. The TechNeeds Database greatly clarifies understanding of the complex relationships of critical technologies to mission and architecture element needs. Extensions to the core TechNeeds Database allow JSC to factor in and appropriately weight JSC Center Core Technology Competencies, and considerations of Commercialization Potential and Partnership Potential. The inherent coupling among these, along with an appropriate importance weighting, has provided an initial prioritization for allocation of technology development research funding for JSC. The HAT Technology Needs Database, with a core of built-in reports, clarifies and communicates complex technology needs for cost effective human space exploration such that an organization seeking to assure that research prioritization supports human spaceflight of the future can be successful.
Building an integrated neurodegenerative disease database at an academic health center.
Xie, Sharon X; Baek, Young; Grossman, Murray; Arnold, Steven E; Karlawish, Jason; Siderowf, Andrew; Hurtig, Howard; Elman, Lauren; McCluskey, Leo; Van Deerlin, Vivianna; Lee, Virginia M-Y; Trojanowski, John Q
2011-07-01
It is becoming increasingly important to study common and distinct etiologies, clinical and pathological features, and mechanisms related to neurodegenerative diseases such as Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, and frontotemporal lobar degeneration. These comparative studies rely on powerful database tools to quickly generate data sets that match diverse and complementary criteria set by them. In this article, we present a novel integrated neurodegenerative disease (INDD) database, which was developed at the University of Pennsylvania (Penn) with the help of a consortium of Penn investigators. Because the work of these investigators are based on Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, and frontotemporal lobar degeneration, it allowed us to achieve the goal of developing an INDD database for these major neurodegenerative disorders. We used the Microsoft SQL server as a platform, with built-in "backwards" functionality to provide Access as a frontend client to interface with the database. We used PHP Hypertext Preprocessor to create the "frontend" web interface and then used a master lookup table to integrate individual neurodegenerative disease databases. We also present methods of data entry, database security, database backups, and database audit trails for this INDD database. Using the INDD database, we compared the results of a biomarker study with those using an alternative approach by querying individual databases separately. We have demonstrated that the Penn INDD database has the ability to query multiple database tables from a single console with high accuracy and reliability. The INDD database provides a powerful tool for generating data sets in comparative studies on several neurodegenerative diseases. Copyright © 2011 The Alzheimer's Association. Published by Elsevier Inc. All rights reserved.
Kao, Wei-Heng; Hong, Ji-Hong; See, Lai-Chu; Yu, Huang-Ping; Hsu, Jun-Te; Chou, I-Jun; Chou, Wen-Chi; Chiou, Meng-Jiun; Wang, Chun-Chieh; Kuo, Chang-Fu
2017-08-16
We aimed to evaluate the validity of cancer diagnosis in the National Health Insurance (NHI) database, which has routinely collected the health information of almost the entire Taiwanese population since 1995, compared with the Taiwan National Cancer Registry (NCR). There were 26,542,445 active participants registered in the NHI database between 2001 and 2012. National Cancer Registry and NHI database records were compared for cancer diagnosis; date of cancer diagnosis; and 1, 2, and 5 year survival. In addition, the 10 leading causes of cancer deaths in Taiwan were analyzed. There were 908,986 cancer diagnoses in NCR and NHI database and 782,775 (86.1%) in both, with 53,192 (5.9%) in the NHI database only and 73,019 (8.0%) in the NCR only. The positive predictive value of the NHI database cancer diagnoses was 94% for all cancers; the positive predictive value of the 10 specific cancers ranged from 95% (lung cancer) to 82% (cervical cancer). The date of diagnosis in the NHI database was generally delayed by a median of 15 days (interquartile range 8-18) compared with the NCR. The 1, 2, and 5 year survival rates were 71.21%, 60.85%, and 47.44% using the NHI database and were 71.18%, 60.17%, and 46.09% using NCR data. Recording of cancer diagnoses and survival estimates based on these diagnosis codes in the NHI database are generally consistent with the NCR. Studies using NHI database data must pay careful attention to eligibility and record linkage; use of both sources is recommended. Copyright © 2017 John Wiley & Sons, Ltd.
76 FR 72929 - Agency Information Collection Activities: Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2011-11-28
... proposed information collection project: ``Medical Office Survey on Patient Safety Culture Comparative... Medical Office Survey on Patient Safety Culture Comparative Database. The Agency for Healthcare Research... Patient Safety Culture (Medical Office SOPS) Comparative Database. The Medical Office SOPS Comparative...
Catlin, Ann Christine; Fernando, Sumudinie; Gamage, Ruwan; Renner, Lorna; Antwi, Sampson; Tettey, Jonas Kusah; Amisah, Kofi Aikins; Kyriakides, Tassos; Cong, Xiangyu; Reynolds, Nancy R.; Paintsil, Elijah
2015-01-01
Prevalence of pediatric HIV disclosure is low in resource-limited settings. Innovative, culturally sensitive, and patient-centered disclosure approaches are needed. Conducting such studies in resource-limited settings is not trivial considering the challenges of capturing, cleaning, and storing clinical research data. To overcome some of these challenges, the Sankofa pediatric disclosure intervention adopted an interactive cyber infrastructure for data capture and analysis. The Sankofa Project database system is built on the HUBzero cyber infrastructure (https://hubzero.org), an open source software platform. The hub database components support: (1) data management – the “databases” component creates, configures, and manages database access, backup, repositories, applications, and access control; (2) data collection – the “forms” component is used to build customized web case report forms that incorporate common data elements and include tailored form submit processing to handle error checking, data validation, and data linkage as the data are stored to the database; and (3) data exploration – the “dataviewer” component provides powerful methods for users to view, search, sort, navigate, explore, map, graph, visualize, aggregate, drill-down, compute, and export data from the database. The Sankofa cyber data management tool supports a user-friendly, secure, and systematic collection of all data. We have screened more than 400 child–caregiver dyads and enrolled nearly 300 dyads, with tens of thousands of data elements. The dataviews have successfully supported all data exploration and analysis needs of the Sankofa Project. Moreover, the ability of the sites to query and view data summaries has proven to be an incentive for collecting complete and accurate data. The data system has all the desirable attributes of an electronic data capture tool. It also provides an added advantage of building data management capacity in resource-limited settings due to its innovative data query and summary views and availability of real-time support by the data management team. PMID:26616131
Yip, A M; Kephart, G; Rockwood, K
2001-01-01
The Canadian Study of Health and Aging (CSHA) was a cohort study that included 528 Nova Scotian community-dwelling participants. Linkage of CSHA and provincial Medical Services Insurance (MSI) data enabled examination of health care utilization in this subsample. This article discusses methodological and ethical issues of database linkage and explores variation in the use of health services by demographic variables and health status. Utilization over 24 months following baseline was extracted from MSI's physician claims, hospital discharge abstracts, and Pharmacare claims databases. Twenty-nine subjects refused consent for access to their MSI file; health card numbers for three others could not be retrieved. A significant difference in healthcare use by age and self-rated health was revealed. Linkage of population-based data with provincial administrative health care databases has the potential to guide health care planning and resource allocation. This process must include steps to ensure protection of confidentiality. Standard practices for linkage consent and routine follow-up should be adopted. The Canadian Study of Health and Aging (CSHA) began in 1991-92 to explore dementia, frailty, and adverse health outcomes (Canadian Study of Health and Aging Working Group, 1994). The original CSHA proposal included linkage to provincial administrative health care databases by the individual CSHA study centers to enhance information on health care utilization and outcomes of study participants. In Nova Scotia, the Medical Services Insurance (MSI) administration, which drew the sampling frame for the original CSHA, did not retain the list of corresponding health card numbers. Furthermore, consent for this access was not asked of participants at the time of the first interview. The objectives of this study reported here were to examine the feasibility and ethical considerations of linking data from the CSHA to MSI utilization data, and to explore variation in health services use by demographic and health status characteristics in the Nova Scotia community cohort.
Selby, Luke V; Sjoberg, Daniel D; Cassella, Danielle; Sovel, Mindy; Weiser, Martin R; Sepkowitz, Kent; Jones, David R; Strong, Vivian E
2015-06-15
Surgical quality improvement requires accurate tracking and benchmarking of postoperative adverse events. We track surgical site infections (SSIs) with two systems; our in-house surgical secondary events (SSE) database and the National Surgical Quality Improvement Project (NSQIP). The SSE database, a modification of the Clavien-Dindo classification, categorizes SSIs by their anatomic site, whereas NSQIP categorizes by their level. Our aim was to directly compare these different definitions. NSQIP and the SSE database entries for all surgeries performed in 2011 and 2012 were compared. To match NSQIP definitions, and while blinded to NSQIP results, entries in the SSE database were categorized as either incisional (superficial or deep) or organ space infections. These categorizations were compared with NSQIP records; agreement was assessed with Cohen kappa. The 5028 patients in our cohort had a 6.5% SSI in the SSE database and a 4% rate in NSQIP, with an overall agreement of 95% (kappa = 0.48, P < 0.0001). The rates of categorized infections were similarly well matched; incisional rates of 4.1% and 2.7% for the SSE database and NSQIP and organ space rates of 2.6% and 1.5%. Overall agreements were 96% (kappa = 0.36, P < 0.0001) and 98% (kappa = 0.55, P < 0.0001), respectively. Over 80% of cases recorded by the SSE database but not NSQIP did not meet NSQIP criteria. The SSE database is an accurate, real-time record of postoperative SSIs. Institutional databases that capture all surgical cases can be used in conjunction with NSQIP with excellent concordance. Copyright © 2015 Elsevier Inc. All rights reserved.
Recommendations for Exploration Space Medicine from the Apollo Medical Operations Project
NASA Technical Reports Server (NTRS)
Scheuring, R. a.; Davis, J. R.; Duncan, J. M.; Polk, J. D.; Jones, J. A.; Gillis, D. B.
2007-01-01
Introduction: A study was requested in December, 2005 by the Space Medicine Division at the NASA-Johnson Space Center (JSC) to identify Apollo mission issues relevant to medical operations that had impact to crew health and/or performance. The objective was to use this new information to develop medical requirements for the future Crew Exploration Vehicle (CEV), Lunar Surface Access Module (LSAM), Lunar Habitat, and Advanced Extravehicular Activity (EVA) suits that are currently being developed within the exploration architecture. Methods: Available resources pertaining to medical operations on the Apollo 7 through 17 missions were reviewed. Ten categories of hardware, systems, or crew factors were identified in the background research, generating 655 data records in a database. A review of the records resulted in 280 questions that were then posed to surviving Apollo crewmembers by mail, face-to-face meetings, or online interaction. Response analysis to these questions formed the basis of recommendations to items in each of the categories. Results: Thirteen of 22 surviving Apollo astronauts (59%) participated in the project. Approximately 236 pages of responses to the questions were captured, resulting in 107 recommendations offered for medical consideration in the design of future vehicles and EVA suits based on the Apollo experience. Discussion: The goals of this project included: 1) Develop or modify medical requirements for new vehicles; 2) create a centralized database for future access; and 3) take this new knowledge and educate the various directorates at NASA-JSC who are participating in the exploration effort. To date, the Apollo Medical Operations recommendations are being incorporated into the exploration mission architecture at various levels and a centralized database has been developed. The Apollo crewmembers input has proved to be an invaluable resource, prompting ongoing collaboration as the requirements for the future exploration missions continue to evolve and be refined.
Databases for rRNA gene profiling of microbial communities
Ashby, Matthew
2013-07-02
The present invention relates to methods for performing surveys of the genetic diversity of a population. The invention also relates to methods for performing genetic analyses of a population. The invention further relates to methods for the creation of databases comprising the survey information and the databases created by these methods. The invention also relates to methods for analyzing the information to correlate the presence of nucleic acid markers with desired parameters in a sample. These methods have application in the fields of geochemical exploration, agriculture, bioremediation, environmental analysis, clinical microbiology, forensic science and medicine.
Chen, Yi- Ping Phoebe; Hanan, Jim
2002-01-01
Models of plant architecture allow us to explore how genotype environment interactions effect the development of plant phenotypes. Such models generate masses of data organised in complex hierarchies. This paper presents a generic system for creating and automatically populating a relational database from data generated by the widely used L-system approach to modelling plant morphogenesis. Techniques from compiler technology are applied to generate attributes (new fields) in the database, to simplify query development for the recursively-structured branching relationship. Use of biological terminology in an interactive query builder contributes towards making the system biologist-friendly.
This document may be of assistance in applying the Title V air operating permit regulations. This document is part of the Title V Petition Database available at www2.epa.gov/title-v-operating-permits/title-v-petition-database. Some documents in the database are a scanned or retyped version of a paper photocopy of the original. Although we have taken considerable effort to quality assure the documents, some may contain typographical errors. Contact the office that issued the document if you need a copy of the original.
Utilizing semantic networks to database and retrieve generalized stochastic colored Petri nets
NASA Technical Reports Server (NTRS)
Farah, Jeffrey J.; Kelley, Robert B.
1992-01-01
Previous work has introduced the Planning Coordinator (PCOORD), a coordinator functioning within the hierarchy of the Intelligent Machine Mode. Within the structure of the Planning Coordinator resides the Primitive Structure Database (PSDB) functioning to provide the primitive structures utilized by the Planning Coordinator in the establishing of error recovery or on-line path plans. This report further explores the Primitive Structure Database and establishes the potential of utilizing semantic networks as a means of efficiently storing and retrieving the Generalized Stochastic Colored Petri Nets from which the error recovery plans are derived.
TIPdb-3D: the three-dimensional structure database of phytochemicals from Taiwan indigenous plants.
Tung, Chun-Wei; Lin, Ying-Chi; Chang, Hsun-Shuo; Wang, Chia-Chi; Chen, Ih-Sheng; Jheng, Jhao-Liang; Li, Jih-Heng
2014-01-01
The rich indigenous and endemic plants in Taiwan serve as a resourceful bank for biologically active phytochemicals. Based on our TIPdb database curating bioactive phytochemicals from Taiwan indigenous plants, this study presents a three-dimensional (3D) chemical structure database named TIPdb-3D to support the discovery of novel pharmacologically active compounds. The Merck Molecular Force Field (MMFF94) was used to generate 3D structures of phytochemicals in TIPdb. The 3D structures could facilitate the analysis of 3D quantitative structure-activity relationship, the exploration of chemical space and the identification of potential pharmacologically active compounds using protein-ligand docking. Database URL: http://cwtung.kmu.edu.tw/tipdb. © The Author(s) 2014. Published by Oxford University Press.
Medical data mining: knowledge discovery in a clinical data warehouse.
Prather, J. C.; Lobach, D. F.; Goodwin, L. K.; Hales, J. W.; Hage, M. L.; Hammond, W. E.
1997-01-01
Clinical databases have accumulated large quantities of information about patients and their medical conditions. Relationships and patterns within this data could provide new medical knowledge. Unfortunately, few methodologies have been developed and applied to discover this hidden knowledge. In this study, the techniques of data mining (also known as Knowledge Discovery in Databases) were used to search for relationships in a large clinical database. Specifically, data accumulated on 3,902 obstetrical patients were evaluated for factors potentially contributing to preterm birth using exploratory factor analysis. Three factors were identified by the investigators for further exploration. This paper describes the processes involved in mining a clinical database including data warehousing, data query and cleaning, and data analysis. PMID:9357597
ERIC Educational Resources Information Center
Flatley, Robert K.; Lilla, Rick; Widner, Jack
2007-01-01
This study compared Social Work Abstracts and Social Services Abstracts databases in terms of indexing, journal coverage, and searches. The authors interviewed editors, analyzed journal coverage, and compared searches. It was determined that the databases complement one another more than compete. The authors conclude with some considerations.
The intelligent database machine
NASA Technical Reports Server (NTRS)
Yancey, K. E.
1985-01-01
The IDM data base was compared with the data base crack to determine whether IDM 500 would better serve the needs of the MSFC data base management system than Oracle. The two were compared and the performance of the IDM was studied. Implementations that work best on which database are implicated. The choice is left to the database administrator.
78 FR 28848 - Information Collection Activities; Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-16
... Quality's (AHRQ) Hospital Survey on Patient Safety Culture Comparative Database.'' In accordance with the... for Healthcare Research and Quality's (AHRQ) Hospital Survey on Patient Safety Culture Comparative... SOPS) Comparative Database; OMB NO. 0935- [[Page 28849
Goulet, Marie-Hélène; Larue, Caroline; Alderson, Marie
2016-04-01
This paper reports on an analysis of the concept of reflective practice. Reflective practice, a concept borrowed from the field of education, is widely used in nursing. However, to date, no study has explored whether this appropriation has resulted in a definition of the concept specific to the nursing discipline. A sample comprised of 42 articles in the field of nursing drawn from the CINAHL database and 35 articles in education from the ERIC database (1989-2013) was analyzed. A concept analysis using the method proposed by Bowers and Schatzman was conducted to explore the differing meanings of reflective practice in nursing and education. In nursing, the dimensions of the concept differ depending on context. In the clinical context, the dimensions may be summarized as theory-practice gap, development, and caring; in training, as learning, guided process, and development; and in research, as knowledge, method, and social change. In education, the concept is also used in the contexts of training (the dimensions being development, deliberate review, emotions, and evaluation) and research (knowledge, temporal distance, and method). The humanist dimension in nursing thus reflects a use of the concept more specific to the discipline. The concept analysis helped clarify the meaning of reflective practice in nursing and its specific use in the discipline. This observation leads to a consideration of how the concept has developed since its appropriation by nursing; the adoption of a terminology particular to nursing may well be worth contemplating. © 2015 Wiley Periodicals, Inc.
NASA Technical Reports Server (NTRS)
Fong, Danny; Odell,Dorice; Barry, Peter; Abrahamian, Tomik
2008-01-01
This software provides internal, automated search mechanics of GIDEP (Government- Industry Data Exchange Program) Alert data imported from the GIDEP government Web site. The batching tool allows the import of a single parts list in tab-delimited text format into the local JPL GIDEP database. Delimiters from every part number are removed. The original part numbers with delimiters are compared, as well as the newly generated list without the delimiters. The two lists run against the GIDEP imports, and output any matches. This feature only works with Netscape 2.0 or greater, or Internet Explorer 4.0 or greater. The user selects the browser button to choose a text file to import. When the submit button is pressed, this script will import alerts from the text file into the local JPL GIDEP database. This batch tool provides complete in-house control over exported material and data for automated batch match abilities. The batching tool has the ability to match capabilities of the parts list to tables, and yields results that aid further research and analysis. This provides more control over GIDEP information for metrics and reports information not provided by the government site. This software yields results quickly and gives more control over external data from the government site in order to generate other reports not available from the external source. There is enough space to store years of data. The program relates to risk identification and management with regard to projects and GIDEP alert information encompassing flight parts for space exploration.
Using the Saccharomyces Genome Database (SGD) for analysis of genomic information
Skrzypek, Marek S.; Hirschman, Jodi
2011-01-01
Analysis of genomic data requires access to software tools that place the sequence-derived information in the context of biology. The Saccharomyces Genome Database (SGD) integrates functional information about budding yeast genes and their products with a set of analysis tools that facilitate exploring their biological details. This unit describes how the various types of functional data available at SGD can be searched, retrieved, and analyzed. Starting with the guided tour of the SGD Home page and Locus Summary page, this unit highlights how to retrieve data using YeastMine, how to visualize genomic information with GBrowse, how to explore gene expression patterns with SPELL, and how to use Gene Ontology tools to characterize large-scale datasets. PMID:21901739
Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation
2011-01-01
Background The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. Results A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. Conclusions Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance. PMID:21631914
Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation.
Rognes, Torbjørn
2011-06-01
The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance.
Pediatric burns: Kids' Inpatient Database vs the National Burn Repository.
Soleimani, Tahereh; Evans, Tyler A; Sood, Rajiv; Hartman, Brett C; Hadad, Ivan; Tholpady, Sunil S
2016-04-01
Burn injuries are one of the leading causes of morbidity and mortality in young children. The Kids' Inpatient Database (KID) and National Burn Repository (NBR) are two large national databases that can be used to evaluate outcomes and help quality improvement in burn care. Differences in the design of the KID and NBR could lead to differing results affecting resultant conclusions and quality improvement programs. This study was designed to validate the use of KID for burn epidemiologic studies, as an adjunct to the NBR. Using the KID (2003, 2006, and 2009), a total of 17,300 nonelective burn patients younger than 20 y old were identified. Data from 13,828 similar patients were collected from the NBR. Outcome variables were compared between the two databases. Comparisons revealed similar patient distribution by gender, race, and burn size. Inhalation injury was more common among the NBR patients and was associated with increased mortality. The rates of respiratory failure, wound infection, cellulitis, sepsis, and urinary tract infection were higher in the KID. Multiple regression analysis adjusting for potential confounders demonstrated similar mortality rate but significantly longer length of stay for patients in the NBR. Despite differences in the design and sampling of the KID and NBR, the overall demographic and mortality results are similar. The differences in complication rate and length of stay should be explored by further studies to clarify underlying causes. Investigations into these differences should also better inform strategies to improve burn prevention and treatment. Copyright © 2016 Elsevier Inc. All rights reserved.
Rivas, Elena; Lang, Raymond; Eddy, Sean R
2012-02-01
The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases.
Rivas, Elena; Lang, Raymond; Eddy, Sean R.
2012-01-01
The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases. PMID:22194308
Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints.
Awale, Mahendra; Jin, Xian; Reymond, Jean-Louis
2015-01-01
Tools to explore large compound databases in search for analogs of query molecules provide a strategically important support in drug discovery to help identify available analogs of any given reference or hit compound by ligand based virtual screening (LBVS). We recently showed that large databases can be formatted for very fast searching with various 2D-fingerprints using the city-block distance as similarity measure, in particular a 2D-atom pair fingerprint (APfp) and the related category extended atom pair fingerprint (Xfp) which efficiently encode molecular shape and pharmacophores, but do not perceive stereochemistry. Here we investigated related 3D-atom pair fingerprints to enable rapid stereoselective searches in the ZINC database (23.2 million 3D structures). Molecular fingerprints counting atom pairs at increasing through-space distance intervals were designed using either all atoms (16-bit 3DAPfp) or different atom categories (80-bit 3DXfp). These 3D-fingerprints retrieved molecular shape and pharmacophore analogs (defined by OpenEye ROCS scoring functions) of 110,000 compounds from the Cambridge Structural Database with equal or better accuracy than the 2D-fingerprints APfp and Xfp, and showed comparable performance in recovering actives from decoys in the DUD database. LBVS by 3DXfp or 3DAPfp similarity was stereoselective and gave very different analogs when starting from different diastereomers of the same chiral drug. Results were also different from LBVS with the parent 2D-fingerprints Xfp or APfp. 3D- and 2D-fingerprints also gave very different results in LBVS of folded molecules where through-space distances between atom pairs are much shorter than topological distances. 3DAPfp and 3DXfp are suitable for stereoselective searches for shape and pharmacophore analogs of query molecules in large databases. Web-browsers for searching ZINC by 3DAPfp and 3DXfp similarity are accessible at www.gdb.unibe.ch and should provide useful assistance to drug discovery projects. Graphical abstractAtom pair fingerprints based on through-space distances (3DAPfp) provide better shape encoding than atom pair fingerprints based on topological distances (APfp) as measured by the recovery of ROCS shape analogs by fp similarity.
An intercomparison of tropical cyclone best-track products for the southwest Pacific
NASA Astrophysics Data System (ADS)
Magee, Andrew D.; Verdon-Kidd, Danielle C.; Kiem, Anthony S.
2016-06-01
Recent efforts to understand tropical cyclone (TC) activity in the southwest Pacific (SWP) have led to the development of numerous TC databases. The methods used to compile each database vary and are based on data from different meteorological centres, standalone TC databases and archived synoptic charts. Therefore the aims of this study are to (i) provide a spatio-temporal comparison of three TC best-track (BT) databases and explore any differences between them (and any associated implications) and (ii) investigate whether there are any spatial, temporal or statistical differences between pre-satellite (1945-1969), post-satellite (1970-2011) and post-geostationary satellite (1982-2011) era TC data given the changing observational technologies with time. To achieve this, we compare three best-track TC databases for the SWP region (0-35° S, 135° E-120° W) from 1945 to 2011: the Joint Typhoon Warning Center (JTWC), the International Best Track Archive for Climate Stewardship (IBTrACS) and the Southwest Pacific Enhanced Archive of Tropical Cyclones (SPEArTC). The results of this study suggest that SPEArTC is the most complete repository of TCs for the SWP region. In particular, we show that the SPEArTC database includes a number of additional TCs, not included in either the JTWC or IBTrACS database. These SPEArTC events do occur under environmental conditions conducive to tropical cyclogenesis (TC genesis), including anomalously negative 700 hPa vorticity (VORT), anomalously negative vertical shear of zonal winds (VSZW), anomalously negative 700 hPa geopotential height (GPH), cyclonic (absolute) 700 hPa winds and low values of absolute vertical wind shear (EVWS). Further, while changes in observational technologies from 1945 have undoubtedly improved our ability to detect and monitor TCs, we show that the number of TCs detected prior to the satellite era (1945-1969) are not statistically different to those in the post-satellite era (post-1970). Although data from pre-satellite and pre-geostationary satellite periods are currently inadequate for investigating TC intensity, this study suggests that SPEArTC data (from 1945) may be used to investigate long-term variability of TC counts and TC genesis locations.
Twenty-Five Year Survival of Children with Intellectual Disability in Western Australia.
Bourke, Jenny; Nembhard, Wendy N; Wong, Kingsley; Leonard, Helen
2017-09-01
To investigate survival up to early adulthood for children with intellectual disability and compare their risk of mortality with that of children without intellectual disability. This was a retrospective cohort study of all live births in Western Australia between January 1, 1983 and December 31, 2010. Children with an intellectual disability (n = 10 593) were identified from the Western Australian Intellectual Disability Exploring Answers Database. Vital status was determined from linkage to the Western Australian Mortality database. Kaplan-Meier product limit estimates and 95% CIs were computed by level of intellectual disability. Hazard ratios (HRs) and 95% CIs were calculated from Cox proportional hazard regression models adjusting for potential confounders. After adjusting for potential confounders, compared with those without intellectual disability, children with intellectual disability had a 6-fold increased risk of mortality at 1-5 years of age (adjusted HR [aHR] = 6.0, 95%CI: 4.8, 7.6), a 12-fold increased risk at 6-10 years of age (aHR = 12.6, 95% CI: 9.0, 17.7) and a 5-fold increased risk at 11-25 years of age (aHR = 4.9, 95% CI: 3.9, 6.1). Children with severe intellectual disability were at even greater risk. No difference in survival was observed for Aboriginal children with intellectual disability compared with non-Aboriginal children with intellectual disability. Although children with intellectual disability experience higher mortality at all ages compared with those without intellectual disability, the greatest burden is for those with severe intellectual disability. However, even children with mild to moderate intellectual disability have increased risk of death compared with unaffected children. Copyright © 2017 Elsevier Inc. All rights reserved.
Leveraging Cognitive Context for Object Recognition
2014-06-01
learned from large image databases. We build upon this concept by exploring cognitive context, demonstrating how rich dynamic context provided by...context that people rely upon as they perceive the world. Context in ACT-R/E takes the form of associations between related concepts that are learned ...and accuracy of object recognition. Context is most often viewed as a static concept, learned from large image databases. We build upon this concept by
Bolbase: a comprehensive genomics database for Brassica oleracea
2013-01-01
Background Brassica oleracea is a morphologically diverse species in the family Brassicaceae and contains a group of nutrition-rich vegetable crops, including common heading cabbage, cauliflower, broccoli, kohlrabi, kale, Brussels sprouts. This diversity along with its phylogenetic membership in a group of three diploid and three tetraploid species, and the recent availability of genome sequences within Brassica provide an unprecedented opportunity to study intra- and inter-species divergence and evolution in this species and its close relatives. Description We have developed a comprehensive database, Bolbase, which provides access to the B. oleracea genome data and comparative genomics information. The whole genome of B. oleracea is available, including nine fully assembled chromosomes and 1,848 scaffolds, with 45,758 predicted genes, 13,382 transposable elements, and 3,581 non-coding RNAs. Comparative genomics information is available, including syntenic regions among B. oleracea, Brassica rapa and Arabidopsis thaliana, synonymous (Ks) and non-synonymous (Ka) substitution rates between orthologous gene pairs, gene families or clusters, and differences in quantity, category, and distribution of transposable elements on chromosomes. Bolbase provides useful search and data mining tools, including a keyword search, a local BLAST server, and a customized GBrowse tool, which can be used to extract annotations of genome components, identify similar sequences and visualize syntenic regions among species. Users can download all genomic data and explore comparative genomics in a highly visual setting. Conclusions Bolbase is the first resource platform for the B. oleracea genome and for genomic comparisons with its relatives, and thus it will help the research community to better study the function and evolution of Brassica genomes as well as enhance molecular breeding research. This database will be updated regularly with new features, improvements to genome annotation, and new genomic sequences as they become available. Bolbase is freely available at http://ocri-genomics.org/bolbase. PMID:24079801
Bibliometrics of NIHR HTA monographs and their related journal articles.
Royle, Pamela; Waugh, Norman
2015-02-18
A bibliometric analysis of the UK National Institute for Health Research (NIHR) Health Technology Assessment (HTA) monographs and their related journal articles by: (1) exploring the differences in citations to the HTA monographs in Google Scholar (GS), Scopus and Web of Science (WoS), and (2) comparing Scopus citations to the monographs with their related journal articles. A study of 111 HTA monographs published in 2010 and 2011, and their external journal articles. Citations to the monographs in GS, Scopus and WoS, and to their external journal articles in Scopus. The number of citations varied among the three databases, with GS having the highest and WoS the lowest; however, the citation-based rankings among the databases were highly correlated. Overall, 56% of monographs had a related publication, with the highest proportion for primary research (76%) and lowest for evidence syntheses (43%). There was a large variation in how the monographs were cited, compared to journal articles, resulting in more frequent problems, with unlinked citations in Scopus and WoS. When comparing differences in the number of citations between monograph publications with their related journal articles from the same project, we found that monographs received more citations than their journal articles for evidence syntheses and methodology projects; by contrast, journal articles related to primary research monographs were more highly cited than their monograph. The numbers of citations to the HTA monographs differed considerably between the databases, but were highly correlated. When a HTA monograph had a journal article from the same study, there were more citations to the journal article for primary research, but more to the monographs for evidence syntheses. Citations to the related journal articles were more reliably recorded than citations to the HTA monographs. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Privacy considerations in the context of an Australian observational database.
Duszynski, K M; Beilby, J J; Marley, J E; Walker, D C; Pratt, N L
2001-12-01
Observational databases are increasingly acknowledged for their value in clinical investigation. Australian general practice in particular presents an exciting opportunity to examine treatment in a natural setting. The paper explores issues such as privacy and confidentiality--foremost considerations when conducting this form of pharmacoepidemiological research. Australian legislation is currently addressing these exact issues in order to establish clear directives regarding ethical concerns. The development of a pharmacoepidemiological database arising from the integration of computerized Australian general practice records is described in addition, to the challenges associated with creating a database which considers patient privacy. The database known as 'Medic-GP', presently contains more than 950,000 clinical notes (including consultations, pathology, diagnostic imaging and adverse reactions) over a 5-year time period and relates to 55,000 patients. The paper then details a retrospective study which utilized the database to examine the interaction between antibiotic prescribing and patient outcomes from a community perspective, following a policy intervention. This study illustrates the application of computerized general practice records in research.
The effectiveness of Pilates exercise in people with chronic low back pain: a systematic review.
Wells, Cherie; Kolt, Gregory S; Marshall, Paul; Hill, Bridget; Bialocerkowski, Andrea
2014-01-01
To evaluate the effectiveness of Pilates exercise in people with chronic low back pain (CLBP) through a systematic review of randomised controlled trials (RCTs). A search for RCTs was undertaken using Medical Search Terms and synonyms for "Pilates" and "low back pain" within the maximal date range of 10 databases. Databases included the Cumulative Index to Nursing and Allied Health Literature; Cochrane Library; Medline; Physiotherapy Evidence Database; ProQuest: Health and Medical Complete, Nursing and Allied Health Source, Dissertation and Theses; Scopus; Sport Discus; Web of Science. Two independent reviewers were involved in the selection of evidence. To be included, relevant RCTs needed to be published in the English language. From 152 studies, 14 RCTs were included. Two independent reviewers appraised the methodological quality of RCTs using the McMaster Critical Review Form for Quantitative Studies. The author(s), year of publication, and details regarding participants, Pilates exercise, comparison treatments, and outcome measures, and findings, were then extracted. The methodological quality of RCTs ranged from "poor" to "excellent". A meta-analysis of RCTs was not undertaken due to the heterogeneity of RCTs. Pilates exercise provided statistically significant improvements in pain and functional ability compared to usual care and physical activity between 4 and 15 weeks, but not at 24 weeks. There were no consistent statistically significant differences in improvements in pain and functional ability with Pilates exercise, massage therapy, or other forms of exercise at any time period. Pilates exercise offers greater improvements in pain and functional ability compared to usual care and physical activity in the short term. Pilates exercise offers equivalent improvements to massage therapy and other forms of exercise. Future research should explore optimal Pilates exercise designs, and whether some people with CLBP may benefit from Pilates exercise more than others.
ERIC Educational Resources Information Center
Cole, Ann F.; McArdle, Geri; Clements, Kimberly D.
2005-01-01
Human resource development professionals are in a unique position to help organizations achieve maximum positive impact and avoid legal difficulties when implementing mentoring programs. This case study explored a formal mentoring program that was data-based and linked to HRD in order to advance the mentoring process as an effective individual and…
You Got a Problem with That? Exploring Evaluators' Disagreements about Ethics.
ERIC Educational Resources Information Center
Morris, Michael; Jacobs, Lynette
Research has suggested that evaluators vary in the extent to which they interpret the challenges they face in ethical terms. The question of what accounts for these differences was explored through a survey completed by 391 individuals listed in the database of the American Evaluation Association. The first section of the questionnaire presented…
ProteoLens: a visual analytic tool for multi-scale database-driven biological network data mining.
Huan, Tianxiao; Sivachenko, Andrey Y; Harrison, Scott H; Chen, Jake Y
2008-08-12
New systems biology studies require researchers to understand how interplay among myriads of biomolecular entities is orchestrated in order to achieve high-level cellular and physiological functions. Many software tools have been developed in the past decade to help researchers visually navigate large networks of biomolecular interactions with built-in template-based query capabilities. To further advance researchers' ability to interrogate global physiological states of cells through multi-scale visual network explorations, new visualization software tools still need to be developed to empower the analysis. A robust visual data analysis platform driven by database management systems to perform bi-directional data processing-to-visualizations with declarative querying capabilities is needed. We developed ProteoLens as a JAVA-based visual analytic software tool for creating, annotating and exploring multi-scale biological networks. It supports direct database connectivity to either Oracle or PostgreSQL database tables/views, on which SQL statements using both Data Definition Languages (DDL) and Data Manipulation languages (DML) may be specified. The robust query languages embedded directly within the visualization software help users to bring their network data into a visualization context for annotation and exploration. ProteoLens supports graph/network represented data in standard Graph Modeling Language (GML) formats, and this enables interoperation with a wide range of other visual layout tools. The architectural design of ProteoLens enables the de-coupling of complex network data visualization tasks into two distinct phases: 1) creating network data association rules, which are mapping rules between network node IDs or edge IDs and data attributes such as functional annotations, expression levels, scores, synonyms, descriptions etc; 2) applying network data association rules to build the network and perform the visual annotation of graph nodes and edges according to associated data values. We demonstrated the advantages of these new capabilities through three biological network visualization case studies: human disease association network, drug-target interaction network and protein-peptide mapping network. The architectural design of ProteoLens makes it suitable for bioinformatics expert data analysts who are experienced with relational database management to perform large-scale integrated network visual explorations. ProteoLens is a promising visual analytic platform that will facilitate knowledge discoveries in future network and systems biology studies.
Nørgaard, M; Johnsen, S P
2016-02-01
In Denmark, the need for monitoring of clinical quality and patient safety with feedback to the clinical, administrative and political systems has resulted in the establishment of a network of more than 60 publicly financed nationwide clinical quality databases. Although primarily devoted to monitoring and improving quality of care, the potential of these databases as data sources in clinical research is increasingly being recognized. In this review, we describe these databases focusing on their use as data sources for clinical research, including their strengths and weaknesses as well as future concerns and opportunities. The research potential of the clinical quality databases is substantial but has so far only been explored to a limited extent. Efforts related to technical, legal and financial challenges are needed in order to take full advantage of this potential. © 2016 The Association for the Publication of the Journal of Internal Medicine.
Zhang, Qingzhou; Yang, Bo; Chen, Xujiao; Xu, Jing; Mei, Changlin; Mao, Zhiguo
2014-01-01
We present a bioinformatics database named Renal Gene Expression Database (RGED), which contains comprehensive gene expression data sets from renal disease research. The web-based interface of RGED allows users to query the gene expression profiles in various kidney-related samples, including renal cell lines, human kidney tissues and murine model kidneys. Researchers can explore certain gene profiles, the relationships between genes of interests and identify biomarkers or even drug targets in kidney diseases. The aim of this work is to provide a user-friendly utility for the renal disease research community to query expression profiles of genes of their own interest without the requirement of advanced computational skills. Availability and implementation: Website is implemented in PHP, R, MySQL and Nginx and freely available from http://rged.wall-eva.net. Database URL: http://rged.wall-eva.net PMID:25252782
Machado, Helena; Silva, Susana
2015-01-01
The ethical aspects of biobanks and forensic DNA databases are often treated as separate issues. As a reflection of this, public participation, or the involvement of citizens in genetic databases, has been approached differently in the fields of forensics and medicine. This paper aims to cross the boundaries between medicine and forensics by exploring the flows between the ethical issues presented in the two domains and the subsequent conceptualisation of public trust and legitimisation. We propose to introduce the concept of ‘solidarity’, traditionally applied only to medical and research biobanks, into a consideration of public engagement in medicine and forensics. Inclusion of a solidarity-based framework, in both medical biobanks and forensic DNA databases, raises new questions that should be included in the ethical debate, in relation to both health services/medical research and activities associated with the criminal justice system. PMID:26139851
Analysis of human serum phosphopeptidome by a focused database searching strategy.
Zhu, Jun; Wang, Fangjun; Cheng, Kai; Song, Chunxia; Qin, Hongqiang; Hu, Lianghai; Figeys, Daniel; Ye, Mingliang; Zou, Hanfa
2013-01-14
As human serum is an important source for early diagnosis of many serious diseases, analysis of serum proteome and peptidome has been extensively performed. However, the serum phosphopeptidome was less explored probably because the effective method for database searching is lacking. Conventional database searching strategy always uses the whole proteome database, which is very time-consuming for phosphopeptidome search due to the huge searching space resulted from the high redundancy of the database and the setting of dynamic modifications during searching. In this work, a focused database searching strategy using an in-house collected human serum pro-peptidome target/decoy database (HuSPep) was established. It was found that the searching time was significantly decreased without compromising the identification sensitivity. By combining size-selective Ti (IV)-MCM-41 enrichment, RP-RP off-line separation, and complementary CID and ETD fragmentation with the new searching strategy, 143 unique endogenous phosphopeptides and 133 phosphorylation sites (109 novel sites) were identified from human serum with high reliability. Copyright © 2012 Elsevier B.V. All rights reserved.
A new relational database structure and online interface for the HITRAN database
NASA Astrophysics Data System (ADS)
Hill, Christian; Gordon, Iouli E.; Rothman, Laurence S.; Tennyson, Jonathan
2013-11-01
A new format for the HITRAN database is proposed. By storing the line-transition data in a number of linked tables described by a relational database schema, it is possible to overcome the limitations of the existing format, which have become increasingly apparent over the last few years as new and more varied data are being used by radiative-transfer models. Although the database in the new format can be searched using the well-established Structured Query Language (SQL), a web service, HITRANonline, has been deployed to allow users to make most common queries of the database using a graphical user interface in a web page. The advantages of the relational form of the database to ensuring data integrity and consistency are explored, and the compatibility of the online interface with the emerging standards of the Virtual Atomic and Molecular Data Centre (VAMDC) project is discussed. In particular, the ability to access HITRAN data using a standard query language from other websites, command line tools and from within computer programs is described.
Mining the Galaxy Zoo Database: Machine Learning Applications
NASA Astrophysics Data System (ADS)
Borne, Kirk D.; Wallin, J.; Vedachalam, A.; Baehr, S.; Lintott, C.; Darg, D.; Smith, A.; Fortson, L.
2010-01-01
The new Zooniverse initiative is addressing the data flood in the sciences through a transformative partnership between professional scientists, volunteer citizen scientists, and machines. As part of this project, we are exploring the application of machine learning techniques to data mining problems associated with the large and growing database of volunteer science results gathered by the Galaxy Zoo citizen science project. We will describe the basic challenge, some machine learning approaches, and early results. One of the motivators for this study is the acquisition (through the Galaxy Zoo results database) of approximately 100 million classification labels for roughly one million galaxies, yielding a tremendously large and rich set of training examples for improving automated galaxy morphological classification algorithms. In our first case study, the goal is to learn which morphological and photometric features in the Sloan Digital Sky Survey (SDSS) database correlate most strongly with user-selected galaxy morphological class. As a corollary to this study, we are also aiming to identify which galaxy parameters in the SDSS database correspond to galaxies that have been the most difficult to classify (based upon large dispersion in their volunter-provided classifications). Our second case study will focus on similar data mining analyses and machine leaning algorithms applied to the Galaxy Zoo catalog of merging and interacting galaxies. The outcomes of this project will have applications in future large sky surveys, such as the LSST (Large Synoptic Survey Telescope) project, which will generate a catalog of 20 billion galaxies and will produce an additional astronomical alert database of approximately 100 thousand events each night for 10 years -- the capabilities and algorithms that we are exploring will assist in the rapid characterization and classification of such massive data streams. This research has been supported in part through NSF award #0941610.
Data Mining Research with the LSST
NASA Astrophysics Data System (ADS)
Borne, Kirk D.; Strauss, M. A.; Tyson, J. A.
2007-12-01
The LSST catalog database will exceed 10 petabytes, comprising several hundred attributes for 5 billion galaxies, 10 billion stars, and over 1 billion variable sources (optical variables, transients, or moving objects), extracted from over 20,000 square degrees of deep imaging in 5 passbands with thorough time domain coverage: 1000 visits over the 10-year LSST survey lifetime. The opportunities are enormous for novel scientific discoveries within this rich time-domain ultra-deep multi-band survey database. Data Mining, Machine Learning, and Knowledge Discovery research opportunities with the LSST are now under study, with a potential for new collaborations to develop to contribute to these investigations. We will describe features of the LSST science database that are amenable to scientific data mining, object classification, outlier identification, anomaly detection, image quality assurance, and survey science validation. We also give some illustrative examples of current scientific data mining research in astronomy, and point out where new research is needed. In particular, the data mining research community will need to address several issues in the coming years as we prepare for the LSST data deluge. The data mining research agenda includes: scalability (at petabytes scales) of existing machine learning and data mining algorithms; development of grid-enabled parallel data mining algorithms; designing a robust system for brokering classifications from the LSST event pipeline (which may produce 10,000 or more event alerts per night); multi-resolution methods for exploration of petascale databases; visual data mining algorithms for visual exploration of the data; indexing of multi-attribute multi-dimensional astronomical databases (beyond RA-Dec spatial indexing) for rapid querying of petabyte databases; and more. Finally, we will identify opportunities for synergistic collaboration between the data mining research group and the LSST Data Management and Science Collaboration teams.
Zhou, Zhiqing E; Yan, Yu; Che, Xin Xuan; Meier, Laurenz L
2015-01-01
Although previous studies have linked workplace incivility with various negative outcomes, they mainly focused on the long-term effects of chronic exposure to workplace incivility, whereas targets' short-term reactions to incivility episodes have been largely neglected. Using a daily diary design, the current study examined effects of daily workplace incivility on end-of-work negative affect and explored potential individual and organizational moderators. Data collected from 76 full-time employees across 10 consecutive working days revealed that daily workplace incivility positively predicted end-of-work negative affect while controlling for before-work negative affect. Further, the relationship was stronger for people with low emotional stability, high hostile attribution bias, external locus of control, and people experiencing low chronic workload and more chronic organizational constraints, as compared with people with high emotional stability, low hostile attribution bias, internal locus of control, and people experiencing high chronic workload and fewer chronic organizational constraints, respectively. (PsycINFO Database Record (c) 2014 APA, all rights reserved). PsycINFO Database Record (c) 2014 APA, all rights reserved.
A Visual Galaxy Classification Interface and its Classroom Application
NASA Astrophysics Data System (ADS)
Kautsch, Stefan J.; Phung, Chau; VanHilst, Michael; Castro, Victor H
2014-06-01
Galaxy morphology is an important topic in modern astronomy to understand questions concerning the evolution and formation of galaxies and their dark matter content. In order to engage students in exploring galaxy morphology, we developed a web-based, graphical interface that allows students to visually classify galaxy images according to various morphological types. The website is designed with HTML5, JavaScript, PHP, and a MySQL database. The classification interface provides hands-on research experience and training for students and interested clients, and allows them to contribute to studies of galaxy morphology. We present the first results of a pilot study and compare the visually classified types using our interface with that from automated classification routines.
ExpEdit: a webserver to explore human RNA editing in RNA-Seq experiments.
Picardi, Ernesto; D'Antonio, Mattia; Carrabino, Danilo; Castrignanò, Tiziana; Pesole, Graziano
2011-05-01
ExpEdit is a web application for assessing RNA editing in human at known or user-specified sites supported by transcript data obtained by RNA-Seq experiments. Mapping data (in SAM/BAM format) or directly sequence reads [in FASTQ/short read archive (SRA) format] can be provided as input to carry out a comparative analysis against a large collection of known editing sites collected in DARNED database as well as other user-provided potentially edited positions. Results are shown as dynamic tables containing University of California, Santa Cruz (UCSC) links for a quick examination of the genomic context. ExpEdit is freely available on the web at http://www.caspur.it/ExpEdit/.
Comparative analysis of data mining techniques for business data
NASA Astrophysics Data System (ADS)
Jamil, Jastini Mohd; Shaharanee, Izwan Nizal Mohd
2014-12-01
Data mining is the process of employing one or more computer learning techniques to automatically analyze and extract knowledge from data contained within a database. Companies are using this tool to further understand their customers, to design targeted sales and marketing campaigns, to predict what product customers will buy and the frequency of purchase, and to spot trends in customer preferences that can lead to new product development. In this paper, we conduct a systematic approach to explore several of data mining techniques in business application. The experimental result reveals that all data mining techniques accomplish their goals perfectly, but each of the technique has its own characteristics and specification that demonstrate their accuracy, proficiency and preference.
Evaluation of Sub Query Performance in SQL Server
NASA Astrophysics Data System (ADS)
Oktavia, Tanty; Sujarwo, Surya
2014-03-01
The paper explores several sub query methods used in a query and their impact on the query performance. The study uses experimental approach to evaluate the performance of each sub query methods combined with indexing strategy. The sub query methods consist of in, exists, relational operator and relational operator combined with top operator. The experimental shows that using relational operator combined with indexing strategy in sub query has greater performance compared with using same method without indexing strategy and also other methods. In summary, for application that emphasized on the performance of retrieving data from database, it better to use relational operator combined with indexing strategy. This study is done on Microsoft SQL Server 2012.
Exploitation of molecular profiling techniques for GM food safety assessment.
Kuiper, Harry A; Kok, Esther J; Engel, Karl-Heinz
2003-04-01
Several strategies have been developed to identify unintended alterations in the composition of genetically modified (GM) food crops that may occur as a result of the genetic modification process. These include comparative chemical analysis of single compounds in GM food crops and their conventional non-GM counterparts, and profiling methods such as DNA/RNA microarray technologies, proteomics and metabolite profiling. The potential of profiling methods is obvious, but further exploration of specificity, sensitivity and validation is needed. Moreover, the successful application of profiling techniques to the safety evaluation of GM foods will require linked databases to be built that contain information on variations in profiles associated with differences in developmental stages and environmental conditions.
Expedition Memory: Towards Agent-based Web Services for Creating and Using Mars Exploration Data.
NASA Technical Reports Server (NTRS)
Clancey, William J.; Sierhuis, Maarten; Briggs, Geoff; Sims, Mike
2005-01-01
Explorers ranging over kilometers of rugged, sometimes "feature-less" terrain for over a year could be overwhelmed by tracking and sharing what they have done and learned. An automated system based on the existing Mobile Agents design [ I ] and Mars Exploration Rover experience [2], could serve as an "expedition memory" that would be indexed by voice as wel1 as a web interface, linking people, places, activities, records (voice notes, photographs, samples). and a descriptive scientific ontology. This database would be accessible during EVAs by astronauts, annotated by the remote science team, linked to EVA plans, and allow cross indexing between sites and expeditions. We consider the basic problem, our philosophical approach, technical methods, and uses of the expedition memory for facilitating long-term collaboration between Mars crews and Earth support teams. We emphasize that a "memory" does not mean a database per se, but an interactive service that combines different resources, and ultimately could be like a helpful librarian.
NASA Astrophysics Data System (ADS)
Gentry, Jeffery D.
2000-05-01
A relational database is a powerful tool for collecting and analyzing the vast amounts of inner-related data associated with the manufacture of composite materials. A relational database contains many individual database tables that store data that are related in some fashion. Manufacturing process variables as well as quality assurance measurements can be collected and stored in database tables indexed according to lot numbers, part type or individual serial numbers. Relationships between manufacturing process and product quality can then be correlated over a wide range of product types and process variations. This paper presents details on how relational databases are used to collect, store, and analyze process variables and quality assurance data associated with the manufacture of advanced composite materials. Important considerations are covered including how the various types of data are organized and how relationships between the data are defined. Employing relational database techniques to establish correlative relationships between process variables and quality assurance measurements is then explored. Finally, the benefits of database techniques such as data warehousing, data mining and web based client/server architectures are discussed in the context of composite material manufacturing.
PlantTribes: a gene and gene family resource for comparative genomics in plants
Wall, P. Kerr; Leebens-Mack, Jim; Müller, Kai F.; Field, Dawn; Altman, Naomi S.; dePamphilis, Claude W.
2008-01-01
The PlantTribes database (http://fgp.huck.psu.edu/tribe.html) is a plant gene family database based on the inferred proteomes of five sequenced plant species: Arabidopsis thaliana, Carica papaya, Medicago truncatula, Oryza sativa and Populus trichocarpa. We used the graph-based clustering algorithm MCL [Van Dongen (Technical Report INS-R0010 2000) and Enright et al. (Nucleic Acids Res. 2002; 30: 1575–1584)] to classify all of these species’ protein-coding genes into putative gene families, called tribes, using three clustering stringencies (low, medium and high). For all tribes, we have generated protein and DNA alignments and maximum-likelihood phylogenetic trees. A parallel database of microarray experimental results is linked to the genes, which lets researchers identify groups of related genes and their expression patterns. Unified nomenclatures were developed, and tribes can be related to traditional gene families and conserved domain identifiers. SuperTribes, constructed through a second iteration of MCL clustering, connect distant, but potentially related gene clusters. The global classification of nearly 200 000 plant proteins was used as a scaffold for sorting ∼4 million additional cDNA sequences from over 200 plant species. All data and analyses are accessible through a flexible interface allowing users to explore the classification, to place query sequences within the classification, and to download results for further study. PMID:18073194
SANSparallel: interactive homology search against Uniprot.
Somervuo, Panu; Holm, Liisa
2015-07-01
Proteins evolve by mutations and natural selection. The network of sequence similarities is a rich source for mining homologous relationships that inform on protein structure and function. There are many servers available to browse the network of homology relationships but one has to wait up to a minute for results. The SANSparallel webserver provides protein sequence database searches with immediate response and professional alignment visualization by third-party software. The output is a list, pairwise alignment or stacked alignment of sequence-similar proteins from Uniprot, UniRef90/50, Swissprot or Protein Data Bank. The stacked alignments are viewed in Jalview or as sequence logos. The database search uses the suffix array neighborhood search (SANS) method, which has been re-implemented as a client-server, improved and parallelized. The method is extremely fast and as sensitive as BLAST above 50% sequence identity. Benchmarks show that the method is highly competitive compared to previously published fast database search programs: UBLAST, DIAMOND, LAST, LAMBDA, RAPSEARCH2 and BLAT. The web server can be accessed interactively or programmatically at http://ekhidna2.biocenter.helsinki.fi/cgi-bin/sans/sans.cgi. It can be used to make protein functional annotation pipelines more efficient, and it is useful in interactive exploration of the detailed evidence supporting the annotation of particular proteins of interest. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Computational Methods to Work as First-Pass Filter in Deleterious SNP Analysis of Alkaptonuria
Magesh, R.; George Priya Doss, C.
2012-01-01
A major challenge in the analysis of human genetic variation is to distinguish functional from nonfunctional SNPs. Discovering these functional SNPs is one of the main goals of modern genetics and genomics studies. There is a need to effectively and efficiently identify functionally important nsSNPs which may be deleterious or disease causing and to identify their molecular effects. The prediction of phenotype of nsSNPs by computational analysis may provide a good way to explore the function of nsSNPs and its relationship with susceptibility to disease. In this context, we surveyed and compared variation databases along with in silico prediction programs to assess the effects of deleterious functional variants on protein functions. In other respects, we attempted these methods to work as first-pass filter to identify the deleterious substitutions worth pursuing for further experimental research. In this analysis, we used the existing computational methods to explore the mutation-structure-function relationship in HGD gene causing alkaptonuria. PMID:22606059
A metabolomics guided exploration of marine natural product chemical space.
Floros, Dimitrios J; Jensen, Paul R; Dorrestein, Pieter C; Koyama, Nobuhiro
2016-09-01
Natural products from culture collections have enormous impact in advancing discovery programs for metabolites of biotechnological importance. These discovery efforts rely on the metabolomic characterization of strain collections. Many emerging approaches compare metabolomic profiles of such collections, but few enable the analysis and prioritization of thousands of samples from diverse organisms while delivering chemistry specific read outs. In this work we utilize untargeted LC-MS/MS based metabolomics together with molecular networking to. This approach annotated 76 molecular families (a spectral match rate of 28 %), including clinically and biotechnologically important molecules such as valinomycin, actinomycin D, and desferrioxamine E. Targeting a molecular family produced primarily by one microorganism led to the isolation and structure elucidation of two new molecules designated maridric acids A and B. Molecular networking guided exploration of large culture collections allows for rapid dereplication of know molecules and can highlight producers of uniques metabolites. These methods, together with large culture collections and growing databases, allow for data driven strain prioritization with a focus on novel chemistries.
Guhlin, Joseph; Silverstein, Kevin A T; Zhou, Peng; Tiffin, Peter; Young, Nevin D
2017-08-10
Rapid generation of omics data in recent years have resulted in vast amounts of disconnected datasets without systemic integration and knowledge building, while individual groups have made customized, annotated datasets available on the web with few ways to link them to in-lab datasets. With so many research groups generating their own data, the ability to relate it to the larger genomic and comparative genomic context is becoming increasingly crucial to make full use of the data. The Omics Database Generator (ODG) allows users to create customized databases that utilize published genomics data integrated with experimental data which can be queried using a flexible graph database. When provided with omics and experimental data, ODG will create a comparative, multi-dimensional graph database. ODG can import definitions and annotations from other sources such as InterProScan, the Gene Ontology, ENZYME, UniPathway, and others. This annotation data can be especially useful for studying new or understudied species for which transcripts have only been predicted, and rapidly give additional layers of annotation to predicted genes. In better studied species, ODG can perform syntenic annotation translations or rapidly identify characteristics of a set of genes or nucleotide locations, such as hits from an association study. ODG provides a web-based user-interface for configuring the data import and for querying the database. Queries can also be run from the command-line and the database can be queried directly through programming language hooks available for most languages. ODG supports most common genomic formats as well as generic, easy to use tab-separated value format for user-provided annotations. ODG is a user-friendly database generation and query tool that adapts to the supplied data to produce a comparative genomic database or multi-layered annotation database. ODG provides rapid comparative genomic annotation and is therefore particularly useful for non-model or understudied species. For species for which more data are available, ODG can be used to conduct complex multi-omics, pattern-matching queries.
Chaotic Traversal (CHAT): Very Large Graphs Traversal Using Chaotic Dynamics
NASA Astrophysics Data System (ADS)
Changaival, Boonyarit; Rosalie, Martin; Danoy, Grégoire; Lavangnananda, Kittichai; Bouvry, Pascal
2017-12-01
Graph Traversal algorithms can find their applications in various fields such as routing problems, natural language processing or even database querying. The exploration can be considered as a first stepping stone into knowledge extraction from the graph which is now a popular topic. Classical solutions such as Breadth First Search (BFS) and Depth First Search (DFS) require huge amounts of memory for exploring very large graphs. In this research, we present a novel memoryless graph traversal algorithm, Chaotic Traversal (CHAT) which integrates chaotic dynamics to traverse large unknown graphs via the Lozi map and the Rössler system. To compare various dynamics effects on our algorithm, we present an original way to perform the exploration of a parameter space using a bifurcation diagram with respect to the topological structure of attractors. The resulting algorithm is an efficient and nonresource demanding algorithm, and is therefore very suitable for partial traversal of very large and/or unknown environment graphs. CHAT performance using Lozi map is proven superior than the, commonly known, Random Walk, in terms of number of nodes visited (coverage percentage) and computation time where the environment is unknown and memory usage is restricted.
Large-scale Exploration of Neuronal Morphologies Using Deep Learning and Augmented Reality.
Li, Zhongyu; Butler, Erik; Li, Kang; Lu, Aidong; Ji, Shuiwang; Zhang, Shaoting
2018-02-12
Recently released large-scale neuron morphological data has greatly facilitated the research in neuroinformatics. However, the sheer volume and complexity of these data pose significant challenges for efficient and accurate neuron exploration. In this paper, we propose an effective retrieval framework to address these problems, based on frontier techniques of deep learning and binary coding. For the first time, we develop a deep learning based feature representation method for the neuron morphological data, where the 3D neurons are first projected into binary images and then learned features using an unsupervised deep neural network, i.e., stacked convolutional autoencoders (SCAEs). The deep features are subsequently fused with the hand-crafted features for more accurate representation. Considering the exhaustive search is usually very time-consuming in large-scale databases, we employ a novel binary coding method to compress feature vectors into short binary codes. Our framework is validated on a public data set including 58,000 neurons, showing promising retrieval precision and efficiency compared with state-of-the-art methods. In addition, we develop a novel neuron visualization program based on the techniques of augmented reality (AR), which can help users take a deep exploration of neuron morphologies in an interactive and immersive manner.
... to Main Content Two Ways to Explore Toxic Chemicals in Your Community TOXMAP classic provides an Advanced ... group of TOXNET databases related to toxicology, hazardous chemicals, environmental health, and toxic releases. Connect with Us ...
SINEBase: a database and tool for SINE analysis.
Vassetzky, Nikita S; Kramerov, Dmitri A
2013-01-01
SINEBase (http://sines.eimb.ru) integrates the revisited body of knowledge about short interspersed elements (SINEs). A set of formal definitions concerning SINEs was introduced. All available sequence data were screened through these definitions and the genetic elements misidentified as SINEs were discarded. As a result, 175 SINE families have been recognized in animals, flowering plants and green algae. These families were classified by the modular structure of their nucleotide sequences and the frequencies of different patterns were evaluated. These data formed the basis for the database of SINEs. The SINEBase website can be used in two ways: first, to explore the database of SINE families, and second, to analyse candidate SINE sequences using specifically developed tools. This article presents an overview of the database and the process of SINE identification and analysis.
SINEBase: a database and tool for SINE analysis
Vassetzky, Nikita S.; Kramerov, Dmitri A.
2013-01-01
SINEBase (http://sines.eimb.ru) integrates the revisited body of knowledge about short interspersed elements (SINEs). A set of formal definitions concerning SINEs was introduced. All available sequence data were screened through these definitions and the genetic elements misidentified as SINEs were discarded. As a result, 175 SINE families have been recognized in animals, flowering plants and green algae. These families were classified by the modular structure of their nucleotide sequences and the frequencies of different patterns were evaluated. These data formed the basis for the database of SINEs. The SINEBase website can be used in two ways: first, to explore the database of SINE families, and second, to analyse candidate SINE sequences using specifically developed tools. This article presents an overview of the database and the process of SINE identification and analysis. PMID:23203982
NASA Astrophysics Data System (ADS)
Scharberg, Maureen A.; Cox, Oran E.; Barelli, Carl A.
1997-07-01
"The Molecule of the Day" consumer chemical database has been created to allow introductory chemistry students to explore molecular structures of chemicals in household products, and to provide opportunities in molecular modeling for undergraduate chemistry students. Before class begins, an overhead transparency is displayed which shows a three-dimensional molecular structure of a household chemical, and lists relevant features and uses of this chemical. Within answers to questionnaires, students have commented that this molecular graphics database has helped them to visually connect the microscopic structure of a molecule with its physical and chemical properties, as well as its uses in consumer products. It is anticipated that this database will be incorporated into a navigational software package such as Netscape.
Matsuda, Fumio; Nakabayashi, Ryo; Sawada, Yuji; Suzuki, Makoto; Hirai, Masami Y.; Kanaya, Shigehiko; Saito, Kazuki
2011-01-01
A novel framework for automated elucidation of metabolite structures in liquid chromatography–mass spectrometer metabolome data was constructed by integrating databases. High-resolution tandem mass spectra data automatically acquired from each metabolite signal were used for database searches. Three distinct databases, KNApSAcK, ReSpect, and the PRIMe standard compound database, were employed for the structural elucidation. The outputs were retrieved using the CAS metabolite identifier for identification and putative annotation. A simple metabolite ontology system was also introduced to attain putative characterization of the metabolite signals. The automated method was applied for the metabolome data sets obtained from the rosette leaves of 20 Arabidopsis accessions. Phenotypic variations in novel Arabidopsis metabolites among these accessions could be investigated using this method. PMID:22645535
Development of the Orion Crew Module Static Aerodynamic Database. Par 2; Supersonic/Subsonic
NASA Technical Reports Server (NTRS)
Bibb, Karen L.; Walker, Eric L.; Brauckmann, Gregory J.; Robinson, Phil
2011-01-01
This work describes the process of developing the nominal static aerodynamic coefficients and associated uncertainties for the Orion Crew Module for Mach 8 and below. The database was developed from wind tunnel test data and computational simulations of the smooth Crew Module geometry, with no asymmetries or protuberances. The database covers the full range of Reynolds numbers seen in both entry and ascent abort scenarios. The basic uncertainties were developed as functions of Mach number and total angle of attack from variations in the primary data as well as computations at lower Reynolds numbers, on the baseline geometry, and using different flow solvers. The resulting aerodynamic database represents the Crew Exploration Vehicle Aerosciences Project's best estimate of the nominal aerodynamics for the current Crew Module vehicle.
The liver tissue bank and clinical database in China.
Yang, Yuan; Liu, Yi-Min; Wei, Ming-Yue; Wu, Yi-Fei; Gao, Jun-Hui; Liu, Lei; Zhou, Wei-Ping; Wang, Hong-Yang; Wu, Meng-Chao
2010-12-01
To develop a standardized and well-rounded material available for hepatology research, the National Liver Tissue Bank (NLTB) Project began in 2008 in China to make well-characterized and optimally preserved liver tumor tissue and clinical database. From Dec 2008 to Jun 2010, over 3000 individuals have been enrolled as liver tumor donors to the NLTB, including 2317 cases of newly diagnosed hepatocellular carcinoma (HCC) and about 1000 cases of diagnosed benign or malignant liver tumors. The clinical database and sample store can be managed easily and correctly with the data management platform used. We believe that the high-quality samples with detailed information database will become the cornerstone of hepatology research especially in studies exploring the diagnosis and new treatments for HCC and other liver diseases.
77 FR 5023 - Agency Information Collection Activities: Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2012-02-01
... proposed information collection project: ``Medical Office Survey on Patient Safety Culture Comparative... . SUPPLEMENTARY INFORMATION: Proposed Project Medical Office Survey on Patient Safety Culture Comparative Database... AHRQ Medical Office Survey on Patient Safety Culture (Medical Office SOPS) Comparative Database. The...
Gonzalez, Roxana; O'Brien-Barry, Patricia; Ancheta, Reginaldo; Razal, Rennuel; Clyne, Mary Ellen
A quasiexperimental study was conducted to demonstrate which teaching modality, peer education or computer-based education, improves the utilization of the library electronic databases and thereby evidence-based knowledge at the point of care. No significant differences were found between the teaching modalities. However, the study identified the need to explore professional development teaching modalities outside the traditional classroom to support an evidence-based practice healthcare environment.
Gnome View: A tool for visual representation of human genome data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pelkey, J.E.; Thomas, G.S.; Thurman, D.A.
1993-02-01
GnomeView is a tool for exploring data generated by the Human Gemone Project. GnomeView provides both graphical and textural styles of data presentation: employs an intuitive window-based graphical query interface: and integrates its underlying genome databases in such a way that the user can navigate smoothly across databases and between different levels of data. This paper describes GnomeView and discusses how it addresses various genome informatics issues.
Smith, Steven M.; Neilson, Ryan T.; Giles, Stuart A.
2015-01-01
Government-sponsored, national-scale, soil and sediment geochemical databases are used to estimate regional and local background concentrations for environmental issues, identify possible anthropogenic contamination, estimate mineral endowment, explore for new mineral deposits, evaluate nutrient levels for agriculture, and establish concentration relationships with human or animal health. Because of these different uses, it is difficult for any single database to accommodate all the needs of each client. Smith et al. (2013, p. 168) reviewed six national-scale soil and sediment geochemical databases for the United States (U.S.) and, for each, evaluated “its appropriateness as a national-scale geochemical database and its usefulness for national-scale geochemical mapping.” Each of the evaluated databases has strengths and weaknesses that were listed in that review.Two of these U.S. national-scale geochemical databases are similar in their sample media and collection protocols but have different strengths—primarily sampling density and analytical consistency. This project was implemented to determine whether those databases could be merged to produce a combined dataset that could be used for mineral resource assessments. The utility of the merged database was tested to see whether mapped distributions could identify metalliferous black shales at a national scale.
Exploring Chemical Space for Drug Discovery Using the Chemical Universe Database
2012-01-01
Herein we review our recent efforts in searching for bioactive ligands by enumeration and virtual screening of the unknown chemical space of small molecules. Enumeration from first principles shows that almost all small molecules (>99.9%) have never been synthesized and are still available to be prepared and tested. We discuss open access sources of molecules, the classification and representation of chemical space using molecular quantum numbers (MQN), its exhaustive enumeration in form of the chemical universe generated databases (GDB), and examples of using these databases for prospective drug discovery. MQN-searchable GDB, PubChem, and DrugBank are freely accessible at www.gdb.unibe.ch. PMID:23019491
CHIP Demonstrator: Semantics-Driven Recommendations and Museum Tour Generation
NASA Astrophysics Data System (ADS)
Aroyo, Lora; Stash, Natalia; Wang, Yiwen; Gorgels, Peter; Rutledge, Lloyd
The main objective of the CHIP project is to demonstrate how Semantic Web technologies can be deployed to provide personalized access to digital museum collections. We illustrate our approach with the digital database ARIA of the Rijksmuseum Amsterdam. For the semantic enrichment of the Rijksmuseum ARIA database we collaborated with the CATCH STITCH project to produce mappings to Iconclass, and with the MultimediaN E-culture project to produce the RDF/OWL of the ARIA and Adlib databases. The main focus of CHIP is on exploring the potential of applying adaptation techniques to provide personalized experience for the museum visitors both on the Web site and in the museum.
gPhoton: Time-tagged GALEX photon events analysis tools
NASA Astrophysics Data System (ADS)
Million, Chase C.; Fleming, S. W.; Shiao, B.; Loyd, P.; Seibert, M.; Smith, M.
2016-03-01
Written in Python, gPhoton calibrates and sky-projects the ~1.1 trillion ultraviolet photon events detected by the microchannel plates on the Galaxy Evolution Explorer Spacecraft (GALEX), archives these events in a publicly accessible database at the Mikulski Archive for Space Telescopes (MAST), and provides tools for working with the database to extract scientific results, particularly over short time domains. The software includes a re-implementation of core functionality of the GALEX mission calibration pipeline to produce photon list files from raw spacecraft data as well as a suite of command line tools to generate calibrated light curves, images, and movies from the MAST database.
Consumer Attitudes About Renewable Energy. Trends and Regional Differences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bird, Lori; Sumner, Jenny
2011-04-01
The data in this report are taken from Natural Marketing Institute's (NMI's) Lifestyles of Health and Sustainability Consumer Trends Database. Created in 2002, the syndicated consumer database contains responses from 2,000 to 4,000 nationally representative U.S. adults (meaning the demographics of the sample are consistent with U.S. Census findings) each year. NMI used the database to analyze consumer attitudes and behavior related to renewable energy and to update previously conducted related research. Specifically, this report will explore consumer awareness, concerns, perceived benefits, knowledge of purchase options, and usage of renewable energy as well as provide regional comparisons and trends overmore » time.« less
Quality control of EUVE databases
NASA Technical Reports Server (NTRS)
John, L. M.; Drake, J.
1992-01-01
The publicly accessible databases for the Extreme Ultraviolet Explorer include: the EUVE Archive mailserver; the CEA ftp site; the EUVE Guest Observer Mailserver; and the Astronomical Data System node. The EUVE Performance Assurance team is responsible for verifying that these public EUVE databases are working properly, and that the public availability of EUVE data contained therein does not infringe any data rights which may have been assigned. In this poster, we describe the Quality Assurance (QA) procedures we have developed from the approach of QA as a service organization, thus reflecting the overall EUVE philosophy of Quality Assurance integrated into normal operating procedures, rather than imposed as an external, post facto, control mechanism.
Consumer Attitudes About Renewable Energy: Trends and Regional Differences
DOE Office of Scientific and Technical Information (OSTI.GOV)
Natural Marketing Institute, Harleysville, Pennsylvania
The data in this report are taken from Natural Marketing Institute's (NMI's) Lifestyles of Health and Sustainability Consumer Trends Database. Created in 2002, the syndicated consumer database contains responses from 2,000 to 4,000 nationally representative U.S. adults (meaning the demographics of the sample are consistent with U.S. Census findings) each year. NMI used the database to analyze consumer attitudes and behavior related to renewable energy and to update previously conducted related research. Specifically, this report will explore consumer awareness, concerns, perceived benefits, knowledge of purchase options, and usage of renewable energy as well as provide regional comparisons and trends overmore » time.« less
This document may be of assistance in applying the Title V air operating permit regulations. This document is part of the Title V Petition Database available at www2.epa.gov/title-v-operating-permits/title-v-petition-database. Some documents in the database are a scanned or retyped version of a paper photocopy of the original. Although we have taken considerable effort to quality assure the documents, some may contain typographical errors. Contact the office that issued the document if you need a copy of the original.
Lee, Nathan J; Guzman, Javier Z; Kim, Jun; Skovrlj, Branko; Martin, Christopher T; Pugely, Andrew J; Gao, Yubo; Caridi, John M; Mendoza-Lattes, Sergio; Cho, Samuel K
2016-11-01
Retrospective cohort analysis. A growing number of publications have utilized the Scoliosis Research Society (SRS) Morbidity and Mortality (M&M) database, but none have compared it to other large databases. The objective of this study was to compare SRS complications with those in administrative databases. The Nationwide Inpatient Sample (NIS) and Kid's Inpatient Database (KID) captured a greater number of overall complications while the SRS M&M data provided a greater incidence of spine-related complications following adolescent idiopathic scoliosis (AIS) surgery. Chi-square was used to obtain statistical significance, with p < .05 considered significant. The SRS 2004-2007 (9,904 patients), NIS 2004-2007 (20,441 patients) and KID 2003-2006 (10,184 patients) databases were analyzed for AIS patients who underwent fusion. Comparable variables were queried in all three databases, including patient demographics, surgical variables, and complications. Patients undergoing AIS in the SRS database were slightly older (SRS 14.4 years vs. NIS 13.8 years, p < .0001; KID 13.9 years, p < .0001) and less likely to be male (SRS 18.5% vs. NIS 26.3%, p < .0001; KID 24.8%, p < .0001). Revision surgery (SRS 3.3% vs. NIS 2.4%, p < .0001; KID 0.9%, p < .0001) and osteotomy (SRS 8% vs. NIS 2.3%, p < .0001; KID 2.4%, p < .0001) were more commonly reported in the SRS database. The SRS database reported fewer overall complications (SRS 3.9% vs. NIS 7.3%, p < .0001; KID 6.6%, p < .0001). However, when respiratory complications (SRS 0.5% vs. NIS 3.7%, p < .0001; KID 4.4%, p < .0001) were excluded, medical complication rates were similar across databases. In contrast, SRS reported higher spine-specific complication rates. Mortality rates were similar between SRS versus NIS (p = .280) and SRS versus KID (p = .08) databases. There are similarities and differences between the three databases. These discrepancies are likely due to the varying data-gathering methods each organization uses to collect their morbidity data. Level IV. Copyright © 2016 Scoliosis Research Society. Published by Elsevier Inc. All rights reserved.
Does an Otolaryngology-Specific Database Have Added Value? A Comparative Feasibility Analysis.
Bellmunt, Angela M; Roberts, Rhonda; Lee, Walter T; Schulz, Kris; Pynnonen, Melissa A; Crowson, Matthew G; Witsell, David; Parham, Kourosh; Langman, Alan; Vambutas, Andrea; Ryan, Sheila E; Shin, Jennifer J
2016-07-01
There are multiple nationally representative databases that support epidemiologic and outcomes research, and it is unknown whether an otolaryngology-specific resource would prove indispensable or superfluous. Therefore, our objective was to determine the feasibility of analyses in the National Ambulatory Medical Care Survey (NAMCS) and National Hospital Ambulatory Medical Care Survey (NHAMCS) databases as compared with the otolaryngology-specific Creating Healthcare Excellence through Education and Research (CHEER) database. Parallel analyses in 2 data sets. Ambulatory visits in the United States. To test a fixed hypothesis that could be directly compared between data sets, we focused on a condition with expected prevalence high enough to substantiate availability in both. This query also encompassed a broad span of diagnoses to sample the breadth of available information. Specifically, we compared an assessment of suspected risk factors for sensorineural hearing loss in subjects 0 to 21 years of age, according to a predetermined protocol. We also assessed the feasibility of 6 additional diagnostic queries among all age groups. In the NAMCS/NHAMCS data set, the number of measured observations was not sufficient to support reliable numeric conclusions (percentage standard error among risk factors: 38.6-92.1). Analysis of the CHEER database demonstrated that age, sex, meningitis, and cytomegalovirus were statistically significant factors associated with pediatric sensorineural hearing loss (P < .01). Among the 6 additional diagnostic queries assessed, NAMCS/NHAMCS usage was also infeasible; the CHEER database contained 1585 to 212,521 more observations per annum. An otolaryngology-specific database has added utility when compared with already available national ambulatory databases. © American Academy of Otolaryngology—Head and Neck Surgery Foundation 2016.
77 FR 4038 - Agency Information Collection Activities: Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2012-01-26
... proposed information collection project: ``Nursing Home Survey on Patient Safety Culture Comparative...: Proposed Project Nursing Home Survey on Patient Safety Culture Comparative Database The Agency for... Nursing Home Survey on Patient Safety Culture (Nursing Home SOPS) Comparative Database. The Nursing Home...
How people with serious mental illness use smartphones, mobile apps, and social media.
Naslund, John A; Aschbrenner, Kelly A; Bartels, Stephen J
2016-12-01
Research shows that people with serious mental illness are increasingly using mobile devices. Less is known about how these individuals use their mobile devices or whether they access social media. We surveyed individuals with serious mental illness to explore their use of these technologies. Individuals with serious mental illness engaged in lifestyle interventions through community mental health centers completed a survey about their use of mobile and online technologies. Responses were compared with data from the general population. Among respondents (n = 70), 93% owned cellphones, 78% used text messaging, 50% owned smartphones, and 71% used social media such as Facebook. Most respondents reported daily use of text messaging, mobile apps, and social media. Technology use was comparable to the general population, though smartphone ownership was lower. These findings can inform future interventions that fully leverage this group's use of popular digital technologies. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Breast cancer survival in New Zealand women.
Campbell, Ian D; Scott, Nina; Seneviratne, Sanjeewa; Kollias, James; Walters, David; Taylor, Corey; Webster, Fleur; Zorbas, Helen; Roder, David M
2015-01-01
The Quality Audit (BQA) of Breast Surgeons of Australia and New Zealand includes a broad range of data and is the largest New Zealand (NZ) breast cancer (BC) database outside the NZ Cancer Registry. We used BQA data to compare BC survival by ethnicity, deprivation, remoteness, clinical characteristic and case load. BQA and death data were linked using the National Health Index. Disease-specific survival for invasive cases was benchmarked against Australian BQA data and NZ population-based survivals. Validity was explored by comparison with expected survival by risk factor. Compared with 93% for Australian audit cases, 5-year survival was 90% for NZ audit cases overall, 87% for Maori, 84% for Pacific and 91% for other. BC survival in NZ appears lower than in Australia, with inequities by ethnicity. Differences may be due to access, timeliness and quality of health services, patient risk profiles, BQA coverage and death-record methodology. © 2014 Royal Australasian College of Surgeons.
Knechtle, William S; Perez, Sebastian D; Raval, Mehul V; Sullivan, Patrick S; Duwayri, Yazan M; Fernandez, Felix; Sharma, Joe; Sweeney, John F
Quality-cost diagrams have been used previously to assess interventions and their cost-effectiveness. This study explores the use of risk-adjusted quality-cost diagrams to compare the value provided by surgeons by presenting cost and outcomes simultaneously. Colectomy cases from a single institution captured in the National Surgical Quality Improvement Program database were linked to hospital cost-accounting data to determine costs per encounter. Risk adjustment models were developed and observed average cost and complication rates per surgeon were compared to expected cost and complication rates using the diagrams. Surgeons were surveyed to determine if the diagrams could provide information that would result in practice adjustment. Of 55 surgeons surveyed on the utility of the diagrams, 92% of respondents believed the diagrams were useful. The diagrams seemed intuitive to interpret, and making risk-adjusted comparisons accounted for patient differences in the evaluation.
Comparison of the Frontier Distributed Database Caching System to NoSQL Databases
NASA Astrophysics Data System (ADS)
Dykstra, Dave
2012-12-01
One of the main attractions of non-relational “NoSQL” databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also adds high scalability and the ability to be distributed over a wide-area for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It also compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.
Comparison of the Frontier Distributed Database Caching System to NoSQL Databases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dykstra, Dave
One of the main attractions of non-relational NoSQL databases is their ability to scale to large numbers of readers, including readers spread over a wide area. The Frontier distributed database caching system, used in production by the Large Hadron Collider CMS and ATLAS detector projects for Conditions data, is based on traditional SQL databases but also adds high scalability and the ability to be distributed over a wide-area for an important subset of applications. This paper compares the major characteristics of the two different approaches and identifies the criteria for choosing which approach to prefer over the other. It alsomore » compares in some detail the NoSQL databases used by CMS and ATLAS: MongoDB, CouchDB, HBase, and Cassandra.« less
LinkedOmics: analyzing multi-omics data within and across 32 cancer types.
Vasaikar, Suhas V; Straub, Peter; Wang, Jing; Zhang, Bing
2018-01-04
The LinkedOmics database contains multi-omics data and clinical data for 32 cancer types and a total of 11 158 patients from The Cancer Genome Atlas (TCGA) project. It is also the first multi-omics database that integrates mass spectrometry (MS)-based global proteomics data generated by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) on selected TCGA tumor samples. In total, LinkedOmics has more than a billion data points. To allow comprehensive analysis of these data, we developed three analysis modules in the LinkedOmics web application. The LinkFinder module allows flexible exploration of associations between a molecular or clinical attribute of interest and all other attributes, providing the opportunity to analyze and visualize associations between billions of attribute pairs for each cancer cohort. The LinkCompare module enables easy comparison of the associations identified by LinkFinder, which is particularly useful in multi-omics and pan-cancer analyses. The LinkInterpreter module transforms identified associations into biological understanding through pathway and network analysis. Using five case studies, we demonstrate that LinkedOmics provides a unique platform for biologists and clinicians to access, analyze and compare cancer multi-omics data within and across tumor types. LinkedOmics is freely available at http://www.linkedomics.org. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Zhou, Huaqiang; Zhang, Yuanzhe; Song, Yiyan; Tan, Wulin; Qiu, Zeting; Li, Si; Chen, Qinchang; Gao, Shaowei
2017-09-01
Marital status's prognostic impact on pancreatic neuroendocrine tumors (PNET) has not been rigorously studied. We aimed to explore the relationship between marital status and outcomes of PNET. We retrospectively investigated 2060 PNET cases between 2004 and 2010 from Surveillance, Epidemiology, and End Results (SEER) database. Variables were compared by Chi 2 test, t-test as appropriate. Kaplan-Meier methods and COX proportional hazard models were used to ascertain independent prognostic factors. Married patients had better 5-year overall survival (OS) (53.37% vs. 42.27%, P<0.001) and 5-year pancreatic neuroendocrine tumor specific survival (PNSS) (67.76% vs. 59.82%, P=0.001) comparing with unmarried patients. Multivariate analysis revealed marital status is an independent prognostic factor, with married patients showing better OS (HR=0.74; 95% CI: 0.65-0.84; P<0.001) and PNSS (HR=0.78; 95% CI: 0.66-0.92; P=0.004). Subgroup analysis suggested marital status plays a more important role in the PNET patients with distant stage rather than regional or localized disease. Marital status is an independent prognostic factor for survival in PNET patients. Poor prognosis in unmarried patients may be associated with a delayed diagnosis with advanced tumor stage, psychosocial and socioeconomic factors. Further studies are needed. Copyright © 2017. Published by Elsevier Masson SAS.
MIPS PlantsDB: a database framework for comparative plant genome research.
Nussbaumer, Thomas; Martis, Mihaela M; Roessner, Stephan K; Pfeifer, Matthias; Bader, Kai C; Sharma, Sapna; Gundlach, Heidrun; Spannagl, Manuel
2013-01-01
The rapidly increasing amount of plant genome (sequence) data enables powerful comparative analyses and integrative approaches and also requires structured and comprehensive information resources. Databases are needed for both model and crop plant organisms and both intuitive search/browse views and comparative genomics tools should communicate the data to researchers and help them interpret it. MIPS PlantsDB (http://mips.helmholtz-muenchen.de/plant/genomes.jsp) was initially described in NAR in 2007 [Spannagl,M., Noubibou,O., Haase,D., Yang,L., Gundlach,H., Hindemitt, T., Klee,K., Haberer,G., Schoof,H. and Mayer,K.F. (2007) MIPSPlantsDB-plant database resource for integrative and comparative plant genome research. Nucleic Acids Res., 35, D834-D840] and was set up from the start to provide data and information resources for individual plant species as well as a framework for integrative and comparative plant genome research. PlantsDB comprises database instances for tomato, Medicago, Arabidopsis, Brachypodium, Sorghum, maize, rice, barley and wheat. Building up on that, state-of-the-art comparative genomics tools such as CrowsNest are integrated to visualize and investigate syntenic relationships between monocot genomes. Results from novel genome analysis strategies targeting the complex and repetitive genomes of triticeae species (wheat and barley) are provided and cross-linked with model species. The MIPS Repeat Element Database (mips-REdat) and Catalog (mips-REcat) as well as tight connections to other databases, e.g. via web services, are further important components of PlantsDB.
MIPS PlantsDB: a database framework for comparative plant genome research
Nussbaumer, Thomas; Martis, Mihaela M.; Roessner, Stephan K.; Pfeifer, Matthias; Bader, Kai C.; Sharma, Sapna; Gundlach, Heidrun; Spannagl, Manuel
2013-01-01
The rapidly increasing amount of plant genome (sequence) data enables powerful comparative analyses and integrative approaches and also requires structured and comprehensive information resources. Databases are needed for both model and crop plant organisms and both intuitive search/browse views and comparative genomics tools should communicate the data to researchers and help them interpret it. MIPS PlantsDB (http://mips.helmholtz-muenchen.de/plant/genomes.jsp) was initially described in NAR in 2007 [Spannagl,M., Noubibou,O., Haase,D., Yang,L., Gundlach,H., Hindemitt, T., Klee,K., Haberer,G., Schoof,H. and Mayer,K.F. (2007) MIPSPlantsDB–plant database resource for integrative and comparative plant genome research. Nucleic Acids Res., 35, D834–D840] and was set up from the start to provide data and information resources for individual plant species as well as a framework for integrative and comparative plant genome research. PlantsDB comprises database instances for tomato, Medicago, Arabidopsis, Brachypodium, Sorghum, maize, rice, barley and wheat. Building up on that, state-of-the-art comparative genomics tools such as CrowsNest are integrated to visualize and investigate syntenic relationships between monocot genomes. Results from novel genome analysis strategies targeting the complex and repetitive genomes of triticeae species (wheat and barley) are provided and cross-linked with model species. The MIPS Repeat Element Database (mips-REdat) and Catalog (mips-REcat) as well as tight connections to other databases, e.g. via web services, are further important components of PlantsDB. PMID:23203886
DBMap: a TreeMap-based framework for data navigation and visualization of brain research registry
NASA Astrophysics Data System (ADS)
Zhang, Ming; Zhang, Hong; Tjandra, Donny; Wong, Stephen T. C.
2003-05-01
The purpose of this study is to investigate and apply a new, intuitive and space-conscious visualization framework to facilitate efficient data presentation and exploration of large-scale data warehouses. We have implemented the DBMap framework for the UCSF Brain Research Registry. Such a novel utility would facilitate medical specialists and clinical researchers in better exploring and evaluating a number of attributes organized in the brain research registry. The current UCSF Brain Research Registry consists of a federation of disease-oriented database modules, including Epilepsy, Brain Tumor, Intracerebral Hemorrphage, and CJD (Creuzfeld-Jacob disease). These database modules organize large volumes of imaging and non-imaging data to support Web-based clinical research. While the data warehouse supports general information retrieval and analysis, there lacks an effective way to visualize and present the voluminous and complex data stored. This study investigates whether the TreeMap algorithm can be adapted to display and navigate categorical biomedical data warehouse or registry. TreeMap is a space constrained graphical representation of large hierarchical data sets, mapped to a matrix of rectangles, whose size and color represent interested database fields. It allows the display of a large amount of numerical and categorical information in limited real estate of computer screen with an intuitive user interface. The paper will describe, DBMap, the proposed new data visualization framework for large biomedical databases. Built upon XML, Java and JDBC technologies, the prototype system includes a set of software modules that reside in the application server tier and provide interface to backend database tier and front-end Web tier of the brain registry.
Lessons Learned from Deploying an Analytical Task Management Database
NASA Technical Reports Server (NTRS)
O'Neil, Daniel A.; Welch, Clara; Arceneaux, Joshua; Bulgatz, Dennis; Hunt, Mitch; Young, Stephen
2007-01-01
Defining requirements, missions, technologies, and concepts for space exploration involves multiple levels of organizations, teams of people with complementary skills, and analytical models and simulations. Analytical activities range from filling a To-Be-Determined (TBD) in a requirement to creating animations and simulations of exploration missions. In a program as large as returning to the Moon, there are hundreds of simultaneous analysis activities. A way to manage and integrate efforts of this magnitude is to deploy a centralized database that provides the capability to define tasks, identify resources, describe products, schedule deliveries, and generate a variety of reports. This paper describes a web-accessible task management system and explains the lessons learned during the development and deployment of the database. Through the database, managers and team leaders can define tasks, establish review schedules, assign teams, link tasks to specific requirements, identify products, and link the task data records to external repositories that contain the products. Data filters and spreadsheet export utilities provide a powerful capability to create custom reports. Import utilities provide a means to populate the database from previously filled form files. Within a four month period, a small team analyzed requirements, developed a prototype, conducted multiple system demonstrations, and deployed a working system supporting hundreds of users across the aeros pace community. Open-source technologies and agile software development techniques, applied by a skilled team enabled this impressive achievement. Topics in the paper cover the web application technologies, agile software development, an overview of the system's functions and features, dealing with increasing scope, and deploying new versions of the system.
Catalogue of UV sources in the Galaxy
NASA Astrophysics Data System (ADS)
Beitia-Antero, L.; Gómez de Castro, A. I.
2017-03-01
The Galaxy Evolution Explorer (GALEX) ultraviolet (UV) database contains the largest photometric catalogue in the ultraviolet range; as a result GALEX photometric bands, Near UV band (NUV) and the Far UV band (FUV), have become standards. Nevertheless, the GALEX catalogue does not include bright UV sources due to the high sensitivity of its detectors, neither sources in the Galactic plane. In order to extend the GALEX database for future UV missions, we have obtained synthetic FUV and NUV photometry using the database of UV spectra generated by the International Ultraviolet Explorer (IUE). This database contains 63,755 spectra in the low dispersion mode (λ / δ λ ˜ 300) obtained during its 18-year lifetime. For stellar sources in the IUE database, we have selected spectra with high Signal-To-NoiseRatio (SNR) and computed FUV and NUV magnitudes using the GALEX transmission curves along with the conversion equations between flux and magnitudes provided by the mission. Besides, we have performed variability tests to determine whether the sources were variable (during the IUE observations). As a result, we have generated two different catalogues: one for non-variable stars and another one for variable sources. The former contains FUV and NUV magnitudes, while the latter gives the basic information and the FUV magnitude for each observation. The consistency of the magnitudes has been tested using White Dwarfs contained in both GALEX and IUE samples. The catalogues are available through the Centre des Donées Stellaires. The sources are distributed throughout the whole sky, with a special coverage of the Galactic plane.
Ghazi Mirsaeid, Seyed Javad; Motamedi, Nadia; Ramezan Ghorbani, Nahid
2015-09-01
In this study, the impact of self-citation (Journal and Author) on impact factor of Iranian English Medical journals in two international citation databases, Web of Science (WoS) and Islamic world science citation center (ISC), were compared by citation analysis. Twelve journals in WoS and 26 journals in ISC databases indexed between the years (2006-2009) were selected and compared. For comparison of self-citation rate in two databases, we used Wilcoxon and Mann-whitney tests. We used Pearson test for correlation of self-citation and IF in WoS, and the Spearman's correlation coefficient for the ISC database. Covariance analysis was used for comparison of two correlation tests. P. value was 0.05 in all of tests. There was no significant difference between self-citation rates in two databases (P>0.05). Findings also showed no significant difference between the correlation of Journal self-citation and impact factor in two databases (P=0.526) however, there was significant difference between the author's self-citation and impact factor in these databases (P<0.001). The impact of Author's self-citation in the Impact Factor of WoS was higher than the ISC.
Toward unification of taxonomy databases in a distributed computer environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kitakami, Hajime; Tateno, Yoshio; Gojobori, Takashi
1994-12-31
All the taxonomy databases constructed with the DNA databases of the international DNA data banks are powerful electronic dictionaries which aid in biological research by computer. The taxonomy databases are, however not consistently unified with a relational format. If we can achieve consistent unification of the taxonomy databases, it will be useful in comparing many research results, and investigating future research directions from existent research results. In particular, it will be useful in comparing relationships between phylogenetic trees inferred from molecular data and those constructed from morphological data. The goal of the present study is to unify the existent taxonomymore » databases and eliminate inconsistencies (errors) that are present in them. Inconsistencies occur particularly in the restructuring of the existent taxonomy databases, since classification rules for constructing the taxonomy have rapidly changed with biological advancements. A repair system is needed to remove inconsistencies in each data bank and mismatches among data banks. This paper describes a new methodology for removing both inconsistencies and mismatches from the databases on a distributed computer environment. The methodology is implemented in a relational database management system, SYBASE.« less
Digital Dental X-ray Database for Caries Screening
NASA Astrophysics Data System (ADS)
Rad, Abdolvahab Ehsani; Rahim, Mohd Shafry Mohd; Rehman, Amjad; Saba, Tanzila
2016-06-01
Standard database is the essential requirement to compare the performance of image analysis techniques. Hence the main issue in dental image analysis is the lack of available image database which is provided in this paper. Periapical dental X-ray images which are suitable for any analysis and approved by many dental experts are collected. This type of dental radiograph imaging is common and inexpensive, which is normally used for dental disease diagnosis and abnormalities detection. Database contains 120 various Periapical X-ray images from top to bottom jaw. Dental digital database is constructed to provide the source for researchers to use and compare the image analysis techniques and improve or manipulate the performance of each technique.
CyanoClust: comparative genome resources of cyanobacteria and plastids.
Sasaki, Naobumi V; Sato, Naoki
2010-01-01
Cyanobacteria, which perform oxygen-evolving photosynthesis as do chloroplasts of plants and algae, are one of the best-studied prokaryotic phyla and one from which many representative genomes have been sequenced. Lack of a suitable comparative genomic database has been a problem in cyanobacterial genomics because many proteins involved in physiological functions such as photosynthesis and nitrogen fixation are not catalogued in commonly used databases, such as Clusters of Orthologous Proteins (COG). CyanoClust is a database of homolog groups in cyanobacteria and plastids that are produced by the program Gclust. We have developed a web-server system for the protein homology database featuring cyanobacteria and plastids. Database URL: http://cyanoclust.c.u-tokyo.ac.jp/.
DOE Office of Scientific and Technical Information (OSTI.GOV)
MILLS, EVAN; MATTHE, PAUL; STOUFER, MARTIN
2016-10-06
EnergyIQ-the first "action-oriented" benchmarking tool for non-residential buildings-provides a standardized opportunity assessment based on benchmarking results. along with decision-support information to help refine action plans. EnergyIQ offers a wide array of benchmark metrics, with visuall as well as tabular display. These include energy, costs, greenhouse-gas emissions, and a large array of characteristics (e.g. building components or operational strategies). The tool supports cross-sectional benchmarking for comparing the user's building to it's peers at one point in time, as well as longitudinal benchmarking for tracking the performance of an individual building or enterprise portfolio over time. Based on user inputs, the toolmore » generates a list of opportunities and recommended actions. Users can then explore the "Decision Support" module for helpful information on how to refine action plans, create design-intent documentation, and implement improvements. This includes information on best practices, links to other energy analysis tools and more. The variety of databases are available within EnergyIQ from which users can specify peer groups for comparison. Using the tool, this data can be visually browsed and used as a backdrop against which to view a variety of energy benchmarking metrics for the user's own building. User can save their project information and return at a later date to continue their exploration. The initial database is the CA Commercial End-Use Survey (CEUS), which provides details on energy use and characteristics for about 2800 buildings (and 62 building types). CEUS is likely the most thorough survey of its kind every conducted. The tool is built as a web service. The EnergyIQ web application is written in JSP with pervasive us of JavaScript and CSS2. EnergyIQ also supports a SOAP based web service to allow the flow of queries and data to occur with non-browser implementations. Data are stored in an Oracle 10g database. References: Mills, Mathew, Brook and Piette. 2008. "Action Oriented Benchmarking: Concepts and Tools." Energy Engineering, Vol.105, No. 4, pp 21-40. LBNL-358E; Mathew, Mills, Bourassa, Brook. 2008. "Action-Oriented Benchmarking: Using the CEUS Database to Benchmark Commercial Buildings in California." Energy Engineering, Vol 105, No. 5, pp 6-18. LBNL-502E.« less
Del Fiol, Guilherme; Butler, Jorie; Livnat, Yarden; Mayer, Jeanmarie; Samore, Matthew; Jones, Makoto; Weir, Charlene
2016-01-01
Summary Objective Big data or population-based information has the potential to reduce uncertainty in medicine by informing clinicians about individual patient care. The objectives of this study were: 1) to explore the feasibility of extracting and displaying population-based information from an actual clinical population’s database records, 2) to explore specific design features for improving population display, 3) to explore perceptions of population information displays, and 4) to explore the impact of population information display on cognitive outcomes. Methods We used the Veteran’s Affairs (VA) database to identify similar complex patients based on a similar complex patient case. Study outcomes measures were 1) preferences for population information display 2) time looking at the population display, 3) time to read the chart, and 4) appropriateness of plans with pre- and post-presentation of population data. Finally, we redesigned the population information display based on our findings from this study. Results The qualitative data analysis for preferences of population information display resulted in four themes: 1) trusting the big/population data can be an issue, 2) embedded analytics is necessary to explore patient similarities, 3) need for tools to control the view (overview, zoom and filter), and 4) different presentations of the population display can be beneficial to improve the display. We found that appropriateness of plans was at 60% for both groups (t9=-1.9; p=0.08), and overall time looking at the population information display was 2.3 minutes versus 3.6 minutes with experts processing information faster than non-experts (t8= -2.3, p=0.04). Conclusion A population database has great potential for reducing complexity and uncertainty in medicine to improve clinical care. The preferences identified for the population information display will guide future health information technology system designers for better and more intuitive display. PMID:27437065
Project Ares: A Systems Engineering and Operations Architecture for the Exploration of Mars
1992-03-20
increased use of automation, experiential databases , expert systems, and fail-soft’ configurations and designs (33:252-253). Automatic communication relay and...communications satellite’s lifetimes, we assume that uplink data rates on the order of 10 Kbps should suffice for command and database uploads. Current...squashed, 20-sided polyhedron configuration which should be relatively easy to obtain. Thus, two extremes for configuration exist. At one end is the site
DOE Office of Scientific and Technical Information (OSTI.GOV)
Templin-Branner, W.
2010-10-20
The National Library of Medicine's Environmental Health and Toxicology Portal provides access to numerous databases that can help you explore environmental chemicals and risks. TOXNET and Beyond: Using NLM's Environmental Health and Toxicology Portal conveys the fundamentals of searching the NLM's TOXNET system of databases in chemistry, toxicology, environmental health, and related fields. In addition to TOXNET, the course will highlight various resources available through the Environmental Health and Toxicology Portal.
Duclay, E; Hardouin, J B; Sébille, V; Anthoine, E; Moret, L
2015-10-01
To explore the influence of staff absenteeism on patient satisfaction using the indicators available in management reports. Among factors explaining patient satisfaction, human resource indicators have been studied widely in terms of burnout or job satisfaction, but there have not been many studies related to absenteeism indicators. A multilevel analysis was conducted using two routinely compiled databases from 2010 in the clinical departments of a university hospital (France). The staff database monitored absenteeism for short-term medical reasons (5 days or less), non-medical reasons and absences starting at the weekend. The patient satisfaction database was established at the time of discharge. Patient satisfaction related to relationships with staff was significantly and negatively correlated with nurse absenteeism for non-medical reasons (P < 0.05) and with nurse absenteeism starting at weekends (P < 0.05). Patient satisfaction related to the hospital environment was significantly and negatively correlated with nurse assistant absenteeism for short-term medical reasons (P < 0.05). Our findings seem to indicate that patient satisfaction is linked to staff absenteeism and should lead to a better understanding of the impact of human resources on patient satisfaction. To enhance patient satisfaction, managers need to find a way to reduce staff absenteeism, in order to avoid burnout and to improve the atmosphere in the workplace. © 2014 John Wiley & Sons Ltd.
Exoplanet Orbit Database | Exoplanet Data Explorer
, Strasbourg, France, NASA's Astrophysics Data System, the NASA Exoplanet Archive (and, formerly, the NASA/IPAC /Caltech. This research received generous funding from NASA and the NSF.
Qualitative and Quantitative Pedigree Analysis: Graph Theory, Computer Software, and Case Studies.
ERIC Educational Resources Information Center
Jungck, John R.; Soderberg, Patti
1995-01-01
Presents a series of elementary mathematical tools for re-representing pedigrees, pedigree generators, pedigree-driven database management systems, and case studies for exploring genetic relationships. (MKR)
Aerodynamic Analysis of Simulated Heat Shield Recession for the Orion Command Module
NASA Technical Reports Server (NTRS)
Bibb, Karen L.; Alter, Stephen J.; Mcdaniel, Ryan D.
2008-01-01
The aerodynamic effects of the recession of the ablative thermal protection system for the Orion Command Module of the Crew Exploration Vehicle are important for the vehicle guidance. At the present time, the aerodynamic effects of recession being handled within the Orion aerodynamic database indirectly with an additional safety factor placed on the uncertainty bounds. This study is an initial attempt to quantify the effects for a particular set of recessed geometry shapes, in order to provide more rigorous analysis for managing recession effects within the aerodynamic database. The aerodynamic forces and moments for the baseline and recessed geometries were computed at several trajectory points using multiple CFD codes, both viscous and inviscid. The resulting aerodynamics for the baseline and recessed geometries were compared. The forces (lift, drag) show negligible differences between baseline and recessed geometries. Generally, the moments show a difference between baseline and recessed geometries that correlates with the maximum amount of recession of the geometry. The difference between the pitching moments for the baseline and recessed geometries increases as Mach number decreases (and the recession is greater), and reach a value of -0.0026 for the lowest Mach number. The change in trim angle of attack increases from approx. 0.5deg at M = 28.7 to approx. 1.3deg at M = 6, and is consistent with a previous analysis with a lower fidelity engineering tool. This correlation of the present results with the engineering tool results supports the continued use of the engineering tool for future work. The present analysis suggests there does not need to be an uncertainty due to recession in the Orion aerodynamic database for the force quantities. The magnitude of the change in pitching moment due to recession is large enough to warrant inclusion in the aerodynamic database. An increment in the uncertainty for pitching moment could be calculated from these results and included in the development of the aerodynamic database uncertainty for pitching moment.
NASA Astrophysics Data System (ADS)
Paiva, L. M. S.; Bodstein, G. C. R.; Pimentel, L. C. G.
2013-12-01
Large-eddy simulations are performed using the Advanced Regional Prediction System (ARPS) code at horizontal grid resolutions as fine as 300 m to assess the influence of detailed and updated surface databases on the modeling of local atmospheric circulation systems of urban areas with complex terrain. Applications to air pollution and wind energy are sought. These databases are comprised of 3 arc-sec topographic data from the Shuttle Radar Topography Mission, 10 arc-sec vegetation type data from the European Space Agency (ESA) GlobCover Project, and 30 arc-sec Leaf Area Index and Fraction of Absorbed Photosynthetically Active Radiation data from the ESA GlobCarbon Project. Simulations are carried out for the Metropolitan Area of Rio de Janeiro using six one-way nested-grid domains that allow the choice of distinct parametric models and vertical resolutions associated to each grid. ARPS is initialized using the Global Forecasting System with 0.5°-resolution data from the National Center of Environmental Prediction, which is also used every 3 h as lateral boundary condition. Topographic shading is turned on and two soil layers with depths of 0.01 and 1.0 m are used to compute the soil temperature and moisture budgets in all runs. Results for two simulated runs covering the period from 6 to 7 September 2007 are compared to surface and upper-air observational data to explore the dependence of the simulations on initial and boundary conditions, topographic and land-use databases and grid resolution. Our comparisons show overall good agreement between simulated and observed data and also indicate that the low resolution of the 30 arc-sec soil database from United States Geological Survey, the soil moisture and skin temperature initial conditions assimilated from the GFS analyses and the synoptic forcing on the lateral boundaries of the finer grids may affect an adequate spatial description of the meteorological variables.
Content based information retrieval in forensic image databases.
Geradts, Zeno; Bijhold, Jurrien
2002-03-01
This paper gives an overview of the various available image databases and ways of searching these databases on image contents. The developments in research groups of searching in image databases is evaluated and compared with the forensic databases that exist. Forensic image databases of fingerprints, faces, shoeprints, handwriting, cartridge cases, drugs tablets, and tool marks are described. The developments in these fields appear to be valuable for forensic databases, especially that of the framework in MPEG-7, where the searching in image databases is standardized. In the future, the combination of the databases (also DNA-databases) and possibilities to combine these can result in stronger forensic evidence.
75 FR 3908 - Agency Information Collection Activities: Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2010-01-25
... Comparative Database.'' In accordance with the Paperwork Reduction Act, 44 U.S.C. 3501-3520, AHRQ invites the... Assessment of Healthcare Providers and Systems (CAHPS) Health Plan Survey Comparative Database. [[Page 3909..., and the Centers for Medicare & Medicaid Services (CMS) to provide comparative data to support public...
75 FR 16134 - Agency Information Collection Activities: Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2010-03-31
... Survey Comparative Database.'' In accordance with the Paperwork Reduction Act, 44 U.S.C. 3501-3520, AHRQ... Comparative Database The Agency for Healthcare Research and Quality (AHRQ) requests that the Office of..., purchasers, and the Centers for Medicare & Medicaid Services (CMS) to provide comparative data to support...
78 FR 69088 - Agency Information Collection Activities: Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2013-11-18
... Comparative Database.'' In accordance with the Paperwork Reduction Act, 44 U.S.C. 3501-3521, AHRQ invites the... Comparative Database Request for information collection approval. The Agency for Healthcare Research and..., purchasers, and the Centers for Medicare & Medicaid Services (CMS) to provide comparative data to support...
78 FR 49518 - Agency Information Collection Activities: Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2013-08-14
... Comparative Database.'' In accordance with the Paperwork Reduction Act, 44 U.S.C. 3501-3521, AHRQ invites the... Assessment of Healthcare Providers and Systems (CAHPS) Health Plan Survey Comparative Database Request for... Medicare & Medicaid Services (CMS) to provide comparative data to support public reporting of health plan...
Impact of training sets on classification of high-throughput bacterial 16s rRNA gene surveys
Werner, Jeffrey J; Koren, Omry; Hugenholtz, Philip; DeSantis, Todd Z; Walters, William A; Caporaso, J Gregory; Angenent, Largus T; Knight, Rob; Ley, Ruth E
2012-01-01
Taxonomic classification of the thousands–millions of 16S rRNA gene sequences generated in microbiome studies is often achieved using a naïve Bayesian classifier (for example, the Ribosomal Database Project II (RDP) classifier), due to favorable trade-offs among automation, speed and accuracy. The resulting classification depends on the reference sequences and taxonomic hierarchy used to train the model; although the influence of primer sets and classification algorithms have been explored in detail, the influence of training set has not been characterized. We compared classification results obtained using three different publicly available databases as training sets, applied to five different bacterial 16S rRNA gene pyrosequencing data sets generated (from human body, mouse gut, python gut, soil and anaerobic digester samples). We observed numerous advantages to using the largest, most diverse training set available, that we constructed from the Greengenes (GG) bacterial/archaeal 16S rRNA gene sequence database and the latest GG taxonomy. Phylogenetic clusters of previously unclassified experimental sequences were identified with notable improvements (for example, 50% reduction in reads unclassified at the phylum level in mouse gut, soil and anaerobic digester samples), especially for phylotypes belonging to specific phyla (Tenericutes, Chloroflexi, Synergistetes and Candidate phyla TM6, TM7). Trimming the reference sequences to the primer region resulted in systematic improvements in classification depth, and greatest gains at higher confidence thresholds. Phylotypes unclassified at the genus level represented a greater proportion of the total community variation than classified operational taxonomic units in mouse gut and anaerobic digester samples, underscoring the need for greater diversity in existing reference databases. PMID:21716311
Meta-analysis on the effectiveness of team-based learning on medical education in China.
Chen, Minjian; Ni, Chunhui; Hu, Yanhui; Wang, Meilin; Liu, Lu; Ji, Xiaoming; Chu, Haiyan; Wu, Wei; Lu, Chuncheng; Wang, Shouyu; Wang, Shoulin; Zhao, Liping; Li, Zhong; Zhu, Huijuan; Wang, Jianming; Xia, Yankai; Wang, Xinru
2018-04-10
Team-based learning (TBL) has been adopted as a new medical pedagogical approach in China. However, there are no studies or reviews summarizing the effectiveness of TBL on medical education. This study aims to obtain an overall estimation of the effectiveness of TBL on outcomes of theoretical teaching of medical education in China. We retrieved the studies from inception through December, 2015. Chinese National Knowledge Infrastructure, Chinese Biomedical Literature Database, Chinese Wanfang Database, Chinese Scientific Journal Database, PubMed, EMBASE and Cochrane Database were searched. The quality of included studies was assessed by the Newcastle-Ottawa scale. Standardized mean difference (SMD) was applied for the estimation of the pooled effects. Heterogeneity assumption was detected by I 2 statistics, and was further explored by meta-regression analysis. A total of 13 articles including 1545 participants eventually entered into the meta-analysis. The quality scores of these studies ranged from 6 to 10. Altogether, TBL significantly increased students' theoretical examination scores when compared with lecture-based learning (LBL) (SMD = 2.46, 95% CI: 1.53-3.40). Additionally, TBL significantly increased students' learning attitude (SMD = 3.23, 95% CI: 2.27-4.20), and learning skill (SMD = 2.70, 95% CI: 1.33-4.07). The meta-regression results showed that randomization, education classification and gender diversity were the factors that caused heterogeneity. TBL in theoretical teaching of medical education seems to be more effective than LBL in improving the knowledge, attitude and skill of students in China, providing evidence for the implement of TBL in medical education in China. The medical schools should implement TBL with the consideration on the practical teaching situations such as students' education level.
Senatore, Adriano; Edirisinghe, Neranjan; Katz, Paul S.
2015-01-01
Background The sea slug Tritonia diomedea (Mollusca, Gastropoda, Nudibranchia), has a simple and highly accessible nervous system, making it useful for studying neuronal and synaptic mechanisms underlying behavior. Although many important contributions have been made using Tritonia, until now, a lack of genetic information has impeded exploration at the molecular level. Results We performed Illumina sequencing of central nervous system mRNAs from Tritonia, generating 133.1 million 100 base pair, paired-end reads. De novo reconstruction of the RNA-Seq data yielded a total of 185,546 contigs, which partitioned into 123,154 non-redundant gene clusters (unigenes). BLAST comparison with RefSeq and Swiss-Prot protein databases, as well as mRNA data from other invertebrates (gastropod molluscs: Aplysia californica, Lymnaea stagnalis and Biomphalaria glabrata; cnidarian: Nematostella vectensis) revealed that up to 76,292 unigenes in the Tritonia transcriptome have putative homologues in other databases, 18,246 of which are below a more stringent E-value cut-off of 1x10-6. In silico prediction of secreted proteins from the Tritonia transcriptome shotgun assembly (TSA) produced a database of 579 unique sequences of secreted proteins, which also exhibited markedly higher expression levels compared to other genes in the TSA. Conclusions Our efforts greatly expand the availability of gene sequences available for Tritonia diomedea. We were able to extract full length protein sequences for most queried genes, including those involved in electrical excitability, synaptic vesicle release and neurotransmission, thus confirming that the transcriptome will serve as a useful tool for probing the molecular correlates of behavior in this species. We also generated a neurosecretome database that will serve as a useful tool for probing peptidergic signalling systems in the Tritonia brain. PMID:25719197
Sammon, Cormac J; Petersen, Irene
2016-04-01
Studies using primary care databases often censor follow-up at the date data are last collected from clinical computer systems (last collection date (LCD)). We explored whether this results in the selective exclusion of events entered in the electronic health records after their date of occurrence, that is, backdated events. We used data from The Health Improvement Network (THIN). Using two versions of the database, we identified events that were entered into a later (THIN14) but not an earlier version of the database (THIN13) and investigated how the number of entries changed as a function of time since LCD. Times between events and the dates they were recorded were plotted as a function of time since the LCD in an effort to determine appropriate points at which to censor follow-up. There were 356 million eligible events in THIN14 and 355 million eligible events in THIN13. When comparing the two data sets, the proportion of missing events in THIN13 was highest in the month prior to the LCD (9.6%), decreasing to 5.2% at 6 months and 3.4% at 12 months. The proportion of missing events was largest for events typically diagnosed in secondary care such as neoplasms (28% in the month prior to LCD) and negligible for events typically diagnosed in primary care such as respiratory events (2% in the month prior to LCD). Studies using primary care databases, particularly those investigating events typically diagnosed outside primary care, should censor follow-up prior to the LCD to avoid underestimation of event rates. Copyright © 2016 John Wiley & Sons, Ltd.
PDXliver: a database of liver cancer patient derived xenograft mouse models.
He, Sheng; Hu, Bo; Li, Chao; Lin, Ping; Tang, Wei-Guo; Sun, Yun-Fan; Feng, Fang-You-Min; Guo, Wei; Li, Jia; Xu, Yang; Yao, Qian-Lan; Zhang, Xin; Qiu, Shuang-Jian; Zhou, Jian; Fan, Jia; Li, Yi-Xue; Li, Hong; Yang, Xin-Rong
2018-05-09
Liver cancer is the second leading cause of cancer-related deaths and characterized by heterogeneity and drug resistance. Patient-derived xenograft (PDX) models have been widely used in cancer research because they reproduce the characteristics of original tumors. However, the current studies of liver cancer PDX mice are scattered and the number of available PDX models are too small to represent the heterogeneity of liver cancer patients. To improve this situation and to complement available PDX models related resources, here we constructed a comprehensive database, PDXliver, to integrate and analyze liver cancer PDX models. Currently, PDXliver contains 116 PDX models from Chinese liver cancer patients, 51 of them were established by the in-house PDX platform and others were curated from the public literatures. These models are annotated with complete information, including clinical characteristics of patients, genome-wide expression profiles, germline variations, somatic mutations and copy number alterations. Analysis of expression subtypes and mutated genes show that PDXliver represents the diversity of human patients. Another feature of PDXliver is storing drug response data of PDX mice, which makes it possible to explore the association between molecular profiles and drug sensitivity. All data can be accessed via the Browse and Search pages. Additionally, two tools are provided to interactively visualize the omics data of selected PDXs or to compare two groups of PDXs. As far as we known, PDXliver is the first public database of liver cancer PDX models. We hope that this comprehensive resource will accelerate the utility of PDX models and facilitate liver cancer research. The PDXliver database is freely available online at: http://www.picb.ac.cn/PDXliver/.
Padliya, Neerav D; Garrett, Wesley M; Campbell, Kimberly B; Tabb, David L; Cooper, Bret
2007-11-01
LC-MS/MS has demonstrated potential for detecting plant pathogens. Unlike PCR or ELISA, LC-MS/MS does not require pathogen-specific reagents for the detection of pathogen-specific proteins and peptides. However, the MS/MS approach we and others have explored does require a protein sequence reference database and database-search software to interpret tandem mass spectra. To evaluate the limitations of database composition on pathogen identification, we analyzed proteins from cultured Ustilago maydis, Phytophthora sojae, Fusarium graminearum, and Rhizoctonia solani by LC-MS/MS. When the search database did not contain sequences for a target pathogen, or contained sequences to related pathogens, target pathogen spectra were reliably matched to protein sequences from nontarget organisms, giving an illusion that proteins from nontarget organisms were identified. Our analysis demonstrates that when database-search software is used as part of the identification process, a paradox exists whereby additional sequences needed to detect a wide variety of possible organisms may lead to more cross-species protein matches and misidentification of pathogens.
NASA Astrophysics Data System (ADS)
Christian, C. A.; Olson, E. C.
1993-01-01
The proposal database and scheduling system for the Extreme Ultraviolet Explorer is described. The proposal database has been implemented to take input for approved observations selected by the EUVE Peer Review Panel and output target information suitable for the scheduling system to digest. The scheduling system is a hybrid of the SPIKE program and EUVE software which checks spacecraft constraints, produces a proposed schedule and selects spacecraft orientations with optimal configurations for acquiring star trackers, etc. This system is used to schedule the In Orbit Calibration activities that took place this summer, following the EUVE launch in early June 1992. The strategy we have implemented has implications for the selection of approved targets, which have impacted the Peer Review process. In addition, we will discuss how the proposal database, founded on Sybase, controls the processing of EUVE Guest Observer data.
Carbonatites of the World, Explored Deposits of Nb and REE - Database and Grade and Tonnage Models
Berger, Vladimir I.; Singer, Donald A.; Orris, Greta J.
2009-01-01
This report is based on published tonnage and grade data on 58 Nb- and rare-earth-element (REE)-bearing carbonatite deposits that are mostly well explored and are partially mined or contain resources of these elements. The deposits represent only a part of the known 527 carbonatites around the world, but they are characterized by reliable quantitative data on ore tonnages and grades of niobium and REE. Grade and tonnage models are an important component of mineral resource assessments. Carbonatites present one of the main natural sources of niobium and rare-earth elements, the economic importance of which grows consistently. A purpose of this report is to update earlier publications. New information about known deposits, as well as data on new deposits published during the last decade, are incorporated in the present paper. The compiled database (appendix 1; linked to right) contains 60 explored Nb- and REE-bearing carbonatite deposits - resources of 55 of these deposits are taken from publications. In the present updated grade-tonnage model we have added 24 deposits comparing with the previous model of Singer (1998). Resources of most deposits are residuum ores in the upper part of carbonatite bodies. Mineral-deposit models are important in exploration planning and quantitative resource assessments for two reasons: (1) grades and tonnages among deposit types vary significantly, and (2) deposits of different types are present in distinct geologic settings that can be identified from geologic maps. Mineral-deposit models combine the diverse geoscience information on geology, mineral occurrences, geophysics, and geochemistry used in resource assessments and mineral exploration. Globally based deposit models allow recognition of important features and demonstrate how common different features are. Well-designed deposit models allow geologists to deduce possible mineral-deposit types in a given geologic environment, and the grade and tonnage models allow economists to estimate the possible economic viability of these resources. Thus, mineral-deposit models play a central role in presenting geoscience information in a useful form to policy makers. The foundation of mineral-deposit models is information about known deposits. This publication presents the latest geologic information and newly developed grade and tonnage models for Nb- and REE-carbonatite deposits in digital form. The publication contains computer files with information on deposits from around the world. It also contains a text file allowing locations of all deposits to be plotted in geographic information system (GIS) programs. The data are presented in FileMaker Pro as well as in .xls and text files to make the information available to a broadly based audience. The value of this information and any derived analyses depends critically on the consistent manner of data gathering. For this reason, we first discuss the rules used in this compilation. Next, the fields of the database are explained. Finally, we provide new grade and tonnage models and analysis of the information in the file.
Mirza, Shaher Bano; Bokhari, Habib; Fatmi, Muhammad Qaiser
2015-01-01
Pakistan possesses a rich and vast source of natural products (NPs). Some of these secondary metabolites have been identified as potent therapeutic agents. However, the medicinal usage of most of these compounds has not yet been fully explored. The discoveries for new scaffolds of NPs as inhibitors of certain enzymes or receptors using advanced computational drug discovery approaches are also limited due to the unavailability of accurate 3D structures of NPs. An organized database incorporating all relevant information, therefore, can facilitate to explore the medicinal importance of the metabolites from Pakistani Biodiversity. The Chemical Database of Pakistan (ChemDP; release 01) is a fully-referenced, evolving, web-based, virtual database which has been designed and developed to introduce natural products (NPs) and their derivatives from the biodiversity of Pakistan to Global scientific communities. The prime aim is to provide quality structures of compounds with relevant information for computer-aided drug discovery studies. For this purpose, over 1000 NPs have been identified from more than 400 published articles, for which 2D and 3D molecular structures have been generated with a special focus on their stereochemistry, where applicable. The PM7 semiempirical quantum chemistry method has been used to energy optimize the 3D structure of NPs. The 2D and 3D structures can be downloaded as .sdf, .mol, .sybyl, .mol2, and .pdb files - readable formats by many chemoinformatics/bioinformatics software packages. Each entry in ChemDP contains over 100 data fields representing various molecular, biological, physico-chemical and pharmacological properties, which have been properly documented in the database for end users. These pieces of information have been either manually extracted from the literatures or computationally calculated using various computational tools. Cross referencing to a major data repository i.e. ChemSpider has been made available for overlapping compounds. An android application of ChemDP is available at its website. The ChemDP is freely accessible at www.chemdp.com.
Mihalasky, Mark J.; Ludington, Stephen; Alexeiev, Dmitriy V.; Frost, Thomas P.; Light, Thomas D.; Briggs, Deborah A.; Hammarstrom, Jane M.; Wallis, John C.; Bookstrom, Arthur A.; Panteleyev, Andre
2015-01-01
The database of known deposits, significant prospects, and prospects includes an inventory of mineral resources in two known porphyry copper deposits, as well as key characteristics derived from available exploration reports for 70 significant porphyry copper prospects and 86 other prospects. Resource and exploration and development activity are updated with information current through February 2013.
Exploring Global Exposure Factors Resources URLs
The dataset is a compilation of hyperlinks (URLs) for resources (databases, compendia, published articles, etc.) useful for exposure assessment specific to consumer product use.This dataset is associated with the following publication:Zaleski, R., P. Egeghy, and P. Hakkinen. Exploring Global Exposure Factors Resources for Use in Consumer Exposure Assessments. International Journal of Environmental Research and Public Health. Molecular Diversity Preservation International, Basel, SWITZERLAND, 13(7): 744, (2016).
Innovative railroad information displays : executive summary
DOT National Transportation Integrated Search
1998-01-01
The objectives ofthis study were to explore the potential of advanced digital technology, : novel concepts of information management, geographic information databases and : display capabilities in order to enhance planning and decision-making process...
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Lozano-Rubí, Raimundo; Serrano-Balazote, Pablo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario
2017-08-18
The objective of this research is to compare the relational and non-relational (NoSQL) database systems approaches in order to store, recover, query and persist standardized medical information in the form of ISO/EN 13606 normalized Electronic Health Record XML extracts, both in isolation and concurrently. NoSQL database systems have recently attracted much attention, but few studies in the literature address their direct comparison with relational databases when applied to build the persistence layer of a standardized medical information system. One relational and two NoSQL databases (one document-based and one native XML database) of three different sizes have been created in order to evaluate and compare the response times (algorithmic complexity) of six different complexity growing queries, which have been performed on them. Similar appropriate results available in the literature have also been considered. Relational and non-relational NoSQL database systems show almost linear algorithmic complexity query execution. However, they show very different linear slopes, the former being much steeper than the two latter. Document-based NoSQL databases perform better in concurrency than in isolation, and also better than relational databases in concurrency. Non-relational NoSQL databases seem to be more appropriate than standard relational SQL databases when database size is extremely high (secondary use, research applications). Document-based NoSQL databases perform in general better than native XML NoSQL databases. EHR extracts visualization and edition are also document-based tasks more appropriate to NoSQL database systems. However, the appropriate database solution much depends on each particular situation and specific problem.
Clauson, Kevin A; Polen, Hyla H; Marsh, Wallace A
2007-12-01
To evaluate personal digital assistant (PDA) drug information databases used to support clinical decision-making, and to compare the performance of PDA databases with their online versions. Prospective evaluation with descriptive analysis. Five drug information databases available for PDAs and online were evaluated according to their scope (inclusion of correct answers), completeness (on a 3-point scale), and ease of use; 158 question-answer pairs across 15 weighted categories of drug information essential to health care professionals were used to evaluate these databases. An overall composite score integrating these three measures was then calculated. Scores for the PDA databases and for each PDA-online pair were compared. Among the PDA databases, composite rankings, from highest to lowest, were as follows: Lexi-Drugs, Clinical Pharmacology OnHand, Epocrates Rx Pro, mobileMicromedex (now called Thomson Clinical Xpert), and Epocrates Rx free version. When we compared database pairs, online databases that had greater scope than their PDA counterparts were Clinical Pharmacology (137 vs 100 answers, p<0.001), Micromedex (132 vs 96 answers, p<0.001), Lexi-Comp Online (131 vs 119 answers, p<0.001), and Epocrates Online Premium (103 vs 98 answers, p=0.001). Only Micromedex online was more complete than its PDA version (p=0.008). Regarding ease of use, the Lexi-Drugs PDA database was superior to Lexi-Comp Online (p<0.001); however, Epocrates Online Premium, Epocrates Online Free, and Micromedex online were easier to use than their PDA counterparts (p<0.001). In terms of composite scores, only the online versions of Clinical Pharmacology and Micromedex demonstrated superiority over their PDA versions (p>0.01). Online and PDA drug information databases assist practitioners in improving their clinical decision-making. Lexi-Drugs performed significantly better than all of the other PDA databases evaluated. No PDA database demonstrated superiority to its online counterpart; however, the online versions of Clinical Pharmacology and Micromedex were superior to their PDA versions in answering questions.
Chen, Jie; Fu, Ziyi; Ji, Chenbo; Gu, Pingqing; Xu, Pengfei; Yu, Ningzhu; Kan, Yansheng; Wu, Xiaowei; Shen, Rong; Shen, Yan
2015-05-01
The human uterine cervix carcinoma is one of the most well-known malignancy reproductive system cancers, which threatens women health globally. However, the mechanisms of the oncogenesis and development process of cervix carcinoma are not yet fully understood. Long non-coding RNAs (lncRNAs) have been proved to play key roles in various biological processes, especially development of cancer. The function and mechanism of lncRNAs on cervix carcinoma is still rarely reported. We selected 3 cervix cancer and normal cervix tissues separately, then performed lncRNA microarray to detect the differentially expressed lncRNAs. Subsequently, we explored the potential function of these dysregulated lncRNAs through online bioinformatics databases. Finally, quantity real-time PCR was carried out to confirm the expression levels of these dysregulated lncRNAs in cervix cancer and normal tissues. We uncovered the profiles of differentially expressed lncRNAs between normal and cervix carcinoma tissues by using the microarray techniques, and found 1622 upregulated and 3026 downregulated lncRNAs (fold-change>2.0) in cervix carcinoma compared to the normal cervical tissue. Furthermore, we found HOXA11-AS might participate in cervix carcinogenesis by regulating HOXA11, which is involved in regulating biological processes of cervix cancer. This study afforded expression profiles of lncRNAs between cervix carcinoma tissue and normal cervical tissue, which could provide database for further research about the function and mechanism of key-lncRNAs in cervix carcinoma, and might be helpful to explore potential diagnosis factors and therapeutic targets for cervix carcinoma. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Dipeptidyl Peptidase-4 Inhibitor-Associated Pancreatic Carcinoma: A Review of the FAERS Database.
Nagel, Angela K; Ahmed-Sarwar, Nabila; Werner, Paul M; Cipriano, Gabriela C; Van Manen, Robbert P; Brown, Jack E
2016-01-01
To date, there is limited literature regarding the association between dipeptidyl peptidase-4 (DPP-4) inhibitors and pancreatic carcinoma. To describe the comparative incidence of DPP-4 inhibitors and pancreatic carcinoma as reportedly available in the Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS) database. The goal was to provide health care practitioners a general understanding of the drug-disease occurrence. This is a case/noncase study utilizing Empirica Signal software to query FAERS from November 1968 to December 31, 2013. The software was used to calculate a disproportionality statistic--namely, the empirical Bayesian geometric mean (EBGM)--for reports of DPP-4 inhibitors-associated pancreatic carcinoma. The FDA considers an EBGM significant if the fifth percentile of the distribution is at least 2, defined as an EB05 ≥ 2. With use of a disproportionality analysis, DPP-4 inhibitors were compared with all agents listed in FAERS. A total of 156 patients experienced pancreatic carcinoma while receiving DPP-4 inhibitor therapy. An EB05 of 10.3 was determined for sitagliptin, 7.1 for saxagliptin, 4.9 for linagliptin, and 1.4 for alogliptin, compared with all other agents included in FAERS. Although an EB05 > 2 was achieved in 2 other antihyperglycemic agents, the findings were not consistent within their medication classes. There appears to be a statistical association between DPP-4 inhibitor use and pancreatic carcinoma. Causality cannot be inferred from the data provided. Additional clinical studies are needed to further explore this statistical association. © The Author(s) 2015.
Weng, W; Liang, Y; Kimball, E S; Hobbs, T; Kong, S; Sakurada, B; Bouchard, J
2016-07-01
Objective To explore trends in demographics, comorbidities, anti-diabetic drug usage, and healthcare utilization costs in patients with newly-diagnosed type 2 diabetes mellitus (T2DM) using a large US claims database. Methods For the years 2007 and 2012, Truven Health Marketscan Research Databases were used to identify adults with newly-diagnosed T2DM and continuous 12-month enrollment with prescription benefits. Variables examined included patient demographics, comorbidities, inpatient utilization patterns, healthcare costs (inpatient and outpatient), drug costs, and diabetes drug claim patterns. Results Despite an increase in the overall database population between 2007-2012, the incidence of newly-diagnosed T2DM decreased from 1.1% (2007) to 0.65% (2012). Hyperlipidemia and hypertension were the most common comorbidities and increased in prevalence from 2007 to 2012. In 2007, 48.3% of newly-diagnosed T2DM patients had no claims for diabetes medications, compared with 36.2% of patients in 2012. The use of a single oral anti-diabetic drug (OAD) was the most common diabetes medication-related claim (46.2% of patients in 2007; 56.7% of patients in 2012). Among OAD monotherapy users, metformin was the most commonly used and increased from 2007 (74.7% of OAD monotherapy users) to 2012 (90.8%). Decreases were observed for sulfonylureas (14.1% to 6.2%) and thiazolidinediones (7.3% to 0.6%). Insulin, predominantly basal insulin, was used by 3.9% of patients in 2007 and 5.3% of patients in 2012. Mean total annual healthcare costs increased from $13,744 in 2007 to $15,175 in 2012, driven largely by outpatient services, although costs in all individual categories of healthcare services (inpatient and outpatient) increased. Conversely, total drug costs per patient were lower in 2012 compared with 2007. Conclusions Despite a drop in the rate of newly-diagnosed T2DM from 2007 to 2012 in the US, increased total medical costs and comorbidities per individual patient suggest that the clinical and economic trends for T2DM are not declining.
Multimedia explorer: image database, image proxy-server and search-engine.
Frankewitsch, T.; Prokosch, U.
1999-01-01
Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval. PMID:10566463
Multimedia explorer: image database, image proxy-server and search-engine.
Frankewitsch, T; Prokosch, U
1999-01-01
Multimedia plays a major role in medicine. Databases containing images, movies or other types of multimedia objects are increasing in number, especially on the WWW. However, no good retrieval mechanism or search engine currently exists to efficiently track down such multimedia sources in the vast of information provided by the WWW. Secondly, the tools for searching databases are usually not adapted to the properties of images. HTML pages do not allow complex searches. Therefore establishing a more comfortable retrieval involves the use of a higher programming level like JAVA. With this platform independent language it is possible to create extensions to commonly used web browsers. These applets offer a graphical user interface for high level navigation. We implemented a database using JAVA objects as the primary storage container which are then stored by a JAVA controlled ORACLE8 database. Navigation depends on a structured vocabulary enhanced by a semantic network. With this approach multimedia objects can be encapsulated within a logical module for quick data retrieval.
De-identifying an EHR database - anonymity, correctness and readability of the medical record.
Pantazos, Kostas; Lauesen, Soren; Lippert, Soren
2011-01-01
Electronic health records (EHR) contain a large amount of structured data and free text. Exploring and sharing clinical data can improve healthcare and facilitate the development of medical software. However, revealing confidential information is against ethical principles and laws. We de-identified a Danish EHR database with 437,164 patients. The goal was to generate a version with real medical records, but related to artificial persons. We developed a de-identification algorithm that uses lists of named entities, simple language analysis, and special rules. Our algorithm consists of 3 steps: collect lists of identifiers from the database and external resources, define a replacement for each identifier, and replace identifiers in structured data and free text. Some patient records could not be safely de-identified, so the de-identified database has 323,122 patient records with an acceptable degree of anonymity, readability and correctness (F-measure of 95%). The algorithm has to be adjusted for each culture, language and database.
Applying manifold learning techniques to the CAESAR database
NASA Astrophysics Data System (ADS)
Mendoza-Schrock, Olga; Patrick, James; Arnold, Gregory; Ferrara, Matthew
2010-04-01
Understanding and organizing data is the first step toward exploiting sensor phenomenology for dismount tracking. What image features are good for distinguishing people and what measurements, or combination of measurements, can be used to classify the dataset by demographics including gender, age, and race? A particular technique, Diffusion Maps, has demonstrated the potential to extract features that intuitively make sense [1]. We want to develop an understanding of this tool by validating existing results on the Civilian American and European Surface Anthropometry Resource (CAESAR) database. This database, provided by the Air Force Research Laboratory (AFRL) Human Effectiveness Directorate and SAE International, is a rich dataset which includes 40 traditional, anthropometric measurements of 4400 human subjects. If we could specifically measure the defining features for classification, from this database, then the future question will then be to determine a subset of these features that can be measured from imagery. This paper briefly describes the Diffusion Map technique, shows potential for dimension reduction of the CAESAR database, and describes interesting problems to be further explored.
The Steward Observatory asteroid relational database
NASA Technical Reports Server (NTRS)
Sykes, Mark V.; Alvarezdelcastillo, Elizabeth M.
1992-01-01
The Steward Observatory Asteroid Relational Database (SOARD) was created as a flexible tool for undertaking studies of asteroid populations and sub-populations, to probe the biases intrinsic to asteroid databases, to ascertain the completeness of data pertaining to specific problems, to aid in the development of observational programs, and to develop pedagogical materials. To date SOARD has compiled an extensive list of data available on asteroids and made it accessible through a single menu-driven database program. Users may obtain tailored lists of asteroid properties for any subset of asteroids or output files which are suitable for plotting spectral data on individual asteroids. A browse capability allows the user to explore the contents of any data file. SOARD offers, also, an asteroid bibliography containing about 13,000 references. The program has online help as well as user and programmer documentation manuals. SOARD continues to provide data to fulfill requests by members of the astronomical community and will continue to grow as data is added to the database and new features are added to the program.
Exploring differences in inpatient drug purchasing cost between two pediatric hospitals.
Nydert, Per; Poole, Robert
2012-10-01
In this study, the hospital cost of purchasing drugs at two children's hospitals is explored with respect to high-cost drugs and drug classes and discussed with regard to differences in hospital setting, drug price, or number of treatments. The purchasing costs of drugs at the two hospitals were retrieved and analyzed. All information was connected to the Anatomic Therapeutic Chemical code and compared in a Microsoft Access database. The 6-month drug purchasing costs at Astrid Lindgren Children's Hospital (ALCH), Stockholm, Sweden, and Lucile Packard Children's Hospital at Stanford (LPCH), Palo Alto, California, are similar and result in a cost per patient day of US $149 and US $136, respectively. The hospital setting and choice of drug products are factors that influence the drug cost in product-specific ways. Several problems are highlighted when only drug costs are compared between hospitals. For example, the comparison does not take into account the amount of waste, risk of adverse drug events, local dosing strategies, disease prevalence, and national drug-pricing models. The difference in cost per inpatient day at ALCH may indicate that cost could be redistributed in Sweden to support pediatric pharmacy services. Also, when introducing new therapies seen at the comparison hospital, it may be possible to extrapolate the estimated increase in cost.
Exploring Differences in Inpatient Drug Purchasing Cost Between Two Pediatric Hospitals
Nydert, Per; Poole, Robert
2012-01-01
OBJECTIVES In this study, the hospital cost of purchasing drugs at two children's hospitals is explored with respect to high-cost drugs and drug classes and discussed with regard to differences in hospital setting, drug price, or number of treatments. METHODS The purchasing costs of drugs at the two hospitals were retrieved and analyzed. All information was connected to the Anatomic Therapeutic Chemical code and compared in a Microsoft Access database. RESULTS The 6-month drug purchasing costs at Astrid Lindgren Children's Hospital (ALCH), Stockholm, Sweden, and Lucile Packard Children's Hospital at Stanford (LPCH), Palo Alto, California, are similar and result in a cost per patient day of US $149 and US $136, respectively. The hospital setting and choice of drug products are factors that influence the drug cost in product-specific ways. CONCLUSIONS Several problems are highlighted when only drug costs are compared between hospitals. For example, the comparison does not take into account the amount of waste, risk of adverse drug events, local dosing strategies, disease prevalence, and national drug-pricing models. The difference in cost per inpatient day at ALCH may indicate that cost could be redistributed in Sweden to support pediatric pharmacy services. Also, when introducing new therapies seen at the comparison hospital, it may be possible to extrapolate the estimated increase in cost. PMID:23413208
STCRDab: the structural T-cell receptor database
de Oliveira, Saulo H P; Krawczyk, Konrad
2018-01-01
Abstract The Structural T–cell Receptor Database (STCRDab; http://opig.stats.ox.ac.uk/webapps/stcrdab) is an online resource that automatically collects and curates TCR structural data from the Protein Data Bank. For each entry, the database provides annotations, such as the α/β or γ/δ chain pairings, major histocompatibility complex details, and where available, antigen binding affinities. In addition, the orientation between the variable domains and the canonical forms of the complementarity-determining region loops are also provided. Users can select, view, and download individual or bulk sets of structures based on these criteria. Where available, STCRDab also finds antibody structures that are similar to TCRs, helping users explore the relationship between TCRs and antibodies. PMID:29087479
DIGITAL CARTOGRAPHY OF THE PLANETS: NEW METHODS, ITS STATUS, AND ITS FUTURE.
Batson, R.M.
1987-01-01
A system has been developed that establishes a standardized cartographic database for each of the 19 planets and major satellites that have been explored to date. Compilation of the databases involves both traditional and newly developed digital image processing and mosaicking techniques, including radiometric and geometric corrections of the images. Each database, or digital image model (DIM), is a digital mosaic of spacecraft images that have been radiometrically and geometrically corrected and photometrically modeled. During compilation, ancillary data files such as radiometric calibrations and refined photometric values for all camera lens and filter combinations and refined camera-orientation matrices for all images used in the mapping are produced.
DIMA 3.0: Domain Interaction Map.
Luo, Qibin; Pagel, Philipp; Vilne, Baiba; Frishman, Dmitrij
2011-01-01
Domain Interaction MAp (DIMA, available at http://webclu.bio.wzw.tum.de/dima) is a database of predicted and known interactions between protein domains. It integrates 5807 structurally known interactions imported from the iPfam and 3did databases and 46,900 domain interactions predicted by four computational methods: domain phylogenetic profiling, domain pair exclusion algorithm correlated mutations and domain interaction prediction in a discriminative way. Additionally predictions are filtered to exclude those domain pairs that are reported as non-interacting by the Negatome database. The DIMA Web site allows to calculate domain interaction networks either for a domain of interest or for entire organisms, and to explore them interactively using the Flash-based Cytoscape Web software.
Evaluating the Impact of Database Heterogeneity on Observational Study Results
Madigan, David; Ryan, Patrick B.; Schuemie, Martijn; Stang, Paul E.; Overhage, J. Marc; Hartzema, Abraham G.; Suchard, Marc A.; DuMouchel, William; Berlin, Jesse A.
2013-01-01
Clinical studies that use observational databases to evaluate the effects of medical products have become commonplace. Such studies begin by selecting a particular database, a decision that published papers invariably report but do not discuss. Studies of the same issue in different databases, however, can and do generate different results, sometimes with strikingly different clinical implications. In this paper, we systematically study heterogeneity among databases, holding other study methods constant, by exploring relative risk estimates for 53 drug-outcome pairs and 2 widely used study designs (cohort studies and self-controlled case series) across 10 observational databases. When holding the study design constant, our analysis shows that estimated relative risks range from a statistically significant decreased risk to a statistically significant increased risk in 11 of 53 (21%) of drug-outcome pairs that use a cohort design and 19 of 53 (36%) of drug-outcome pairs that use a self-controlled case series design. This exceeds the proportion of pairs that were consistent across databases in both direction and statistical significance, which was 9 of 53 (17%) for cohort studies and 5 of 53 (9%) for self-controlled case series. Our findings show that clinical studies that use observational databases can be sensitive to the choice of database. More attention is needed to consider how the choice of data source may be affecting results. PMID:23648805
Explore a Career in Health Sciences Information
... tools that range from traditional print journals to electronic databases and the latest mobile devices, health sciences ... an expert search of the literature. connecting licensed electronic resources and decision tools into a patient's electronic ...
ERIC Educational Resources Information Center
Bitter, Gary G., Ed.
1989-01-01
Describes three software packages: (1) "MacMendeleev"--database/graphic display for chemistry, grades 10-12, Macintosh; (2) "Geometry One: Foundations"--geometry tutorial, grades 7-12, IBM; (3) "Mathematics Exploration Toolkit"--algebra and calculus tutorial, grades 8-12, IBM. (MVL)
Innovative railroad information displays : video guide
DOT National Transportation Integrated Search
1998-01-01
The objectives of this study were to explore the potential of advanced digital technology, : novel concepts of information management, geographic information databases and : display capabilities in order to enhance planning and decision-making proces...
Analysis of crashes involving 15-passenger vans
DOT National Transportation Integrated Search
2004-05-01
This study explores the relationship between vehicle occupancy and several other variables in the National Highway Traffic Safety Administration's (NHTSA's) Fatality Analysis Reporting System (FARS) database and a 15-passenger van's risk of rollover....
NASA Video Catalog. Supplement 12
NASA Technical Reports Server (NTRS)
2002-01-01
This report lists 1878 video productions from the NASA STI Database. This issue of the NASA Video Catalog cites video productions listed in the NASA STI Database. The videos listed have been developed by the NASA centers, covering Shuttle mission press conferences; fly-bys of planets; aircraft design, testing and performance; environmental pollution; lunar and planetary exploration; and many other categories related to manned and unmanned space exploration. Each entry in the publication consists of a standard bibliographic citation accompanied by an abstract. The listing of the entries is arranged by STAR categories. A complete Table of Contents describes the scope of each category. For users with specific information, a Title Index is available. A Subject Term Index, based on the NASA Thesaurus, is also included. Guidelines for usage of NASA audio/visual material, ordering information, and order forms are also available.
21SSD: a new public 21-cm EoR database
NASA Astrophysics Data System (ADS)
Eames, Evan; Semelin, Benoît
2018-05-01
With current efforts inching closer to detecting the 21-cm signal from the Epoch of Reionization (EoR), proper preparation will require publicly available simulated models of the various forms the signal could take. In this work we present a database of such models, available at 21ssd.obspm.fr. The models are created with a fully-coupled radiative hydrodynamic simulation (LICORICE), and are created at high resolution (10243). We also begin to analyse and explore the possible 21-cm EoR signals (with Power Spectra and Pixel Distribution Functions), and study the effects of thermal noise on our ability to recover the signal out to high redshifts. Finally, we begin to explore the concepts of `distance' between different models, which represents a crucial step towards optimising parameter space sampling, training neural networks, and finally extracting parameter values from observations.
Rot, Gregor; Parikh, Anup; Curk, Tomaz; Kuspa, Adam; Shaulsky, Gad; Zupan, Blaz
2009-08-25
Bioinformatics often leverages on recent advancements in computer science to support biologists in their scientific discovery process. Such efforts include the development of easy-to-use web interfaces to biomedical databases. Recent advancements in interactive web technologies require us to rethink the standard submit-and-wait paradigm, and craft bioinformatics web applications that share analytical and interactive power with their desktop relatives, while retaining simplicity and availability. We have developed dictyExpress, a web application that features a graphical, highly interactive explorative interface to our database that consists of more than 1000 Dictyostelium discoideum gene expression experiments. In dictyExpress, the user can select experiments and genes, perform gene clustering, view gene expression profiles across time, view gene co-expression networks, perform analyses of Gene Ontology term enrichment, and simultaneously display expression profiles for a selected gene in various experiments. Most importantly, these tasks are achieved through web applications whose components are seamlessly interlinked and immediately respond to events triggered by the user, thus providing a powerful explorative data analysis environment. dictyExpress is a precursor for a new generation of web-based bioinformatics applications with simple but powerful interactive interfaces that resemble that of the modern desktop. While dictyExpress serves mainly the Dictyostelium research community, it is relatively easy to adapt it to other datasets. We propose that the design ideas behind dictyExpress will influence the development of similar applications for other model organisms.
Rot, Gregor; Parikh, Anup; Curk, Tomaz; Kuspa, Adam; Shaulsky, Gad; Zupan, Blaz
2009-01-01
Background Bioinformatics often leverages on recent advancements in computer science to support biologists in their scientific discovery process. Such efforts include the development of easy-to-use web interfaces to biomedical databases. Recent advancements in interactive web technologies require us to rethink the standard submit-and-wait paradigm, and craft bioinformatics web applications that share analytical and interactive power with their desktop relatives, while retaining simplicity and availability. Results We have developed dictyExpress, a web application that features a graphical, highly interactive explorative interface to our database that consists of more than 1000 Dictyostelium discoideum gene expression experiments. In dictyExpress, the user can select experiments and genes, perform gene clustering, view gene expression profiles across time, view gene co-expression networks, perform analyses of Gene Ontology term enrichment, and simultaneously display expression profiles for a selected gene in various experiments. Most importantly, these tasks are achieved through web applications whose components are seamlessly interlinked and immediately respond to events triggered by the user, thus providing a powerful explorative data analysis environment. Conclusion dictyExpress is a precursor for a new generation of web-based bioinformatics applications with simple but powerful interactive interfaces that resemble that of the modern desktop. While dictyExpress serves mainly the Dictyostelium research community, it is relatively easy to adapt it to other datasets. We propose that the design ideas behind dictyExpress will influence the development of similar applications for other model organisms. PMID:19706156
Analysis and interpretation of diffuse x-ray emission using data from the Einstein satellite
NASA Technical Reports Server (NTRS)
Helfand, David J.
1991-01-01
An ambitious program to create a powerful and accessible archive of the HEAO-2 Imaging Proportional Counter (IPC) database was outlined. The scientific utility of that database for studies of diffuse x ray emissions was explored. Technical and scientific accomplishments are reviewed. Three papers were presented which have major new scientific findings relevant to the global structure of the interstellar medium and the origin of the cosmic x ray background. An all-sky map of diffuse x ray emission was constructed.
Managing Requirements-Documents to Data
NASA Technical Reports Server (NTRS)
Orr, Kevin; Hudson, Abe
2017-01-01
Managing Requirements on long term projects like International Space Station (ISS) can go thru many phases, from initial product development to almost over 20 years of operations and sustainment. Over that time many authorized changes have been made to the requirement set, that apply to any new systems that would visit the ISS today, like commercial cargo/crew vehicles or payloads. Explore the benefits of managing requirements in a database while satisfying traditional documents needs for contracts and stakeholder/user consumption that are not tied into the database.
Heterogenous database integration in a physician workstation.
Annevelink, J; Young, C Y; Tang, P C
1991-01-01
We discuss the integration of a variety of data and information sources in a Physician Workstation (PWS), focusing on the integration of data from DHCP, the Veteran Administration's Distributed Hospital Computer Program. We designed a logically centralized, object-oriented data-schema, used by end users and applications to explore the data accessible through an object-oriented database using a declarative query language. We emphasize the use of procedural abstraction to transparently integrate a variety of information sources into the data schema.
Heterogenous database integration in a physician workstation.
Annevelink, J.; Young, C. Y.; Tang, P. C.
1991-01-01
We discuss the integration of a variety of data and information sources in a Physician Workstation (PWS), focusing on the integration of data from DHCP, the Veteran Administration's Distributed Hospital Computer Program. We designed a logically centralized, object-oriented data-schema, used by end users and applications to explore the data accessible through an object-oriented database using a declarative query language. We emphasize the use of procedural abstraction to transparently integrate a variety of information sources into the data schema. PMID:1807624
Oral cancer databases: A comprehensive review.
Sarode, Gargi S; Sarode, Sachin C; Maniyar, Nikunj; Anand, Rahul; Patil, Shankargouda
2017-11-29
Cancer database is a systematic collection and analysis of information on various human cancers at genomic and molecular level that can be utilized to understand various steps in carcinogenesis and for therapeutic advancement in cancer field. Oral cancer is one of the leading causes of morbidity and mortality all over the world. The current research efforts in this field are aimed at cancer etiology and therapy. Advanced genomic technologies including microarrays, proteomics, transcrpitomics, and gene sequencing development have culminated in generation of extensive data and subjection of several genes and microRNAs that are distinctively expressed and this information is stored in the form of various databases. Extensive data from various resources have brought the need for collaboration and data sharing to make effective use of this new knowledge. The current review provides comprehensive information of various publicly accessible databases that contain information pertinent to oral squamous cell carcinoma (OSCC) and databases designed exclusively for OSCC. The databases discussed in this paper are Protein-Coding Gene Databases and microRNA Databases. This paper also describes gene overlap in various databases, which will help researchers to reduce redundancy and focus on only those genes, which are common to more than one databases. We hope such introduction will promote awareness and facilitate the usage of these resources in the cancer research community, and researchers can explore the molecular mechanisms involved in the development of cancer, which can help in subsequent crafting of therapeutic strategies. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Liljekvist, Mads Svane; Andresen, Kristoffer; Pommergaard, Hans-Christian; Rosenberg, Jacob
2015-01-01
Background. Open access (OA) journals allows access to research papers free of charge to the reader. Traditionally, biomedical researchers use databases like MEDLINE and EMBASE to discover new advances. However, biomedical OA journals might not fulfill such databases' criteria, hindering dissemination. The Directory of Open Access Journals (DOAJ) is a database exclusively listing OA journals. The aim of this study was to investigate DOAJ's coverage of biomedical OA journals compared with the conventional biomedical databases. Methods. Information on all journals listed in four conventional biomedical databases (MEDLINE, PubMed Central, EMBASE and SCOPUS) and DOAJ were gathered. Journals were included if they were (1) actively publishing, (2) full OA, (3) prospectively indexed in one or more database, and (4) of biomedical subject. Impact factor and journal language were also collected. DOAJ was compared with conventional databases regarding the proportion of journals covered, along with their impact factor and publishing language. The proportion of journals with articles indexed by DOAJ was determined. Results. In total, 3,236 biomedical OA journals were included in the study. Of the included journals, 86.7% were listed in DOAJ. Combined, the conventional biomedical databases listed 75.0% of the journals; 18.7% in MEDLINE; 36.5% in PubMed Central; 51.5% in SCOPUS and 50.6% in EMBASE. Of the journals in DOAJ, 88.7% published in English and 20.6% had received impact factor for 2012 compared with 93.5% and 26.0%, respectively, for journals in the conventional biomedical databases. A subset of 51.1% and 48.5% of the journals in DOAJ had articles indexed from 2012 and 2013, respectively. Of journals exclusively listed in DOAJ, one journal had received an impact factor for 2012, and 59.6% of the journals had no content from 2013 indexed in DOAJ. Conclusions. DOAJ is the most complete registry of biomedical OA journals compared with five conventional biomedical databases. However, DOAJ only indexes articles for half of the biomedical journals listed, making it an incomplete source for biomedical research papers in general.
Information Literacy Skills: Comparing and Evaluating Databases
ERIC Educational Resources Information Center
Grismore, Brian A.
2012-01-01
The purpose of this database comparison is to express the importance of teaching information literacy skills and to apply those skills to commonly used Internet-based research tools. This paper includes a comparison and evaluation of three databases (ProQuest, ERIC, and Google Scholar). It includes strengths and weaknesses of each database based…
Comparing Top-Down with Bottom-Up Approaches: Teaching Data Modeling
ERIC Educational Resources Information Center
Kung, Hsiang-Jui; Kung, LeeAnn; Gardiner, Adrian
2013-01-01
Conceptual database design is a difficult task for novice database designers, such as students, and is also therefore particularly challenging for database educators to teach. In the teaching of database design, two general approaches are frequently emphasized: top-down and bottom-up. In this paper, we present an empirical comparison of students'…
Mammography status using patient self-reports and computerized radiology database.
Thompson, B; Taylor, V; Goldberg, H; Mullen, M
1999-10-01
This study sought to compare self-reported mammography use of low-income women utilizing an inner-city public hospital with a computerized hospital database for tracking mammography use. A survey of all age-eligible women using the hospital's internal medicine clinic was done; responses were matched with the radiology database. We examined concordance among the two data sources. Concordance between self-report and the database was high (82%) when using "ever had a mammogram at the hospital," but low (58%) when comparing self-reported last mammogram with the information contained in the database. Disagreements existed between self-reports and the database. Because we sought to ensure that women would know exactly what a mammogram entailed by including a picture of a woman having a mammogram, it is possible that women's responses were accurate, leading to concerns that discrepancies might be present in the database. Physicians and staff must ensure that they understand the full history of a woman's experience with mammography before recommending for or against the procedure.
Recent updates and developments to plant genome size databases
Garcia, Sònia; Leitch, Ilia J.; Anadon-Rosell, Alba; Canela, Miguel Á.; Gálvez, Francisco; Garnatje, Teresa; Gras, Airy; Hidalgo, Oriane; Johnston, Emmeline; Mas de Xaxars, Gemma; Pellicer, Jaume; Siljak-Yakovlev, Sonja; Vallès, Joan; Vitales, Daniel; Bennett, Michael D.
2014-01-01
Two plant genome size databases have been recently updated and/or extended: the Plant DNA C-values database (http://data.kew.org/cvalues), and GSAD, the Genome Size in Asteraceae database (http://www.asteraceaegenomesize.com). While the first provides information on nuclear DNA contents across land plants and some algal groups, the second is focused on one of the largest and most economically important angiosperm families, Asteraceae. Genome size data have numerous applications: they can be used in comparative studies on genome evolution, or as a tool to appraise the cost of whole-genome sequencing programs. The growing interest in genome size and increasing rate of data accumulation has necessitated the continued update of these databases. Currently, the Plant DNA C-values database (Release 6.0, Dec. 2012) contains data for 8510 species, while GSAD has 1219 species (Release 2.0, June 2013), representing increases of 17 and 51%, respectively, in the number of species with genome size data, compared with previous releases. Here we provide overviews of the most recent releases of each database, and outline new features of GSAD. The latter include (i) a tool to visually compare genome size data between species, (ii) the option to export data and (iii) a webpage containing information about flow cytometry protocols. PMID:24288377
GHAZI MIRSAEID, Seyed Javad; MOTAMEDI, Nadia; RAMEZAN GHORBANI, Nahid
2015-01-01
Background: In this study, the impact of self-citation (Journal and Author) on impact factor of Iranian English Medical journals in two international citation databases, Web of Science (WoS) and Islamic world science citation center (ISC), were compared by citation analysis. Methods: Twelve journals in WoS and 26 journals in ISC databases indexed between the years (2006–2009) were selected and compared. For comparison of self-citation rate in two databases, we used Wilcoxon and Mann-whitney tests. We used Pearson test for correlation of self-citation and IF in WoS, and the Spearman’s correlation coefficient for the ISC database. Covariance analysis was used for comparison of two correlation tests. P. value was 0.05 in all of tests. Results: There was no significant difference between self-citation rates in two databases (P>0.05). Findings also showed no significant difference between the correlation of Journal self-citation and impact factor in two databases (P=0.526) however, there was significant difference between the author’s self-citation and impact factor in these databases (P<0.001). Conclusion: The impact of Author’s self-citation in the Impact Factor of WoS was higher than the ISC. PMID:26587498
Johnson, Kjell; Guo, Cen; Gosink, Mark; Wang, Vicky; Hauben, Manfred
2012-12-01
A principal objective of pharmacovigilance is to detect adverse drug reactions that are unknown or novel in terms of their clinical severity or frequency. One method is through inspection of spontaneous reporting system databases, which consist of millions of reports of patients experiencing adverse effects while taking one or more drugs. For such large databases, there is an increasing need for quantitative and automated screening tools to assist drug safety professionals in identifying drug-event combinations (DECs) worthy of further investigation. Existing algorithms can effectively identify problematic DECs when the frequencies are high. However these algorithms perform differently for low-frequency DECs. In this work, we provide a method based on the multinomial distribution that identifies signals of disproportionate reporting, especially for low-frequency combinations. In addition, we comprehensively compare the performance of commonly used algorithms with the new approach. Simulation results demonstrate the advantages of the proposed method, and analysis of the Adverse Event Reporting System data shows that the proposed method can help detect interesting signals. Furthermore, we suggest that these methods be used to identify DECs that occur significantly less frequently than expected, thus identifying potential alternative indications for these drugs. We provide an empirical example that demonstrates the importance of exploring underexpected DECs. Code to implement the proposed method is available in R on request from the corresponding authors. kjell@arboranalytics.com or Mark.M.Gosink@Pfizer.com Supplementary data are available at Bioinformatics online.
Utilization of tooth filling services by people with disabilities in Taiwan.
Chen, Ming-Chuan; Kung, Pei-Tseng; Su, Hsun-Pi; Yen, Suh-May; Chiu, Li-Ting; Tsai, Wen-Chen
2016-04-05
The oral condition of people with disabilities has considerable influence on their physical and mental health. However, nationwide surveys regarding this group have not been conducted. For this study, we used the National Health Insurance Research Database to explore the tooth filling utilization among people with disabilities. Using the database of the Ministry of the Interior in 2008 which included people with disabilities registered, we merged with the medical claims database in 2008 of the Bureau of National Health Insurance to calculate the tooth filling utilization and to analyze relative factors. We recruited 993,487 people with disabilities as the research sample. The tooth filling utilization was 17.53 %. The multiple logistic regression result showed that the utilization rate of men was lower than that of women (OR = 0.78, 95 % CI = 0.77-0.79) and older people had lower utilization rates (aged over 75, OR = 0.22, 95 % CI = 0.22-0.23) compared to those under the age of 20. Other factors that significantly influenced the low tooth filling utilization included a low education level, living in less urbanized areas, low economic capacity, dementia, and severe disability. We identified the factors that influence and decrease the tooth-filling service utilization rate: male sex, old age, low education level, being married, indigenous ethnicity, residing in a low urbanization area, low income, chronic circulatory system diseases, dementia, and severe disabilities. We suggest establishing proper medical care environments for high-risk groups to maintain their quality of life.
3D multi-view convolutional neural networks for lung nodule classification
Kang, Guixia; Hou, Beibei; Zhang, Ningbo
2017-01-01
The 3D convolutional neural network (CNN) is able to make full use of the spatial 3D context information of lung nodules, and the multi-view strategy has been shown to be useful for improving the performance of 2D CNN in classifying lung nodules. In this paper, we explore the classification of lung nodules using the 3D multi-view convolutional neural networks (MV-CNN) with both chain architecture and directed acyclic graph architecture, including 3D Inception and 3D Inception-ResNet. All networks employ the multi-view-one-network strategy. We conduct a binary classification (benign and malignant) and a ternary classification (benign, primary malignant and metastatic malignant) on Computed Tomography (CT) images from Lung Image Database Consortium and Image Database Resource Initiative database (LIDC-IDRI). All results are obtained via 10-fold cross validation. As regards the MV-CNN with chain architecture, results show that the performance of 3D MV-CNN surpasses that of 2D MV-CNN by a significant margin. Finally, a 3D Inception network achieved an error rate of 4.59% for the binary classification and 7.70% for the ternary classification, both of which represent superior results for the corresponding task. We compare the multi-view-one-network strategy with the one-view-one-network strategy. The results reveal that the multi-view-one-network strategy can achieve a lower error rate than the one-view-one-network strategy. PMID:29145492
Kokol, Peter; Vošner, Helena Blažun
2018-01-01
The overall aim of the present study was to compare the coverage of existing research funding information for articles indexed in Scopus, Web of Science, and PubMed databases. The numbers of articles with funding information published in 2015 were identified in the three selected databases and compared using bibliometric analysis of a sample of twenty-eight prestigious medical journals. Frequency analysis of the number of articles with funding information showed statistically significant differences between Scopus, Web of Science, and PubMed databases. The largest proportion of articles with funding information was found in Web of Science (29.0%), followed by PubMed (14.6%) and Scopus (7.7%). The results show that coverage of funding information differs significantly among Scopus, Web of Science, and PubMed databases in a sample of the same medical journals. Moreover, we found that, currently, funding data in PubMed is more difficult to obtain and analyze compared with that in the other two databases.
National Administrative Databases in Adult Spinal Deformity Surgery: A Cautionary Tale.
Buckland, Aaron J; Poorman, Gregory; Freitag, Robert; Jalai, Cyrus; Klineberg, Eric O; Kelly, Michael; Passias, Peter G
2017-08-15
Comparison between national administrative databases and a prospective multicenter physician managed database. This study aims to assess the applicability of National Administrative Databases (NADs) in adult spinal deformity (ASD). Our hypothesis is that NADs do not include comparable patients as in a physician-managed database (PMD) for surgical outcomes in adult spinal deformity. NADs such as National Inpatient Sample (NIS) and National Surgical Quality Improvement Program (NSQIP) provide large numbers of publications owing to ease of data access and lack of IRB approval requirement. These databases utilize billing codes, not clinical inclusion criteria, and have not been validated against PMDs in ASD surgery. The NIS was searched for years 2002 to 2012 and NSQIP for years 2006 to 2013 using validated spinal deformity diagnostic codes. Procedural codes (ICD-9 and CPT) were then applied to each database. A multicenter PMD including years 2008 to 2015 was used for comparison. Databases were assessed for levels fused, osteotomies, decompressed levels, and invasiveness. Database comparisons for surgical details were made in all patients, and also for patients with ≥ 5 level spinal fusions. Approximately, 37,368 NIS, 1291 NSQIP, and 737 PMD patients were identified. NADs showed an increased use of deformity billing codes over the study period (NIS doubled, 68x NSQIP, P < 0.001), but ASD remained stable in the PMD.Surgical invasiveness, levels fused and use of 3-column osteotomy (3-CO) were significantly lower for all patients in the NIS (11.4-13.7) and NSQIP databases (6.4-12.7) compared with PMD (27.5-32.3). When limited to patients with ≥5 levels, invasiveness, levels fused, and use of 3-CO remained significantly higher in the PMD compared with NADs (P < 0.001). National databases NIS and NSQIP do not capture the same patient population as is captured in PMDs in ASD. Physicians should remain cautious in interpreting conclusions drawn from these databases. 4.
Q2Stress: A database for multiple cues to stress assignment in Italian.
Spinelli, Giacomo; Sulpizio, Simone; Burani, Cristina
2017-12-01
In languages where the position of lexical stress within a word is not predictable from print, readers rely on distributional information extracted from the lexicon in order to assign stress. Lexical databases are thus especially important for researchers willing to address stress assignment in those languages. Here we present Q2Stress, a new database aimed to fill the lack of such a resource for Italian. Q2Stress includes multiple cues readers may use in assigning stress, such as type and token frequency of stress patterns as well as their distribution with respect to number of syllables, grammatical category, word beginnings, word endings, and consonant-vowel structures. Furthermore, for the first time, data for both adults and children are available. Q2Stress may help researchers to answer empirical as well as theoretical questions about stress assignment and stress-related issues, and more in general, to explore the orthography-to-phonology relation in reading. Q2Stress is designed as a user-friendly resource, as it comes with scripts allowing researchers to explore and select their own stimuli according to several criteria as well as summary tables for overall data analysis.
Hernandez-Prieto, Miguel A; Futschik, Matthias E
2012-01-01
Synechocystis sp. PCC6803 is one of the best studied cyanobacteria and an important model organism for our understanding of photosynthesis. The early availability of its complete genome sequence initiated numerous transcriptome studies, which have generated a wealth of expression data. Analysis of the accumulated data can be a powerful tool to study transcription in a comprehensive manner and to reveal underlying regulatory mechanisms, as well as to annotate genes whose functions are yet unknown. However, use of divergent microarray platforms, as well as distributed data storage make meta-analyses of Synechocystis expression data highly challenging, especially for researchers with limited bioinformatic expertise and resources. To facilitate utilisation of the accumulated expression data for a wider research community, we have developed CyanoEXpress, a web database for interactive exploration and visualisation of transcriptional response patterns in Synechocystis. CyanoEXpress currently comprises expression data for 3073 genes and 178 environmental and genetic perturbations obtained in 31 independent studies. At present, CyanoEXpress constitutes the most comprehensive collection of expression data available for Synechocystis and can be freely accessed. The database is available for free at http://cyanoexpress.sysbiolab.eu.
Global Distribution of Outbreaks of Water-Associated Infectious Diseases
Yang, Kun; LeJeune, Jeffrey; Alsdorf, Doug; Lu, Bo; Shum, C. K.; Liang, Song
2012-01-01
Background Water plays an important role in the transmission of many infectious diseases, which pose a great burden on global public health. However, the global distribution of these water-associated infectious diseases and underlying factors remain largely unexplored. Methods and Findings Based on the Global Infectious Disease and Epidemiology Network (GIDEON), a global database including water-associated pathogens and diseases was developed. In this study, reported outbreak events associated with corresponding water-associated infectious diseases from 1991 to 2008 were extracted from the database. The location of each reported outbreak event was identified and geocoded into a GIS database. Also collected in the GIS database included geo-referenced socio-environmental information including population density (2000), annual accumulated temperature, surface water area, and average annual precipitation. Poisson models with Bayesian inference were developed to explore the association between these socio-environmental factors and distribution of the reported outbreak events. Based on model predictions a global relative risk map was generated. A total of 1,428 reported outbreak events were retrieved from the database. The analysis suggested that outbreaks of water-associated diseases are significantly correlated with socio-environmental factors. Population density is a significant risk factor for all categories of reported outbreaks of water-associated diseases; water-related diseases (e.g., vector-borne diseases) are associated with accumulated temperature; water-washed diseases (e.g., conjunctivitis) are inversely related to surface water area; both water-borne and water-related diseases are inversely related to average annual rainfall. Based on the model predictions, “hotspots” of risks for all categories of water-associated diseases were explored. Conclusions At the global scale, water-associated infectious diseases are significantly correlated with socio-environmental factors, impacting all regions which are affected disproportionately by different categories of water-associated infectious diseases. PMID:22348158
NASA Astrophysics Data System (ADS)
Myrbo, A.; Loeffler, S.; Ai, S.; McEwan, R.
2015-12-01
The ultimate EarthCube product has been described as a mobile app that provides all of the known geoscience data for a geographic point or polygon, from the top of the atmosphere to the core of the Earth, throughout geologic time. The database queries are hidden from the user, and the data are visually rendered for easy recognition of patterns and associations. This fanciful vision is not so remote: NSF EarthCube and Geoinformatics support has already fostered major advances in database interoperability and harmonization of APIs; numerous "domain repositories," databases curated by subject matter experts, now provide a vast wealth of open, easily-accessible georeferenced data on rock and sediment chemistry and mineralogy, paleobiology, stratigraphy, rock magnetics, and more. New datasets accrue daily, including many harvested from the literature by automated means. None of these constitute big data - all are part of the long tail of geoscience, heterogeneous data consisting of relatively small numbers of measurements made by a large number of people, typically on physical samples. This vision of mobile data discovery requires a software package to cleverly expose these domain repositories' holdings; currently, queries mainly come from single investigators to single databases. The NSF-funded mobile app Flyover Country (FC; fc.umn.edu), developed for geoscience outreach and education, has been welcomed by data curators and cyberinfrastructure developers as a testing ground for their API services, data provision, and scalability. FC pulls maps and data within a bounding envelope and caches them for offline use; location-based services alert users to nearby points of interest (POI). The incorporation of data from multiple databases across domains requires parsimonious data requests and novel visualization techniques, especially for mapping of data with a time or stratigraphic depth component. The preservation of data provenance and authority is critical for researcher buy-in to all community databases, and further allows exploration and suggestions of collaborators, based upon geography and topical relevance.
Sockeye: A 3D Environment for Comparative Genomics
Montgomery, Stephen B.; Astakhova, Tamara; Bilenky, Mikhail; Birney, Ewan; Fu, Tony; Hassel, Maik; Melsopp, Craig; Rak, Marcin; Robertson, A. Gordon; Sleumer, Monica; Siddiqui, Asim S.; Jones, Steven J.M.
2004-01-01
Comparative genomics techniques are used in bioinformatics analyses to identify the structural and functional properties of DNA sequences. As the amount of available sequence data steadily increases, the ability to perform large-scale comparative analyses has become increasingly relevant. In addition, the growing complexity of genomic feature annotation means that new approaches to genomic visualization need to be explored. We have developed a Java-based application called Sockeye that uses three-dimensional (3D) graphics technology to facilitate the visualization of annotation and conservation across multiple sequences. This software uses the Ensembl database project to import sequence and annotation information from several eukaryotic species. A user can additionally import their own custom sequence and annotation data. Individual annotation objects are displayed in Sockeye by using custom 3D models. Ensembl-derived and imported sequences can be analyzed by using a suite of multiple and pair-wise alignment algorithms. The results of these comparative analyses are also displayed in the 3D environment of Sockeye. By using the Java3D API to visualize genomic data in a 3D environment, we are able to compactly display cross-sequence comparisons. This provides the user with a novel platform for visualizing and comparing genomic feature organization. PMID:15123592
The Use of National Data Sets to Baseline Science Education Reform: Exploring Value-Added Approaches
ERIC Educational Resources Information Center
Homer, Matt; Ryder, Jim; Donnelly, Jim
2011-01-01
This paper uses data from the National Pupil Database to investigate the differences in "performance" across the range of science courses available following the 2006 Key Stage 4 (KS4) science reforms in England. This is a value-added exploration (from Key Stage 3 [KS3] to KS4) aimed not at the student or the school level, but rather at…
ERIC Educational Resources Information Center
Deasy, Michael Joseph
2012-01-01
Concern over worldwide literacy rates prompted the United Nations to establish the UN Literacy Decade (2003-2012) with one area of focus being to provide support to schools to develop effective literacy programs (UNESCO, 2005). This study addressed the area of providing support to schools to develop effective literacy programs by exploring the…
Automated Rendezvous and Docking: 1994-2004
NASA Technical Reports Server (NTRS)
2004-01-01
This custom bibliography from the NASA Scientific and Technical Information Program lists a sampling of records found in the NASA Aeronautics and Space Database. The scope of this topic includes technologies for human exploration and robotic sample return missions. This area of focus is one of the enabling technologies as defined by NASA s Report of the President s Commission on Implementation of United States Space Exploration Policy, published in June 2004.
Biewick, Laura
2008-01-01
This report contains maps and associated spatial data showing historical oil and gas exploration and production in the United States. Because of the proprietary nature of many oil and gas well databases, the United States was divided into cells one-quarter square mile and the production status of all wells in a given cell was aggregated. Base-map reference data are included, using the U.S. Geological Survey (USGS) National Map, the USGS and American Geological Institute (AGI) Global GIS, and a World Shaded Relief map service from the ESRI Geography Network. A hardcopy map was created to synthesize recorded exploration data from 1859, when the first oil well was drilled in the U.S., to 2005. In addition to the hardcopy map product, the data have been refined and made more accessible through the use of Geographic Information System (GIS) tools. The cell data are included in a GIS database constructed for spatial analysis via the USGS Internet Map Service or by importing the data into GIS software such as ArcGIS. The USGS internet map service provides a number of useful and sophisticated geoprocessing and cartographic functions via an internet browser. Also included is a video clip of U.S. oil and gas exploration and production through time.
Meringer, Markus; Cleaves, H James
2017-12-13
The reverse tricarboxylic acid (rTCA) cycle has been explored from various standpoints as an idealized primordial metabolic cycle. Its simplicity and apparent ubiquity in diverse organisms across the tree of life have been used to argue for its antiquity and its optimality. In 2000 it was proposed that chemoinformatics approaches support some of these views. Specifically, defined queries of the Beilstein database showed that the molecules of the rTCA are heavily represented in such compound databases. We explore here the chemical structure "space," e.g. the set of organic compounds which possesses some minimal set of defining characteristics, of the rTCA cycle's intermediates using an exhaustive structure generation method. The rTCA's chemical space as defined by the original criteria and explored by our method is some six to seven times larger than originally considered. Acknowledging that each assumption in what is a defining criterion making the rTCA cycle special limits possible generative outcomes, there are many unrealized compounds which fulfill these criteria. That these compounds are unrealized could be due to evolutionary frozen accidents or optimization, though this optimization may also be for systems-level reasons, e.g., the way the pathway and its elements interface with other aspects of metabolism.
PDB explorer -- a web based algorithm for protein annotation viewer and 3D visualization.
Nayarisseri, Anuraj; Shardiwal, Rakesh Kumar; Yadav, Mukesh; Kanungo, Neha; Singh, Pooja; Shah, Pratik; Ahmed, Sheaza
2014-12-01
The PDB file format, is a text format characterizing the three dimensional structures of macro molecules available in the Protein Data Bank (PDB). Determined protein structure are found in coalition with other molecules or ions such as nucleic acids, water, ions, Drug molecules and so on, which therefore can be described in the PDB format and have been deposited in PDB database. PDB is a machine generated file, it's not human readable format, to read this file we need any computational tool to understand it. The objective of our present study is to develop a free online software for retrieval, visualization and reading of annotation of a protein 3D structure which is available in PDB database. Main aim is to create PDB file in human readable format, i.e., the information in PDB file is converted in readable sentences. It displays all possible information from a PDB file including 3D structure of that file. Programming languages and scripting languages like Perl, CSS, Javascript, Ajax, and HTML have been used for the development of PDB Explorer. The PDB Explorer directly parses the PDB file, calling methods for parsed element secondary structure element, atoms, coordinates etc. PDB Explorer is freely available at http://www.pdbexplorer.eminentbio.com/home with no requirement of log-in.
Why do nursing homes close? An analysis of newspaper articles.
Fisher, Andrew; Castle, Nicholas
2012-01-01
Using Non-numerical Unstructured Data Indexing Searching and Theorizing (NUD'IST) software to extract and examine keywords from text, the authors explored the phenomenon of nursing home closure through an analysis of 30 major-market newspapers over a period of 66 months (January 1, 1999 to June 1, 2005). Newspaper articles typically represent a careful analysis of staff impressions via interviews, managerial perspectives, and financial records review. There is a current reliance on the synthesis of information from large regulatory databases such as the Online Survey Certification And Reporting database, the California Office of Statewide Healthcare Planning and Development database, and Area Resource Files. Although such databases permit the construction of studies capable of revealing some reasons for nursing home closure, they are hampered by the confines of the data entered. Using our analysis of newspaper articles, the authors are able to add further to their understanding of nursing home closures.
NCBI GEO: mining tens of millions of expression profiles--database and tools update.
Barrett, Tanya; Troup, Dennis B; Wilhite, Stephen E; Ledoux, Pierre; Rudnev, Dmitry; Evangelista, Carlos; Kim, Irene F; Soboleva, Alexandra; Tomashevsky, Maxim; Edgar, Ron
2007-01-01
The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information (NCBI) archives and freely disseminates microarray and other forms of high-throughput data generated by the scientific community. The database has a minimum information about a microarray experiment (MIAME)-compliant infrastructure that captures fully annotated raw and processed data. Several data deposit options and formats are supported, including web forms, spreadsheets, XML and Simple Omnibus Format in Text (SOFT). In addition to data storage, a collection of user-friendly web-based interfaces and applications are available to help users effectively explore, visualize and download the thousands of experiments and tens of millions of gene expression patterns stored in GEO. This paper provides a summary of the GEO database structure and user facilities, and describes recent enhancements to database design, performance, submission format options, data query and retrieval utilities. GEO is accessible at http://www.ncbi.nlm.nih.gov/geo/
ADASS Web Database XML Project
NASA Astrophysics Data System (ADS)
Barg, M. I.; Stobie, E. B.; Ferro, A. J.; O'Neil, E. J.
In the spring of 2000, at the request of the ADASS Program Organizing Committee (POC), we began organizing information from previous ADASS conferences in an effort to create a centralized database. The beginnings of this database originated from data (invited speakers, participants, papers, etc.) extracted from HyperText Markup Language (HTML) documents from past ADASS host sites. Unfortunately, not all HTML documents are well formed and parsing them proved to be an iterative process. It was evident at the beginning that if these Web documents were organized in a standardized way, such as XML (Extensible Markup Language), the processing of this information across the Web could be automated, more efficient, and less error prone. This paper will briefly review the many programming tools available for processing XML, including Java, Perl and Python, and will explore the mapping of relational data from our MySQL database to XML.
MEIMAN: Database exploring Medicinal and Edible insects of Manipur
Shantibala, Tourangbam; Lokeshwari, Rajkumari; Thingnam, Gourshyam; Somkuwar, Bharat Gopalrao
2012-01-01
We have developed MEIMAN, a unique database on medicinal and edible insects of Manipur which comprises 51 insects species collected through extensive survey and questionnaire for two years. MEIMAN provides integrated access to insect species thorough sophisticated web interface which has following capabilities a) Graphical interface of seasonality, b) Method of preparation, c) Form of use - edible and medicinal, d) habitat, e) medicinal uses, f) commercial importance and g) economic status. This database will be useful for scientific validations and updating of traditional wisdom in bioprospecting aspects. It will be useful in analyzing the insect biodiversity for the development of virgin resources and their industrialization. Further, the features will be suited for detailed investigation on potential medicinal and edible insects that make MEIMAN a powerful tool for sustainable management. Availability The database is available for free at www.ibsd.gov.in/meiman PMID:22715305
The quest for the perfect gravity anomaly: Part 1 - New calculation standards
Li, X.; Hildenbrand, T.G.; Hinze, W. J.; Keller, Gordon R.; Ravat, D.; Webring, M.
2006-01-01
The North American gravity database together with databases from Canada, Mexico, and the United States are being revised to improve their coverage, versatility, and accuracy. An important part of this effort is revision of procedures and standards for calculating gravity anomalies taking into account our enhanced computational power, modern satellite-based positioning technology, improved terrain databases, and increased interest in more accurately defining different anomaly components. The most striking revision is the use of one single internationally accepted reference ellipsoid for the horizontal and vertical datums of gravity stations as well as for the computation of the theoretical gravity. The new standards hardly impact the interpretation of local anomalies, but do improve regional anomalies. Most importantly, such new standards can be consistently applied to gravity database compilations of nations, continents, and even the entire world. ?? 2005 Society of Exploration Geophysicists.
Data Structures in Natural Computing: Databases as Weak or Strong Anticipatory Systems
NASA Astrophysics Data System (ADS)
Rossiter, B. N.; Heather, M. A.
2004-08-01
Information systems anticipate the real world. Classical databases store, organise and search collections of data of that real world but only as weak anticipatory information systems. This is because of the reductionism and normalisation needed to map the structuralism of natural data on to idealised machines with von Neumann architectures consisting of fixed instructions. Category theory developed as a formalism to explore the theoretical concept of naturality shows that methods like sketches arising from graph theory as only non-natural models of naturality cannot capture real-world structures for strong anticipatory information systems. Databases need a schema of the natural world. Natural computing databases need the schema itself to be also natural. Natural computing methods including neural computers, evolutionary automata, molecular and nanocomputing and quantum computation have the potential to be strong. At present they are mainly at the stage of weak anticipatory systems.
Zhang, Qingzhou; Yang, Bo; Chen, Xujiao; Xu, Jing; Mei, Changlin; Mao, Zhiguo
2014-01-01
We present a bioinformatics database named Renal Gene Expression Database (RGED), which contains comprehensive gene expression data sets from renal disease research. The web-based interface of RGED allows users to query the gene expression profiles in various kidney-related samples, including renal cell lines, human kidney tissues and murine model kidneys. Researchers can explore certain gene profiles, the relationships between genes of interests and identify biomarkers or even drug targets in kidney diseases. The aim of this work is to provide a user-friendly utility for the renal disease research community to query expression profiles of genes of their own interest without the requirement of advanced computational skills. Website is implemented in PHP, R, MySQL and Nginx and freely available from http://rged.wall-eva.net. http://rged.wall-eva.net. © The Author(s) 2014. Published by Oxford University Press.
Data Auditor: Analyzing Data Quality Using Pattern Tableaux
NASA Astrophysics Data System (ADS)
Srivastava, Divesh
Monitoring databases maintain configuration and measurement tables about computer systems, such as networks and computing clusters, and serve important business functions, such as troubleshooting customer problems, analyzing equipment failures, planning system upgrades, etc. These databases are prone to many data quality issues: configuration tables may be incorrect due to data entry errors, while measurement tables may be affected by incorrect, missing, duplicate and delayed polls. We describe Data Auditor, a tool for analyzing data quality and exploring data semantics of monitoring databases. Given a user-supplied constraint, such as a boolean predicate expected to be satisfied by every tuple, a functional dependency, or an inclusion dependency, Data Auditor computes "pattern tableaux", which are concise summaries of subsets of the data that satisfy or fail the constraint. We discuss the architecture of Data Auditor, including the supported types of constraints and the tableau generation mechanism. We also show the utility of our approach on an operational network monitoring database.
HLLV avionics requirements study and electronic filing system database development
NASA Technical Reports Server (NTRS)
1994-01-01
This final report provides a summary of achievements and activities performed under Contract NAS8-39215. The contract's objective was to explore a new way of delivering, storing, accessing, and archiving study products and information and to define top level system requirements for Heavy Lift Launch Vehicle (HLLV) avionics that incorporate Vehicle Health Management (VHM). This report includes technical objectives, methods, assumptions, recommendations, sample data, and issues as specified by DPD No. 772, DR-3. The report is organized into two major subsections, one specific to each of the two tasks defined in the Statement of Work: the Index Database Task and the HLLV Avionics Requirements Task. The Index Database Task resulted in the selection and modification of a commercial database software tool to contain the data developed during the HLLV Avionics Requirements Task. All summary information is addressed within each task's section.
SATRAT: Staphylococcus aureus transcript regulatory network analysis tool.
Gopal, Tamilselvi; Nagarajan, Vijayaraj; Elasri, Mohamed O
2015-01-01
Staphylococcus aureus is a commensal organism that primarily colonizes the nose of healthy individuals. S. aureus causes a spectrum of infections that range from skin and soft-tissue infections to fatal invasive diseases. S. aureus uses a large number of virulence factors that are regulated in a coordinated fashion. The complex regulatory mechanisms have been investigated in numerous high-throughput experiments. Access to this data is critical to studying this pathogen. Previously, we developed a compilation of microarray experimental data to enable researchers to search, browse, compare, and contrast transcript profiles. We have substantially updated this database and have built a novel exploratory tool-SATRAT-the S. aureus transcript regulatory network analysis tool, based on the updated database. This tool is capable of performing deep searches using a query and generating an interactive regulatory network based on associations among the regulators of any query gene. We believe this integrated regulatory network analysis tool would help researchers explore the missing links and identify novel pathways that regulate virulence in S. aureus. Also, the data model and the network generation code used to build this resource is open sourced, enabling researchers to build similar resources for other bacterial systems.
PGSB/MIPS Plant Genome Information Resources and Concepts for the Analysis of Complex Grass Genomes.
Spannagl, Manuel; Bader, Kai; Pfeifer, Matthias; Nussbaumer, Thomas; Mayer, Klaus F X
2016-01-01
PGSB (Plant Genome and Systems Biology; formerly MIPS-Munich Institute for Protein Sequences) has been involved in developing, implementing and maintaining plant genome databases for more than a decade. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable datasets for model plant genomes as a backbone against which experimental data, e.g., from high-throughput functional genomics, can be organized and analyzed. In addition, genomes from both model and crop plants form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny) between related species on macro- and micro-levels.The genomes of many economically important Triticeae plants such as wheat, barley, and rye present a great challenge for sequence assembly and bioinformatic analysis due to their enormous complexity and large genome size. Novel concepts and strategies have been developed to deal with these difficulties and have been applied to the genomes of wheat, barley, rye, and other cereals. This includes the GenomeZipper concept, reference-guided exome assembly, and "chromosome genomics" based on flow cytometry sorted chromosomes.
Comparing NetCDF and SciDB on managing and querying 5D hydrologic dataset
NASA Astrophysics Data System (ADS)
Liu, Haicheng; Xiao, Xiao
2016-11-01
Efficiently extracting information from high dimensional hydro-meteorological modelling datasets requires smart solutions. Traditional methods are mostly based on files, which can be edited and accessed handily. But they have problems of efficiency due to contiguous storage structure. Others propose databases as an alternative for advantages such as native functionalities for manipulating multidimensional (MD) arrays, smart caching strategy and scalability. In this research, NetCDF file based solutions and the multidimensional array database management system (DBMS) SciDB applying chunked storage structure are benchmarked to determine the best solution for storing and querying 5D large hydrologic modelling dataset. The effect of data storage configurations including chunk size, dimension order and compression on query performance is explored. Results indicate that dimension order to organize storage of 5D data has significant influence on query performance if chunk size is very large. But the effect becomes insignificant when chunk size is properly set. Compression of SciDB mostly has negative influence on query performance. Caching is an advantage but may be influenced by execution of different query processes. On the whole, NetCDF solution without compression is in general more efficient than the SciDB DBMS.
Zhu, JiangLing; Shi, Yue; Fang, LeQi; Liu, XingE; Ji, ChengJun
2015-06-01
The physical and mechanical properties of wood affect the growth and development of trees, and also act as the main criteria when determining wood usage. Our understanding on patterns and controls of wood physical and mechanical properties could provide benefits for forestry management and bases for wood application and forest tree breeding. However, current studies on wood properties mainly focus on wood density and ignore other wood physical properties. In this study, we established a comprehensive database of wood physical properties across major tree species in China. Based on this database, we explored spatial patterns and driving factors of wood properties across major tree species in China. Our results showed that (i) compared with wood density, air-dried density, tangential shrinkage coefficient and resilience provide more accuracy and higher explanation power when used as the evaluation index of wood physical properties. (ii) Among life form, climatic and edaphic variables, life form is the dominant factor shaping spatial patterns of wood physical properties, climatic factors the next, and edaphic factors have the least effects, suggesting that the effects of climatic factors on spatial variations of wood properties are indirectly induced by their effects on species distribution.
Representation and alignment of sung queries for music information retrieval
NASA Astrophysics Data System (ADS)
Adams, Norman H.; Wakefield, Gregory H.
2005-09-01
The pursuit of robust and rapid query-by-humming systems, which search melodic databases using sung queries, is a common theme in music information retrieval. The retrieval aspect of this database problem has received considerable attention, whereas the front-end processing of sung queries and the data structure to represent melodies has been based on musical intuition and historical momentum. The present work explores three time series representations for sung queries: a sequence of notes, a ``smooth'' pitch contour, and a sequence of pitch histograms. The performance of the three representations is compared using a collection of naturally sung queries. It is found that the most robust performance is achieved by the representation with highest dimension, the smooth pitch contour, but that this representation presents a formidable computational burden. For all three representations, it is necessary to align the query and target in order to achieve robust performance. The computational cost of the alignment is quadratic, hence it is necessary to keep the dimension small for rapid retrieval. Accordingly, iterative deepening is employed to achieve both robust performance and rapid retrieval. Finally, the conventional iterative framework is expanded to adapt the alignment constraints based on previous iterations, further expediting retrieval without degrading performance.
View subspaces for indexing and retrieval of 3D models
NASA Astrophysics Data System (ADS)
Dutagaci, Helin; Godil, Afzal; Sankur, Bülent; Yemez, Yücel
2010-02-01
View-based indexing schemes for 3D object retrieval are gaining popularity since they provide good retrieval results. These schemes are coherent with the theory that humans recognize objects based on their 2D appearances. The viewbased techniques also allow users to search with various queries such as binary images, range images and even 2D sketches. The previous view-based techniques use classical 2D shape descriptors such as Fourier invariants, Zernike moments, Scale Invariant Feature Transform-based local features and 2D Digital Fourier Transform coefficients. These methods describe each object independent of others. In this work, we explore data driven subspace models, such as Principal Component Analysis, Independent Component Analysis and Nonnegative Matrix Factorization to describe the shape information of the views. We treat the depth images obtained from various points of the view sphere as 2D intensity images and train a subspace to extract the inherent structure of the views within a database. We also show the benefit of categorizing shapes according to their eigenvalue spread. Both the shape categorization and data-driven feature set conjectures are tested on the PSB database and compared with the competitor view-based 3D shape retrieval algorithms.
A bibliometric analysis of research in psychopharmacology by psychology departments (1987-2007).
Portillo-Salido, Enrique F
2010-05-01
From the very outset of scientific Psychology, psychologists have shown interest for drugs and their effects on behavior. This has given rise to numerous contributions, mostly in the form of Psychopharmacology publications. The aim of this study was to quantitatively evaluate these contributions and compare them with other academic disciplines related to Psychopharmacology. Using the PubMed database, we retrieved information about articles from 15 journals included in the Pharmacology and Pharmacy category of the Journal Citation Reports database for a 21-year period (1987 to 2007). There were 37540 articles which about 52% were represented by 3 journals. About 70% of psychology publications were represented by 2 of these journals. Psychology departments accounted for the 11% of the published papers, which places Psychology third behind Psychiatry and Pharmacology, which contributed to 22.69 and 13% respectively. Psychology contributed to the greatest number of studies in 3 journals, second in 3 and third in 8. This report represents the first effort to explore the contribution of academic Psychology to the multidisciplinary science of psychopharmacology. Although leaders of production of psychopharmacology research were from Psychiatry and Pharmacology, Psychology departments are an important source of studies and thus of knowledge in the field of Psychopharmacology.
Biomedical Requirements for High Productivity Computing Systems
2005-04-01
server at http://www.ncbi.nlm.nih.gov/BLAST/. There are many variants of BLAST, including: 1. BLASTN - Compares a DNA query to a DNA database. Searches ...database (3 reading frames from each strand of the DNA) searching . 13 4. TBLASTN - Compares a protein query to a DNA database, in the 6 possible...the molecular during this phase. After eliminating molecules that could not match the query , an atom-by-atom search for the molecules in conducted
Improved orthologous databases to ease protozoan targets inference.
Kotowski, Nelson; Jardim, Rodrigo; Dávila, Alberto M R
2015-09-29
Homology inference helps on identifying similarities, as well as differences among organisms, which provides a better insight on how closely related one might be to another. In addition, comparative genomics pipelines are widely adopted tools designed using different bioinformatics applications and algorithms. In this article, we propose a methodology to build improved orthologous databases with the potential to aid on protozoan target identification, one of the many tasks which benefit from comparative genomics tools. Our analyses are based on OrthoSearch, a comparative genomics pipeline originally designed to infer orthologs through protein-profile comparison, supported by an HMM, reciprocal best hits based approach. Our methodology allows OrthoSearch to confront two orthologous databases and to generate an improved new one. Such can be later used to infer potential protozoan targets through a similarity analysis against the human genome. The protein sequences of Cryptosporidium hominis, Entamoeba histolytica and Leishmania infantum genomes were comparatively analyzed against three orthologous databases: (i) EggNOG KOG, (ii) ProtozoaDB and (iii) Kegg Orthology (KO). That allowed us to create two new orthologous databases, "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB", with 16,938 and 27,701 orthologous groups, respectively. Such new orthologous databases were used for a regular OrthoSearch run. By confronting "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB" databases and protozoan species we were able to detect the following total of orthologous groups and coverage (relation between the inferred orthologous groups and the species total number of proteins): Cryptosporidium hominis: 1,821 (11 %) and 3,254 (12 %); Entamoeba histolytica: 2,245 (13 %) and 5,305 (19 %); Leishmania infantum: 2,702 (16 %) and 4,760 (17 %). Using our HMM-based methodology and the largest created orthologous database, it was possible to infer 13 orthologous groups which represent potential protozoan targets; these were found because of our distant homology approach. We also provide the number of species-specific, pair-to-pair and core groups from such analyses, depicted in Venn diagrams. The orthologous databases generated by our HMM-based methodology provide a broader dataset, with larger amounts of orthologous groups when compared to the original databases used as input. Those may be used for several homology inference analyses, annotation tasks and protozoan targets identification.
Xia, Kai; Dong, Dong; Han, Jing-Dong J
2006-01-01
Background Although protein-protein interaction (PPI) networks have been explored by various experimental methods, the maps so built are still limited in coverage and accuracy. To further expand the PPI network and to extract more accurate information from existing maps, studies have been carried out to integrate various types of functional relationship data. A frequently updated database of computationally analyzed potential PPIs to provide biological researchers with rapid and easy access to analyze original data as a biological network is still lacking. Results By applying a probabilistic model, we integrated 27 heterogeneous genomic, proteomic and functional annotation datasets to predict PPI networks in human. In addition to previously studied data types, we show that phenotypic distances and genetic interactions can also be integrated to predict PPIs. We further built an easy-to-use, updatable integrated PPI database, the Integrated Network Database (IntNetDB) online, to provide automatic prediction and visualization of PPI network among genes of interest. The networks can be visualized in SVG (Scalable Vector Graphics) format for zooming in or out. IntNetDB also provides a tool to extract topologically highly connected network neighborhoods from a specific network for further exploration and research. Using the MCODE (Molecular Complex Detections) algorithm, 190 such neighborhoods were detected among all the predicted interactions. The predicted PPIs can also be mapped to worm, fly and mouse interologs. Conclusion IntNetDB includes 180,010 predicted protein-protein interactions among 9,901 human proteins and represents a useful resource for the research community. Our study has increased prediction coverage by five-fold. IntNetDB also provides easy-to-use network visualization and analysis tools that allow biological researchers unfamiliar with computational biology to access and analyze data over the internet. The web interface of IntNetDB is freely accessible at . Visualization requires Mozilla version 1.8 (or higher) or Internet Explorer with installation of SVGviewer. PMID:17112386
rCAD: A Novel Database Schema for the Comparative Analysis of RNA.
Ozer, Stuart; Doshi, Kishore J; Xu, Weijia; Gutell, Robin R
2011-12-31
Beyond its direct involvement in protein synthesis with mRNA, tRNA, and rRNA, RNA is now being appreciated for its significance in the overall metabolism and regulation of the cell. Comparative analysis has been very effective in the identification and characterization of RNA molecules, including the accurate prediction of their secondary structure. We are developing an integrative scalable data management and analysis system, the RNA Comparative Analysis Database (rCAD), implemented with SQL Server to support RNA comparative analysis. The platformagnostic database schema of rCAD captures the essential relationships between the different dimensions of information for RNA comparative analysis datasets. The rCAD implementation enables a variety of comparative analysis manipulations with multiple integrated data dimensions for advanced RNA comparative analysis workflows. In this paper, we describe details of the rCAD schema design and illustrate its usefulness with two usage scenarios.
rCAD: A Novel Database Schema for the Comparative Analysis of RNA
Ozer, Stuart; Doshi, Kishore J.; Xu, Weijia; Gutell, Robin R.
2013-01-01
Beyond its direct involvement in protein synthesis with mRNA, tRNA, and rRNA, RNA is now being appreciated for its significance in the overall metabolism and regulation of the cell. Comparative analysis has been very effective in the identification and characterization of RNA molecules, including the accurate prediction of their secondary structure. We are developing an integrative scalable data management and analysis system, the RNA Comparative Analysis Database (rCAD), implemented with SQL Server to support RNA comparative analysis. The platformagnostic database schema of rCAD captures the essential relationships between the different dimensions of information for RNA comparative analysis datasets. The rCAD implementation enables a variety of comparative analysis manipulations with multiple integrated data dimensions for advanced RNA comparative analysis workflows. In this paper, we describe details of the rCAD schema design and illustrate its usefulness with two usage scenarios. PMID:24772454
Rahman, Md Motiur; Alatawi, Yasser; Cheng, Ning; Qian, Jingjing; Peissig, Peggy L; Berg, Richard L; Page, David C; Hansen, Richard A
2017-12-01
The US Food and Drug Administration Adverse Event Reporting System (FAERS), a post-marketing safety database, can be used to differentiate brand versus generic safety signals. To explore the methods for identifying and analyzing brand versus generic adverse event (AE) reports. Public release FAERS data from January 2004 to March 2015 were analyzed using alendronate and carbamazepine as examples. Reports were classified as brand, generic, and authorized generic (AG). Disproportionality analyses compared reporting odds ratios (RORs) of selected known labeled serious adverse events stratifying by brand, generic, and AG. The homogeneity of these RORs was compared using the Breslow-Day test. The AG versus generic was the primary focus since the AG is identical to brand but marketed as a generic, therefore minimizing generic perception bias. Sensitivity analyses explored how methodological approach influenced results. Based on 17,521 US event reports involving alendronate and 3733 US event reports involving carbamazepine (immediate and extended release), no consistently significant differences were observed across RORs for the AGs versus generics. Similar results were obtained when comparing reporting patterns over all time and just after generic entry. The most restrictive approach for classifying AE reports yielded smaller report counts but similar results. Differentiation of FAERS reports as brand versus generic requires careful attention to risk of product misclassification, but the relative stability of findings across varying assumptions supports the utility of these approaches for potential signal detection.
Tri-Clustered Tensor Completion for Social-Aware Image Tag Refinement.
Tang, Jinhui; Shu, Xiangbo; Qi, Guo-Jun; Li, Zechao; Wang, Meng; Yan, Shuicheng; Jain, Ramesh
2017-08-01
Social image tag refinement, which aims to improve tag quality by automatically completing the missing tags and rectifying the noise-corrupted ones, is an essential component for social image search. Conventional approaches mainly focus on exploring the visual and tag information, without considering the user information, which often reveals important hints on the (in)correct tags of social images. Towards this end, we propose a novel tri-clustered tensor completion framework to collaboratively explore these three kinds of information to improve the performance of social image tag refinement. Specifically, the inter-relations among users, images and tags are modeled by a tensor, and the intra-relations between users, images and tags are explored by three regularizations respectively. To address the challenges of the super-sparse and large-scale tensor factorization that demands expensive computing and memory cost, we propose a novel tri-clustering method to divide the tensor into a certain number of sub-tensors by simultaneously clustering users, images and tags into a bunch of tri-clusters. And then we investigate two strategies to complete these sub-tensors by considering (in)dependence between the sub-tensors. Experimental results on a real-world social image database demonstrate the superiority of the proposed method compared with the state-of-the-art methods.
Xu, Yanjun; Yang, Haixiu; Wu, Tan; Dong, Qun; Sun, Zeguo; Shang, Desi; Li, Feng; Xu, Yingqi; Su, Fei; Liu, Siyao
2017-01-01
Abstract BioM2MetDisease is a manually curated database that aims to provide a comprehensive and experimentally supported resource of associations between metabolic diseases and various biomolecules. Recently, metabolic diseases such as diabetes have become one of the leading threats to people’s health. Metabolic disease associated with alterations of multiple types of biomolecules such as miRNAs and metabolites. An integrated and high-quality data source that collection of metabolic disease associated biomolecules is essential for exploring the underlying molecular mechanisms and discovering novel therapeutics. Here, we developed the BioM2MetDisease database, which currently documents 2681 entries of relationships between 1147 biomolecules (miRNAs, metabolites and small molecules/drugs) and 78 metabolic diseases across 14 species. Each entry includes biomolecule category, species, biomolecule name, disease name, dysregulation pattern, experimental technique, a brief description of metabolic disease-biomolecule relationships, the reference, additional annotation information etc. BioM2MetDisease provides a user-friendly interface to explore and retrieve all data conveniently. A submission page was also offered for researchers to submit new associations between biomolecules and metabolic diseases. BioM2MetDisease provides a comprehensive resource for studying biology molecules act in metabolic diseases, and it is helpful for understanding the molecular mechanisms and developing novel therapeutics for metabolic diseases. Database URL: http://www.bio-bigdata.com/BioM2MetDisease/ PMID:28605773
CIM explorer--intelligent tool for exploring the ICD Romanian version.
Filip, F; Haras, C
2000-01-01
The CIM Explorer software provides us with an intelligent interface for exploring the Romanian version of the International Classification of Diseases (in Romanian Clasificarea Internationala a Maladiilor-CIM). The ICD was transposed from its initial appearance as a printed document into a database. The classification can be accessed in two modes: "Navigation" and "Code" and queried in the "Key words" mode. In the last mode CIM Explorer program searches for the right content of the ICD records starting from naturally written expressions which it "understands". As a results it returns all the records containing the key words regardless their grammatical form. This program implements the specificity of the Romanian language where the words are made up from a root and a flexional termination.
Erich, Roger; Eaton, Melinda; Mayes, Ryan; Pierce, Lamar; Knight, Andrew; Genovesi, Paul; Escobar, James; Mychalczuk, George; Selent, Monica
2016-08-01
Preparing data for medical research can be challenging, detail oriented, and time consuming. Transcription errors, missing or nonsensical data, and records not applicable to the study population may hamper progress and, if unaddressed, can lead to erroneous conclusions. In addition, study data may be housed in multiple disparate databases and complex formats. Merging methods may be incomplete to obtain temporally synchronized data elements. We created a comprehensive database to explore the general hypothesis that environmental and occupational factors influence health outcomes and risk-taking behavior among active duty Air Force personnel. Several databases containing demographics, medical records, health survey responses, and safety incident reports were cleaned, validated, and linked to form a comprehensive, relational database. The final step involved removing and transforming personally identifiable information to form a Health Insurance Portability and Accountability Act compliant limited database. Initial data consisted of over 62.8 million records containing 221 variables. When completed, approximately 23.9 million clean and valid records with 214 variables remained. With a clean, robust database, future analysis aims to identify high-risk career fields for targeted interventions or uncover potential protective factors in low-risk career fields. Reprint & Copyright © 2016 Association of Military Surgeons of the U.S.
Children's growth: a health indicator and a diagnostic tool.
Gelander, Lars
2006-05-01
The publication of Werner and Bodin in Acta Paediatrica should inspire countries to use the growth of children as an indicator of health. The development of databases that cover all measurements of all children that have contact with healthcare and medical care will provide new knowledge in this area. Such databases will give us the opportunity to explore health in different areas of the country and to evaluate community projects in order to prevent obesity. Growth charts that are used to identify sick children or children that have other causes for growth disturbances must reflect how a healthy child should grow. If such prescriptive growth charts are computerized together with regional databases, they will provide necessary growth data for descriptive health surveys.
Tagare, Hemant D.; Jaffe, C. Carl; Duncan, James
1997-01-01
Abstract Information contained in medical images differs considerably from that residing in alphanumeric format. The difference can be attributed to four characteristics: (1) the semantics of medical knowledge extractable from images is imprecise; (2) image information contains form and spatial data, which are not expressible in conventional language; (3) a large part of image information is geometric; (4) diagnostic inferences derived from images rest on an incomplete, continuously evolving model of normality. This paper explores the differentiating characteristics of text versus images and their impact on design of a medical image database intended to allow content-based indexing and retrieval. One strategy for implementing medical image databases is presented, which employs object-oriented iconic queries, semantics by association with prototypes, and a generic schema. PMID:9147338
76 FR 72931 - Agency Information Collection Activities: Proposed Collection; Comment Request
Federal Register 2010, 2011, 2012, 2013, 2014
2011-11-28
... Systems (CAHPS) Clinician and Group Survey Comparative Database.'' In accordance with the Paperwork... Providers and Systems (CAHPS) Clinician and Group Survey Comparative Database The Agency for Healthcare..., and provided critical data illuminating key aspects of survey design and administration. In July 2007...
Butyaev, Alexander; Mavlyutov, Ruslan; Blanchette, Mathieu; Cudré-Mauroux, Philippe; Waldispühl, Jérôme
2015-09-18
Recent releases of genome three-dimensional (3D) structures have the potential to transform our understanding of genomes. Nonetheless, the storage technology and visualization tools need to evolve to offer to the scientific community fast and convenient access to these data. We introduce simultaneously a database system to store and query 3D genomic data (3DBG), and a 3D genome browser to visualize and explore 3D genome structures (3DGB). We benchmark 3DBG against state-of-the-art systems and demonstrate that it is faster than previous solutions, and importantly gracefully scales with the size of data. We also illustrate the usefulness of our 3D genome Web browser to explore human genome structures. The 3D genome browser is available at http://3dgb.cs.mcgill.ca/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Butyaev, Alexander; Mavlyutov, Ruslan; Blanchette, Mathieu; Cudré-Mauroux, Philippe; Waldispühl, Jérôme
2015-01-01
Recent releases of genome three-dimensional (3D) structures have the potential to transform our understanding of genomes. Nonetheless, the storage technology and visualization tools need to evolve to offer to the scientific community fast and convenient access to these data. We introduce simultaneously a database system to store and query 3D genomic data (3DBG), and a 3D genome browser to visualize and explore 3D genome structures (3DGB). We benchmark 3DBG against state-of-the-art systems and demonstrate that it is faster than previous solutions, and importantly gracefully scales with the size of data. We also illustrate the usefulness of our 3D genome Web browser to explore human genome structures. The 3D genome browser is available at http://3dgb.cs.mcgill.ca/. PMID:25990738
NASA Astrophysics Data System (ADS)
Wantuch, Andrew C.; Vita, Joshua A.; Jimenez, Edward S.; Bray, Iliana E.
2016-10-01
Despite object detection, recognition, and identification being very active areas of computer vision research, many of the available tools to aid in these processes are designed with only photographs in mind. Although some algorithms used specifically for feature detection and identification may not take explicit advantage of the colors available in the image, they still under-perform on radiographs, which are grayscale images. We are especially interested in the robustness of these algorithms, specifically their performance on a preexisting database of X-ray radiographs in compressed JPEG form, with multiple ways of describing pixel information. We will review various aspects of the performance of available feature detection and identification systems, including MATLABs Computer Vision toolbox, VLFeat, and OpenCV on our non-ideal database. In the process, we will explore possible reasons for the algorithms' lessened ability to detect and identify features from the X-ray radiographs.
Plans for the extreme ultraviolet explorer data base
NASA Technical Reports Server (NTRS)
Marshall, Herman L.; Dobson, Carl A.; Malina, Roger F.; Bowyer, Stuart
1988-01-01
The paper presents an approach for storage and fast access to data that will be obtained by the Extreme Ultraviolet Explorer (EUVE), a satellite payload scheduled for launch in 1991. The EUVE telescopes will be operated remotely from the EUVE Science Operation Center (SOC) located at the University of California, Berkeley. The EUVE science payload consists of three scanning telescope carrying out an all-sky survey in the 80-800 A spectral region and a Deep Survey/Spectrometer telescope performing a deep survey in the 80-250 A spectral region. Guest Observers will remotely access the EUVE spectrometer database at the SOC. The EUVE database will consist of about 2 X 10 to the 10th bytes of information in a very compact form, very similar to the raw telemetry data. A history file will be built concurrently giving telescope parameters, command history, attitude summaries, engineering summaries, anomalous events, and ephemeris summaries.
Towards G2G: Systems of Technology Database Systems
NASA Technical Reports Server (NTRS)
Maluf, David A.; Bell, David
2005-01-01
We present an approach and methodology for developing Government-to-Government (G2G) Systems of Technology Database Systems. G2G will deliver technologies for distributed and remote integration of technology data for internal use in analysis and planning as well as for external communications. G2G enables NASA managers, engineers, operational teams and information systems to "compose" technology roadmaps and plans by selecting, combining, extending, specializing and modifying components of technology database systems. G2G will interoperate information and knowledge that is distributed across organizational entities involved that is ideal for NASA future Exploration Enterprise. Key contributions of the G2G system will include the creation of an integrated approach to sustain effective management of technology investments that supports the ability of various technology database systems to be independently managed. The integration technology will comply with emerging open standards. Applications can thus be customized for local needs while enabling an integrated management of technology approach that serves the global needs of NASA. The G2G capabilities will use NASA s breakthrough in database "composition" and integration technology, will use and advance emerging open standards, and will use commercial information technologies to enable effective System of Technology Database systems.
Andresen, Kristoffer; Pommergaard, Hans-Christian; Rosenberg, Jacob
2015-01-01
Background. Open access (OA) journals allows access to research papers free of charge to the reader. Traditionally, biomedical researchers use databases like MEDLINE and EMBASE to discover new advances. However, biomedical OA journals might not fulfill such databases’ criteria, hindering dissemination. The Directory of Open Access Journals (DOAJ) is a database exclusively listing OA journals. The aim of this study was to investigate DOAJ’s coverage of biomedical OA journals compared with the conventional biomedical databases. Methods. Information on all journals listed in four conventional biomedical databases (MEDLINE, PubMed Central, EMBASE and SCOPUS) and DOAJ were gathered. Journals were included if they were (1) actively publishing, (2) full OA, (3) prospectively indexed in one or more database, and (4) of biomedical subject. Impact factor and journal language were also collected. DOAJ was compared with conventional databases regarding the proportion of journals covered, along with their impact factor and publishing language. The proportion of journals with articles indexed by DOAJ was determined. Results. In total, 3,236 biomedical OA journals were included in the study. Of the included journals, 86.7% were listed in DOAJ. Combined, the conventional biomedical databases listed 75.0% of the journals; 18.7% in MEDLINE; 36.5% in PubMed Central; 51.5% in SCOPUS and 50.6% in EMBASE. Of the journals in DOAJ, 88.7% published in English and 20.6% had received impact factor for 2012 compared with 93.5% and 26.0%, respectively, for journals in the conventional biomedical databases. A subset of 51.1% and 48.5% of the journals in DOAJ had articles indexed from 2012 and 2013, respectively. Of journals exclusively listed in DOAJ, one journal had received an impact factor for 2012, and 59.6% of the journals had no content from 2013 indexed in DOAJ. Conclusions. DOAJ is the most complete registry of biomedical OA journals compared with five conventional biomedical databases. However, DOAJ only indexes articles for half of the biomedical journals listed, making it an incomplete source for biomedical research papers in general. PMID:26038727
NASA Astrophysics Data System (ADS)
Morrison, S. M.; Downs, R. T.; Golden, J. J.; Pires, A.; Fox, P. A.; Ma, X.; Zednik, S.; Eleish, A.; Prabhu, A.; Hummer, D. R.; Liu, C.; Meyer, M.; Ralph, J.; Hystad, G.; Hazen, R. M.
2016-12-01
We have developed a comprehensive database of copper (Cu) mineral characteristics. These data include crystallographic, paragenetic, chemical, locality, age, structural complexity, and physical property information for the 689 Cu mineral species approved by the International Mineralogical Association (rruff.info/ima). Synthesis of this large, varied dataset allows for in-depth exploration of statistical trends and visualization techniques. With social network analysis (SNA) and cluster analysis of minerals, we create sociograms and chord diagrams. SNA visualizations illustrate the relationships and connectivity between mineral species, which often form cliques associated with rock type and/or geochemistry. Using mineral ecology statistics, we analyze mineral-locality frequency distribution and predict the number of missing mineral species, visualized with accumulation curves. By assembly of 2-dimensional KLEE diagrams of co-existing elements in minerals, we illustrate geochemical trends within a mineral system. To explore mineral age and chemical oxidation state, we create skyline diagrams and compare trends with varying chemistry. These trends illustrate mineral redox changes through geologic time and correlate with significant geologic occurrences, such as the Great Oxidation Event (GOE) or Wilson Cycles.
Is that your pager or mine: a survey of women academic family physicians in dual physician families.
Schrager, Sarina; Kolan, Anne; Dottl, Susan L
2007-08-01
This study explored the unique challenges and strategies of women in academic family medicine who are in dual physician families. An e-mail survey was sent to all female physician members of the Society of Teachers of Family Medicine (STFM) who were listed in the on-line database. The survey collected demographic information, details of job descriptions and family life, and included 3 open-ended questions about the experiences of dual physician families. Over 1200 surveys were sent to women physicians in academic family medicine. One hundred fifty-nine surveys were returned. Half of all women worked full time compared to 87% of their partners. Most women reported benefits of having a physician partner including support and having an understanding person at home, though scheduling conflicts and childcare responsibilities contributed to the need for job compromises. Women prioritized finding work-life balance and having supportive partners and mentors as most important to their success as academic family physicians. Dual physician relationships involve rewards and conflicts. More research should explore the competing demands of family life with success in academic medicine.
Wang, Yao-Ting; Chen, Hsi-Han; Lin, Ching-Heng; Lee, Shih-Hsiung; Chan, Chin-Hong; Huang, Shiau-Shian
2016-10-30
Previous studies indicated that panic disorder is correlated with erectile dysfunction (ED). The primary aim of this study was to explore the incidence rate of ED among panic disorder patients in an Asian country. The secondary aim was to compare the risk of ED in panic disorder patients that were treated with different kinds of antidepressants, and to explore the possible mechanism between these two disorders. We identified 1393 male patients with newly diagnosed panic disorder from the Taiwan's National Health Insurance Database. Four matched controls per case were selected for the study group by propensity score. After adjusting for age, obesity and comorbidities, the panic disorder patients had a higher hazard ratio of ED diagnosis than the controls, especially among the untreated panic disorder patients. This retrospective dynamic cohort study supports the link between ED and prior panic disorder in a large sample of panic disorder patients. This study points out the need of early antidepressant treatment for panic disorder to prevent further ED. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Risk of large oil spills: a statistical analysis in the aftermath of Deepwater Horizon.
Eckle, Petrissa; Burgherr, Peter; Michaux, Edouard
2012-12-04
The oil spill in the Gulf of Mexico that followed the explosion of the exploration platform Deepwater Horizon on 20 April 2010 was the largest accidental oil spill so far. In this paper we evaluate the risk of such very severe oil spills based on global historical data from our Energy-Related Severe Accident Database (ENSAD) and investigate if an accident of this size could have been "expected". We also compare the risk of oil spills from such accidents in exploration and production to accidental spills from other activities in the oil chain (tanker ship transport, pipelines, storage/refinery) and analyze the two components of risk, frequency and severity (quantity of oil spilled) separately. This detailed analysis reveals the differences in the structure of the risk between different spill sources, differences in trends over time and it allows in particular assessing the risk of very severe events such as the Deepwater Horizon spill. Such top down risk assessment can serve as an important input to decision making by complementing bottom up engineering risk assessment and can be combined with impact assessment in environmental risk analysis.
WholePathwayScope: a comprehensive pathway-based analysis tool for high-throughput data
Yi, Ming; Horton, Jay D; Cohen, Jonathan C; Hobbs, Helen H; Stephens, Robert M
2006-01-01
Background Analysis of High Throughput (HTP) Data such as microarray and proteomics data has provided a powerful methodology to study patterns of gene regulation at genome scale. A major unresolved problem in the post-genomic era is to assemble the large amounts of data generated into a meaningful biological context. We have developed a comprehensive software tool, WholePathwayScope (WPS), for deriving biological insights from analysis of HTP data. Result WPS extracts gene lists with shared biological themes through color cue templates. WPS statistically evaluates global functional category enrichment of gene lists and pathway-level pattern enrichment of data. WPS incorporates well-known biological pathways from KEGG (Kyoto Encyclopedia of Genes and Genomes) and Biocarta, GO (Gene Ontology) terms as well as user-defined pathways or relevant gene clusters or groups, and explores gene-term relationships within the derived gene-term association networks (GTANs). WPS simultaneously compares multiple datasets within biological contexts either as pathways or as association networks. WPS also integrates Genetic Association Database and Partial MedGene Database for disease-association information. We have used this program to analyze and compare microarray and proteomics datasets derived from a variety of biological systems. Application examples demonstrated the capacity of WPS to significantly facilitate the analysis of HTP data for integrative discovery. Conclusion This tool represents a pathway-based platform for discovery integration to maximize analysis power. The tool is freely available at . PMID:16423281
Ishihara, Masaru; Onoguchi, Masahisa; Taniguchi, Yasuyo; Shibutani, Takayuki
2017-12-01
The aim of this study was to clarify the differences in thallium-201-chloride (thallium-201) myocardial perfusion imaging (MPI) scans evaluated by conventional anger-type single-photon emission computed tomography (conventional SPECT) versus cadmium-zinc-telluride SPECT (CZT SPECT) imaging in normal databases for different ethnic groups. MPI scans from 81 consecutive Japanese patients were examined using conventional SPECT and CZT SPECT and analyzed with the pre-installed quantitative perfusion SPECT (QPS) software. We compared the summed stress score (SSS), summed rest score (SRS), and summed difference score (SDS) for the two SPECT devices. For a normal MPI reference, we usually use Japanese databases for MPI created by the Japanese Society of Nuclear Medicine, which can be used with conventional SPECT but not with CZT SPECT. In this study, we used new Japanese normal databases constructed in our institution to compare conventional and CZT SPECT. Compared with conventional SPECT, CZT SPECT showed lower SSS (p < 0.001), SRS (p = 0.001), and SDS (p = 0.189) using the pre-installed SPECT database. In contrast, CZT SPECT showed no significant difference from conventional SPECT in QPS analysis using the normal databases from our institution. Myocardial perfusion analyses by CZT SPECT should be evaluated using normal databases based on the ethnic group being evaluated.
Gomez, Gabriela B.; Borquez, Annick; Case, Kelsey K.; Wheelock, Ana; Vassall, Anna; Hankins, Catherine
2013-01-01
Background Cost-effectiveness studies inform resource allocation, strategy, and policy development. However, due to their complexity, dependence on assumptions made, and inherent uncertainty, synthesising, and generalising the results can be difficult. We assess cost-effectiveness models evaluating expected health gains and costs of HIV pre-exposure prophylaxis (PrEP) interventions. Methods and Findings We conducted a systematic review comparing epidemiological and economic assumptions of cost-effectiveness studies using various modelling approaches. The following databases were searched (until January 2013): PubMed/Medline, ISI Web of Knowledge, Centre for Reviews and Dissemination databases, EconLIT, and region-specific databases. We included modelling studies reporting both cost and expected impact of a PrEP roll-out. We explored five issues: prioritisation strategies, adherence, behaviour change, toxicity, and resistance. Of 961 studies retrieved, 13 were included. Studies modelled populations (heterosexual couples, men who have sex with men, people who inject drugs) in generalised and concentrated epidemics from Southern Africa (including South Africa), Ukraine, USA, and Peru. PrEP was found to have the potential to be a cost-effective addition to HIV prevention programmes in specific settings. The extent of the impact of PrEP depended upon assumptions made concerning cost, epidemic context, programme coverage, prioritisation strategies, and individual-level adherence. Delivery of PrEP to key populations at highest risk of HIV exposure appears the most cost-effective strategy. Limitations of this review include the partial geographical coverage, our inability to perform a meta-analysis, and the paucity of information available exploring trade-offs between early treatment and PrEP. Conclusions Our review identifies the main considerations to address in assessing cost-effectiveness analyses of a PrEP intervention—cost, epidemic context, individual adherence level, PrEP programme coverage, and prioritisation strategy. Cost-effectiveness studies indicating where resources can be applied for greatest impact are essential to guide resource allocation decisions; however, the results of such analyses must be considered within the context of the underlying assumptions made. Please see later in the article for the Editors' Summary PMID:23554579
Zonal management of arsenic contaminated ground water in Northwestern Bangladesh.
Hill, Jason; Hossain, Faisal; Bagtzoglou, Amvrossios C
2009-09-01
This paper used ordinary kriging to spatially map arsenic contamination in shallow aquifers of Northwestern Bangladesh (total area approximately 35,000 km(2)). The Northwestern region was selected because it represents a relatively safer source of large-scale and affordable water supply for the rest of Bangladesh currently faced with extensive arsenic contamination in drinking water (such as the Southern regions). Hence, the work appropriately explored sustainability issues by building upon a previously published study (Hossain et al., 2007; Water Resources Management, vol. 21: 1245-1261) where a more general nation-wide assessment afforded by kriging was identified. The arsenic database for reference comprised the nation-wide survey (of 3534 drinking wells) completed in 1999 by the British Geological Survey (BGS) in collaboration with the Department of Public Health Engineering (DPHE) of Bangladesh. Randomly sampled networks of zones from this reference database were used to develop an empirical variogram and develop maps of zonal arsenic concentration for the Northwestern region. The remaining non-sampled zones from the reference database were used to assess the accuracy of the kriged maps. Two additional criteria were explored: (1) the ability of geostatistical interpolators such as kriging to extrapolate information on spatial structure of arsenic contamination beyond small-scale exploratory domains; (2) the impact of a priori knowledge of anisotropic variability on the effectiveness of geostatistically based management. On the average, the kriging method was found to have a 90% probability of successful prediction of safe zones according to the WHO safe limit of 10ppb while for the Bangladesh safe limit of 50ppb, the safe zone prediction probability was 97%. Compared to the previous study by Hossain et al. (2007) over the rest of the contaminated country side, the probability of successful detection of safe zones in the Northwest is observed to be about 25% higher. An a priori knowledge of anisotropy was found to have inconclusive impact on the effectiveness of kriging. It was, however, hypothesized that a preferential sampling strategy that honored anisotropy could be necessary to reach a more definitive conclusion in regards to this issue.
Gomez, Gabriela B; Borquez, Annick; Case, Kelsey K; Wheelock, Ana; Vassall, Anna; Hankins, Catherine
2013-01-01
Cost-effectiveness studies inform resource allocation, strategy, and policy development. However, due to their complexity, dependence on assumptions made, and inherent uncertainty, synthesising, and generalising the results can be difficult. We assess cost-effectiveness models evaluating expected health gains and costs of HIV pre-exposure prophylaxis (PrEP) interventions. We conducted a systematic review comparing epidemiological and economic assumptions of cost-effectiveness studies using various modelling approaches. The following databases were searched (until January 2013): PubMed/Medline, ISI Web of Knowledge, Centre for Reviews and Dissemination databases, EconLIT, and region-specific databases. We included modelling studies reporting both cost and expected impact of a PrEP roll-out. We explored five issues: prioritisation strategies, adherence, behaviour change, toxicity, and resistance. Of 961 studies retrieved, 13 were included. Studies modelled populations (heterosexual couples, men who have sex with men, people who inject drugs) in generalised and concentrated epidemics from Southern Africa (including South Africa), Ukraine, USA, and Peru. PrEP was found to have the potential to be a cost-effective addition to HIV prevention programmes in specific settings. The extent of the impact of PrEP depended upon assumptions made concerning cost, epidemic context, programme coverage, prioritisation strategies, and individual-level adherence. Delivery of PrEP to key populations at highest risk of HIV exposure appears the most cost-effective strategy. Limitations of this review include the partial geographical coverage, our inability to perform a meta-analysis, and the paucity of information available exploring trade-offs between early treatment and PrEP. Our review identifies the main considerations to address in assessing cost-effectiveness analyses of a PrEP intervention--cost, epidemic context, individual adherence level, PrEP programme coverage, and prioritisation strategy. Cost-effectiveness studies indicating where resources can be applied for greatest impact are essential to guide resource allocation decisions; however, the results of such analyses must be considered within the context of the underlying assumptions made. Please see later in the article for the Editors' Summary.
Selecting Data-Base Management Software for Microcomputers in Libraries and Information Units.
ERIC Educational Resources Information Center
Pieska, K. A. O.
1986-01-01
Presents a model for the evaluation of database management systems software from the viewpoint of librarians and information specialists. The properties of data management systems, database management systems, and text retrieval systems are outlined and compared. (10 references) (CLB)
Yonker, V A; Young, K P; Beecham, S K; Horwitz, S; Cousin, K
1990-01-01
This study was designed to make a comparative evaluation of the performance of MEDLINE in covering serial literature. Forensic medicine was chosen because it is an interdisciplinary subject area that would test MEDLARS at the periphery of the system. The evaluation of database coverage was based upon articles included in the bibliographies of scholars in the field of forensic medicine. This method was considered appropriate for characterizing work used by researchers in this field. The results of comparing MEDLINE to other databases evoked some concerns about the selective indexing policy of MEDLINE in serving the interests of those working in forensic medicine. PMID:2403829
High Acceleration, High Life Cycle, Reusable In-Space Main Engine: 2000-2004
NASA Technical Reports Server (NTRS)
2004-01-01
This custom bibliography from the NASA Scientific and Technical Information Program lists a sampling of records found in the NASA Aeronautics and Space Database. The scope of this topic includes technologies for the crew exploration vehicle. This area of focus is one of the enabling technologies as defined by NASA s Report of the President s Commission on Implementation of United States Space Exploration Policy, published in June 2004.
Burnham, Judy F
2006-03-08
The Scopus database provides access to STM journal articles and the references included in those articles, allowing the searcher to search both forward and backward in time. The database can be used for collection development as well as for research. This review provides information on the key points of the database and compares it to Web of Science. Neither database is inclusive, but complements each other. If a library can only afford one, choice must be based in institutional needs.
Burnham, Judy F
2006-01-01
The Scopus database provides access to STM journal articles and the references included in those articles, allowing the searcher to search both forward and backward in time. The database can be used for collection development as well as for research. This review provides information on the key points of the database and compares it to Web of Science. Neither database is inclusive, but complements each other. If a library can only afford one, choice must be based in institutional needs. PMID:16522216