DOT National Transportation Integrated Search
2004-05-01
For estimating the system total unlinked passenger trips and passenger miles of a fixed-route bus system for the National Transit Database (NTD), the FTA approved sampling plans may either over-sample or do not yield FTAs required confidence and p...
NASA Astrophysics Data System (ADS)
Cheng, T.; Zhou, X.; Jia, Y.; Yang, G.; Bai, J.
2018-04-01
In the project of China's First National Geographic Conditions Census, millions of sample data have been collected all over the country for interpreting land cover based on remote sensing images, the quantity of data files reaches more than 12,000,000 and has grown in the following project of National Geographic Conditions Monitoring. By now, using database such as Oracle for storing the big data is the most effective method. However, applicable method is more significant for sample data's management and application. This paper studies a database construction method which is based on relational database with distributed file system. The vector data and file data are saved in different physical location. The key issues and solution method are discussed. Based on this, it studies the application method of sample data and analyzes some kinds of using cases, which could lay the foundation for sample data's application. Particularly, sample data locating in Shaanxi province are selected for verifying the method. At the same time, it takes 10 first-level classes which defined in the land cover classification system for example, and analyzes the spatial distribution and density characteristics of all kinds of sample data. The results verify that the method of database construction which is based on relational database with distributed file system is very useful and applicative for sample data's searching, analyzing and promoted application. Furthermore, sample data collected in the project of China's First National Geographic Conditions Census could be useful in the earth observation and land cover's quality assessment.
A dedicated database system for handling multi-level data in systems biology.
Pornputtapong, Natapol; Wanichthanarak, Kwanjeera; Nilsson, Avlant; Nookaew, Intawat; Nielsen, Jens
2014-01-01
Advances in high-throughput technologies have enabled extensive generation of multi-level omics data. These data are crucial for systems biology research, though they are complex, heterogeneous, highly dynamic, incomplete and distributed among public databases. This leads to difficulties in data accessibility and often results in errors when data are merged and integrated from varied resources. Therefore, integration and management of systems biological data remain very challenging. To overcome this, we designed and developed a dedicated database system that can serve and solve the vital issues in data management and hereby facilitate data integration, modeling and analysis in systems biology within a sole database. In addition, a yeast data repository was implemented as an integrated database environment which is operated by the database system. Two applications were implemented to demonstrate extensibility and utilization of the system. Both illustrate how the user can access the database via the web query function and implemented scripts. These scripts are specific for two sample cases: 1) Detecting the pheromone pathway in protein interaction networks; and 2) Finding metabolic reactions regulated by Snf1 kinase. In this study we present the design of database system which offers an extensible environment to efficiently capture the majority of biological entities and relations encountered in systems biology. Critical functions and control processes were designed and implemented to ensure consistent, efficient, secure and reliable transactions. The two sample cases on the yeast integrated data clearly demonstrate the value of a sole database environment for systems biology research.
JAMSTEC DARWIN Database Assimilates GANSEKI and COEDO
NASA Astrophysics Data System (ADS)
Tomiyama, T.; Toyoda, Y.; Horikawa, H.; Sasaki, T.; Fukuda, K.; Hase, H.; Saito, H.
2017-12-01
Introduction: Japan Agency for Marine-Earth Science and Technology (JAMSTEC) archives data and samples obtained by JAMSTEC research vessels and submersibles. As a common property of the human society, JAMSTEC archive is open for public users with scientific/educational purposes [1]. For publicizing its data and samples online, JAMSTEC is operating NUUNKUI data sites [2], a group of several databases for various data and sample types. For years, data and metadata of JAMSTEC rock samples, sediment core samples and cruise/dive observation were publicized through databases named GANSEKI, COEDO, and DARWIN, respectively. However, because they had different user interfaces and data structures, these services were somewhat confusing for unfamiliar users. Maintenance costs of multiple hardware and software were also problematic for performing sustainable services and continuous improvements. Database Integration: In 2017, GANSEKI, COEDO and DARWIN were integrated into DARWIN+ [3]. The update also included implementation of map-search function as a substitute of closed portal site. Major functions of previous systems were incorporated into the new system; users can perform the complex search, by thumbnail browsing, map area, keyword filtering, and metadata constraints. As for data handling, the new system is more flexible, allowing the entry of variety of additional data types. Data Management: After the DARWIN major update, JAMSTEC data & sample team has been dealing with minor issues of individual sample data/metadata which sometimes need manual modification to be transferred to the new system. Some new data sets, such as onboard sample photos and surface close-up photos of rock samples, are getting available online. Geochemical data of sediment core samples will supposedly be added in the near future. Reference: [1] http://www.jamstec.go.jp/e/database/data_policy.html [2] http://www.godac.jamstec.go.jp/jmedia/portal/e/ [3] http://www.godac.jamstec.go.jp/darwin/e/
The Israel DNA database--the establishment of a rapid, semi-automated analysis system.
Zamir, Ashira; Dell'Ariccia-Carmon, Aviva; Zaken, Neomi; Oz, Carla
2012-03-01
The Israel Police DNA database, also known as IPDIS (Israel Police DNA Index System), has been operating since February 2007. During that time more than 135,000 reference samples have been uploaded and more than 2000 hits reported. We have developed an effective semi-automated system that includes two automated punchers, three liquid handler robots and four genetic analyzers. An inhouse LIMS program enables full tracking of every sample through the entire process of registration, pre-PCR handling, analysis of profiles, uploading to the database, hit reports and ultimately storage. The LIMS is also responsible for the future tracking of samples and their profiles to be expunged from the database according to the Israeli DNA legislation. The database is administered by an in-house developed software program, where reference and evidentiary profiles are uploaded, stored, searched and matched. The DNA database has proven to be an effective investigative tool which has gained the confidence of the Israeli public and on which the Israel National Police force has grown to rely. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Hoh, Siew Sin; Rapie, Nurul Nadiah; Lim, Edwin Suh Wen; Tan, Chun Yuan; Yavar, Alireza; Sarmani, Sukiman; Majid, Amran Ab.; Khoo, Kok Siong
2013-05-01
Instrumental Neutron Activation Analysis (INAA) is often used to determine and calculate the elemental concentrations of a sample at The National University of Malaysia (UKM) typically in Nuclear Science Programme, Faculty of Science and Technology. The objective of this study was to develop a database code-system based on Microsoft Access 2010 which could help the INAA users to choose either comparator method, k0-method or absolute method for calculating the elemental concentrations of a sample. This study also integrated k0data, Com-INAA, k0Concent, k0-Westcott and Abs-INAA to execute and complete the ECC-UKM database code-system. After the integration, a study was conducted to test the effectiveness of the ECC-UKM database code-system by comparing the concentrations between the experiments and the code-systems. 'Triple Bare Monitor' Zr-Au and Cr-Mo-Au were used in k0Concent, k0-Westcott and Abs-INAA code-systems as monitors to determine the thermal to epithermal neutron flux ratio (f). Calculations involved in determining the concentration were net peak area (Np), measurement time (tm), irradiation time (tirr), k-factor (k), thermal to epithermal neutron flux ratio (f), parameters of the neutron flux distribution epithermal (α) and detection efficiency (ɛp). For Com-INAA code-system, certified reference material IAEA-375 Soil was used to calculate the concentrations of elements in a sample. Other CRM and SRM were also used in this database codesystem. Later, a verification process to examine the effectiveness of the Abs-INAA code-system was carried out by comparing the sample concentrations between the code-system and the experiment. The results of the experimental concentration values of ECC-UKM database code-system were performed with good accuracy.
Su, Xiaoquan; Xu, Jian; Ning, Kang
2012-10-01
It has long been intriguing scientists to effectively compare different microbial communities (also referred as 'metagenomic samples' here) in a large scale: given a set of unknown samples, find similar metagenomic samples from a large repository and examine how similar these samples are. With the current metagenomic samples accumulated, it is possible to build a database of metagenomic samples of interests. Any metagenomic samples could then be searched against this database to find the most similar metagenomic sample(s). However, on one hand, current databases with a large number of metagenomic samples mostly serve as data repositories that offer few functionalities for analysis; and on the other hand, methods to measure the similarity of metagenomic data work well only for small set of samples by pairwise comparison. It is not yet clear, how to efficiently search for metagenomic samples against a large metagenomic database. In this study, we have proposed a novel method, Meta-Storms, that could systematically and efficiently organize and search metagenomic data. It includes the following components: (i) creating a database of metagenomic samples based on their taxonomical annotations, (ii) efficient indexing of samples in the database based on a hierarchical taxonomy indexing strategy, (iii) searching for a metagenomic sample against the database by a fast scoring function based on quantitative phylogeny and (iv) managing database by index export, index import, data insertion, data deletion and database merging. We have collected more than 1300 metagenomic data from the public domain and in-house facilities, and tested the Meta-Storms method on these datasets. Our experimental results show that Meta-Storms is capable of database creation and effective searching for a large number of metagenomic samples, and it could achieve similar accuracies compared with the current popular significance testing-based methods. Meta-Storms method would serve as a suitable database management and search system to quickly identify similar metagenomic samples from a large pool of samples. ningkang@qibebt.ac.cn Supplementary data are available at Bioinformatics online.
Advanced Query Formulation in Deductive Databases.
ERIC Educational Resources Information Center
Niemi, Timo; Jarvelin, Kalervo
1992-01-01
Discusses deductive databases and database management systems (DBMS) and introduces a framework for advanced query formulation for end users. Recursive processing is described, a sample extensional database is presented, query types are explained, and criteria for advanced query formulation from the end user's viewpoint are examined. (31…
[Development of an analyzing system for soil parameters based on NIR spectroscopy].
Zheng, Li-Hua; Li, Min-Zan; Sun, Hong
2009-10-01
A rapid estimation system for soil parameters based on spectral analysis was developed by using object-oriented (OO) technology. A class of SOIL was designed. The instance of the SOIL class is the object of the soil samples with the particular type, specific physical properties and spectral characteristics. Through extracting the effective information from the modeling spectral data of soil object, a map model was established between the soil parameters and its spectral data, while it was possible to save the mapping model parameters in the database of the model. When forecasting the content of any soil parameter, the corresponding prediction model of this parameter can be selected with the same soil type and the similar soil physical properties of objects. And after the object of target soil samples was carried into the prediction model and processed by the system, the accurate forecasting content of the target soil samples could be obtained. The system includes modules such as file operations, spectra pretreatment, sample analysis, calibrating and validating, and samples content forecasting. The system was designed to run out of equipment. The parameters and spectral data files (*.xls) of the known soil samples can be input into the system. Due to various data pretreatment being selected according to the concrete conditions, the results of predicting content will appear in the terminal and the forecasting model can be stored in the model database. The system reads the predicting models and their parameters are saved in the model database from the module interface, and then the data of the tested samples are transferred into the selected model. Finally the content of soil parameters can be predicted by the developed system. The system was programmed with Visual C++6.0 and Matlab 7.0. And the Access XP was used to create and manage the model database.
Burnett, Leslie; Barlow-Stewart, Kris; Proos, Anné L; Aizenberg, Harry
2003-05-01
This article describes a generic model for access to samples and information in human genetic databases. The model utilises a "GeneTrustee", a third-party intermediary independent of the subjects and of the investigators or database custodians. The GeneTrustee model has been implemented successfully in various community genetics screening programs and has facilitated research access to genetic databases while protecting the privacy and confidentiality of research subjects. The GeneTrustee model could also be applied to various types of non-conventional genetic databases, including neonatal screening Guthrie card collections, and to forensic DNA samples.
Generation of signature databases with fast codes
NASA Astrophysics Data System (ADS)
Bradford, Robert A.; Woodling, Arthur E.; Brazzell, James S.
1990-09-01
Using the FASTSIG signature code to generate optical signature databases for the Ground-based Surveillance and Traking System (GSTS) Program has improved the efficiency of the database generation process. The goal of the current GSTS database is to provide standardized, threat representative target signatures that can easily be used for acquisition and trk studies, discrimination algorithm development, and system simulations. Large databases, with as many as eight interpolalion parameters, are required to maintain the fidelity demands of discrimination and to generalize their application to other strateg systems. As the need increases for quick availability of long wave infrared (LWIR) target signatures for an evolving design4o-threat, FASTSIG has become a database generation alternative to using the industry standard OptiCal Signatures Code (OSC). FASTSIG, developed in 1985 to meet the unique strategic systems demands imposed by the discrimination function, has the significant advantage of being a faster running signature code than the OSC, typically requiring two percent of the cpu time. It uses analytical approximations to model axisymmetric targets, with the fidelity required for discrimination analysis. Access of the signature database is accomplished through use of the waveband integration and interpolation software, INTEG and SIGNAT. This paper gives details of this procedure as well as sample interpolated signatures and also covers sample verification by comparison to the OSC, in order to establish the fidelity of the FASTSIG generated database.
The Web-Database Connection Tools for Sharing Information on the Campus Intranet.
ERIC Educational Resources Information Center
Thibeault, Nancy E.
This paper evaluates four tools for creating World Wide Web pages that interface with Microsoft Access databases: DB Gateway, Internet Database Assistant (IDBA), Microsoft Internet Database Connector (IDC), and Cold Fusion. The system requirements and features of each tool are discussed. A sample application, "The Virtual Help Desk"…
Advanced transportation system studies. Alternate propulsion subsystem concepts: Propulsion database
NASA Technical Reports Server (NTRS)
Levack, Daniel
1993-01-01
The Advanced Transportation System Studies alternate propulsion subsystem concepts propulsion database interim report is presented. The objective of the database development task is to produce a propulsion database which is easy to use and modify while also being comprehensive in the level of detail available. The database is to be available on the Macintosh computer system. The task is to extend across all three years of the contract. Consequently, a significant fraction of the effort in this first year of the task was devoted to the development of the database structure to ensure a robust base for the following years' efforts. Nonetheless, significant point design propulsion system descriptions and parametric models were also produced. Each of the two propulsion databases, parametric propulsion database and propulsion system database, are described. The descriptions include a user's guide to each code, write-ups for models used, and sample output. The parametric database has models for LOX/H2 and LOX/RP liquid engines, solid rocket boosters using three different propellants, a hybrid rocket booster, and a NERVA derived nuclear thermal rocket engine.
Development of a System Model for Non-Invasive Quantification of Bilirubin in Jaundice Patients
NASA Astrophysics Data System (ADS)
Alla, Suresh K.
Neonatal jaundice is a medical condition which occurs in newborns as a result of an imbalance between the production and elimination of bilirubin. Excess bilirubin in the blood stream diffuses into the surrounding tissue leading to a yellowing of the skin. An optical system integrated with a signal processing system is used as a platform to noninvasively quantify bilirubin concentration through the measurement of diffuse skin reflectance. Initial studies have lead to the generation of a clinical analytical model for neonatal jaundice which generates spectral reflectance data for jaundiced skin with varying levels of bilirubin concentration in the tissue. The spectral database built using the clinical analytical model is then used as a test database to validate the signal processing system in real time. This evaluation forms the basis for understanding the translation of this research to human trials. The clinical analytical model and signal processing system have been successful validated on three spectral databases. First spectral database is constructed using a porcine model as a surrogate for neonatal skin tissue. Samples of pig skin were soaked in bilirubin solutions of varying concentrations to simulate jaundice skin conditions. The resulting skins samples were analyzed with our skin reflectance systems producing bilirubin concentration values that show a high correlation (R2 = 0.94) to concentration of the bilirubin solution that each porcine tissue sample is soaked in. The second spectral database is the spectral measurements collected on human volunteers to quantify the different chromophores and other physical properties of the tissue such a Hematocrit, Hemoglobin etc. The third spectral database is the spectral data collected at different time periods from the moment a bruise is induced.
Teaching Tip: Active Learning via a Sample Database: The Case of Microsoft's Adventure Works
ERIC Educational Resources Information Center
Mitri, Michel
2015-01-01
This paper describes the use and benefits of Microsoft's Adventure Works (AW) database to teach advanced database skills in a hands-on, realistic environment. Database management and querying skills are a key element of a robust information systems curriculum, and active learning is an important way to develop these skills. To facilitate active…
The Hawaiian Freshwater Algal Database (HfwADB): a laboratory LIMS and online biodiversity resource
2012-01-01
Background Biodiversity databases serve the important role of highlighting species-level diversity from defined geographical regions. Databases that are specially designed to accommodate the types of data gathered during regional surveys are valuable in allowing full data access and display to researchers not directly involved with the project, while serving as a Laboratory Information Management System (LIMS). The Hawaiian Freshwater Algal Database, or HfwADB, was modified from the Hawaiian Algal Database to showcase non-marine algal specimens collected from the Hawaiian Archipelago by accommodating the additional level of organization required for samples including multiple species. Description The Hawaiian Freshwater Algal Database is a comprehensive and searchable database containing photographs and micrographs of samples and collection sites, geo-referenced collecting information, taxonomic data and standardized DNA sequence data. All data for individual samples are linked through unique 10-digit accession numbers (“Isolate Accession”), the first five of which correspond to the collection site (“Environmental Accession”). Users can search online for sample information by accession number, various levels of taxonomy, habitat or collection site. HfwADB is hosted at the University of Hawaii, and was made publicly accessible in October 2011. At the present time the database houses data for over 2,825 samples of non-marine algae from 1,786 collection sites from the Hawaiian Archipelago. These samples include cyanobacteria, red and green algae and diatoms, as well as lesser representation from some other algal lineages. Conclusions HfwADB is a digital repository that acts as a Laboratory Information Management System for Hawaiian non-marine algal data. Users can interact with the repository through the web to view relevant habitat data (including geo-referenced collection locations) and download images of collection sites, specimen photographs and micrographs, and DNA sequences. It is publicly available at http://algae.manoa.hawaii.edu/hfwadb/. PMID:23095476
The CIS Database: Occupational Health and Safety Information Online.
ERIC Educational Resources Information Center
Siegel, Herbert; Scurr, Erica
1985-01-01
Describes document acquisition, selection, indexing, and abstracting and discusses online searching of the CIS database, an online system produced by the International Occupational Safety and Health Information Centre. This database comprehensively covers information in the field of occupational health and safety. Sample searches and search…
Space Station Freedom environmental database system (FEDS) for MSFC testing
NASA Technical Reports Server (NTRS)
Story, Gail S.; Williams, Wendy; Chiu, Charles
1991-01-01
The Water Recovery Test (WRT) at Marshall Space Flight Center (MSFC) is the first demonstration of integrated water recovery systems for potable and hygiene water reuse as envisioned for Space Station Freedom (SSF). In order to satisfy the safety and health requirements placed on the SSF program and facilitate test data assessment, an extensive laboratory analysis database was established to provide a central archive and data retrieval function. The database is required to store analysis results for physical, chemical, and microbial parameters measured from water, air and surface samples collected at various locations throughout the test facility. The Oracle Relational Database Management System (RDBMS) was utilized to implement a secured on-line information system with the ECLSS WRT program as the foundation for this system. The database is supported on a VAX/VMS 8810 series mainframe and is accessible from the Marshall Information Network System (MINS). This paper summarizes the database requirements, system design, interfaces, and future enhancements.
EROS Main Image File: A Picture Perfect Database for Landsat Imagery and Aerial Photography.
ERIC Educational Resources Information Center
Jack, Robert F.
1984-01-01
Describes Earth Resources Observation System online database, which provides access to computerized images of Earth obtained via satellite. Highlights include retrieval system and commands, types of images, search strategies, other online functions, and interpretation of accessions. Satellite information, sources and samples of accessions, and…
ERIC Educational Resources Information Center
Gruner, Richard; Heron, Carol E.
1984-01-01
Examines usefulness of DIALOG as legal research tool through use of DIALOG's DIALINDEX database to identify those databases among almost 200 available that contain large numbers of records related to federal securities regulation. Eight databases selected for further study are detailed. Twenty-six footnotes, database statistics, and samples are…
The Hawaiian Algal Database: a laboratory LIMS and online resource for biodiversity data
Wang, Norman; Sherwood, Alison R; Kurihara, Akira; Conklin, Kimberly Y; Sauvage, Thomas; Presting, Gernot G
2009-01-01
Background Organization and presentation of biodiversity data is greatly facilitated by databases that are specially designed to allow easy data entry and organized data display. Such databases also have the capacity to serve as Laboratory Information Management Systems (LIMS). The Hawaiian Algal Database was designed to showcase specimens collected from the Hawaiian Archipelago, enabling users around the world to compare their specimens with our photographs and DNA sequence data, and to provide lab personnel with an organizational tool for storing various biodiversity data types. Description We describe the Hawaiian Algal Database, a comprehensive and searchable database containing photographs and micrographs, geo-referenced collecting information, taxonomic checklists and standardized DNA sequence data. All data for individual samples are linked through unique accession numbers. Users can search online for sample information by accession number, numerous levels of taxonomy, or collection site. At the present time the database contains data representing over 2,000 samples of marine, freshwater and terrestrial algae from the Hawaiian Archipelago. These samples are primarily red algae, although other taxa are being added. Conclusion The Hawaiian Algal Database is a digital repository for Hawaiian algal samples and acts as a LIMS for the laboratory. Users can make use of the online search tool to view and download specimen photographs and micrographs, DNA sequences and relevant habitat data, including georeferenced collecting locations. It is publicly available at . PMID:19728892
Bakshi, Sonal R; Shukla, Shilin N; Shah, Pankaj M
2009-01-01
We developed a Microsoft Access-based laboratory management system to facilitate database management of leukemia patients referred for cytogenetic tests in regards to karyotyping and fluorescence in situ hybridization (FISH). The database is custom-made for entry of patient data, clinical details, sample details, cytogenetics test results, and data mining for various ongoing research areas. A number of clinical research laboratoryrelated tasks are carried out faster using specific "queries." The tasks include tracking clinical progression of a particular patient for multiple visits, treatment response, morphological and cytogenetics response, survival time, automatic grouping of patient inclusion criteria in a research project, tracking various processing steps of samples, turn-around time, and revenue generated. Since 2005 we have collected of over 5,000 samples. The database is easily updated and is being adapted for various data maintenance and mining needs.
Van Berkel, Gary J.; Kertesz, Vilmos
2016-11-15
An “Open Access”-like mass spectrometric platform to fully utilize the simplicity of the manual open port sampling interface for rapid characterization of unprocessed samples by liquid introduction atmospheric pressure ionization mass spectrometry has been lacking. The in-house developed integrated software with a simple, small and relatively low-cost mass spectrometry system introduced here fills this void. Software was developed to operate the mass spectrometer, to collect and process mass spectrometric data files, to build a database and to classify samples using such a database. These tasks were accomplished via the vendorprovided software libraries. Sample classification based on spectral comparison utilized themore » spectral contrast angle method. As a result, using the developed software platform near real-time sample classification is exemplified using a series of commercially available blue ink rollerball pens and vegetable oils. In the case of the inks, full scan positive and negative ion ESI mass spectra were both used for database generation and sample classification. For the vegetable oils, full scan positive ion mode APCI mass spectra were recorded. The overall accuracy of the employed spectral contrast angle statistical model was 95.3% and 98% in case of the inks and oils, respectively, using leave-one-out cross-validation. In conclusion, this work illustrates that an open port sampling interface/mass spectrometer combination, with appropriate instrument control and data processing software, is a viable direct liquid extraction sampling and analysis system suitable for the non-expert user and near real-time sample classification via database matching.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van Berkel, Gary J.; Kertesz, Vilmos
An “Open Access”-like mass spectrometric platform to fully utilize the simplicity of the manual open port sampling interface for rapid characterization of unprocessed samples by liquid introduction atmospheric pressure ionization mass spectrometry has been lacking. The in-house developed integrated software with a simple, small and relatively low-cost mass spectrometry system introduced here fills this void. Software was developed to operate the mass spectrometer, to collect and process mass spectrometric data files, to build a database and to classify samples using such a database. These tasks were accomplished via the vendorprovided software libraries. Sample classification based on spectral comparison utilized themore » spectral contrast angle method. As a result, using the developed software platform near real-time sample classification is exemplified using a series of commercially available blue ink rollerball pens and vegetable oils. In the case of the inks, full scan positive and negative ion ESI mass spectra were both used for database generation and sample classification. For the vegetable oils, full scan positive ion mode APCI mass spectra were recorded. The overall accuracy of the employed spectral contrast angle statistical model was 95.3% and 98% in case of the inks and oils, respectively, using leave-one-out cross-validation. In conclusion, this work illustrates that an open port sampling interface/mass spectrometer combination, with appropriate instrument control and data processing software, is a viable direct liquid extraction sampling and analysis system suitable for the non-expert user and near real-time sample classification via database matching.« less
A manual for a laboratory information management system (LIMS) for light stable isotopes
Coplen, Tyler B.
1997-01-01
The reliability and accuracy of isotopic data can be improved by utilizing database software to (i) store information about samples, (ii) store the results of mass spectrometric isotope-ratio analyses of samples, (iii) calculate analytical results using standardized algorithms stored in a database, (iv) normalize stable isotopic data to international scales using isotopic reference materials, and (v) generate multi-sheet paper templates for convenient sample loading of automated mass-spectrometer sample preparation manifolds. Such a database program is presented herein. Major benefits of this system include (i) an increase in laboratory efficiency, (ii) reduction in the use of paper, (iii) reduction in workload due to the elimination or reduction of retyping of data by laboratory personnel, and (iv) decreased errors in data reported to sample submitters. Such a database provides a complete record of when and how often laboratory reference materials have been analyzed and provides a record of what correction factors have been used through time. It provides an audit trail for stable isotope laboratories. Since the original publication of the manual for LIMS for Light Stable Isotopes, the isotopes 3 H, 3 He, and 14 C, and the chlorofluorocarbons (CFCs), CFC-11, CFC-12, and CFC-113, have been added to this program.
A manual for a Laboratory Information Management System (LIMS) for light stable isotopes
Coplen, Tyler B.
1998-01-01
The reliability and accuracy of isotopic data can be improved by utilizing database software to (i) store information about samples, (ii) store the results of mass spectrometric isotope-ratio analyses of samples, (iii) calculate analytical results using standardized algorithms stored in a database, (iv) normalize stable isotopic data to international scales using isotopic reference materials, and (v) generate multi-sheet paper templates for convenient sample loading of automated mass-spectrometer sample preparation manifolds. Such a database program is presented herein. Major benefits of this system include (i) an increase in laboratory efficiency, (ii) reduction in the use of paper, (iii) reduction in workload due to the elimination or reduction of retyping of data by laboratory personnel, and (iv) decreased errors in data reported to sample submitters. Such a database provides a complete record of when and how often laboratory reference materials have been analyzed and provides a record of what correction factors have been used through time. It provides an audit trail for stable isotope laboratories. Since the original publication of the manual for LIMS for Light Stable Isotopes, the isotopes 3 H, 3 He, and 14 C, and the chlorofluorocarbons (CFCs), CFC-11, CFC-12, and CFC-113, have been added to this program.
The Philip Morris Information Network: A Library Database on an In-House Timesharing System.
ERIC Educational Resources Information Center
DeBardeleben, Marian Z.; And Others
1983-01-01
Outlines a database constructed at Philip Morris Research Center Library which encompasses holdings and circulation and acquisitions records for all items in the library. Host computer (DECSYSTEM-2060), software (BASIC), database design, search methodology, cataloging, and accessibility are noted; sample search, circ-in profile, end user profiles,…
Hudson, Lawrence N; Newbold, Tim; Contu, Sara; Hill, Samantha L L; Lysenko, Igor; De Palma, Adriana; Phillips, Helen R P; Alhusseini, Tamera I; Bedford, Felicity E; Bennett, Dominic J; Booth, Hollie; Burton, Victoria J; Chng, Charlotte W T; Choimes, Argyrios; Correia, David L P; Day, Julie; Echeverría-Londoño, Susy; Emerson, Susan R; Gao, Di; Garon, Morgan; Harrison, Michelle L K; Ingram, Daniel J; Jung, Martin; Kemp, Victoria; Kirkpatrick, Lucinda; Martin, Callum D; Pan, Yuan; Pask-Hale, Gwilym D; Pynegar, Edwin L; Robinson, Alexandra N; Sanchez-Ortiz, Katia; Senior, Rebecca A; Simmons, Benno I; White, Hannah J; Zhang, Hanbin; Aben, Job; Abrahamczyk, Stefan; Adum, Gilbert B; Aguilar-Barquero, Virginia; Aizen, Marcelo A; Albertos, Belén; Alcala, E L; Del Mar Alguacil, Maria; Alignier, Audrey; Ancrenaz, Marc; Andersen, Alan N; Arbeláez-Cortés, Enrique; Armbrecht, Inge; Arroyo-Rodríguez, Víctor; Aumann, Tom; Axmacher, Jan C; Azhar, Badrul; Azpiroz, Adrián B; Baeten, Lander; Bakayoko, Adama; Báldi, András; Banks, John E; Baral, Sharad K; Barlow, Jos; Barratt, Barbara I P; Barrico, Lurdes; Bartolommei, Paola; Barton, Diane M; Basset, Yves; Batáry, Péter; Bates, Adam J; Baur, Bruno; Bayne, Erin M; Beja, Pedro; Benedick, Suzan; Berg, Åke; Bernard, Henry; Berry, Nicholas J; Bhatt, Dinesh; Bicknell, Jake E; Bihn, Jochen H; Blake, Robin J; Bobo, Kadiri S; Bóçon, Roberto; Boekhout, Teun; Böhning-Gaese, Katrin; Bonham, Kevin J; Borges, Paulo A V; Borges, Sérgio H; Boutin, Céline; Bouyer, Jérémy; Bragagnolo, Cibele; Brandt, Jodi S; Brearley, Francis Q; Brito, Isabel; Bros, Vicenç; Brunet, Jörg; Buczkowski, Grzegorz; Buddle, Christopher M; Bugter, Rob; Buscardo, Erika; Buse, Jörn; Cabra-García, Jimmy; Cáceres, Nilton C; Cagle, Nicolette L; Calviño-Cancela, María; Cameron, Sydney A; Cancello, Eliana M; Caparrós, Rut; Cardoso, Pedro; Carpenter, Dan; Carrijo, Tiago F; Carvalho, Anelena L; Cassano, Camila R; Castro, Helena; Castro-Luna, Alejandro A; Rolando, Cerda B; Cerezo, Alexis; Chapman, Kim Alan; Chauvat, Matthieu; Christensen, Morten; Clarke, Francis M; Cleary, Daniel F R; Colombo, Giorgio; Connop, Stuart P; Craig, Michael D; Cruz-López, Leopoldo; Cunningham, Saul A; D'Aniello, Biagio; D'Cruze, Neil; da Silva, Pedro Giovâni; Dallimer, Martin; Danquah, Emmanuel; Darvill, Ben; Dauber, Jens; Davis, Adrian L V; Dawson, Jeff; de Sassi, Claudio; de Thoisy, Benoit; Deheuvels, Olivier; Dejean, Alain; Devineau, Jean-Louis; Diekötter, Tim; Dolia, Jignasu V; Domínguez, Erwin; Dominguez-Haydar, Yamileth; Dorn, Silvia; Draper, Isabel; Dreber, Niels; Dumont, Bertrand; Dures, Simon G; Dynesius, Mats; Edenius, Lars; Eggleton, Paul; Eigenbrod, Felix; Elek, Zoltán; Entling, Martin H; Esler, Karen J; de Lima, Ricardo F; Faruk, Aisyah; Farwig, Nina; Fayle, Tom M; Felicioli, Antonio; Felton, Annika M; Fensham, Roderick J; Fernandez, Ignacio C; Ferreira, Catarina C; Ficetola, Gentile F; Fiera, Cristina; Filgueiras, Bruno K C; Fırıncıoğlu, Hüseyin K; Flaspohler, David; Floren, Andreas; Fonte, Steven J; Fournier, Anne; Fowler, Robert E; Franzén, Markus; Fraser, Lauchlan H; Fredriksson, Gabriella M; Freire, Geraldo B; Frizzo, Tiago L M; Fukuda, Daisuke; Furlani, Dario; Gaigher, René; Ganzhorn, Jörg U; García, Karla P; Garcia-R, Juan C; Garden, Jenni G; Garilleti, Ricardo; Ge, Bao-Ming; Gendreau-Berthiaume, Benoit; Gerard, Philippa J; Gheler-Costa, Carla; Gilbert, Benjamin; Giordani, Paolo; Giordano, Simonetta; Golodets, Carly; Gomes, Laurens G L; Gould, Rachelle K; Goulson, Dave; Gove, Aaron D; Granjon, Laurent; Grass, Ingo; Gray, Claudia L; Grogan, James; Gu, Weibin; Guardiola, Moisès; Gunawardene, Nihara R; Gutierrez, Alvaro G; Gutiérrez-Lamus, Doris L; Haarmeyer, Daniela H; Hanley, Mick E; Hanson, Thor; Hashim, Nor R; Hassan, Shombe N; Hatfield, Richard G; Hawes, Joseph E; Hayward, Matt W; Hébert, Christian; Helden, Alvin J; Henden, John-André; Henschel, Philipp; Hernández, Lionel; Herrera, James P; Herrmann, Farina; Herzog, Felix; Higuera-Diaz, Diego; Hilje, Branko; Höfer, Hubert; Hoffmann, Anke; Horgan, Finbarr G; Hornung, Elisabeth; Horváth, Roland; Hylander, Kristoffer; Isaacs-Cubides, Paola; Ishida, Hiroaki; Ishitani, Masahiro; Jacobs, Carmen T; Jaramillo, Víctor J; Jauker, Birgit; Hernández, F Jiménez; Johnson, McKenzie F; Jolli, Virat; Jonsell, Mats; Juliani, S Nur; Jung, Thomas S; Kapoor, Vena; Kappes, Heike; Kati, Vassiliki; Katovai, Eric; Kellner, Klaus; Kessler, Michael; Kirby, Kathryn R; Kittle, Andrew M; Knight, Mairi E; Knop, Eva; Kohler, Florian; Koivula, Matti; Kolb, Annette; Kone, Mouhamadou; Kőrösi, Ádám; Krauss, Jochen; Kumar, Ajith; Kumar, Raman; Kurz, David J; Kutt, Alex S; Lachat, Thibault; Lantschner, Victoria; Lara, Francisco; Lasky, Jesse R; Latta, Steven C; Laurance, William F; Lavelle, Patrick; Le Féon, Violette; LeBuhn, Gretchen; Légaré, Jean-Philippe; Lehouck, Valérie; Lencinas, María V; Lentini, Pia E; Letcher, Susan G; Li, Qi; Litchwark, Simon A; Littlewood, Nick A; Liu, Yunhui; Lo-Man-Hung, Nancy; López-Quintero, Carlos A; Louhaichi, Mounir; Lövei, Gabor L; Lucas-Borja, Manuel Esteban; Luja, Victor H; Luskin, Matthew S; MacSwiney G, M Cristina; Maeto, Kaoru; Magura, Tibor; Mallari, Neil Aldrin; Malone, Louise A; Malonza, Patrick K; Malumbres-Olarte, Jagoba; Mandujano, Salvador; Måren, Inger E; Marin-Spiotta, Erika; Marsh, Charles J; Marshall, E J P; Martínez, Eliana; Martínez Pastur, Guillermo; Moreno Mateos, David; Mayfield, Margaret M; Mazimpaka, Vicente; McCarthy, Jennifer L; McCarthy, Kyle P; McFrederick, Quinn S; McNamara, Sean; Medina, Nagore G; Medina, Rafael; Mena, Jose L; Mico, Estefania; Mikusinski, Grzegorz; Milder, Jeffrey C; Miller, James R; Miranda-Esquivel, Daniel R; Moir, Melinda L; Morales, Carolina L; Muchane, Mary N; Muchane, Muchai; Mudri-Stojnic, Sonja; Munira, A Nur; Muoñz-Alonso, Antonio; Munyekenye, B F; Naidoo, Robin; Naithani, A; Nakagawa, Michiko; Nakamura, Akihiro; Nakashima, Yoshihiro; Naoe, Shoji; Nates-Parra, Guiomar; Navarrete Gutierrez, Dario A; Navarro-Iriarte, Luis; Ndang'ang'a, Paul K; Neuschulz, Eike L; Ngai, Jacqueline T; Nicolas, Violaine; Nilsson, Sven G; Noreika, Norbertas; Norfolk, Olivia; Noriega, Jorge Ari; Norton, David A; Nöske, Nicole M; Nowakowski, A Justin; Numa, Catherine; O'Dea, Niall; O'Farrell, Patrick J; Oduro, William; Oertli, Sabine; Ofori-Boateng, Caleb; Oke, Christopher Omamoke; Oostra, Vicencio; Osgathorpe, Lynne M; Otavo, Samuel Eduardo; Page, Navendu V; Paritsis, Juan; Parra-H, Alejandro; Parry, Luke; Pe'er, Guy; Pearman, Peter B; Pelegrin, Nicolás; Pélissier, Raphaël; Peres, Carlos A; Peri, Pablo L; Persson, Anna S; Petanidou, Theodora; Peters, Marcell K; Pethiyagoda, Rohan S; Phalan, Ben; Philips, T Keith; Pillsbury, Finn C; Pincheira-Ulbrich, Jimmy; Pineda, Eduardo; Pino, Joan; Pizarro-Araya, Jaime; Plumptre, A J; Poggio, Santiago L; Politi, Natalia; Pons, Pere; Poveda, Katja; Power, Eileen F; Presley, Steven J; Proença, Vânia; Quaranta, Marino; Quintero, Carolina; Rader, Romina; Ramesh, B R; Ramirez-Pinilla, Martha P; Ranganathan, Jai; Rasmussen, Claus; Redpath-Downing, Nicola A; Reid, J Leighton; Reis, Yana T; Rey Benayas, José M; Rey-Velasco, Juan Carlos; Reynolds, Chevonne; Ribeiro, Danilo Bandini; Richards, Miriam H; Richardson, Barbara A; Richardson, Michael J; Ríos, Rodrigo Macip; Robinson, Richard; Robles, Carolina A; Römbke, Jörg; Romero-Duque, Luz Piedad; Rös, Matthias; Rosselli, Loreta; Rossiter, Stephen J; Roth, Dana S; Roulston, T'ai H; Rousseau, Laurent; Rubio, André V; Ruel, Jean-Claude; Sadler, Jonathan P; Sáfián, Szabolcs; Saldaña-Vázquez, Romeo A; Sam, Katerina; Samnegård, Ulrika; Santana, Joana; Santos, Xavier; Savage, Jade; Schellhorn, Nancy A; Schilthuizen, Menno; Schmiedel, Ute; Schmitt, Christine B; Schon, Nicole L; Schüepp, Christof; Schumann, Katharina; Schweiger, Oliver; Scott, Dawn M; Scott, Kenneth A; Sedlock, Jodi L; Seefeldt, Steven S; Shahabuddin, Ghazala; Shannon, Graeme; Sheil, Douglas; Sheldon, Frederick H; Shochat, Eyal; Siebert, Stefan J; Silva, Fernando A B; Simonetti, Javier A; Slade, Eleanor M; Smith, Jo; Smith-Pardo, Allan H; Sodhi, Navjot S; Somarriba, Eduardo J; Sosa, Ramón A; Soto Quiroga, Grimaldo; St-Laurent, Martin-Hugues; Starzomski, Brian M; Stefanescu, Constanti; Steffan-Dewenter, Ingolf; Stouffer, Philip C; Stout, Jane C; Strauch, Ayron M; Struebig, Matthew J; Su, Zhimin; Suarez-Rubio, Marcela; Sugiura, Shinji; Summerville, Keith S; Sung, Yik-Hei; Sutrisno, Hari; Svenning, Jens-Christian; Teder, Tiit; Threlfall, Caragh G; Tiitsaar, Anu; Todd, Jacqui H; Tonietto, Rebecca K; Torre, Ignasi; Tóthmérész, Béla; Tscharntke, Teja; Turner, Edgar C; Tylianakis, Jason M; Uehara-Prado, Marcio; Urbina-Cardona, Nicolas; Vallan, Denis; Vanbergen, Adam J; Vasconcelos, Heraldo L; Vassilev, Kiril; Verboven, Hans A F; Verdasca, Maria João; Verdú, José R; Vergara, Carlos H; Vergara, Pablo M; Verhulst, Jort; Virgilio, Massimiliano; Vu, Lien Van; Waite, Edward M; Walker, Tony R; Wang, Hua-Feng; Wang, Yanping; Watling, James I; Weller, Britta; Wells, Konstans; Westphal, Catrin; Wiafe, Edward D; Williams, Christopher D; Willig, Michael R; Woinarski, John C Z; Wolf, Jan H D; Wolters, Volkmar; Woodcock, Ben A; Wu, Jihua; Wunderle, Joseph M; Yamaura, Yuichi; Yoshikura, Satoko; Yu, Douglas W; Zaitsev, Andrey S; Zeidler, Juliane; Zou, Fasheng; Collen, Ben; Ewers, Rob M; Mace, Georgina M; Purves, Drew W; Scharlemann, Jörn P W; Purvis, Andy
2017-01-01
The PREDICTS project-Projecting Responses of Ecological Diversity In Changing Terrestrial Systems (www.predicts.org.uk)-has collated from published studies a large, reasonably representative database of comparable samples of biodiversity from multiple sites that differ in the nature or intensity of human impacts relating to land use. We have used this evidence base to develop global and regional statistical models of how local biodiversity responds to these measures. We describe and make freely available this 2016 release of the database, containing more than 3.2 million records sampled at over 26,000 locations and representing over 47,000 species. We outline how the database can help in answering a range of questions in ecology and conservation biology. To our knowledge, this is the largest and most geographically and taxonomically representative database of spatial comparisons of biodiversity that has been collated to date; it will be useful to researchers and international efforts wishing to model and understand the global status of biodiversity.
MoonDB — A Data System for Analytical Data of Lunar Samples
NASA Astrophysics Data System (ADS)
Lehnert, K.; Ji, P.; Cai, M.; Evans, C.; Zeigler, R.
2018-04-01
MoonDB is a data system that makes analytical data from the Apollo lunar sample collection and lunar meteorites accessible by synthesizing published and unpublished datasets in a relational database with an online search interface.
A comprehensive and scalable database search system for metaproteomics.
Chatterjee, Sandip; Stupp, Gregory S; Park, Sung Kyu Robin; Ducom, Jean-Christophe; Yates, John R; Su, Andrew I; Wolan, Dennis W
2016-08-16
Mass spectrometry-based shotgun proteomics experiments rely on accurate matching of experimental spectra against a database of protein sequences. Existing computational analysis methods are limited in the size of their sequence databases, which severely restricts the proteomic sequencing depth and functional analysis of highly complex samples. The growing amount of public high-throughput sequencing data will only exacerbate this problem. We designed a broadly applicable metaproteomic analysis method (ComPIL) that addresses protein database size limitations. Our approach to overcome this significant limitation in metaproteomics was to design a scalable set of sequence databases assembled for optimal library querying speeds. ComPIL was integrated with a modified version of the search engine ProLuCID (termed "Blazmass") to permit rapid matching of experimental spectra. Proof-of-principle analysis of human HEK293 lysate with a ComPIL database derived from high-quality genomic libraries was able to detect nearly all of the same peptides as a search with a human database (~500x fewer peptides in the database), with a small reduction in sensitivity. We were also able to detect proteins from the adenovirus used to immortalize these cells. We applied our method to a set of healthy human gut microbiome proteomic samples and showed a substantial increase in the number of identified peptides and proteins compared to previous metaproteomic analyses, while retaining a high degree of protein identification accuracy and allowing for a more in-depth characterization of the functional landscape of the samples. The combination of ComPIL with Blazmass allows proteomic searches to be performed with database sizes much larger than previously possible. These large database searches can be applied to complex meta-samples with unknown composition or proteomic samples where unexpected proteins may be identified. The protein database, proteomic search engine, and the proteomic data files for the 5 microbiome samples characterized and discussed herein are open source and available for use and additional analysis.
Van Berkel, Gary J; Kertesz, Vilmos
2017-02-15
An "Open Access"-like mass spectrometric platform to fully utilize the simplicity of the manual open port sampling interface for rapid characterization of unprocessed samples by liquid introduction atmospheric pressure ionization mass spectrometry has been lacking. The in-house developed integrated software with a simple, small and relatively low-cost mass spectrometry system introduced here fills this void. Software was developed to operate the mass spectrometer, to collect and process mass spectrometric data files, to build a database and to classify samples using such a database. These tasks were accomplished via the vendor-provided software libraries. Sample classification based on spectral comparison utilized the spectral contrast angle method. Using the developed software platform near real-time sample classification is exemplified using a series of commercially available blue ink rollerball pens and vegetable oils. In the case of the inks, full scan positive and negative ion ESI mass spectra were both used for database generation and sample classification. For the vegetable oils, full scan positive ion mode APCI mass spectra were recorded. The overall accuracy of the employed spectral contrast angle statistical model was 95.3% and 98% in case of the inks and oils, respectively, using leave-one-out cross-validation. This work illustrates that an open port sampling interface/mass spectrometer combination, with appropriate instrument control and data processing software, is a viable direct liquid extraction sampling and analysis system suitable for the non-expert user and near real-time sample classification via database matching. Published in 2016. This article is a U.S. Government work and is in the public domain in the USA. Published in 2016. This article is a U.S. Government work and is in the public domain in the USA.
Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf
2014-01-01
CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB PMID:25281234
Ameur, Adam; Bunikis, Ignas; Enroth, Stefan; Gyllensten, Ulf
2014-01-01
CanvasDB is an infrastructure for management and analysis of genetic variants from massively parallel sequencing (MPS) projects. The system stores SNP and indel calls in a local database, designed to handle very large datasets, to allow for rapid analysis using simple commands in R. Functional annotations are included in the system, making it suitable for direct identification of disease-causing mutations in human exome- (WES) or whole-genome sequencing (WGS) projects. The system has a built-in filtering function implemented to simultaneously take into account variant calls from all individual samples. This enables advanced comparative analysis of variant distribution between groups of samples, including detection of candidate causative mutations within family structures and genome-wide association by sequencing. In most cases, these analyses are executed within just a matter of seconds, even when there are several hundreds of samples and millions of variants in the database. We demonstrate the scalability of canvasDB by importing the individual variant calls from all 1092 individuals present in the 1000 Genomes Project into the system, over 4.4 billion SNPs and indels in total. Our results show that canvasDB makes it possible to perform advanced analyses of large-scale WGS projects on a local server. Database URL: https://github.com/UppsalaGenomeCenter/CanvasDB. © The Author(s) 2014. Published by Oxford University Press.
Physical Samples Linked Data in Action
NASA Astrophysics Data System (ADS)
Ji, P.; Arko, R. A.; Lehnert, K.; Bristol, S.
2017-12-01
Most data and metadata related to physical samples currently reside in isolated relational databases driven by diverse data models. How to approach the challenge for sharing, interchanging and integrating data from these difference relational databases motivated us to publish Linked Open Data for collections of physical samples, using Semantic Web technologies including the Resource Description Framework (RDF), RDF Query Language (SPARQL), and Web Ontology Language (OWL). In last few years, we have released four knowledge graphs concentrated on physical samples, including System for Earth Sample Registration (SESAR), USGS National Geochemical Database (NGDC), Ocean Biogeographic Information System (OBIS), and Earthchem Database. Currently the four knowledge graphs contain over 12 million facets (triples) about objects of interest to the geoscience domain. Choosing appropriate domain ontologies for representing context of data is the core of the whole work. Geolink ontology developed by Earthcube Geolink project was used as top level to represent common concepts like person, organization, cruise, etc. Physical sample ontology developed by Interdisciplinary Earth Data Alliance (IEDA) and Darwin Core vocabulary were used as second level to describe details about geological samples and biological diversity. We also focused on finding and building best tool chains to support the whole life cycle of publishing linked data we have, including information retrieval, linked data browsing and data visualization. Currently, Morph, Virtuoso Server, LodView, LodLive, and YASGUI were employed for converting, storing, representing, and querying data in a knowledge base (RDF triplestore). Persistent digital identifier is another main point we concentrated on. Open Researcher & Contributor IDs (ORCIDs), International Geo Sample Numbers (IGSNs), Global Research Identifier Database (GRID) and other persistent identifiers were used to link different resources from various graphs with person, sample, organization, cruise, etc. This work is supported by the EarthCube "GeoLink" project (NSF# ICER14-40221 and others) and the "USGS-IEDA Partnership to Support a Data Lifecycle Framework and Tools" project (USGS# G13AC00381).
NASA Technical Reports Server (NTRS)
Todd, Nancy S.
2016-01-01
The rock and soil samples returned from the Apollo missions from 1969-72 have supported 46 years of research leading to advances in our understanding of the formation and evolution of the inner Solar System. NASA has been engaged in several initiatives that aim to restore, digitize, and make available to the public existing published and unpublished research data for the Apollo samples. One of these initiatives is a collaboration with IEDA (Interdisciplinary Earth Data Alliance) to develop MoonDB, a lunar geochemical database modeled after PetDB (Petrological Database of the Ocean Floor). In support of this initiative, NASA has adopted the use of IGSN (International Geo Sample Number) to generate persistent, unique identifiers for lunar samples that scientists can use when publishing research data. To facilitate the IGSN registration of the original 2,200 samples and over 120,000 subdivided samples, NASA has developed an application that retrieves sample metadata from the Lunar Curation Database and uses the SESAR API to automate the generation of IGSNs and registration of samples into SESAR (System for Earth Sample Registration). This presentation will describe the work done by NASA to map existing sample metadata to the IGSN metadata and integrate the IGSN registration process into the sample curation workflow, the lessons learned from this effort, and how this work can be extended in the future to help deal with the registration of large numbers of samples.
An integrated data-analysis and database system for AMS 14C
NASA Astrophysics Data System (ADS)
Kjeldsen, Henrik; Olsen, Jesper; Heinemeier, Jan
2010-04-01
AMSdata is the name of a combined database and data-analysis system for AMS 14C and stable-isotope work that has been developed at Aarhus University. The system (1) contains routines for data analysis of AMS and MS data, (2) allows a flexible and accurate description of sample extraction and pretreatment, also when samples are split into several fractions, and (3) keeps track of all measured, calculated and attributed data. The structure of the database is flexible and allows an unlimited number of measurement and pretreatment procedures. The AMS 14C data analysis routine is fairly advanced and flexible, and it can be easily optimized for different kinds of measuring processes. Technically, the system is based on a Microsoft SQL server and includes stored SQL procedures for the data analysis. Microsoft Office Access is used for the (graphical) user interface, and in addition Excel, Word and Origin are exploited for input and output of data, e.g. for plotting data during data analysis.
An open experimental database for exploring inorganic materials
Zakutayev, Andriy; Wunder, Nick; Schwarting, Marcus; ...
2018-04-03
The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half ofmore » these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.« less
An open experimental database for exploring inorganic materials.
Zakutayev, Andriy; Wunder, Nick; Schwarting, Marcus; Perkins, John D; White, Robert; Munch, Kristin; Tumas, William; Phillips, Caleb
2018-04-03
The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half of these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.
An open experimental database for exploring inorganic materials
Zakutayev, Andriy; Wunder, Nick; Schwarting, Marcus; Perkins, John D.; White, Robert; Munch, Kristin; Tumas, William; Phillips, Caleb
2018-01-01
The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half of these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource. PMID:29611842
An open experimental database for exploring inorganic materials
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zakutayev, Andriy; Wunder, Nick; Schwarting, Marcus
The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half ofmore » these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.« less
GIS Methodic and New Database for Magmatic Rocks. Application for Atlantic Oceanic Magmatism.
NASA Astrophysics Data System (ADS)
Asavin, A. M.
2001-12-01
There are several geochemical Databases in INTERNET available now. There one of the main peculiarities of stored geochemical information is geographical coordinates of each samples in those Databases. As rule the software of this Database use spatial information only for users interface search procedures. In the other side, GIS-software (Geographical Information System software),for example ARC/INFO software which using for creation and analyzing special geological, geochemical and geophysical e-map, have been deeply involved with geographical coordinates for of samples. We join peculiarities GIS systems and relational geochemical Database from special software. Our geochemical information system created in Vernadsky Geological State Museum and institute of Geochemistry and Analytical Chemistry from Moscow. Now we tested system with data of geochemistry oceanic rock from Atlantic and Pacific oceans, about 10000 chemical analysis. GIS information content consist from e-map covers Wold Globes. Parts of these maps are Atlantic ocean covers gravica map (with grid 2''), oceanic bottom hot stream, altimeteric maps, seismic activity, tectonic map and geological map. Combination of this information content makes possible created new geochemical maps and combination of spatial analysis and numerical geochemical modeling of volcanic process in ocean segment. Now we tested information system on thick client technology. Interface between GIS system Arc/View and Database resides in special multiply SQL-queries sequence. The result of the above gueries were simple DBF-file with geographical coordinates. This file act at the instant of creation geochemical and other special e-map from oceanic region. We used more complex method for geophysical data. From ARC\\View we created grid cover for polygon spatial geophysical information.
Cowan, Dallas M; Cheng, Thales J; Ground, Matthew; Sahmel, Jennifer; Varughese, Allysha; Madl, Amy K
2015-08-01
The United States Occupational Safety and Health Administration (OSHA) maintains the Chemical Exposure Health Data (CEHD) and the Integrated Management Information System (IMIS) databases, which contain quantitative and qualitative data resulting from compliance inspections conducted from 1984 to 2011. This analysis aimed to evaluate trends in workplace asbestos concentrations over time and across industries by combining the samples from these two databases. From 1984 to 2011, personal air samples ranged from 0.001 to 175 f/cc. Asbestos compliance sampling data associated with the construction, automotive repair, manufacturing, and chemical/petroleum/rubber industries included measurements in excess of 10 f/cc, and were above the permissible exposure limit from 2001 to 2011. The utility of combining the databases was limited by the completeness and accuracy of the data recorded. In this analysis, 40% of the data overlapped between the two databases. Other limitations included sampling bias associated with compliance sampling and errors occurring from user-entered data. A clear decreasing trend in both airborne fiber concentrations and the numbers of asbestos samples collected parallels historically decreasing trends in the consumption of asbestos, and declining mesothelioma incidence rates. Although air sampling data indicated that airborne fiber exposure potential was high (>10 f/cc for short and long-term samples) in some industries (e.g., construction, manufacturing), airborne concentrations have significantly declined over the past 30 years. Recommendations for improving the existing exposure OSHA databases are provided. Copyright © 2015. Published by Elsevier Inc.
An indoor positioning technology in the BLE mobile payment system
NASA Astrophysics Data System (ADS)
Han, Tiantian; Ding, Lei
2017-05-01
Mobile payment system for large supermarkets, the core function is through the BLE low-power Bluetooth technology to achieve the amount of payment in the mobile payment system, can through an indoor positioning technology to achieve value-added services. The technology by collecting Bluetooth RSSI, the fingerprint database of sampling points corresponding is established. To get Bluetooth module RSSI by the AP. Then, to use k-Nearest Neighbor match the value of the fingerprint database. Thereby, to help businesses find customers through the mall location, combined settlement amount of the customer's purchase of goods, to analyze customer's behavior. When the system collect signal strength, the distribution of the sampling points of RSSI is analyzed and the value is filtered. The system, used in the laboratory is designed to demonstrate the feasibility.
Database Access Manager for the Software Engineering Laboratory (DAMSEL) user's guide
NASA Technical Reports Server (NTRS)
1990-01-01
Operating instructions for the Database Access Manager for the Software Engineering Laboratory (DAMSEL) system are presented. Step-by-step instructions for performing various data entry and report generation activities are included. Sample sessions showing the user interface display screens are also included. Instructions for generating reports are accompanied by sample outputs for each of the reports. The document groups the available software functions by the classes of users that may access them.
Granitto, Matthew; DeWitt, Ed H.; Klein, Terry L.
2010-01-01
This database was initiated, designed, and populated to collect and integrate geochemical data from central Colorado in order to facilitate geologic mapping, petrologic studies, mineral resource assessment, definition of geochemical baseline values and statistics, environmental impact assessment, and medical geology. The Microsoft Access database serves as a geochemical data warehouse in support of the Central Colorado Assessment Project (CCAP) and contains data tables describing historical and new quantitative and qualitative geochemical analyses determined by 70 analytical laboratory and field methods for 47,478 rock, sediment, soil, and heavy-mineral concentrate samples. Most samples were collected by U.S. Geological Survey (USGS) personnel and analyzed either in the analytical laboratories of the USGS or by contract with commercial analytical laboratories. These data represent analyses of samples collected as part of various USGS programs and projects. In addition, geochemical data from 7,470 sediment and soil samples collected and analyzed under the Atomic Energy Commission National Uranium Resource Evaluation (NURE) Hydrogeochemical and Stream Sediment Reconnaissance (HSSR) program (henceforth called NURE) have been included in this database. In addition to data from 2,377 samples collected and analyzed under CCAP, this dataset includes archived geochemical data originally entered into the in-house Rock Analysis Storage System (RASS) database (used by the USGS from the mid-1960s through the late 1980s) and the in-house PLUTO database (used by the USGS from the mid-1970s through the mid-1990s). All of these data are maintained in the Oracle-based National Geochemical Database (NGDB). Retrievals from the NGDB and from the NURE database were used to generate most of this dataset. In addition, USGS data that have been excluded previously from the NGDB because the data predate earliest USGS geochemical databases, or were once excluded for programmatic reasons, have been included in the CCAP Geochemical Database and are planned to be added to the NGDB.
NASA Astrophysics Data System (ADS)
Poinsot, Audrey; Yang, Fan; Brost, Vincent
2011-02-01
Including multiple sources of information in personal identity recognition and verification gives the opportunity to greatly improve performance. We propose a contactless biometric system that combines two modalities: palmprint and face. Hardware implementations are proposed on the Texas Instrument Digital Signal Processor and Xilinx Field-Programmable Gate Array (FPGA) platforms. The algorithmic chain consists of a preprocessing (which includes palm extraction from hand images), Gabor feature extraction, comparison by Hamming distance, and score fusion. Fusion possibilities are discussed and tested first using a bimodal database of 130 subjects that we designed (uB database), and then two common public biometric databases (AR for face and PolyU for palmprint). High performance has been obtained for recognition and verification purpose: a recognition rate of 97.49% with AR-PolyU database and an equal error rate of 1.10% on the uB database using only two training samples per subject have been obtained. Hardware results demonstrate that preprocessing can easily be performed during the acquisition phase, and multimodal biometric recognition can be treated almost instantly (0.4 ms on FPGA). We show the feasibility of a robust and efficient multimodal hardware biometric system that offers several advantages, such as user-friendliness and flexibility.
Expert Systems the Old Fashioned Way: Person to Person.
ERIC Educational Resources Information Center
McCleary, Hunter; Mayer, William J.
1988-01-01
Describes the services of Teltech, Inc., which mimic the desirable attributes of artificial intelligence and expert systems via a "database" of 5,000 experts in technical areas and interactive literature searches executed by staff. Advantages and shortcomings of the network are exemplified by sample searches. Several sample menus and…
Hadley, Heidi K.
2000-01-01
Selected nitrogen and phosphorus (nutrient), suspended-sediment and total suspended-solids surface-water data were compiled from January 1980 through December 1995 within the Great Salt Lake Basins National Water-Quality Assessment study unit, which extends from southeastern Idaho to west-central Utah and from Great Salt Lake to the Wasatch and western Uinta Mountains. The data were retrieved from the U.S. Geological Survey National Water Information System and the State of Utah, Department of Environmental Quality, Division of Water Quality database. The Division of Water Quality database includes data that are submitted to the U.S. Environmental Protection Agency STOrage and RETrieval system. Water-quality data included in this report were selected for surface-water sites (rivers, streams, and canals) that had three or more nutrient, suspended-sediment, or total suspended-solids analyses. Also, 33 percent or more of the measurements at a site had to include discharge, and, for non-U.S. Geological Survey sites, there had to be 2 or more years of data. Ancillary data for parameters such as water temperature, pH, specific conductance, streamflow (discharge), dissolved oxygen, biochemical oxygen demand, alkalinity, and turbidity also were compiled, as available. The compiled nutrient database contains 13,511 samples from 191 selected sites. The compiled suspended-sediment and total suspended-solids database contains 11,642 samples from 142 selected sites. For the nutrient database, the median (50th percentile) sample period for individual sites is 6 years, and the 75th percentile is 14 years. The median number of samples per site is 52 and the 75th percentile is 110 samples. For the suspended-sediment and total suspended-solids database, the median sample period for individual sites is 9 years, and the 75th percentile is 14 years. The median number of samples per site is 76 and the 75th percentile is 120 samples. The compiled historical data are being used in the basinwide sampling strategy to characterize the broad-scale geographic and seasonal water-quality conditions in relation to major contaminant sources and background conditions. Data for this report are stored on a compact disc.
Lawrence N. Hudson; Joseph Wunderle M.; And Others
2016-01-01
The PREDICTS projectâProjecting Responses of Ecological Diversity In Changing Terrestrial Systems (www.predicts.org.uk)âhas collated from published studies a large, reasonably representative database of comparable samples of biodiversity from multiple sites that differ in the nature or intensity of human impacts relating to land use. We have used this evidence base to...
Jeddi, Fatemeh Rangraz; Farzandipoor, Mehrdad; Arabfard, Masoud; Hosseini, Azam Haj Mohammad
2014-04-01
The purpose of this study was investigating situation and presenting a conceptual model for clinical governance information system by using UML in two sample hospitals. However, use of information is one of the fundamental components of clinical governance; but unfortunately, it does not pay much attention to information management. A cross sectional study was conducted in October 2012- May 2013. Data were gathered through questionnaires and interviews in two sample hospitals. Face and content validity of the questionnaire has been confirmed by experts. Data were collected from a pilot hospital and reforms were carried out and Final questionnaire was prepared. Data were analyzed by descriptive statistics and SPSS 16 software. With the scenario derived from questionnaires, UML diagrams are presented by using Rational Rose 7 software. The results showed that 32.14 percent Indicators of the hospitals were calculated. Database was not designed and 100 percent of the hospital's clinical governance was required to create a database. Clinical governance unit of hospitals to perform its mission, do not have access to all the needed indicators. Defining of Processes and drawing of models and creating of database are essential for designing of information systems.
Jeddi, Fatemeh Rangraz; Farzandipoor, Mehrdad; Arabfard, Masoud; Hosseini, Azam Haj Mohammad
2016-04-01
The purpose of this study was investigating situation and presenting a conceptual model for clinical governance information system by using UML in two sample hospitals. However, use of information is one of the fundamental components of clinical governance; but unfortunately, it does not pay much attention to information management. A cross sectional study was conducted in October 2012- May 2013. Data were gathered through questionnaires and interviews in two sample hospitals. Face and content validity of the questionnaire has been confirmed by experts. Data were collected from a pilot hospital and reforms were carried out and Final questionnaire was prepared. Data were analyzed by descriptive statistics and SPSS 16 software. With the scenario derived from questionnaires, UML diagrams are presented by using Rational Rose 7 software. The results showed that 32.14 percent Indicators of the hospitals were calculated. Database was not designed and 100 percent of the hospital's clinical governance was required to create a database. Clinical governance unit of hospitals to perform its mission, do not have access to all the needed indicators. Defining of Processes and drawing of models and creating of database are essential for designing of information systems.
Shuttle Hypervelocity Impact Database
NASA Technical Reports Server (NTRS)
Hyde, James L.; Christiansen, Eric L.; Lear, Dana M.
2011-01-01
With three missions outstanding, the Shuttle Hypervelocity Impact Database has nearly 3000 entries. The data is divided into tables for crew module windows, payload bay door radiators and thermal protection system regions, with window impacts compromising just over half the records. In general, the database provides dimensions of hypervelocity impact damage, a component level location (i.e., window number or radiator panel number) and the orbiter mission when the impact occurred. Additional detail on the type of particle that produced the damage site is provided when sampling data and definitive analysis results are available. Details and insights on the contents of the database including examples of descriptive statistics will be provided. Post flight impact damage inspection and sampling techniques that were employed during the different observation campaigns will also be discussed. Potential enhancements to the database structure and availability of the data for other researchers will be addressed in the Future Work section. A related database of returned surfaces from the International Space Station will also be introduced.
Gavrielides, Mike; Furney, Simon J; Yates, Tim; Miller, Crispin J; Marais, Richard
2014-01-01
Whole genomes, whole exomes and transcriptomes of tumour samples are sequenced routinely to identify the drivers of cancer. The systematic sequencing and analysis of tumour samples, as well other oncogenomic experiments, necessitates the tracking of relevant sample information throughout the investigative process. These meta-data of the sequencing and analysis procedures include information about the samples and projects as well as the sequencing centres, platforms, data locations, results locations, alignments, analysis specifications and further information relevant to the experiments. The current work presents a sample tracking system for oncogenomic studies (Onco-STS) to store these data and make them easily accessible to the researchers who work with the samples. The system is a web application, which includes a database and a front-end web page that allows the remote access, submission and updating of the sample data in the database. The web application development programming framework Grails was used for the development and implementation of the system. The resulting Onco-STS solution is efficient, secure and easy to use and is intended to replace the manual data handling of text records. Onco-STS allows simultaneous remote access to the system making collaboration among researchers more effective. The system stores both information on the samples in oncogenomic studies and details of the analyses conducted on the resulting data. Onco-STS is based on open-source software, is easy to develop and can be modified according to a research group's needs. Hence it is suitable for laboratories that do not require a commercial system.
Amelogenin test: From forensics to quality control in clinical and biochemical genomics.
Francès, F; Portolés, O; González, J I; Coltell, O; Verdú, F; Castelló, A; Corella, D
2007-01-01
The increasing number of samples from the biomedical genetic studies and the number of centers participating in the same involves increasing risk of mistakes in the different sample handling stages. We have evaluated the usefulness of the amelogenin test for quality control in sample identification. Amelogenin test (frequently used in forensics) was undertaken on 1224 individuals participating in a biomedical study. Concordance between referred sex in the database and amelogenin test was estimated. Additional sex-error genetic detecting systems were developed. The overall concordance rate was 99.84% (1222/1224). Two samples showed a female amelogenin test outcome, being codified as males in the database. The first, after checking sex-specific biochemical and clinical profile data was found to be due to a codification error in the database. In the second, after checking the database, no apparent error was discovered because a correct male profile was found. False negatives in amelogenin male sex determination were discarded by additional tests, and feminine sex was confirmed. A sample labeling error was revealed after a new DNA extraction. The amelogenin test is a useful quality control tool for detecting sex-identification errors in large genomic studies, and can contribute to increase its validity.
Developmental validation of the PowerPlex(®) Fusion 6C System.
Ensenberger, Martin G; Lenz, Kristy A; Matthies, Learden K; Hadinoto, Gregory M; Schienman, John E; Przech, Angela J; Morganti, Michael W; Renstrom, Daniel T; Baker, Victoria M; Gawrys, Kori M; Hoogendoorn, Marlijn; Steffen, Carolyn R; Martín, Pablo; Alonso, Antonio; Olson, Hope R; Sprecher, Cynthia J; Storts, Douglas R
2016-03-01
The PowerPlex(®) Fusion 6C System is a 27-locus, six-dye, multiplex that includes all markers in the expanded CODIS core loci and increases overlap with STR database standards throughout the world. Additionally, it contains two, rapidly mutating, Y-STRs and is capable of both casework and database workflows, including direct amplification. A multi-laboratory developmental validation study was performed on the PowerPlex(®) Fusion 6C System. Here, we report the results of that study which followed SWGDAM guidelines and includes data for: species specificity, sensitivity, stability, precision, reproducibility and repeatability, case-type samples, concordance, stutter, DNA mixtures, and PCR-based procedures. Where appropriate we report data from both extracted DNA samples and direct amplification samples from various substrates and collection devices. Samples from all studies were separated on both Applied Biosystems 3500 series and 6-dye capable 3130 series Genetic Analyzers and data is reported for each. Together, the data validate the design and demonstrate the performance of the PowerPlex(®) Fusion 6C System. Copyright © 2015 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
Yager, Douglas B.; Hofstra, Albert H.; Granitto, Matthew
2012-01-01
This report emphasizes geographic information system analysis and the display of data stored in the legacy U.S. Geological Survey National Geochemical Database for use in mineral resource investigations. Geochemical analyses of soils, stream sediments, and rocks that are archived in the National Geochemical Database provide an extensive data source for investigating geochemical anomalies. A study area in the Egan Range of east-central Nevada was used to develop a geographic information system analysis methodology for two different geochemical datasets involving detailed (Bureau of Land Management Wilderness) and reconnaissance-scale (National Uranium Resource Evaluation) investigations. ArcGIS was used to analyze and thematically map geochemical information at point locations. Watershed-boundary datasets served as a geographic reference to relate potentially anomalous sample sites with hydrologic unit codes at varying scales. The National Hydrography Dataset was analyzed with Hydrography Event Management and ArcGIS Utility Network Analyst tools to delineate potential sediment-sample provenance along a stream network. These tools can be used to track potential upstream-sediment-contributing areas to a sample site. This methodology identifies geochemically anomalous sample sites, watersheds, and streams that could help focus mineral resource investigations in the field.
NASA Astrophysics Data System (ADS)
Hsu, L.; Bristol, S.; Lehnert, K. A.; Arko, R. A.; Peters, S. E.; Uhen, M. D.; Song, L.
2014-12-01
The U.S. Geological Survey (USGS) is an exemplar of the need for improved cyberinfrastructure for its vast holdings of invaluable physical geoscience data. Millions of discrete paleobiological and geological specimens lie in USGS warehouses and at the Smithsonian Institution. These specimens serve as the basis for many geologic maps and geochemical databases, and are a potential treasure trove of new scientific knowledge. The extent of this treasure is virtually unknown and inaccessible outside a small group of paleogeoscientists and geochemists. A team from the USGS, the Integrated Earth Data Applications (IEDA) facility, and the Paleobiology Database (PBDB) are working to expose information on paleontological and geochemical specimens for discovery by scientists and citizens. This project uses existing infrastructure of the System for Earth Sample Registration (SESAR) and PBDB, which already contains much of the fundamental data schemas that are necessary to accommodate USGS records. The project is also developing a new Linked Data interface for the USGS National Geochemical Database (NGDB). The International Geo Sample Number (IGSN) is the identifier that links samples between all systems. For paleontological specimens, SESAR and PBDB will be the primary repositories for USGS records, with a data syncing process to archive records within the USGS ScienceBase system. The process began with mapping the metadata fields necessary for USGS collections to the existing SESAR and PBDB data structures, while aligning them with the Observations & Measurements and Darwin Core standards. New functionality needed in SESAR included links to a USGS locality registry, fossil classifications, a spatial qualifier attribution for samples with sensitive locations, and acknowledgement of data and metadata licensing. The team is developing a harvesting mechanism to periodically transfer USGS records from within PBDB and SESAR to ScienceBase. For the NGDB, the samples are being registered with IGSNs in SESAR and the geochemical data are being published as Linked Data. This system allows the USGS collections to benefit from disciplinary and institutional strengths of the participating resources, while simultaneously increasing the discovery, accessibility, and citation of USGS physical collection holdings.
Recent advances on terrain database correlation testing
NASA Astrophysics Data System (ADS)
Sakude, Milton T.; Schiavone, Guy A.; Morelos-Borja, Hector; Martin, Glenn; Cortes, Art
1998-08-01
Terrain database correlation is a major requirement for interoperability in distributed simulation. There are numerous situations in which terrain database correlation problems can occur that, in turn, lead to lack of interoperability in distributed training simulations. Examples are the use of different run-time terrain databases derived from inconsistent on source data, the use of different resolutions, and the use of different data models between databases for both terrain and culture data. IST has been developing a suite of software tools, named ZCAP, to address terrain database interoperability issues. In this paper we discuss recent enhancements made to this suite, including improved algorithms for sampling and calculating line-of-sight, an improved method for measuring terrain roughness, and the application of a sparse matrix method to the terrain remediation solution developed at the Visual Systems Lab of the Institute for Simulation and Training. We review the application of some of these new algorithms to the terrain correlation measurement processes. The application of these new algorithms improves our support for very large terrain databases, and provides the capability for performing test replications to estimate the sampling error of the tests. With this set of tools, a user can quantitatively assess the degree of correlation between large terrain databases.
Artist Material BRDF Database for Computer Graphics Rendering
NASA Astrophysics Data System (ADS)
Ashbaugh, Justin C.
The primary goal of this thesis was to create a physical library of artist material samples. This collection provides necessary data for the development of a gonio-imaging system for use in museums to more accurately document their collections. A sample set was produced consisting of 25 panels and containing nearly 600 unique samples. Selected materials are representative of those commonly used by artists both past and present. These take into account the variability in visual appearance resulting from the materials and application techniques used. Five attributes of variability were identified including medium, color, substrate, application technique and overcoat. Combinations of these attributes were selected based on those commonly observed in museum collections and suggested by surveying experts in the field. For each sample material, image data is collected and used to measure an average bi-directional reflectance distribution function (BRDF). The results are available as a public-domain image and optical database of artist materials at art-si.org. Additionally, the database includes specifications for each sample along with other information useful for computer graphics rendering such as the rectified sample images and normal maps.
Bannasch, Detlev; Mehrle, Alexander; Glatting, Karl-Heinz; Pepperkok, Rainer; Poustka, Annemarie; Wiemann, Stefan
2004-01-01
We have implemented LIFEdb (http://www.dkfz.de/LIFEdb) to link information regarding novel human full-length cDNAs generated and sequenced by the German cDNA Consortium with functional information on the encoded proteins produced in functional genomics and proteomics approaches. The database also serves as a sample-tracking system to manage the process from cDNA to experimental read-out and data interpretation. A web interface enables the scientific community to explore and visualize features of the annotated cDNAs and ORFs combined with experimental results, and thus helps to unravel new features of proteins with as yet unknown functions. PMID:14681468
A database for propagation models
NASA Technical Reports Server (NTRS)
Kantak, Anil V.; Suwitra, Krisjani; Le, Choung
1994-01-01
A database of various propagation phenomena models that can be used by telecommunications systems engineers to obtain parameter values for systems design is presented. This is an easy-to-use tool and is currently available for either a PC using Excel software under Windows environment or a Macintosh using Excel software for Macintosh. All the steps necessary to use the software are easy and many times self-explanatory; however, a sample run of the CCIR rain attenuation model is presented.
Data Processing on Database Management Systems with Fuzzy Query
NASA Astrophysics Data System (ADS)
Şimşek, Irfan; Topuz, Vedat
In this study, a fuzzy query tool (SQLf) for non-fuzzy database management systems was developed. In addition, samples of fuzzy queries were made by using real data with the tool developed in this study. Performance of SQLf was tested with the data about the Marmara University students' food grant. The food grant data were collected in MySQL database by using a form which had been filled on the web. The students filled a form on the web to describe their social and economical conditions for the food grant request. This form consists of questions which have fuzzy and crisp answers. The main purpose of this fuzzy query is to determine the students who deserve the grant. The SQLf easily found the eligible students for the grant through predefined fuzzy values. The fuzzy query tool (SQLf) could be used easily with other database system like ORACLE and SQL server.
Granitto, Matthew; Bailey, Elizabeth A.; Schmidt, Jeanine M.; Shew, Nora B.; Gamble, Bruce M.; Labay, Keith A.
2011-01-01
The Alaska Geochemical Database (AGDB) was created and designed to compile and integrate geochemical data from Alaska in order to facilitate geologic mapping, petrologic studies, mineral resource assessments, definition of geochemical baseline values and statistics, environmental impact assessments, and studies in medical geology. This Microsoft Access database serves as a data archive in support of present and future Alaskan geologic and geochemical projects, and contains data tables describing historical and new quantitative and qualitative geochemical analyses. The analytical results were determined by 85 laboratory and field analytical methods on 264,095 rock, sediment, soil, mineral and heavy-mineral concentrate samples. Most samples were collected by U.S. Geological Survey (USGS) personnel and analyzed in USGS laboratories or, under contracts, in commercial analytical laboratories. These data represent analyses of samples collected as part of various USGS programs and projects from 1962 to 2009. In addition, mineralogical data from 18,138 nonmagnetic heavy mineral concentrate samples are included in this database. The AGDB includes historical geochemical data originally archived in the USGS Rock Analysis Storage System (RASS) database, used from the mid-1960s through the late 1980s and the USGS PLUTO database used from the mid-1970s through the mid-1990s. All of these data are currently maintained in the Oracle-based National Geochemical Database (NGDB). Retrievals from the NGDB were used to generate most of the AGDB data set. These data were checked for accuracy regarding sample location, sample media type, and analytical methods used. This arduous process of reviewing, verifying and, where necessary, editing all USGS geochemical data resulted in a significantly improved Alaska geochemical dataset. USGS data that were not previously in the NGDB because the data predate the earliest USGS geochemical databases, or were once excluded for programmatic reasons, are included here in the AGDB and will be added to the NGDB. The AGDB data provided here are the most accurate and complete to date, and should be useful for a wide variety of geochemical studies. The AGDB data provided in the linked database may be updated or changed periodically. The data on the DVD and in the data downloads provided with this report are current as of date of publication.
Jeddi, Fatemeh Rangraz; Farzandipoor, Mehrdad; Arabfard, Masoud; Hosseini, Azam Haj Mohammad
2016-01-01
Objective: The purpose of this study was investigating situation and presenting a conceptual model for clinical governance information system by using UML in two sample hospitals. Background: However, use of information is one of the fundamental components of clinical governance; but unfortunately, it does not pay much attention to information management. Material and Methods: A cross sectional study was conducted in October 2012- May 2013. Data were gathered through questionnaires and interviews in two sample hospitals. Face and content validity of the questionnaire has been confirmed by experts. Data were collected from a pilot hospital and reforms were carried out and Final questionnaire was prepared. Data were analyzed by descriptive statistics and SPSS 16 software. Results: With the scenario derived from questionnaires, UML diagrams are presented by using Rational Rose 7 software. The results showed that 32.14 percent Indicators of the hospitals were calculated. Database was not designed and 100 percent of the hospital’s clinical governance was required to create a database. Conclusion: Clinical governance unit of hospitals to perform its mission, do not have access to all the needed indicators. Defining of Processes and drawing of models and creating of database are essential for designing of information systems. PMID:27147804
Jeddi, Fatemeh Rangraz; Farzandipoor, Mehrdad; Arabfard, Masoud; Hosseini, Azam Haj Mohammad
2014-01-01
Objective: The purpose of this study was investigating situation and presenting a conceptual model for clinical governance information system by using UML in two sample hospitals. Background: However, use of information is one of the fundamental components of clinical governance; but unfortunately, it does not pay much attention to information management. Material and Methods: A cross sectional study was conducted in October 2012- May 2013. Data were gathered through questionnaires and interviews in two sample hospitals. Face and content validity of the questionnaire has been confirmed by experts. Data were collected from a pilot hospital and reforms were carried out and Final questionnaire was prepared. Data were analyzed by descriptive statistics and SPSS 16 software. Results: With the scenario derived from questionnaires, UML diagrams are presented by using Rational Rose 7 software. The results showed that 32.14 percent Indicators of the hospitals were calculated. Database was not designed and 100 percent of the hospital’s clinical governance was required to create a database. Conclusion: Clinical governance unit of hospitals to perform its mission, do not have access to all the needed indicators. Defining of Processes and drawing of models and creating of database are essential for designing of information systems. PMID:24825933
NASA Astrophysics Data System (ADS)
Petpairote, Chayanut; Madarasmi, Suthep; Chamnongthai, Kosin
2018-01-01
The practical identification of individuals using facial recognition techniques requires the matching of faces with specific expressions to faces from a neutral face database. A method for facial recognition under varied expressions against neutral face samples of individuals via recognition of expression warping and the use of a virtual expression-face database is proposed. In this method, facial expressions are recognized and the input expression faces are classified into facial expression groups. To aid facial recognition, the virtual expression-face database is sorted into average facial-expression shapes and by coarse- and fine-featured facial textures. Wrinkle information is also employed in classification by using a process of masking to adjust input faces to match the expression-face database. We evaluate the performance of the proposed method using the CMU multi-PIE, Cohn-Kanade, and AR expression-face databases, and we find that it provides significantly improved results in terms of face recognition accuracy compared to conventional methods and is acceptable for facial recognition under expression variation.
INFOSAM: A Sample Database Management System.
1981-12-01
PROGRAM ELEMENT. PROJECT, TASA Sloan School of Management AREA WORK UNIT NUMBERS Massachusetts Institute of Technology Cambridge, MA 02139 II...96 NSETCAT .. ............................. 96 Inter -level Communication Databases .... 99 DEEAR ...................... 100 DVAR...Conceptual level, and the External level. The Inter - nal level represents a union of Hsu’s proposed Unary and Binary levels. The rationale for combining the
Coplen, Tyler B.
2000-01-01
The reliability and accuracy of isotopic data can be improved by utilizing database software to (i) store information about samples, (ii) store the results of mass spectrometric isotope-ratio analyses of samples, (iii) calculate analytical results using standardized algorithms stored in a database, (iv) normalize stable isotopic data to international scales using isotopic reference materials, and (v) generate multi-sheet paper templates for convenient sample loading of automated mass-spectrometer sample preparation manifolds. Such a database program, the Laboratory Information Management System (LIMS) for Light Stable Isotopes, is presented herein. Major benefits of this system include (i) a dramatic improvement in quality assurance, (ii) an increase in laboratory efficiency, (iii) a reduction in workload due to the elimination or reduction of retyping of data by laboratory personnel, and (iv) a decrease in errors in data reported to sample submitters. Such a database provides a complete record of when and how often laboratory reference materials have been analyzed and provides a record of what correction factors have been used through time. It provides an audit trail for laboratories. LIMS for Light Stable Isotopes is available for both Microsoft Office 97 Professional and Microsoft Office 2000 Professional as versions 7 and 8, respectively. Both source code (mdb file) and precompiled executable files (mde) are available. Numerous improvements have been made for continuous flow isotopic analysis in this version (specifically 7.13 for Microsoft Access 97 and 8.13 for Microsoft Access 2000). It is much easier to import isotopic results from Finnigan ISODAT worksheets, even worksheets on which corrections for amount of sample (linearity corrections) have been added. The capability to determine blank corrections using isotope mass balance from analyses of elemental analyzer samples has been added. It is now possible to calculate and apply drift corrections to isotopic data based on the time of day of analysis. Whereas Finnigan ISODAT software is confined to using only a single peak for calculating delta values, LIMS now enables one to use the mean of two or more reference injections during a continuous flow analysis to calculate delta values. This is useful with Finnigan?s GasBench II online sample preparation system. Concentrations of carbon, nitrogen, and sulfur can be calculated based one or more isotopic reference materials analyzed with a group of samples. Both sample data and isotopic analysis data can now be exported to Excel files. A calculator for determining the amount of sample needed for isotopic analysis based on a previous amount of sample and continuous flow area is now an integral part of LIMS for Light Stable Isotopes. LIMS for Light Stable Isotopes can now assign an error code to Finnigan elemental analyzer analyses in which one of the electrometers has saturated due to analysis of too much sample material, giving rise to incorrect isotopic abundances. Information on downloading this report and downloading code and databases is provided at the Internet addresses: http://water.usgs.gov/software/geochemical.html or http://www.geogr.uni-jena.de/software/geochemical.html in the Eastern Hemisphere.
Granitto, Matthew; Schmidt, Jeanine M.; Shew, Nora B.; Gamble, Bruce M.; Labay, Keith A.
2013-01-01
The Alaska Geochemical Database Version 2.0 (AGDB2) contains new geochemical data compilations in which each geologic material sample has one “best value” determination for each analyzed species, greatly improving speed and efficiency of use. Like the Alaska Geochemical Database (AGDB, http://pubs.usgs.gov/ds/637/) before it, the AGDB2 was created and designed to compile and integrate geochemical data from Alaska in order to facilitate geologic mapping, petrologic studies, mineral resource assessments, definition of geochemical baseline values and statistics, environmental impact assessments, and studies in medical geology. This relational database, created from the Alaska Geochemical Database (AGDB) that was released in 2011, serves as a data archive in support of present and future Alaskan geologic and geochemical projects, and contains data tables in several different formats describing historical and new quantitative and qualitative geochemical analyses. The analytical results were determined by 85 laboratory and field analytical methods on 264,095 rock, sediment, soil, mineral and heavy-mineral concentrate samples. Most samples were collected by U.S. Geological Survey personnel and analyzed in U.S. Geological Survey laboratories or, under contracts, in commercial analytical laboratories. These data represent analyses of samples collected as part of various U.S. Geological Survey programs and projects from 1962 through 2009. In addition, mineralogical data from 18,138 nonmagnetic heavy-mineral concentrate samples are included in this database. The AGDB2 includes historical geochemical data originally archived in the U.S. Geological Survey Rock Analysis Storage System (RASS) database, used from the mid-1960s through the late 1980s and the U.S. Geological Survey PLUTO database used from the mid-1970s through the mid-1990s. All of these data are currently maintained in the National Geochemical Database (NGDB). Retrievals from the NGDB were used to generate most of the AGDB data set. These data were checked for accuracy regarding sample location, sample media type, and analytical methods used. This arduous process of reviewing, verifying and, where necessary, editing all U.S. Geological Survey geochemical data resulted in a significantly improved Alaska geochemical dataset. USGS data that were not previously in the NGDB because the data predate the earliest U.S. Geological Survey geochemical databases, or were once excluded for programmatic reasons, are included here in the AGDB2 and will be added to the NGDB. The AGDB2 data provided here are the most accurate and complete to date, and should be useful for a wide variety of geochemical studies. The AGDB2 data provided in the linked database may be updated or changed periodically.
Analysis of lane change crashes
DOT National Transportation Integrated Search
2003-03-01
This report defines the problem of lane change crashes in the United States (U.S.) based on data from the 1999 National Automotive Sampling System/General Estimates System (GES) crash database of the National Highway Traffic Safety Administration. Th...
2012-09-01
relative performance of several conventional SQL and NoSQL databases with a set of one billion file block hashes. Digital Forensics, Sector Hashing, Full... NoSQL databases with a set of one billion file block hashes. v THIS PAGE INTENTIONALLY LEFT BLANK vi Table of Contents List of Acronyms and...Operating System NOOP No Operation assembly instruction NoSQL “Not only SQL” model for non-relational database management NSRL National Software
Publishing Linked Open Data for Physical Samples - Lessons Learned
NASA Astrophysics Data System (ADS)
Ji, P.; Arko, R. A.; Lehnert, K.; Bristol, S.
2016-12-01
Most data and information about physical samples and associated sampling features currently reside in relational databases. Integrating common concepts from various databases has motivated us to publish Linked Open Data for collections of physical samples, using Semantic Web technologies including the Resource Description Framework (RDF), RDF Query Language (SPARQL), and Web Ontology Language (OWL). The goal of our work is threefold: To evaluate and select ontologies in different granularities for common concepts; to establish best practices and develop a generic methodology for publishing physical sample data stored in relational database as Linked Open Data; and to reuse standard community vocabularies from the International Commission on Stratigraphy (ICS), Global Volcanism Program (GVP), General Bathymetric Chart of the Oceans (GEBCO), and others. Our work leverages developments in the EarthCube GeoLink project and the Interdisciplinary Earth Data Alliance (IEDA) facility for modeling and extracting physical sample data stored in relational databases. Reusing ontologies developed by GeoLink and IEDA has facilitated discovery and integration of data and information across multiple collections including the USGS National Geochemical Database (NGDB), System for Earth Sample Registration (SESAR), and Index to Marine & Lacustrine Geological Samples (IMLGS). We have evaluated, tested, and deployed Linked Open Data tools including Morph, Virtuoso Server, LodView, LodLive, and YASGUI for converting, storing, representing, and querying data in a knowledge base (RDF triplestore). Using persistent identifiers such as Open Researcher & Contributor IDs (ORCIDs) and International Geo Sample Numbers (IGSNs) at the record level makes it possible for other repositories to link related resources such as persons, datasets, documents, expeditions, awards, etc. to samples, features, and collections. This work is supported by the EarthCube "GeoLink" project (NSF# ICER14-40221 and others) and the "USGS-IEDA Partnership to Support a Data Lifecycle Framework and Tools" project (USGS# G13AC00381).
2010-01-01
Background A plant-based diet protects against chronic oxidative stress-related diseases. Dietary plants contain variable chemical families and amounts of antioxidants. It has been hypothesized that plant antioxidants may contribute to the beneficial health effects of dietary plants. Our objective was to develop a comprehensive food database consisting of the total antioxidant content of typical foods as well as other dietary items such as traditional medicine plants, herbs and spices and dietary supplements. This database is intended for use in a wide range of nutritional research, from in vitro and cell and animal studies, to clinical trials and nutritional epidemiological studies. Methods We procured samples from countries worldwide and assayed the samples for their total antioxidant content using a modified version of the FRAP assay. Results and sample information (such as country of origin, product and/or brand name) were registered for each individual food sample and constitute the Antioxidant Food Table. Results The results demonstrate that there are several thousand-fold differences in antioxidant content of foods. Spices, herbs and supplements include the most antioxidant rich products in our study, some exceptionally high. Berries, fruits, nuts, chocolate, vegetables and products thereof constitute common foods and beverages with high antioxidant values. Conclusions This database is to our best knowledge the most comprehensive Antioxidant Food Database published and it shows that plant-based foods introduce significantly more antioxidants into human diet than non-plant foods. Because of the large variations observed between otherwise comparable food samples the study emphasizes the importance of using a comprehensive database combined with a detailed system for food registration in clinical and epidemiological studies. The present antioxidant database is therefore an essential research tool to further elucidate the potential health effects of phytochemical antioxidants in diet. PMID:20096093
Bar-Code System for a Microbiological Laboratory
NASA Technical Reports Server (NTRS)
Law, Jennifer; Kirschner, Larry
2007-01-01
A bar-code system has been assembled for a microbiological laboratory that must examine a large number of samples. The system includes a commercial bar-code reader, computer hardware and software components, plus custom-designed database software. The software generates a user-friendly, menu-driven interface.
Palaeo-sea-level and palaeo-ice-sheet databases: problems, strategies, and perspectives
NASA Astrophysics Data System (ADS)
Düsterhus, André; Rovere, Alessio; Carlson, Anders E.; Horton, Benjamin P.; Klemann, Volker; Tarasov, Lev; Barlow, Natasha L. M.; Bradwell, Tom; Clark, Jorie; Dutton, Andrea; Gehrels, W. Roland; Hibbert, Fiona D.; Hijma, Marc P.; Khan, Nicole; Kopp, Robert E.; Sivan, Dorit; Törnqvist, Torbjörn E.
2016-04-01
Sea-level and ice-sheet databases have driven numerous advances in understanding the Earth system. We describe the challenges and offer best strategies that can be adopted to build self-consistent and standardised databases of geological and geochemical information used to archive palaeo-sea-levels and palaeo-ice-sheets. There are three phases in the development of a database: (i) measurement, (ii) interpretation, and (iii) database creation. Measurement should include the objective description of the position and age of a sample, description of associated geological features, and quantification of uncertainties. Interpretation of the sample may have a subjective component, but it should always include uncertainties and alternative or contrasting interpretations, with any exclusion of existing interpretations requiring a full justification. During the creation of a database, an approach based on accessibility, transparency, trust, availability, continuity, completeness, and communication of content (ATTAC3) must be adopted. It is essential to consider the community that creates and benefits from a database. We conclude that funding agencies should not only consider the creation of original data in specific research-question-oriented projects, but also include the possibility of using part of the funding for IT-related and database creation tasks, which are essential to guarantee accessibility and maintenance of the collected data.
One approach to design of speech emotion database
NASA Astrophysics Data System (ADS)
Uhrin, Dominik; Chmelikova, Zdenka; Tovarek, Jaromir; Partila, Pavol; Voznak, Miroslav
2016-05-01
This article describes a system for evaluating the credibility of recordings with emotional character. Sound recordings form Czech language database for training and testing systems of speech emotion recognition. These systems are designed to detect human emotions in his voice. The emotional state of man is useful in the security forces and emergency call service. Man in action (soldier, police officer and firefighter) is often exposed to stress. Information about the emotional state (his voice) will help to dispatch to adapt control commands for procedure intervention. Call agents of emergency call service must recognize the mental state of the caller to adjust the mood of the conversation. In this case, the evaluation of the psychological state is the key factor for successful intervention. A quality database of sound recordings is essential for the creation of the mentioned systems. There are quality databases such as Berlin Database of Emotional Speech or Humaine. The actors have created these databases in an audio studio. It means that the recordings contain simulated emotions, not real. Our research aims at creating a database of the Czech emotional recordings of real human speech. Collecting sound samples to the database is only one of the tasks. Another one, no less important, is to evaluate the significance of recordings from the perspective of emotional states. The design of a methodology for evaluating emotional recordings credibility is described in this article. The results describe the advantages and applicability of the developed method.
The Brain Database: A Multimedia Neuroscience Database for Research and Teaching
Wertheim, Steven L.
1989-01-01
The Brain Database is an information tool designed to aid in the integration of clinical and research results in neuroanatomy and regional biochemistry. It can handle a wide range of data types including natural images, 2 and 3-dimensional graphics, video, numeric data and text. It is organized around three main entities: structures, substances and processes. The database will support a wide variety of graphical interfaces. Two sample interfaces have been made. This tool is intended to serve as one component of a system that would allow neuroscientists and clinicians 1) to represent clinical and experimental data within a common framework 2) to compare results precisely between experiments and among laboratories, 3) to use computing tools as an aid in collaborative work and 4) to contribute to a shared and accessible body of knowledge about the nervous system.
GANSEKI: JAMSTEC Deep Seafloor Rock Sample Database Emerging to the New Phase
NASA Astrophysics Data System (ADS)
Tomiyama, T.; Ichiyama, Y.; Horikawa, H.; Sato, Y.; Soma, S.; Hanafusa, Y.
2013-12-01
Japan Agency for Marine-Earth Science and Technology (JAMSTEC) collects a lot of substantial samples as well as various geophysical data using its research vessels and submersibles. These samples and data, which are obtained by spending large amounts of human and physical resources, are precious wealth of the world scientific community. For the better use of these samples and data, it is important that they are utilized not only for initial purpose of each cruse but also for other general scientific and educational purposes of second-hand users. Based on the JAMSTEC data and sample handling policies [1], JAMSTEC has systematically stored samples and data obtained during research cruises, and provided them to domestic/foreign activities on research, education, and public relation. Being highly valued for second-hand usability, deep seafloor rock samples are one of the most important types of samples obtained by JAMSTEC, as oceanic biological samples and sediment core samples are. Rock samples can be utilized for natural history sciences and other various purposes; some of these purposes are connected to socially important issues such as earthquake mechanisms and mineral resource developments. Researchers and educators can access to JAMSTEC rock samples and associated data through 'GANSEKI [2]', the JAMSTEC Deep Seafloor Rock Sample Database. GANSEKI was established on the Internet in 2006 and its contents and functions have been continuously enriched and upgraded since then. GANSEKI currently provides 19 thousands of sample metadata, 9 thousands of collection inventory data and 18 thousands of geochemical data. Most of these samples are recovered from the North-western Pacific Ocean, although samples from other area are also included. The major update of GANSEKI held in May 2013 involved a replacement of database core system and a redesign of user interface. In the new GANSEKI, users can select samples easily and precisely using multi-index search, numerical constraints on geochemical data and thumbnail browsing of sample and thin-section photos. 'MyList' function allows users to organize, compare and download the data of selected samples. To develop a close network among online databases, the new GANSEKI allows multiple URL entries for individual samples. Now the curatorial staffs are working for maintaining references to other JAMSTEC databases such as 'DARWIN [3]' and 'J-EDI [4]'.
Alaska Geochemical Database - Mineral Exploration Tool for the 21st Century - PDF of presentation
Granitto, Matthew; Schmidt, Jeanine M.; Labay, Keith A.; Shew, Nora B.; Gamble, Bruce M.
2012-01-01
The U.S. Geological Survey has created a geochemical database of geologic material samples collected in Alaska. This database is readily accessible to anyone with access to the Internet. Designed as a tool for mineral or environmental assessment, land management, or mineral exploration, the initial version of the Alaska Geochemical Database - U.S. Geological Survey Data Series 637 - contains geochemical, geologic, and geospatial data for 264,158 samples collected from 1962-2009: 108,909 rock samples; 92,701 sediment samples; 48,209 heavy-mineral-concentrate samples; 6,869 soil samples; and 7,470 mineral samples. In addition, the Alaska Geochemical Database contains mineralogic data for 18,138 nonmagnetic-fraction heavy mineral concentrates, making it the first U.S. Geological Survey database of this scope that contains both geochemical and mineralogic data. Examples from the Alaska Range will illustrate potential uses of the Alaska Geochemical Database in mineral exploration. Data from the Alaska Geochemical Database have been extensively checked for accuracy of sample media description, sample site location, and analytical method using U.S. Geological Survey sample-submittal archives and U.S. Geological Survey publications (plus field notebooks and sample site compilation base maps from the Alaska Technical Data Unit in Anchorage, Alaska). The database is also the repository for nearly all previously released U.S. Geological Survey Alaska geochemical datasets. Although the Alaska Geochemical Database is a fully relational database in Microsoft® Access 2003 and 2010 formats, these same data are also provided as a series of spreadsheet files in Microsoft® Excel 2003 and 2010 formats, and as ASCII text files. A DVD version of the Alaska Geochemical Database was released in October 2011, as U.S. Geological Survey Data Series 637, and data downloads are available at http://pubs.usgs.gov/ds/637/. Also, all Alaska Geochemical Database data have been incorporated into the interactive U.S. Geological Survey Mineral Resource Data web portal, available at http://mrdata.usgs.gov/.
Research on computer virus database management system
NASA Astrophysics Data System (ADS)
Qi, Guoquan
2011-12-01
The growing proliferation of computer viruses becomes the lethal threat and research focus of the security of network information. While new virus is emerging, the number of viruses is growing, virus classification increasing complex. Virus naming because of agencies' capture time differences can not be unified. Although each agency has its own virus database, the communication between each other lacks, or virus information is incomplete, or a small number of sample information. This paper introduces the current construction status of the virus database at home and abroad, analyzes how to standardize and complete description of virus characteristics, and then gives the information integrity, storage security and manageable computer virus database design scheme.
Towards a Selenographic Information System: Apollo 15 Mission Digitization
NASA Astrophysics Data System (ADS)
Votava, J. E.; Petro, N. E.
2012-12-01
The Apollo missions represent some of the most technically complex and extensively documented explorations ever endeavored by mankind. The surface experiments performed and the lunar samples collected in-situ have helped form our understanding of the Moon's geologic history and the history of our Solar System. Unfortunately, a complication exists in the analysis and accessibility of these large volumes of lunar data and historical Apollo Era documents due to their multiple formats and disconnected web and print locations. Described here is a project to modernize, spatially reference, and link the lunar data into a comprehensive SELENOGRAPHIC INFORMATION SYSTEM, starting with the Apollo 15 mission. Like its terrestrial counter-parts, Geographic Information System (GIS) programs, such as ArcGIS, allow for easy integration, access, analysis, and display of large amounts of spatially-related data. Documentation in this new database includes surface photographs, panoramas, samples and their laboratory studies (major element and rare earth element weight percents), planned and actual vehicle traverses, and field notes. Using high-resolution (<0.25 m/pixel) images from the Lunar Reconnaissance Orbiter Camera (LROC) the rover (LRV) tracks and astronaut surface activities, along with field sketches from the Apollo 15 Preliminary Science Report (Swann, 1972), were digitized and mapped in ArcMap. Point features were created for each documented sample within the Lunar Sample Compendium (Meyer, 2010) and hyperlinked to the appropriate Compendium file (.PDF) at the stable archive site: http://curator.jsc.nasa.gov/lunar/compendium.cfm. Historical Apollo Era photographs and assembled panoramas were included as point features at each station that have been hyperlinked to the Apollo Lunar Surface Journal (ALSJ) online image library. The database has been set up to allow for the easy display of spatial variation of select attributes between samples. Attributes of interest that have data from the Compendium added directly into the database include age (Ga), mass, texture, major oxide elements (weight %), and Th and U (ppm). This project will produce an easily accessible and linked database that can offer technical and scientific information in its spatial context. While it is not possible given the enormous amounts of data, and the small allotment of time, to enter and/or link every detail to its map layer, the links that have been made here direct the user to rich, stable archive websites and web-based databases that are easy to navigate. While this project only created a product for the Apollo 15 mission, it is the model for spatially-referencing the other Apollo missions. Such a comprehensive lunar surface-activities database, a Selenographic Information System, will likely prove invaluable for future lunar studies. References: Meyer, C. (2010), The lunar sample compendium, June 2012 to August 2012, http://curator.jsc.nasa.gov/lunar/compendium.cfm, Astromaterials Res. & Exploration Sci., NASA L. B. Johnson Space Cent., Houston, TX. Swann, G. A. (1972), Preliminary geologic investigation of the Apollo 15 landing site, in Apollo 15 Preliminary Science Report, [NASA SP-289], pp. 5-1 - 5-112, NASA Manned Spacecraft Cent., Washington, D.C.
Zilaout, Hicham; Vlaanderen, Jelle; Houba, Remko; Kromhout, Hans
2017-07-01
In 2000, a prospective Dust Monitoring Program (DMP) was started in which measurements of worker's exposure to respirable dust and quartz are collected in member companies from the European Industrial Minerals Association (IMA-Europe). After 15 years, the resulting IMA-DMP database allows a detailed overview of exposure levels of respirable dust and quartz over time within this industrial sector. Our aim is to describe the IMA-DMP and the current state of the corresponding database which due to continuation of the IMA-DMP is still growing. The future use of the database will also be highlighted including its utility for the industrial minerals producing sector. Exposure data are being obtained following a common protocol including a standardized sampling strategy, standardized sampling and analytical methods and a data management system. Following strict quality control procedures, exposure data are consequently added to a central database. The data comprises personal exposure measurements including auxiliary information on work and other conditions during sampling. Currently, the IMA-DMP database consists of almost 28,000 personal measurements which have been performed from 2000 until 2015 representing 29 half-yearly sampling campaigns. The exposure data have been collected from 160 different worksites owned by 35 industrial mineral companies and comes from 23 European countries and approximately 5000 workers. The IMA-DMP database provides the European minerals sector with reliable data regarding worker personal exposures to respirable dust and quartz. The database can be used as a powerful tool to address outstanding scientific issues on long-term exposure trends and exposure variability, and importantly, as a surveillance tool to evaluate exposure control measures. The database will be valuable for future epidemiological studies on respiratory health effects and will allow for estimation of quantitative exposure response relationships. Copyright © 2017 The Authors. Published by Elsevier GmbH.. All rights reserved.
Al-Nasheri, Ahmed; Muhammad, Ghulam; Alsulaiman, Mansour; Ali, Zulfiqar; Mesallam, Tamer A; Farahat, Mohamed; Malki, Khalid H; Bencherif, Mohamed A
2017-01-01
Automatic voice-pathology detection and classification systems may help clinicians to detect the existence of any voice pathologies and the type of pathology from which patients suffer in the early stages. The main aim of this paper is to investigate Multidimensional Voice Program (MDVP) parameters to automatically detect and classify the voice pathologies in multiple databases, and then to find out which parameters performed well in these two processes. Samples of the sustained vowel /a/ of normal and pathological voices were extracted from three different databases, which have three voice pathologies in common. The selected databases in this study represent three distinct languages: (1) the Arabic voice pathology database; (2) the Massachusetts Eye and Ear Infirmary database (English database); and (3) the Saarbruecken Voice Database (German database). A computerized speech lab program was used to extract MDVP parameters as features, and an acoustical analysis was performed. The Fisher discrimination ratio was applied to rank the parameters. A t test was performed to highlight any significant differences in the means of the normal and pathological samples. The experimental results demonstrate a clear difference in the performance of the MDVP parameters using these databases. The highly ranked parameters also differed from one database to another. The best accuracies were obtained by using the three highest ranked MDVP parameters arranged according to the Fisher discrimination ratio: these accuracies were 99.68%, 88.21%, and 72.53% for the Saarbruecken Voice Database, the Massachusetts Eye and Ear Infirmary database, and the Arabic voice pathology database, respectively. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
MouseNet database: digital management of a large-scale mutagenesis project.
Pargent, W; Heffner, S; Schäble, K F; Soewarto, D; Fuchs, H; Hrabé de Angelis, M
2000-07-01
The Munich ENU Mouse Mutagenesis Screen is a large-scale mutant production, phenotyping, and mapping project. It encompasses two animal breeding facilities and a number of screening groups located in the general area of Munich. A central database is required to manage and process the immense amount of data generated by the mutagenesis project. This database, which we named MouseNet(c), runs on a Sybase platform and will finally store and process all data from the entire project. In addition, the system comprises a portfolio of functions needed to support the workflow management of the core facility and the screening groups. MouseNet(c) will make all of the data available to the participating screening groups, and later to the international scientific community. MouseNet(c) will consist of three major software components:* Animal Management System (AMS)* Sample Tracking System (STS)* Result Documentation System (RDS)MouseNet(c) provides the following major advantages:* being accessible from different client platforms via the Internet* being a full-featured multi-user system (including access restriction and data locking mechanisms)* relying on a professional RDBMS (relational database management system) which runs on a UNIX server platform* supplying workflow functions and a variety of plausibility checks.
NASA Astrophysics Data System (ADS)
Bowring, J. F.; McLean, N. M.; Walker, J. D.; Gehrels, G. E.; Rubin, K. H.; Dutton, A.; Bowring, S. A.; Rioux, M. E.
2015-12-01
The Cyber Infrastructure Research and Development Lab for the Earth Sciences (CIRDLES.org) has worked collaboratively for the last decade with geochronologists from EARTHTIME and EarthChem to build cyberinfrastructure geared to ensuring transparency and reproducibility in geoscience workflows and is engaged in refining and extending that work to serve additional geochronology domains during the next decade. ET_Redux (formerly U-Pb_Redux) is a free open-source software system that provides end-to-end support for the analysis of U-Pb geochronological data. The system reduces raw mass spectrometer (TIMS and LA-ICPMS) data to U-Pb dates, allows users to interpret ages from these data, and then facilitates the seamless federation of the results from one or more labs into a community web-accessible database using standard and open techniques. This EarthChem database - GeoChron.org - depends on keyed references to the System for Earth Sample Registration (SESAR) database that stores metadata about registered samples. These keys are each a unique International Geo Sample Number (IGSN) assigned to a sample and to its derivatives. ET_Redux provides for interaction with this archive, allowing analysts to store, maintain, retrieve, and share their data and analytical results electronically with whomever they choose. This initiative has created an open standard for the data elements of a complete reduction and analysis of U-Pb data, and is currently working to complete the same for U-series geochronology. We have demonstrated the utility of interdisciplinary collaboration between computer scientists and geoscientists in achieving a working and useful system that provides transparency and supports reproducibility, allowing geochemists to focus on their specialties. The software engineering community also benefits by acquiring research opportunities to improve development process methodologies used in the design, implementation, and sustainability of domain-specific software.
Nakashima, Shinya; Hayashi, Yuzuru
2016-01-01
The aim of this paper is to propose a stochastic method for estimating the detection limits (DLs) and quantitation limits (QLs) of compounds registered in a database of a GC/MS system and prove its validity with experiments. The approach described in ISO 11843 Part 7 is adopted here as an estimation means of DL and QL, and the decafluorotriphenylphosphine (DFTPP) tuning and retention time locking are carried out for adjusting the system. Coupled with the data obtained from the system adjustment experiments, the information (noise and signal of chromatograms and calibration curves) stored in the database is used for the stochastic estimation, dispensing with the repetition measurements. Of sixty-six pesticides, the DL values obtained by the ISO method were compared with those from the statistical approach and the correlation between them was observed to be excellent with the correlation coefficient of 0.865. The accuracy of the method proposed was also examined and concluded to be satisfactory as well. The samples used are commercial products of pesticides mixtures and the uncertainty from sample preparation processes is not taken into account. PMID:27162706
Shah, Sachin D.; Quigley, Sean M.
2005-01-01
Air Force Plant 4 (AFP4) and adjacent Naval Air Station-Joint Reserve Base (NAS-JRB) at Fort Worth, Tex., constitute a government-owned, contractor-operated (GOCO) facility that has been in operation since 1942. Contaminants from the facility, primarily volatile organic compounds (VOCs) and metals, have entered the groundwater-flow system through leakage from waste-disposal sites (landfills and pits) and from manufacturing processes (U.S. Air Force, Aeronautical Systems Center, 1995). The U.S. Geological Survey (USGS), in cooperation with the U.S. Air Force (USAF), Aeronautical Systems Center, Environmental Management Directorate (ASC/ENVR), developed a comprehensive database (or geodatabase) of temporal and spatial environmental information associated with the geology, hydrology, and water quality at AFP4 and NAS-JRB. The database of this report provides information about the AFP4 and NAS-JRB study area including sample location names, identification numbers, locations, historical dates, and various measured hydrologic data. This database does not include every sample location at the site, but is limited to an aggregation of selected digital and hardcopy data of the USAF, USGS, and various consultants who have previously or are currently working at the site.
Windshear certification data base for forward-look detection systems
NASA Technical Reports Server (NTRS)
Switzer, George F.; Hinton, David A.; Proctor, Fred H.
1994-01-01
Described is an introduction to a comprehensive database that is to be used for certification testing of airborne forward-look windshear detection systems. The database was developed by NASA Langley Research Center, at the request of the Federal Aviation Administration (FAA), to support the industry initiative to certify and produce forward-looking windshear detection equipment. The database contains high-resolution three-dimensional fields for meteorological variables that may be sensed by forward-looking systems. The database is made up of seven case studies that are generated by the Terminal Area Simulation System, a state-of-the-art numerical system for the realistic modeling of windshear phenomena. The selected cases contained in the certification documentation represent a wide spectrum of windshear events. The database will be used with vendor-developed sensor simulation software and vendor-collected ground-clutter data to demonstrate detection performance in a variety of meteorological conditions using NASA/FAA pre-defined path scenarios for each of the certification cases. A brief outline of the contents and sample plots from the database documentation are included. These plots show fields of hazard factor, or F-factor (Bowles 1990), radar reflectivity, and velocity vectors on a horizontal plane overlayed with the applicable certification paths. For the plot of the F-factor field the region of 0.105 and above signify an area of hazardous, performance decreasing windshear, while negative values indicate regions of performance increasing windshear. The values of F-factor are based on 1-Km averaged segments along horizontal flight paths, assuming an air speed of 150 knots (approx. 75 m/s). The database has been released to vendors participating in the certification process. The database and associated document have been transferred to the FAA for archival storage and distribution.
NASA Technical Reports Server (NTRS)
Strahler, A. H.; Woodcock, C. E.; Logan, T. L.
1983-01-01
A timber inventory of the Eldorado National Forest, located in east-central California, provides an example of the use of a Geographic Information System (GIS) to stratify large areas of land for sampling and the collection of statistical data. The raster-based GIS format of the VICAR/IBIS software system allows simple and rapid tabulation of areas, and facilitates the selection of random locations for ground sampling. Algorithms that simplify the complex spatial pattern of raster-based information, and convert raster format data to strings of coordinate vectors, provide a link to conventional vector-based geographic information systems.
Abedini, Atosa A.; Hurwitz, S.; Evans, William C.
2006-01-01
The database (Version 1.0) is a MS-Excel file that contains close to 5,000 entries of published information on noble gas concentrations and isotopic ratios from volcanic systems in Mid-Ocean ridges, ocean islands, seamounts, and oceanic and continental arcs (location map). Where they were available we also included the isotopic ratios of strontium, neodymium, and carbon. The database is sub-divided both into material sampled (e.g., volcanic glass, different minerals, fumarole, spring), and into different tectonic settings (MOR, ocean islands, volcanic arcs). Included is also a reference list in MS-Word and pdf from which the data was derived. The database extends previous compilations by Ozima (1994), Farley and Neroda (1998), and Graham (2002). The extended database allows scientists to test competing hypotheses, and it provides a framework for analysis of noble gas data during periods of volcanic unrest.
HLLV avionics requirements study and electronic filing system database development
NASA Technical Reports Server (NTRS)
1994-01-01
This final report provides a summary of achievements and activities performed under Contract NAS8-39215. The contract's objective was to explore a new way of delivering, storing, accessing, and archiving study products and information and to define top level system requirements for Heavy Lift Launch Vehicle (HLLV) avionics that incorporate Vehicle Health Management (VHM). This report includes technical objectives, methods, assumptions, recommendations, sample data, and issues as specified by DPD No. 772, DR-3. The report is organized into two major subsections, one specific to each of the two tasks defined in the Statement of Work: the Index Database Task and the HLLV Avionics Requirements Task. The Index Database Task resulted in the selection and modification of a commercial database software tool to contain the data developed during the HLLV Avionics Requirements Task. All summary information is addressed within each task's section.
ASM Based Synthesis of Handwritten Arabic Text Pages
Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif; Ghoneim, Ahmed
2015-01-01
Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available. PMID:26295059
ASM Based Synthesis of Handwritten Arabic Text Pages.
Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif; Ghoneim, Ahmed
2015-01-01
Document analysis tasks, as text recognition, word spotting, or segmentation, are highly dependent on comprehensive and suitable databases for training and validation. However their generation is expensive in sense of labor and time. As a matter of fact, there is a lack of such databases, which complicates research and development. This is especially true for the case of Arabic handwriting recognition, that involves different preprocessing, segmentation, and recognition methods, which have individual demands on samples and ground truth. To bypass this problem, we present an efficient system that automatically turns Arabic Unicode text into synthetic images of handwritten documents and detailed ground truth. Active Shape Models (ASMs) based on 28046 online samples were used for character synthesis and statistical properties were extracted from the IESK-arDB database to simulate baselines and word slant or skew. In the synthesis step ASM based representations are composed to words and text pages, smoothed by B-Spline interpolation and rendered considering writing speed and pen characteristics. Finally, we use the synthetic data to validate a segmentation method. An experimental comparison with the IESK-arDB database encourages to train and test document analysis related methods on synthetic samples, whenever no sufficient natural ground truthed data is available.
Domain Regeneration for Cross-Database Micro-Expression Recognition
NASA Astrophysics Data System (ADS)
Zong, Yuan; Zheng, Wenming; Huang, Xiaohua; Shi, Jingang; Cui, Zhen; Zhao, Guoying
2018-05-01
In this paper, we investigate the cross-database micro-expression recognition problem, where the training and testing samples are from two different micro-expression databases. Under this setting, the training and testing samples would have different feature distributions and hence the performance of most existing micro-expression recognition methods may decrease greatly. To solve this problem, we propose a simple yet effective method called Target Sample Re-Generator (TSRG) in this paper. By using TSRG, we are able to re-generate the samples from target micro-expression database and the re-generated target samples would share same or similar feature distributions with the original source samples. For this reason, we can then use the classifier learned based on the labeled source samples to accurately predict the micro-expression categories of the unlabeled target samples. To evaluate the performance of the proposed TSRG method, extensive cross-database micro-expression recognition experiments designed based on SMIC and CASME II databases are conducted. Compared with recent state-of-the-art cross-database emotion recognition methods, the proposed TSRG achieves more promising results.
High-precision positioning system of four-quadrant detector based on the database query
NASA Astrophysics Data System (ADS)
Zhang, Xin; Deng, Xiao-guo; Su, Xiu-qin; Zheng, Xiao-qiang
2015-02-01
The fine pointing mechanism of the Acquisition, Pointing and Tracking (APT) system in free space laser communication usually use four-quadrant detector (QD) to point and track the laser beam accurately. The positioning precision of QD is one of the key factors of the pointing accuracy to APT system. A positioning system is designed based on FPGA and DSP in this paper, which can realize the sampling of AD, the positioning algorithm and the control of the fast swing mirror. We analyze the positioning error of facular center calculated by universal algorithm when the facular energy obeys Gauss distribution from the working principle of QD. A database is built by calculation and simulation with MatLab software, in which the facular center calculated by universal algorithm is corresponded with the facular center of Gaussian beam, and the database is stored in two pieces of E2PROM as the external memory of DSP. The facular center of Gaussian beam is inquiry in the database on the basis of the facular center calculated by universal algorithm in DSP. The experiment results show that the positioning accuracy of the high-precision positioning system is much better than the positioning accuracy calculated by universal algorithm.
NASA Astrophysics Data System (ADS)
Wu, A. M.; Nater, E. A.; Dalzell, B. J.; Perry, C. H.
2014-12-01
The USDA Forest Service's Forest Inventory Analysis (FIA) program is a national effort assessing current forest resources to ensure sustainable management practices, to assist planning activities, and to report critical status and trends. For example, estimates of carbon stocks and stock change in FIA are reported as the official United States submission to the United Nations Framework Convention on Climate Change. While the main effort in FIA has been focused on aboveground biomass, soil is a critical component of this system. FIA sampled forest soils in the early 2000s and has remeasurement now underway. However, soil sampling is repeated on a 10-year interval (or longer), and it is uncertain what magnitude of changes in soil organic carbon (SOC) may be detectable with the current sampling protocol. We aim to identify the sensitivity and variability of SOC in the FIA database, and to determine the amount of SOC change that can be detected with the current sampling scheme. For this analysis, we attempt to answer the following questions: 1) What is the sensitivity (power) of SOC data in the current FIA database? 2) How does the minimum detectable change in forest SOC respond to changes in sampling intervals and/or sample point density? Soil samples in the FIA database represent 0-10 cm and 10-20 cm depth increments with a 10-year sampling interval. We are investigating the variability of SOC and its change over time for composite soil data in each FIA region (Pacific Northwest, Interior West, Northern, and Southern). To guide future sampling efforts, we are employing statistical power analysis to examine the minimum detectable change in SOC storage. We are also investigating the sensitivity of SOC storage changes under various scenarios of sample size and/or sample frequency. This research will inform the design of future FIA soil sampling schemes and improve the information available to international policy makers, university and industry partners, and the public.
Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research.
Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif
2016-03-11
Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the input of a specific area of application. However, generation of training samples is expensive in the sense of manpower and time, particularly if complete text pages including complex ground truth are required. This is why there is a lack of such databases, especially for Arabic, the second most popular language. However, Arabic handwriting recognition involves different preprocessing, segmentation and recognition methods. Each requires particular ground truth or samples to enable optimal training and validation, which are often not covered by the currently available databases. To overcome this issue, we propose a system that synthesizes Arabic handwritten words and text pages and generates corresponding detailed ground truth. We use these syntheses to validate a new, segmentation based system that recognizes handwritten Arabic words. We found that a modification of an Active Shape Model based character classifiers-that we proposed earlier-improves the word recognition accuracy. Further improvements are achieved, by using a vocabulary of the 50,000 most common Arabic words for error correction.
Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research
Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif
2016-01-01
Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the input of a specific area of application. However, generation of training samples is expensive in the sense of manpower and time, particularly if complete text pages including complex ground truth are required. This is why there is a lack of such databases, especially for Arabic, the second most popular language. However, Arabic handwriting recognition involves different preprocessing, segmentation and recognition methods. Each requires particular ground truth or samples to enable optimal training and validation, which are often not covered by the currently available databases. To overcome this issue, we propose a system that synthesizes Arabic handwritten words and text pages and generates corresponding detailed ground truth. We use these syntheses to validate a new, segmentation based system that recognizes handwritten Arabic words. We found that a modification of an Active Shape Model based character classifiers—that we proposed earlier—improves the word recognition accuracy. Further improvements are achieved, by using a vocabulary of the 50,000 most common Arabic words for error correction. PMID:26978368
The U.S. Geological Survey coal quality (COALQUAL) database version 3.0
Palmer, Curtis A.; Oman, Charles L.; Park, Andy J.; Luppens, James A.
2015-12-21
Because of database size limits during the development of COALQUAL Version 1.3, many analyses of individual bench samples were merged into whole coal bed averages. The methodology for making these composite intervals was not consistent. Size limits also restricted the amount of georeferencing information and forced removal of qualifier notations such as "less than detection limit" (<) information, which can cause problems when using the data. A review of the original data sheets revealed that COALQUAL Version 2.0 was missing information that was needed for a complete understanding of a coal section. Another important database issue to resolve was the USGS "remnant moisture" problem. Prior to 1998, tests for remnant moisture (as-determined moisture in the sample at the time of analysis) were not performed on any USGS major, minor, or trace element coal analyses. Without the remnant moisture, it is impossible to convert the analyses to a usable basis (as-received, dry, etc.). Based on remnant moisture analyses of hundreds of samples of different ranks (and known residual moisture) reported after 1998, it was possible to develop a method to provide reasonable estimates of remnant moisture for older data to make it more useful in COALQUAL Version 3.0. In addition, COALQUAL Version 3.0 is improved by (1) adding qualifiers, including statistical programming to deal with the qualifiers; (2) clarifying the sample compositing problems; and (3) adding associated samples. Version 3.0 of COALQUAL also represents the first attempt to incorporate data verification by mathematically crosschecking certain analytical parameters. Finally, a new database system was designed and implemented to replace the outdated DOS program used in earlier versions of the database.
DOE Office of Scientific and Technical Information (OSTI.GOV)
NONE
The bibliography contains citations concerning chemiluminescence assays. The citations include sample system design, sample collection, measurement techniques, and sensitivity of the instrumentation. Applications in high altitude air pollution studies are emphasized. (Contains 50-250 citations and includes a subject term index and title list.) (Copyright NERAC, Inc. 1995)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Femec, D.A.
This report describes two code-generating tools used to speed design and implementation of relational databases and user interfaces: CREATE-SCHEMA and BUILD-SCREEN. CREATE-SCHEMA produces the SQL commands that actually create and define the database. BUILD-SCREEN takes templates for data entry screens and generates the screen management system routine calls to display the desired screen. Both tools also generate the related FORTRAN declaration statements and precompiled SQL calls. Included with this report is the source code for a number of FORTRAN routines and functions used by the user interface. This code is broadly applicable to a number of different databases.
NASA Astrophysics Data System (ADS)
Todd, N. S.
2016-12-01
The rock and soil samples returned from the Apollo missions from 1969-72 have supported 46 years of research leading to advances in our understanding of the formation and evolution of the inner Solar System. NASA has been engaged in several initiatives that aim to restore, digitize, and make available to the public existing published and unpublished research data for the Apollo samples. One of these initiatives is a collaboration with IEDA (Interdisciplinary Earth Data Alliance) to develop MoonDB, a lunar geochemical database modeled after PetDB. In support of this initiative, NASA has adopted the use of IGSN (International Geo Sample Number) to generate persistent, unique identifiers for lunar samples that scientists can use when publishing research data. To facilitate the IGSN registration of the original 2,200 samples and over 120,000 subdivided samples, NASA has developed an application that retrieves sample metadata from the Lunar Curation Database and uses the SESAR API to automate the generation of IGSNs and registration of samples into SESAR (System for Earth Sample Registration). This presentation will describe the work done by NASA to map existing sample metadata to the IGSN metadata and integrate the IGSN registration process into the sample curation workflow, the lessons learned from this effort, and how this work can be extended in the future to help deal with the registration of large numbers of samples.
Visibiome: an efficient microbiome search engine based on a scalable, distributed architecture.
Azman, Syafiq Kamarul; Anwar, Muhammad Zohaib; Henschel, Andreas
2017-07-24
Given the current influx of 16S rRNA profiles of microbiota samples, it is conceivable that large amounts of them eventually are available for search, comparison and contextualization with respect to novel samples. This process facilitates the identification of similar compositional features in microbiota elsewhere and therefore can help to understand driving factors for microbial community assembly. We present Visibiome, a microbiome search engine that can perform exhaustive, phylogeny based similarity search and contextualization of user-provided samples against a comprehensive dataset of 16S rRNA profiles environments, while tackling several computational challenges. In order to scale to high demands, we developed a distributed system that combines web framework technology, task queueing and scheduling, cloud computing and a dedicated database server. To further ensure speed and efficiency, we have deployed Nearest Neighbor search algorithms, capable of sublinear searches in high-dimensional metric spaces in combination with an optimized Earth Mover Distance based implementation of weighted UniFrac. The search also incorporates pairwise (adaptive) rarefaction and optionally, 16S rRNA copy number correction. The result of a query microbiome sample is the contextualization against a comprehensive database of microbiome samples from a diverse range of environments, visualized through a rich set of interactive figures and diagrams, including barchart-based compositional comparisons and ranking of the closest matches in the database. Visibiome is a convenient, scalable and efficient framework to search microbiomes against a comprehensive database of environmental samples. The search engine leverages a popular but computationally expensive, phylogeny based distance metric, while providing numerous advantages over the current state of the art tool.
Stucki, Sheldon Lee; Biss, David J.
2000-01-01
An analysis was performed using the National Automotive Sampling System Crashworthiness Data System (NASS-CDS) database to compare the injury/fatality rates of variously restrained driver occupants as compared to unrestrained driver occupants in the total database of drivers/frontals, and also by Delta-V. A structured search of the NASS-CDS was done using the SAS® statistical analysis software to extract the data for this analysis and the SUDAAN software package was used to arrive at statistical significance indicators. In addition, this paper goes on to investigate different methods for presenting results of accident database searches including significance results; a risk versus Delta-V format for specific exposures; and, a percent cumulative injury versus Delta-V format to characterize injury trends. These alternative analysis presentation methods are then discussed by example using the present study results. PMID:11558105
Design and implementation of website information disclosure assessment system.
Cho, Ying-Chiang; Pan, Jen-Yi
2015-01-01
Internet application technologies, such as cloud computing and cloud storage, have increasingly changed people's lives. Websites contain vast amounts of personal privacy information. In order to protect this information, network security technologies, such as database protection and data encryption, attract many researchers. The most serious problems concerning web vulnerability are e-mail address and network database leakages. These leakages have many causes. For example, malicious users can steal database contents, taking advantage of mistakes made by programmers and administrators. In order to mitigate this type of abuse, a website information disclosure assessment system is proposed in this study. This system utilizes a series of technologies, such as web crawler algorithms, SQL injection attack detection, and web vulnerability mining, to assess a website's information disclosure. Thirty websites, randomly sampled from the top 50 world colleges, were used to collect leakage information. This testing showed the importance of increasing the security and privacy of website information for academic websites.
NASA Astrophysics Data System (ADS)
Stroker, K. J.; Jencks, J. H.; Eakins, B.
2016-12-01
The Index to Marine and Lacustrine Geological Samples (IMLGS) is a community designed and maintained resource enabling researchers to locate and request seafloor and lakebed geologic samples curated by partner institutions. The Index was conceived in the dawn of the digital age by representatives from U.S. academic and government marine core repositories and the NOAA National Geophysical Data Center, now the National Centers for Environmental Information (NCEI), at a 1977 meeting convened by the National Science Foundation (NSF). The Index is based on core concepts of community oversight, common vocabularies, consistent metadata and a shared interface. The Curators Consortium, international in scope, meets biennially to share ideas and discuss best practices. NCEI serves the group by providing database access and maintenance, a list server, digitizing support and long-term archival of sample metadata, data and imagery. Over three decades, participating curators have performed the laborious task of creating and contributing metadata for over 205,000 sea floor and lake-bed cores, grabs, and dredges archived in their collections. Some partners use the Index for primary web access to their collections while others use it to increase exposure of more in-depth institutional systems. The IMLGS has a persistent URL/Digital Object Identifier (DOI), as well as DOIs assigned to partner collections for citation and to provide a persistent link to curator collections. The Index is currently a geospatially-enabled relational database, publicly accessible via Web Feature and Web Map Services, and text- and ArcGIS map-based web interfaces. To provide as much knowledge as possible about each sample, the Index includes curatorial contact information and links to related data, information and images : 1) at participating institutions, 2) in the NCEI archive, and 3) through a Linked Data interface maintained by the Rolling Deck to Repository R2R. Over 43,000 International GeoSample Numbers (IGSNs) linking to the System for Earth Sample Registration (SESAR) are included in anticipation of opportunities for interconnectivity with Integrated Earth Data Applications (IEDA) systems. The paper will discuss the database with a goal to increase the connections and links to related data at partner institutions.
The volatile compound BinBase mass spectral database.
Skogerson, Kirsten; Wohlgemuth, Gert; Barupal, Dinesh K; Fiehn, Oliver
2011-08-04
Volatile compounds comprise diverse chemical groups with wide-ranging sources and functions. These compounds originate from major pathways of secondary metabolism in many organisms and play essential roles in chemical ecology in both plant and animal kingdoms. In past decades, sampling methods and instrumentation for the analysis of complex volatile mixtures have improved; however, design and implementation of database tools to process and store the complex datasets have lagged behind. The volatile compound BinBase (vocBinBase) is an automated peak annotation and database system developed for the analysis of GC-TOF-MS data derived from complex volatile mixtures. The vocBinBase DB is an extension of the previously reported metabolite BinBase software developed to track and identify derivatized metabolites. The BinBase algorithm uses deconvoluted spectra and peak metadata (retention index, unique ion, spectral similarity, peak signal-to-noise ratio, and peak purity) from the Leco ChromaTOF software, and annotates peaks using a multi-tiered filtering system with stringent thresholds. The vocBinBase algorithm assigns the identity of compounds existing in the database. Volatile compound assignments are supported by the Adams mass spectral-retention index library, which contains over 2,000 plant-derived volatile compounds. Novel molecules that are not found within vocBinBase are automatically added using strict mass spectral and experimental criteria. Users obtain fully annotated data sheets with quantitative information for all volatile compounds for studies that may consist of thousands of chromatograms. The vocBinBase database may also be queried across different studies, comprising currently 1,537 unique mass spectra generated from 1.7 million deconvoluted mass spectra of 3,435 samples (18 species). Mass spectra with retention indices and volatile profiles are available as free download under the CC-BY agreement (http://vocbinbase.fiehnlab.ucdavis.edu). The BinBase database algorithms have been successfully modified to allow for tracking and identification of volatile compounds in complex mixtures. The database is capable of annotating large datasets (hundreds to thousands of samples) and is well-suited for between-study comparisons such as chemotaxonomy investigations. This novel volatile compound database tool is applicable to research fields spanning chemical ecology to human health. The BinBase source code is freely available at http://binbase.sourceforge.net/ under the LGPL 2.0 license agreement.
The volatile compound BinBase mass spectral database
2011-01-01
Background Volatile compounds comprise diverse chemical groups with wide-ranging sources and functions. These compounds originate from major pathways of secondary metabolism in many organisms and play essential roles in chemical ecology in both plant and animal kingdoms. In past decades, sampling methods and instrumentation for the analysis of complex volatile mixtures have improved; however, design and implementation of database tools to process and store the complex datasets have lagged behind. Description The volatile compound BinBase (vocBinBase) is an automated peak annotation and database system developed for the analysis of GC-TOF-MS data derived from complex volatile mixtures. The vocBinBase DB is an extension of the previously reported metabolite BinBase software developed to track and identify derivatized metabolites. The BinBase algorithm uses deconvoluted spectra and peak metadata (retention index, unique ion, spectral similarity, peak signal-to-noise ratio, and peak purity) from the Leco ChromaTOF software, and annotates peaks using a multi-tiered filtering system with stringent thresholds. The vocBinBase algorithm assigns the identity of compounds existing in the database. Volatile compound assignments are supported by the Adams mass spectral-retention index library, which contains over 2,000 plant-derived volatile compounds. Novel molecules that are not found within vocBinBase are automatically added using strict mass spectral and experimental criteria. Users obtain fully annotated data sheets with quantitative information for all volatile compounds for studies that may consist of thousands of chromatograms. The vocBinBase database may also be queried across different studies, comprising currently 1,537 unique mass spectra generated from 1.7 million deconvoluted mass spectra of 3,435 samples (18 species). Mass spectra with retention indices and volatile profiles are available as free download under the CC-BY agreement (http://vocbinbase.fiehnlab.ucdavis.edu). Conclusions The BinBase database algorithms have been successfully modified to allow for tracking and identification of volatile compounds in complex mixtures. The database is capable of annotating large datasets (hundreds to thousands of samples) and is well-suited for between-study comparisons such as chemotaxonomy investigations. This novel volatile compound database tool is applicable to research fields spanning chemical ecology to human health. The BinBase source code is freely available at http://binbase.sourceforge.net/ under the LGPL 2.0 license agreement. PMID:21816034
Fish Karyome: A karyological information network database of Indian Fishes.
Nagpure, Naresh Sahebrao; Pathak, Ajey Kumar; Pati, Rameshwar; Singh, Shri Prakash; Singh, Mahender; Sarkar, Uttam Kumar; Kushwaha, Basdeo; Kumar, Ravindra
2012-01-01
'Fish Karyome', a database on karyological information of Indian fishes have been developed that serves as central source for karyotype data about Indian fishes compiled from the published literature. Fish Karyome has been intended to serve as a liaison tool for the researchers and contains karyological information about 171 out of 2438 finfish species reported in India and is publically available via World Wide Web. The database provides information on chromosome number, morphology, sex chromosomes, karyotype formula and cytogenetic markers etc. Additionally, it also provides the phenotypic information that includes species name, its classification, and locality of sample collection, common name, local name, sex, geographical distribution, and IUCN Red list status. Besides, fish and karyotype images, references for 171 finfish species have been included in the database. Fish Karyome has been developed using SQL Server 2008, a relational database management system, Microsoft's ASP.NET-2008 and Macromedia's FLASH Technology under Windows 7 operating environment. The system also enables users to input new information and images into the database, search and view the information and images of interest using various search options. Fish Karyome has wide range of applications in species characterization and identification, sex determination, chromosomal mapping, karyo-evolution and systematics of fishes.
Jha, Ashish Kumar
2015-01-01
Glomerular filtration rate (GFR) estimation by plasma sampling method is considered as the gold standard. However, this method is not widely used because the complex technique and cumbersome calculations coupled with the lack of availability of user-friendly software. The routinely used Serum Creatinine method (SrCrM) of GFR estimation also requires the use of online calculators which cannot be used without internet access. We have developed user-friendly software "GFR estimation software" which gives the options to estimate GFR by plasma sampling method as well as SrCrM. We have used Microsoft Windows(®) as operating system and Visual Basic 6.0 as the front end and Microsoft Access(®) as database tool to develop this software. We have used Russell's formula for GFR calculation by plasma sampling method. GFR calculations using serum creatinine have been done using MIRD, Cockcroft-Gault method, Schwartz method, and Counahan-Barratt methods. The developed software is performing mathematical calculations correctly and is user-friendly. This software also enables storage and easy retrieval of the raw data, patient's information and calculated GFR for further processing and comparison. This is user-friendly software to calculate the GFR by various plasma sampling method and blood parameter. This software is also a good system for storing the raw and processed data for future analysis.
A flexible system to capture sample vials in a storage box - the box vial scanner.
Nowakowski, Steven E; Kressin, Kenneth R; Deick, Steven D
2009-01-01
Tracking sample vials in a research environment is a critical task and doing so efficiently can have a large impact on productivity, especially in high volume laboratories. There are several challenges to automating the capture process, including the variety of containers used to store samples. We developed a fast and robust system to capture the location of sample vials being placed in storage that allows the laboratories the flexibility to use sample containers of varying dimensions. With a single scan, this device captures the box identifier, the vial identifier and the location of each vial within a freezer storage box. The sample vials are tracked through a barcode label affixed to the cap while the boxes are tracked by a barcode label on the side of the box. Scanning units are placed at the point of use and forward data to a sever application for processing the scanned data. Scanning units consist of an industrial barcode reader mounted in a fixture positioning the box for scanning and providing lighting during the scan. The server application transforms the scan data into a list of storage locations holding vial identifiers. The list is then transferred to the laboratory database. The box vial scanner captures the IDs and location information for an entire box of sample vials into the laboratory database in a single scan. The system accommodates a wide variety of vials sizes by inserting risers under the sample box and a variety of storage box layouts are supported via the processing algorithm on the server.
General, database-driven fast-feedback system for the Stanford Linear Collider
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rouse, F.; Allison, S.; Castillo, S.
A new feedback system has been developed for stabilizing the SLC beams at many locations. The feedback loops are designed to sample and correct at the 60 Hz repetition rate of the accelerator. Each loop can be distributed across several of the standard 80386 microprocessors which control the SLC hardware. A new communications system, KISNet, has been implemented to pass signals between the microprocessors at this rate. The software is written in a general fashion using the state space formalism of digital control theory. This allows a new loop to be implemented by just setting up the online database andmore » perhaps installing a communications link. 3 refs., 4 figs.« less
Internet-based profiler system as integrative framework to support translational research
Kim, Robert; Demichelis, Francesca; Tang, Jeffery; Riva, Alberto; Shen, Ronglai; Gibbs, Doug F; Mahavishno, Vasudeva; Chinnaiyan, Arul M; Rubin, Mark A
2005-01-01
Background Translational research requires taking basic science observations and developing them into clinically useful tests and therapeutics. We have developed a process to develop molecular biomarkers for diagnosis and prognosis by integrating tissue microarray (TMA) technology and an internet-database tool, Profiler. TMA technology allows investigators to study hundreds of patient samples on a single glass slide resulting in the conservation of tissue and the reduction in inter-experimental variability. The Profiler system allows investigator to reliably track, store, and evaluate TMA experiments. Here within we describe the process that has evolved through an empirical basis over the past 5 years at two academic institutions. Results The generic design of this system makes it compatible with multiple organ system (e.g., prostate, breast, lung, renal, and hematopoietic system,). Studies and folders are restricted to authorized users as required. Over the past 5 years, investigators at 2 academic institutions have scanned 656 TMA experiments and collected 63,311 digital images of these tissue samples. 68 pathologists from 12 major user groups have accessed the system. Two groups directly link clinical data from over 500 patients for immediate access and the remaining groups choose to maintain clinical and pathology data on separate systems. Profiler currently has 170 K data points such as staining intensity, tumor grade, and nuclear size. Due to the relational database structure, analysis can be easily performed on single or multiple TMA experimental results. The TMA module of Profiler can maintain images acquired from multiple systems. Conclusion We have developed a robust process to develop molecular biomarkers using TMA technology and an internet-based database system to track all steps of this process. This system is extendable to other types of molecular data as separate modules and is freely available to academic institutions for licensing. PMID:16364175
Internet-based Profiler system as integrative framework to support translational research.
Kim, Robert; Demichelis, Francesca; Tang, Jeffery; Riva, Alberto; Shen, Ronglai; Gibbs, Doug F; Mahavishno, Vasudeva; Chinnaiyan, Arul M; Rubin, Mark A
2005-12-19
Translational research requires taking basic science observations and developing them into clinically useful tests and therapeutics. We have developed a process to develop molecular biomarkers for diagnosis and prognosis by integrating tissue microarray (TMA) technology and an internet-database tool, Profiler. TMA technology allows investigators to study hundreds of patient samples on a single glass slide resulting in the conservation of tissue and the reduction in inter-experimental variability. The Profiler system allows investigator to reliably track, store, and evaluate TMA experiments. Here within we describe the process that has evolved through an empirical basis over the past 5 years at two academic institutions. The generic design of this system makes it compatible with multiple organ system (e.g., prostate, breast, lung, renal, and hematopoietic system,). Studies and folders are restricted to authorized users as required. Over the past 5 years, investigators at 2 academic institutions have scanned 656 TMA experiments and collected 63,311 digital images of these tissue samples. 68 pathologists from 12 major user groups have accessed the system. Two groups directly link clinical data from over 500 patients for immediate access and the remaining groups choose to maintain clinical and pathology data on separate systems. Profiler currently has 170 K data points such as staining intensity, tumor grade, and nuclear size. Due to the relational database structure, analysis can be easily performed on single or multiple TMA experimental results. The TMA module of Profiler can maintain images acquired from multiple systems. We have developed a robust process to develop molecular biomarkers using TMA technology and an internet-based database system to track all steps of this process. This system is extendable to other types of molecular data as separate modules and is freely available to academic institutions for licensing.
Development of a 20-locus fluorescent multiplex system as a valuable tool for national DNA database.
Jiang, Xianhua; Guo, Fei; Jia, Fei; Jin, Ping; Sun, Zhu
2013-02-01
The multiplex system allows the detection of 19 autosomal short tandem repeat (STR) loci [including all Combined DNA Index System (CODIS) STR loci as well as D2S1338, D6S1043, D12S391, D19S433, Penta D and Penta E] plus the sex-determining locus Amelogenin in a single reaction, comprising all STR loci in various commercial kits used in the China national DNA database (NDNAD). Primers are designed so that the amplicons are distributed ranging from 90 base pairs (bp) to 450 bp within a five-dye fluorescent design with the fifth dye reserved for the internal size standard. With 30 cycles, 125 pg to 2 ng DNA template showed optimal profiling result, while robust profiles could also be achieved by adjusting the cycle numbers for the DNA template beyond that optimal DNA input range. Mixture studies showed that 83% and 87% of minor alleles were detected at 9:1 and 1:9 ratios, respectively. When 4 ng of degraded DNA was digested by 2-min DNase and 1 ng undegraded DNA was added to 400 μM haematin, the complete profiles were still observed. Polymerase chain reaction (PCR)-based procedures were examined and optimized including the concentrations of primer set, magnesium and the Taq polymerase as well as volume, cycle number and annealing temperature. In addition, the system has been validated by 3000 bloodstain samples and 35 common case samples in line with the Chinese National Standards and Scientific Working Group on DNA Analysis Methods (SWGDAM) guidelines. The total probability of identity (TPI) can reach to 8×10(-24), where DNA database can be improved at the level of 10 million DNA profiles or more because the number of expected match is far from one person (4×10(-10)) and can be negligible. Further, our system also demonstrates its good performance in case samples and it will be an ideal tool for forensic DNA typing and databasing with potential application. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
P2P proteomics -- data sharing for enhanced protein identification
2012-01-01
Background In order to tackle the important and challenging problem in proteomics of identifying known and new protein sequences using high-throughput methods, we propose a data-sharing platform that uses fully distributed P2P technologies to share specifications of peer-interaction protocols and service components. By using such a platform, information to be searched is no longer centralised in a few repositories but gathered from experiments in peer proteomics laboratories, which can subsequently be searched by fellow researchers. Methods The system distributively runs a data-sharing protocol specified in the Lightweight Communication Calculus underlying the system through which researchers interact via message passing. For this, researchers interact with the system through particular components that link to database querying systems based on BLAST and/or OMSSA and GUI-based visualisation environments. We have tested the proposed platform with data drawn from preexisting MS/MS data reservoirs from the 2006 ABRF (Association of Biomolecular Resource Facilities) test sample, which was extensively tested during the ABRF Proteomics Standards Research Group 2006 worldwide survey. In particular we have taken the data available from a subset of proteomics laboratories of Spain's National Institute for Proteomics, ProteoRed, a network for the coordination, integration and development of the Spanish proteomics facilities. Results and Discussion We performed queries against nine databases including seven ProteoRed proteomics laboratories, the NCBI Swiss-Prot database and the local database of the CSIC/UAB Proteomics Laboratory. A detailed analysis of the results indicated the presence of a protein that was supported by other NCBI matches and highly scored matches in several proteomics labs. The analysis clearly indicated that the protein was a relatively high concentrated contaminant that could be present in the ABRF sample. This fact is evident from the information that could be derived from the proposed P2P proteomics system, however it is not straightforward to arrive to the same conclusion by conventional means as it is difficult to discard organic contamination of samples. The actual presence of this contaminant was only stated after the ABRF study of all the identifications reported by the laboratories. PMID:22293032
Amadoz, Alicia; González-Candelas, Fernando
2007-04-20
Most research scientists working in the fields of molecular epidemiology, population and evolutionary genetics are confronted with the management of large volumes of data. Moreover, the data used in studies of infectious diseases are complex and usually derive from different institutions such as hospitals or laboratories. Since no public database scheme incorporating clinical and epidemiological information about patients and molecular information about pathogens is currently available, we have developed an information system, composed by a main database and a web-based interface, which integrates both types of data and satisfies requirements of good organization, simple accessibility, data security and multi-user support. From the moment a patient arrives to a hospital or health centre until the processing and analysis of molecular sequences obtained from infectious pathogens in the laboratory, lots of information is collected from different sources. We have divided the most relevant data into 12 conceptual modules around which we have organized the database schema. Our schema is very complete and it covers many aspects of sample sources, samples, laboratory processes, molecular sequences, phylogenetics results, clinical tests and results, clinical information, treatments, pathogens, transmissions, outbreaks and bibliographic information. Communication between end-users and the selected Relational Database Management System (RDMS) is carried out by default through a command-line window or through a user-friendly, web-based interface which provides access and management tools for the data. epiPATH is an information system for managing clinical and molecular information from infectious diseases. It facilitates daily work related to infectious pathogens and sequences obtained from them. This software is intended for local installation in order to safeguard private data and provides advanced SQL-users the flexibility to adapt it to their needs. The database schema, tool scripts and web-based interface are free software but data stored in our database server are not publicly available. epiPATH is distributed under the terms of GNU General Public License. More details about epiPATH can be found at http://genevo.uv.es/epipath.
Favorable Geochemistry from Springs and Wells in Colorado
Richard E. Zehner
2012-02-01
This layer contains favorable geochemistry for high-temperature geothermal systems, as interpreted by Richard "Rick" Zehner. The data is compiled from the data obtained from the USGS. The original data set combines 15,622 samples collected in the State of Colorado from several sources including 1) the original Geotherm geochemical database, 2) USGS NWIS (National Water Information System), 3) Colorado Geological Survey geothermal sample data, and 4) original samples collected by R. Zehner at various sites during the 2011 field season. These samples are also available in a separate shapefile FlintWaterSamples.shp. Data from all samples were reportedly collected using standard water sampling protocols (filtering through 0.45 micron filter, etc.) Sample information was standardized to ppm (micrograms/liter) in spreadsheet columns. Commonly-used cation and silica geothermometer temperature estimates are included.
McIlroy, Simon Jon; Kirkegaard, Rasmus Hansen; McIlroy, Bianca; Nierychlo, Marta; Kristensen, Jannie Munk; Karst, Søren Michael; Albertsen, Mads
2017-01-01
Abstract Wastewater is increasingly viewed as a resource, with anaerobic digester technology being routinely implemented for biogas production. Characterising the microbial communities involved in wastewater treatment facilities and their anaerobic digesters is considered key to their optimal design and operation. Amplicon sequencing of the 16S rRNA gene allows high-throughput monitoring of these systems. The MiDAS field guide is a public resource providing amplicon sequencing protocols and an ecosystem-specific taxonomic database optimized for use with wastewater treatment facility samples. The curated taxonomy endeavours to provide a genus-level-classification for abundant phylotypes and the online field guide links this identity to published information regarding their ecology, function and distribution. This article describes the expansion of the database resources to cover the organisms of the anaerobic digester systems fed primary sludge and surplus activated sludge. The updated database includes descriptions of the abundant genus-level-taxa in influent wastewater, activated sludge and anaerobic digesters. Abundance information is also included to allow assessment of the role of emigration in the ecology of each phylotype. MiDAS is intended as a collaborative resource for the progression of research into the ecology of wastewater treatment, by providing a public repository for knowledge that is accessible to all interested in these biotechnologically important systems. Database URL: http://www.midasfieldguide.org PMID:28365734
Forensic Tools to Track and Connect Physical Samples to Related Data
NASA Astrophysics Data System (ADS)
Molineux, A.; Thompson, A. C.; Baumgardner, R. W.
2016-12-01
Identifiers, such as local sample numbers, are critical to successfully connecting physical samples and related data. However, identifiers must be globally unique. The International Geo Sample Number (IGSN) generated when registering the sample in the System for Earth Sample Registration (SESAR) provides a globally unique alphanumeric code associated with basic metadata, related samples and their current physical storage location. When registered samples are published, users can link the figured samples to the basic metadata held at SESAR. The use cases we discuss include plant specimens from a Permian core, Holocene corals and derived powders, and thin sections with SEM stubs. Much of this material is now published. The plant taxonomic study from the core is a digital pdf and samples can be directly linked from the captions to the SESAR record. The study of stable isotopes from the corals is not yet digitally available, but individual samples are accessible. Full data and media records for both studies are located in our database where higher quality images, field notes, and section diagrams may exist. Georeferences permit mapping in current and deep time plate configurations. Several aspects emerged during this study. The first, ensure adequate and consistent details are registered with SESAR. Second, educate and encourage the researcher to obtain IGSNs. Third, publish the archive numbers, assigned prior to publication, alongside the IGSN. This provides access to further data through an Integrated Publishing Toolkit (IPT)/aggregators/or online repository databases, thus placing the initial sample in a much richer context for future studies. Fourth, encourage software developers to customize community software to extract data from a database and use it to register samples in bulk. This would improve workflow and provide a path for registration of large legacy collections.
Towards the implementation of a spectral database for the detection of biological warfare agents
NASA Astrophysics Data System (ADS)
Carestia, M.; Pizzoferrato, R.; Gelfusa, M.; Cenciarelli, O.; D'Amico, F.; Malizia, A.; Scarpellini, D.; Murari, A.; Vega, J.; Gaudio, P.
2014-10-01
The deliberate use of biological warfare agents (BWA) and other pathogens can jeopardize the safety of population, fauna and flora, and represents a concrete concern from the military and civil perspective. At present, the only commercially available tools for fast warning of a biological attack can perform point detection and require active or passive sampling collection. The development of a stand-off detection system would be extremely valuable to minimize the risk and the possible consequences of the release of biological aerosols in the atmosphere. Biological samples can be analyzed by means of several optical techniques, covering a broad region of the electromagnetic spectrum. Strong evidence proved that the informative content of fluorescence spectra could provide good preliminary discrimination among those agents and it can also be obtained through stand-off measurements. Such a system necessitates a database and a mathematical method for the discrimination of the spectral signatures. In this work, we collected fluorescence emission spectra of the main BWA simulants, to implement a spectral signature database and apply the Universal Multi Event Locator (UMEL) statistical method. Our preliminary analysis, conducted in laboratory conditions with a standard UV lamp source, considers the main experimental setups influencing the fluorescence signature of some of the most commonly used BWA simulants. Our work represents a first step towards the implementation of a spectral database and a laser-based biological stand-off detection and identification technique.
Nebert, Douglas; Anderson, Dean
1987-01-01
The U. S. Geological Survey (USGS) in cooperation with the U. S. Environmental Protection Agency Office of Pesticide Programs and several State agencies in Oregon has prepared a digital spatial database at 1:500,000 scale to be used as a basis for evaluating the potential for ground-water contamination by pesticides and other agricultural chemicals. Geographic information system (GIS) software was used to assemble, analyze, and manage spatial and tabular environmental data in support of this project. Physical processes were interpreted relative to published spatial data and an integrated database to support the appraisal of regional ground-water contamination was constructed. Ground-water sampling results were reviewed relative to the environmental factors present in several agricultural areas to develop an empirical knowledge base which could be used to assist in the selection of future sampling or study areas.
A k-Vector Approach to Sampling, Interpolation, and Approximation
NASA Astrophysics Data System (ADS)
Mortari, Daniele; Rogers, Jonathan
2013-12-01
The k-vector search technique is a method designed to perform extremely fast range searching of large databases at computational cost independent of the size of the database. k-vector search algorithms have historically found application in satellite star-tracker navigation systems which index very large star catalogues repeatedly in the process of attitude estimation. Recently, the k-vector search algorithm has been applied to numerous other problem areas including non-uniform random variate sampling, interpolation of 1-D or 2-D tables, nonlinear function inversion, and solution of systems of nonlinear equations. This paper presents algorithms in which the k-vector search technique is used to solve each of these problems in a computationally-efficient manner. In instances where these tasks must be performed repeatedly on a static (or nearly-static) data set, the proposed k-vector-based algorithms offer an extremely fast solution technique that outperforms standard methods.
The Young Visual Binary Database
NASA Astrophysics Data System (ADS)
Prato, Lisa A.; Avilez, Ian; Allen, Thomas; Zoonematkermani, Saeid; Biddle, Lauren; Muzzio, Ryan; Wittal, Matthew; Schaefer, Gail; Simon, Michal
2017-01-01
We have obtained adaptive optics imaging and high-resolution H-band and in some cases K-band spectra of each component in close to 100 young multiple systems in the nearby star forming regions of Taurus, Ophiuchus, TW Hya, and Orion. The binary separations for the pairs in our sample range from 30 mas to 3 arcseconds. The imaging and most of our spectra were obtained with instruments behind adaptive optics systems in order to resolve even the closest companions. We are in the process of determining fundamental stellar and circumstellar properties, such as effective temperature, Vsin(i), veiling, and radial velocity, for each component in the entire sample. The beta version of our database includes systems in the Taurus region and provides plots, downloadable ascii spectra, and values of the stellar and circumstellar properties for both stars in each system. This resource is openly available to the community at http://jumar.lowell.edu/BinaryStars/. In this poster we describe initial results from our analysis of the survey data. Support for this research was provided in part by NSF award AST-1313399 and by NASA Keck KPDA funding.
NASA Astrophysics Data System (ADS)
Pavlov, S. S.; Dmitriev, A. Yu.; Chepurchenko, I. A.; Frontasyeva, M. V.
2014-11-01
The automation system for measurement of induced activity of gamma-ray spectra for multi-element high volume neutron activation analysis (NAA) was designed, developed and implemented at the reactor IBR-2 at the Frank Laboratory of Neutron Physics. The system consists of three devices of automatic sample changers for three Canberra HPGe detector-based gamma spectrometry systems. Each sample changer consists of two-axis of linear positioning module M202A by DriveSet company and disk with 45 slots for containers with samples. Control of automatic sample changer is performed by the Xemo S360U controller by Systec company. Positioning accuracy can reach 0.1 mm. Special software performs automatic changing of samples and measurement of gamma spectra at constant interaction with the NAA database.
Souza, C A; Oliveira, T C; Crovella, S; Santos, S M; Rabêlo, K C N; Soriano, E P; Carvalho, M V D; Junior, A F Caldas; Porto, G G; Campello, R I C; Antunes, A A; Queiroz, R A; Souza, S M
2017-04-28
The use of Y chromosome haplotypes, important for the detection of sexual crimes in forensics, has gained prominence with the use of databases that incorporate these genetic profiles in their system. Here, we optimized and validated an amplification protocol for Y chromosome profile retrieval in reference samples using lesser materials than those in commercial kits. FTA ® cards (Flinders Technology Associates) were used to support the oral cells of male individuals, which were amplified directly using the SwabSolution reagent (Promega). First, we optimized and validated the process to define the volume and cycling conditions. Three reference samples and nineteen 1.2 mm-diameter perforated discs were used per sample. Amplification of one or two discs (samples) with the PowerPlex ® Y23 kit (Promega) was performed using 25, 26, and 27 thermal cycles. Twenty percent, 32%, and 100% reagent volumes, one disc, and 26 cycles were used for the control per sample. Thereafter, all samples (N = 270) were amplified using 27 cycles, one disc, and 32% reagents (optimized conditions). Data was analyzed using a study of equilibrium values between fluorophore colors. In the samples analyzed with 20% volume, an imbalance was observed in peak heights, both inside and in-between each dye. In samples amplified with 32% reagents, the values obtained for the intra-color and inter-color standard balance calculations for verification of the quality of the analyzed peaks were similar to those of samples amplified with 100% of the recommended volume. The quality of the profiles obtained with 32% reagents was suitable for insertion into databases.
Asadi, S S; Vuppala, Padmaja; Reddy, M Anji
2005-01-01
A preliminary survey of area under Zone-III of MCH was undertaken to assess the ground water quality, demonstrate its spatial distribution and correlate with the land use patterns using advance techniques of remote sensing and geographical information system (GIS). Twenty-seven ground water samples were collected and their chemical analysis was done to form the attribute database. Water quality index was calculated from the measured parameters, based on which the study area was classified into five groups with respect to suitability of water for drinking purpose. Thematic maps viz., base map, road network, drainage and land use/land cover were prepared from IRS ID PAN + LISS III merged satellite imagery forming the spatial database. Attribute database was integrated with spatial sampling locations map in Arc/Info and maps showing spatial distribution of water quality parameters were prepared in Arc View. Results indicated that high concentrations of total dissolved solids (TDS), nitrates, fluorides and total hardness were observed in few industrial and densely populated areas indicating deteriorated water quality while the other areas exhibited moderate to good water quality.
DOE Office of Scientific and Technical Information (OSTI.GOV)
The system is developed to collect, process, store and present the information provided by the radio frequency identification (RFID) devices. The system contains three parts, the application software, the database and the web page. The application software manages multiple RFID devices, such as readers and portals, simultaneously. It communicates with the devices through application programming interface (API) provided by the device vendor. The application software converts data collected by the RFID readers and portals to readable information. It is capable of encrypting data using 256 bits advanced encryption standard (AES). The application software has a graphical user interface (GUI). Themore » GUI mimics the configurations of the nucler material storage sites or transport vehicles. The GUI gives the user and system administrator an intuitive way to read the information and/or configure the devices. The application software is capable of sending the information to a remote, dedicated and secured web and database server. Two captured screen samples, one for storage and transport, are attached. The database is constructed to handle a large number of RFID tag readers and portals. A SQL server is employed for this purpose. An XML script is used to update the database once the information is sent from the application software. The design of the web page imitates the design of the application software. The web page retrieves data from the database and presents it in different panels. The user needs a user name combined with a password to access the web page. The web page is capable of sending e-mail and text messages based on preset criteria, such as when alarm thresholds are excceeded. A captured screen sample is attached. The application software is designed to be installed on a local computer. The local computer is directly connected to the RFID devices and can be controlled locally or remotely. There are multiple local computers managing different sites or transport vehicles. The control from remote sites and information transmitted to a central database server is through secured internet. The information stored in the central databaser server is shown on the web page. The users can view the web page on the internet. A dedicated and secured web and database server (https) is used to provide information security.« less
R2 Water Quality Portal Monitoring Stations
The Water Quality Data Portal (WQP) provides an easy way to access data stored in various large water quality databases. The WQP provides various input parameters on the form including location, site, sampling, and date parameters to filter and customize the returned results. The The Water Quality Portal (WQP) is a cooperative service sponsored by the United States Geological Survey (USGS), the Environmental Protection Agency (EPA) and the National Water Quality Monitoring Council (NWQMC) that integrates publicly available water quality data from the USGS National Water Information System (NWIS) the EPA STOrage and RETrieval (STORET) Data Warehouse, and the USDA ARS Sustaining The Earth??s Watersheds - Agricultural Research Database System (STEWARDS).
Houssaini, Allal; Assoumou, Lambert; Miller, Veronica; Calvez, Vincent; Marcelin, Anne-Geneviève; Flandre, Philippe
2013-01-01
Background Several attempts have been made to determine HIV-1 resistance from genotype resistance testing. We compare scoring methods for building weighted genotyping scores and commonly used systems to determine whether the virus of a HIV-infected patient is resistant. Methods and Principal Findings Three statistical methods (linear discriminant analysis, support vector machine and logistic regression) are used to determine the weight of mutations involved in HIV resistance. We compared these weighted scores with known interpretation systems (ANRS, REGA and Stanford HIV-db) to classify patients as resistant or not. Our methodology is illustrated on the Forum for Collaborative HIV Research didanosine database (N = 1453). The database was divided into four samples according to the country of enrolment (France, USA/Canada, Italy and Spain/UK/Switzerland). The total sample and the four country-based samples allow external validation (one sample is used to estimate a score and the other samples are used to validate it). We used the observed precision to compare the performance of newly derived scores with other interpretation systems. Our results show that newly derived scores performed better than or similar to existing interpretation systems, even with external validation sets. No difference was found between the three methods investigated. Our analysis identified four new mutations associated with didanosine resistance: D123S, Q207K, H208Y and K223Q. Conclusions We explored the potential of three statistical methods to construct weighted scores for didanosine resistance. Our proposed scores performed at least as well as already existing interpretation systems and previously unrecognized didanosine-resistance associated mutations were identified. This approach could be used for building scores of genotypic resistance to other antiretroviral drugs. PMID:23555613
Web-based biobank system infrastructure monitoring using Python, Perl, and PHP.
Norling, Martin; Kihara, Absolomon; Kemp, Steve
2013-12-01
The establishment and maintenance of biobanks is only as worthwhile as the security and logging of the biobank contents. We have designed a monitoring system that continuously measures temperature and gas content, records the movement of samples in and out of the biobank, and also records the opening and closing of the freezers-storing the results and images in a database. We have also incorporated an early warning feature that sends out alerts, via SMS and email, to responsible persons if any measurement is recorded outside the acceptable limits, guaranteeing the integrity of biobanked samples, as well as reagents used in sample analysis. A surveillance system like this increases the value for any biobank as the initial investment is small and the value of having trustworthy samples for future research is high.
Applying the archetype approach to the database of a biobank information management system.
Späth, Melanie Bettina; Grimson, Jane
2011-03-01
The purpose of this study is to investigate the feasibility of applying the openEHR archetype approach to modelling the data in the database of an existing proprietary biobank information management system. A biobank information management system stores the clinical/phenotypic data of the sample donor and sample related information. The clinical/phenotypic data is potentially sourced from the donor's electronic health record (EHR). The study evaluates the reuse of openEHR archetypes that have been developed for the creation of an interoperable EHR in the context of biobanking, and proposes a new set of archetypes specifically for biobanks. The ultimate goal of the research is the development of an interoperable electronic biomedical research record (eBMRR) to support biomedical knowledge discovery. The database of the prostate cancer biobank of the Irish Prostate Cancer Research Consortium (PCRC), which supports the identification of novel biomarkers for prostate cancer, was taken as the basis for the modelling effort. First the database schema of the biobank was analyzed and reorganized into archetype-friendly concepts. Then, archetype repositories were searched for matching archetypes. Some existing archetypes were reused without change, some were modified or specialized, and new archetypes were developed where needed. The fields of the biobank database schema were then mapped to the elements in the archetypes. Finally, the archetypes were arranged into templates specifically to meet the requirements of the PCRC biobank. A set of 47 archetypes was found to cover all the concepts used in the biobank. Of these, 29 (62%) were reused without change, 6 were modified and/or extended, 1 was specialized, and 11 were newly defined. These archetypes were arranged into 8 templates specifically required for this biobank. A number of issues were encountered in this research. Some arose from the immaturity of the archetype approach, such as immature modelling support tools, difficulties in defining high-quality archetypes and the problem of overlapping archetypes. In addition, the identification of suitable existing archetypes was time-consuming and many semantic conflicts were encountered during the process of mapping the PCRC BIMS database to existing archetypes. These include differences in the granularity of documentation, in metadata-level versus data-level modelling, in terminologies and vocabularies used, and in the amount of structure imposed on the information to be recorded. Furthermore, the current way of modelling the sample entity was found to be cumbersome in the sample-centric activity of biobanking. The archetype approach is a promising approach to create a shareable eBMRR based on the study participant/donor for biobanks. Many archetypes originally developed for the EHR domain can be reused to model the clinical/phenotypic and sample information in the biobank context, which validates the genericity of these archetypes and their potential for reuse in the context of biomedical research. However, finding suitable archetypes in the repositories and establishing an exact mapping between the fields in the PCRC BIMS database and the elements of existing archetypes that have been designed for clinical practice can be challenging and time-consuming and involves resolving many common system integration conflicts. These may be attributable to differences in the requirements for information documentation between clinical practice and biobanking. This research also recognized the need for better support tools, modelling guidelines and best practice rules and reconfirmed the need for better domain knowledge governance. Furthermore, the authors propose that the establishment of an independent sample record with the sample as record subject should be investigated. The research presented in this paper is limited by the fact that the new archetypes developed during this research are based on a single biobank instance. These new archetypes may not be complete, representing only those subsets of items required by this particular database. Nevertheless, this exercise exposes some of the gaps that exist in the archetype modelling landscape and highlights the concepts that need to be modelled with archetypes to enable the development of an eBMRR. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Design and Implementation of Website Information Disclosure Assessment System
Cho, Ying-Chiang; Pan, Jen-Yi
2015-01-01
Internet application technologies, such as cloud computing and cloud storage, have increasingly changed people’s lives. Websites contain vast amounts of personal privacy information. In order to protect this information, network security technologies, such as database protection and data encryption, attract many researchers. The most serious problems concerning web vulnerability are e-mail address and network database leakages. These leakages have many causes. For example, malicious users can steal database contents, taking advantage of mistakes made by programmers and administrators. In order to mitigate this type of abuse, a website information disclosure assessment system is proposed in this study. This system utilizes a series of technologies, such as web crawler algorithms, SQL injection attack detection, and web vulnerability mining, to assess a website’s information disclosure. Thirty websites, randomly sampled from the top 50 world colleges, were used to collect leakage information. This testing showed the importance of increasing the security and privacy of website information for academic websites. PMID:25768434
Measuring health system resource use for economic evaluation: a comparison of data sources.
Pollicino, Christine; Viney, Rosalie; Haas, Marion
2002-01-01
A key challenge for evaluators and health system planners is the identification, measurement and valuation of resource use for economic evaluation. Accurately capturing all significant resource use is particularly difficult in the Australian context where there is no comprehensive database from which researchers can draw. Evaluators and health system planners need to consider different approaches to data collection for estimating resource use for economic evaluation, and the relative merits of the different data sources available. This paper illustrates the issues that arise in using different data sources using a sub-sample of the data being collected for an economic evaluation. Specifically, it compares the use of Australia's largest administrative database on resource use, the Health Insurance Commission database, with the use of patient-supplied data. The extent of agreement and discrepancies between the two data sources is investigated. Findings from this study and recommendations as to how to deal with different data sources are presented.
NASA Technical Reports Server (NTRS)
VanHeel, Nancy; Pettit, Janet; Rice, Barbara; Smith, Scott M.
2003-01-01
Food and nutrient databases are populated with data obtained from a variety of sources including USDA Reference Tables, scientific journals, food manufacturers and foreign food tables. The food and nutrient database maintained by the Nutrition Coordinating Center (NCC) at the University of Minnesota is continually updated with current nutrient data and continues to be expanded with additional nutrient fields to meet diverse research endeavors. Data are strictly evaluated for reliability and relevance before incorporation into the database; however, the values are obtained from various sources and food samples rather than from direct chemical analysis of specific foods. Precise nutrient values for specific foods are essential to the nutrition program at the National Aeronautics and Space Administration (NASA). Specific foods to be included in the menus of astronauts are chemically analyzed at the Johnson Space Center for selected nutrients. A request from NASA for a method to enter the chemically analyzed nutrient values for these space flight food items into the Nutrition Data System for Research (NDS-R) software resulted in modification of the database and interview system for use by NASA, with further modification to extend the method for related uses by more typical research studies.
MGIS: managing banana (Musa spp.) genetic resources information and high-throughput genotyping data
Guignon, V.; Sempere, G.; Sardos, J.; Hueber, Y.; Duvergey, H.; Andrieu, A.; Chase, R.; Jenny, C.; Hazekamp, T.; Irish, B.; Jelali, K.; Adeka, J.; Ayala-Silva, T.; Chao, C.P.; Daniells, J.; Dowiya, B.; Effa effa, B.; Gueco, L.; Herradura, L.; Ibobondji, L.; Kempenaers, E.; Kilangi, J.; Muhangi, S.; Ngo Xuan, P.; Paofa, J.; Pavis, C.; Thiemele, D.; Tossou, C.; Sandoval, J.; Sutanto, A.; Vangu Paka, G.; Yi, G.; Van den houwe, I.; Roux, N.
2017-01-01
Abstract Unraveling the genetic diversity held in genebanks on a large scale is underway, due to advances in Next-generation sequence (NGS) based technologies that produce high-density genetic markers for a large number of samples at low cost. Genebank users should be in a position to identify and select germplasm from the global genepool based on a combination of passport, genotypic and phenotypic data. To facilitate this, a new generation of information systems is being designed to efficiently handle data and link it with other external resources such as genome or breeding databases. The Musa Germplasm Information System (MGIS), the database for global ex situ-held banana genetic resources, has been developed to address those needs in a user-friendly way. In developing MGIS, we selected a generic database schema (Chado), the robust content management system Drupal for the user interface, and Tripal, a set of Drupal modules which links the Chado schema to Drupal. MGIS allows germplasm collection examination, accession browsing, advanced search functions, and germplasm orders. Additionally, we developed unique graphical interfaces to compare accessions and to explore them based on their taxonomic information. Accession-based data has been enriched with publications, genotyping studies and associated genotyping datasets reporting on germplasm use. Finally, an interoperability layer has been implemented to facilitate the link with complementary databases like the Banana Genome Hub and the MusaBase breeding database. Database URL: https://www.crop-diversity.org/mgis/ PMID:29220435
Akram, M; Qureshi, Riffat M; Ahmad, Nasir; Solaija, Tariq Jamal
2007-01-01
Natural radionuclide contents of 226Ra, 228Ra and (40)K were studied for inter-tidal sediments collected from selected locations off the745 km long Balochistan Coast using HPGe detector based gamma-spectrometry system. The sampling zone extends from the beaches of Sonmiani (near Karachi metropolis) through Jiwani (close to the border of Iran). The natural radioactivity levels detected in various sediment samples range from 14.4 +/- 2.5 to 36.6 +/- 3.8 Bq kg(-1) for 226Ra, 9.8 +/- 1.2 to 35.2 +/- 2.0 Bq kg(-1) for (228)Ra and 144.6 +/- 9.4 to 610.5 +/- 23.9 Bq kg(-1) for (40)K. No artificial radionuclide was detected in any of the marine coastal sediment samples. 137Cs, (60)Co, 106Ru and 144Ce contents in sediment samples were below the limit of detection. The measured radioactivity levels are compared with those reported in the literature for coastal sediments in other parts of the world. The information presented in this paper will serve as the first ever local radioactivity database for the Balochistan/Makran Coastal belt of Pakistan. The presented data will also contribute to the IAEA's, Asia-Pacific Marine Radioactivity Database (ASPAMARD) and the Global Marine Radioactivity Database (GLOMARD).
Núñez, Carolina; Baeta, Miriam; Ibarbia, Nerea; Ortueta, Urko; Jiménez-Moreno, Susana; Blazquez-Caeiro, José Luis; Builes, Juan José; Herrera, Rene J; Martínez-Jarreta, Begoña; de Pancorbo, Marian M
2017-04-01
A Y-STR multiplex system has been developed with the purpose of complementing the widely used 17 Y-STR haplotyping (AmpFlSTR Y Filer® PCR Amplification kit) routinely employed in forensic and population genetic studies. This new multiplex system includes six additional STR loci (DYS576, DYS481, DYS549, DYS533, DYS570, and DYS643) to reach the 23 Y-STR of the PowerPlex® Y23 System. In addition, this kit includes the DYS456 and DYS385 loci for traceability purposes. Male samples from 625 individuals from ten worldwide populations were genotyped, including three sample sets from populations previously published with the 17 Y-STR system to expand their current data. Validation studies demonstrated good performance of the panel set in terms of concordance, sensitivity, and stability in the presence of inhibitors and artificially degraded DNA. The results obtained for haplotype diversity and discrimination capacity with this multiplex system were considerably high, providing further evidences of the suitability of this novel Y-STR system for forensic purposes. Thus, the use of this multiplex for samples previously genotyped with 17 Y-STRs will be an efficient and low-cost alternative to complete the set of 23 Y-STRs and improve allele databases for population and forensic purposes. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Sub-Audible Speech Recognition Based upon Electromyographic Signals
NASA Technical Reports Server (NTRS)
Jorgensen, Charles C. (Inventor); Agabon, Shane T. (Inventor); Lee, Diana D. (Inventor)
2012-01-01
Method and system for processing and identifying a sub-audible signal formed by a source of sub-audible sounds. Sequences of samples of sub-audible sound patterns ("SASPs") for known words/phrases in a selected database are received for overlapping time intervals, and Signal Processing Transforms ("SPTs") are formed for each sample, as part of a matrix of entry values. The matrix is decomposed into contiguous, non-overlapping two-dimensional cells of entries, and neural net analysis is applied to estimate reference sets of weight coefficients that provide sums with optimal matches to reference sets of values. The reference sets of weight coefficients are used to determine a correspondence between a new (unknown) word/phrase and a word/phrase in the database.
Consensus for second-order multi-agent systems with position sampled data
NASA Astrophysics Data System (ADS)
Wang, Rusheng; Gao, Lixin; Chen, Wenhai; Dai, Dameng
2016-10-01
In this paper, the consensus problem with position sampled data for second-order multi-agent systems is investigated. The interaction topology among the agents is depicted by a directed graph. The full-order and reduced-order observers with position sampled data are proposed, by which two kinds of sampled data-based consensus protocols are constructed. With the provided sampled protocols, the consensus convergence analysis of a continuous-time multi-agent system is equivalently transformed into that of a discrete-time system. Then, by using matrix theory and a sampled control analysis method, some sufficient and necessary consensus conditions based on the coupling parameters, spectrum of the Laplacian matrix and sampling period are obtained. While the sampling period tends to zero, our established necessary and sufficient conditions are degenerated to the continuous-time protocol case, which are consistent with the existing result for the continuous-time case. Finally, the effectiveness of our established results is illustrated by a simple simulation example. Project supported by the Natural Science Foundation of Zhejiang Province, China (Grant No. LY13F030005) and the National Natural Science Foundation of China (Grant No. 61501331).
MetPetDB: A database for metamorphic geochemistry
NASA Astrophysics Data System (ADS)
Spear, Frank S.; Hallett, Benjamin; Pyle, Joseph M.; Adalı, Sibel; Szymanski, Boleslaw K.; Waters, Anthony; Linder, Zak; Pearce, Shawn O.; Fyffe, Matthew; Goldfarb, Dennis; Glickenhouse, Nickolas; Buletti, Heather
2009-12-01
We present a data model for the initial implementation of MetPetDB, a geochemical database specific to metamorphic rock samples. The database is designed around the concept of preservation of spatial relationships, at all scales, of chemical analyses and their textural setting. Objects in the database (samples) represent physical rock samples; each sample may contain one or more subsamples with associated geochemical and image data. Samples, subsamples, geochemical data, and images are described with attributes (some required, some optional); these attributes also serve as search delimiters. All data in the database are classified as published (i.e., archived or published data), public or private. Public and published data may be freely searched and downloaded. All private data is owned; permission to view, edit, download and otherwise manipulate private data may be granted only by the data owner; all such editing operations are recorded by the database to create a data version log. The sharing of data permissions among a group of collaborators researching a common sample is done by the sample owner through the project manager. User interaction with MetPetDB is hosted by a web-based platform based upon the Java servlet application programming interface, with the PostgreSQL relational database. The database web portal includes modules that allow the user to interact with the database: registered users may save and download public and published data, upload private data, create projects, and assign permission levels to project collaborators. An Image Viewer module provides for spatial integration of image and geochemical data. A toolkit consisting of plotting and geochemical calculation software for data analysis and a mobile application for viewing the public and published data is being developed. Future issues to address include population of the database, integration with other geochemical databases, development of the analysis toolkit, creation of data models for derivative data, and building a community-wide user base. It is believed that this and other geochemical databases will enable more productive collaborations, generate more efficient research efforts, and foster new developments in basic research in the field of solid earth geochemistry.
Selected Geochemical Data for Modeling Near-Surface Processes in Mineral Systems
Giles, Stuart A.; Granitto, Matthew; Eppinger, Robert G.
2009-01-01
The database herein was initiated, designed, and populated to collect and integrate geochemical, geologic, and mineral deposit data in an organized manner to facilitate geoenvironmental mineral deposit modeling. The Microsoft Access database contains data on a variety of mineral deposit types that have variable environmental effects when exposed at the ground surface by mining or natural processes. The data tables describe quantitative and qualitative geochemical analyses determined by 134 analytical laboratory and field methods for over 11,000 heavy-mineral concentrate, rock, sediment, soil, vegetation, and water samples. The database also provides geographic information on geology, climate, ecoregion, and site contamination levels for over 3,000 field sites in North America.
Validity of a computerized population registry of dementia based on clinical databases.
Mar, J; Arrospide, A; Soto-Gordoa, M; Machón, M; Iruin, Á; Martinez-Lage, P; Gabilondo, A; Moreno-Izco, F; Gabilondo, A; Arriola, L
2018-05-08
The handling of information through digital media allows innovative approaches for identifying cases of dementia through computerized searches within the clinical databases that include systems for coding diagnoses. The aim of this study was to analyze the validity of a dementia registry in Gipuzkoa based on the administrative and clinical databases existing in the Basque Health Service. This is a descriptive study based on the evaluation of available data sources. First, through review of medical records, the diagnostic validity was evaluated in 2 samples of cases identified and not identified as dementia. The sensitivity, specificity and positive and negative predictive value of the diagnosis of dementia were measured. Subsequently, the cases of living dementia in December 31, 2016 were searched in the entire Gipuzkoa population to collect sociodemographic and clinical variables. The validation samples included 986 cases and 327 no cases. The calculated sensitivity was 80.2% and the specificity was 99.9%. The negative predictive value was 99.4% and positive value was 95.1%. The cases in Gipuzkoa were 10,551, representing 65% of the cases predicted according to the literature. Antipsychotic medication were taken by a 40% and a 25% of the cases were institutionalized. A registry of dementias based on clinical and administrative databases is valid and feasible. Its main contribution is to show the dimension of dementia in the health system. Copyright © 2018 Sociedad Española de Neurología. Publicado por Elsevier España, S.L.U. All rights reserved.
Database Resources of the BIG Data Center in 2018
Xu, Xingjian; Hao, Lili; Zhu, Junwei; Tang, Bixia; Zhou, Qing; Song, Fuhai; Chen, Tingting; Zhang, Sisi; Dong, Lili; Lan, Li; Wang, Yanqing; Sang, Jian; Hao, Lili; Liang, Fang; Cao, Jiabao; Liu, Fang; Liu, Lin; Wang, Fan; Ma, Yingke; Xu, Xingjian; Zhang, Lijuan; Chen, Meili; Tian, Dongmei; Li, Cuiping; Dong, Lili; Du, Zhenglin; Yuan, Na; Zeng, Jingyao; Zhang, Zhewen; Wang, Jinyue; Shi, Shuo; Zhang, Yadong; Pan, Mengyu; Tang, Bixia; Zou, Dong; Song, Shuhui; Sang, Jian; Xia, Lin; Wang, Zhennan; Li, Man; Cao, Jiabao; Niu, Guangyi; Zhang, Yang; Sheng, Xin; Lu, Mingming; Wang, Qi; Xiao, Jingfa; Zou, Dong; Wang, Fan; Hao, Lili; Liang, Fang; Li, Mengwei; Sun, Shixiang; Zou, Dong; Li, Rujiao; Yu, Chunlei; Wang, Guangyu; Sang, Jian; Liu, Lin; Li, Mengwei; Li, Man; Niu, Guangyi; Cao, Jiabao; Sun, Shixiang; Xia, Lin; Yin, Hongyan; Zou, Dong; Xu, Xingjian; Ma, Lina; Chen, Huanxin; Sun, Yubin; Yu, Lei; Zhai, Shuang; Sun, Mingyuan; Zhang, Zhang; Zhao, Wenming; Xiao, Jingfa; Bao, Yiming; Song, Shuhui; Hao, Lili; Li, Rujiao; Ma, Lina; Sang, Jian; Wang, Yanqing; Tang, Bixia; Zou, Dong; Wang, Fan
2018-01-01
Abstract The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. PMID:29036542
Analysis of aggregates and binders used for the ODOT chip seal program.
DOT National Transportation Integrated Search
2010-11-30
This project compared the results of laboratory characterization of chip seal aggregate samples for Oklahoma DOT Divisions 1,2,3,5 and 6 with performance data from the Pavement Management System (PMS) database. Binder evaluation was limited to identi...
University Learning Systems for Participative Courses.
ERIC Educational Resources Information Center
Billingham, Carol J.; Harper, William W.
1980-01-01
Describes the instructional development of a course for advanced finance students on the use of data files and/or databases for solving complex finance problems. Areas covered include course goals and the design. The course class schedule and sample learning assessment assignments are provided. (JD)
Establishment and maintenance of a standardized glioma tissue bank: Huashan experience.
Aibaidula, Abudumijiti; Lu, Jun-feng; Wu, Jin-song; Zou, He-jian; Chen, Hong; Wang, Yu-qian; Qin, Zhi-yong; Yao, Yu; Gong, Ye; Che, Xiao-ming; Zhong, Ping; Li, Shi-qi; Bao, Wei-min; Mao, Ying; Zhou, Liang-fu
2015-06-01
Cerebral glioma is the most common brain tumor as well as one of the top ten malignant tumors in human beings. In spite of the great progress on chemotherapy and radiotherapy as well as the surgery strategies during the past decades, the mortality and morbidity are still high. One of the major challenges is to explore the pathogenesis and invasion of glioma at various "omics" levels (such as proteomics or genomics) and the clinical implications of biomarkers for diagnosis, prognosis or treatment of glioma patients. Establishment of a standardized tissue bank with high quality biospecimens annotated with clinical information is pivotal to the solution of these questions as well as the drug development process and translational research on glioma. Therefore, based on previous experience of tissue banks, standardized protocols for sample collection and storage were developed. We also developed two systems for glioma patient and sample management, a local database for medical records and a local image database for medical images. For future set-up of a regional biobank network in Shanghai, we also founded a centralized database for medical records. Hence we established a standardized glioma tissue bank with sufficient clinical data and medical images in Huashan Hospital. By September, 2013, tissues samples from 1,326 cases were collected. Histological diagnosis revealed that 73 % were astrocytic tumors, 17 % were oligodendroglial tumors, 2 % were oligoastrocytic tumors, 4 % were ependymal tumors and 4 % were other central nervous system neoplasms.
Generation of large scale urban environments to support advanced sensor and seeker simulation
NASA Astrophysics Data System (ADS)
Giuliani, Joseph; Hershey, Daniel; McKeown, David, Jr.; Willis, Carla; Van, Tan
2009-05-01
One of the key aspects for the design of a next generation weapon system is the need to operate in cluttered and complex urban environments. Simulation systems rely on accurate representation of these environments and require automated software tools to construct the underlying 3D geometry and associated spectral and material properties that are then formatted for various objective seeker simulation systems. Under an Air Force Small Business Innovative Research (SBIR) contract, we have developed an automated process to generate 3D urban environments with user defined properties. These environments can be composed from a wide variety of source materials, including vector source data, pre-existing 3D models, and digital elevation models, and rapidly organized into a geo-specific visual simulation database. This intermediate representation can be easily inspected in the visible spectrum for content and organization and interactively queried for accuracy. Once the database contains the required contents, it can then be exported into specific synthetic scene generation runtime formats, preserving the relationship between geometry and material properties. To date an exporter for the Irma simulation system developed and maintained by AFRL/Eglin has been created and a second exporter to Real Time Composite Hardbody and Missile Plume (CHAMP) simulation system for real-time use is currently being developed. This process supports significantly more complex target environments than previous approaches to database generation. In this paper we describe the capabilities for content creation for advanced seeker processing algorithms simulation and sensor stimulation, including the overall database compilation process and sample databases produced and exported for the Irma runtime system. We also discuss the addition of object dynamics and viewer dynamics within the visual simulation into the Irma runtime environment.
Integral nuclear data validation using experimental spent nuclear fuel compositions
Gauld, Ian C.; Williams, Mark L.; Michel-Sendis, Franco; ...
2017-07-19
Measurements of the isotopic contents of spent nuclear fuel provide experimental data that are a prerequisite for validating computer codes and nuclear data for many spent fuel applications. Under the auspices of the Organisation for Economic Co-operation and Development (OECD) Nuclear Energy Agency (NEA) and guidance of the Expert Group on Assay Data of Spent Nuclear Fuel of the NEA Working Party on Nuclear Criticality Safety, a new database of expanded spent fuel isotopic compositions has been compiled. The database, Spent Fuel Compositions (SFCOMPO) 2.0, includes measured data for more than 750 fuel samples acquired from 44 different reactors andmore » representing eight different reactor technologies. Measurements for more than 90 isotopes are included. This new database provides data essential for establishing the reliability of code systems for inventory predictions, but it also has broader potential application to nuclear data evaluation. Furthermore, the database, together with adjoint based sensitivity and uncertainty tools for transmutation systems developed to quantify the importance of nuclear data on nuclide concentrations, are described.« less
Integral nuclear data validation using experimental spent nuclear fuel compositions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gauld, Ian C.; Williams, Mark L.; Michel-Sendis, Franco
Measurements of the isotopic contents of spent nuclear fuel provide experimental data that are a prerequisite for validating computer codes and nuclear data for many spent fuel applications. Under the auspices of the Organisation for Economic Co-operation and Development (OECD) Nuclear Energy Agency (NEA) and guidance of the Expert Group on Assay Data of Spent Nuclear Fuel of the NEA Working Party on Nuclear Criticality Safety, a new database of expanded spent fuel isotopic compositions has been compiled. The database, Spent Fuel Compositions (SFCOMPO) 2.0, includes measured data for more than 750 fuel samples acquired from 44 different reactors andmore » representing eight different reactor technologies. Measurements for more than 90 isotopes are included. This new database provides data essential for establishing the reliability of code systems for inventory predictions, but it also has broader potential application to nuclear data evaluation. Furthermore, the database, together with adjoint based sensitivity and uncertainty tools for transmutation systems developed to quantify the importance of nuclear data on nuclide concentrations, are described.« less
Bailey, Sarah F; Scheible, Melissa K; Williams, Christopher; Silva, Deborah S B S; Hoggan, Marina; Eichman, Christopher; Faith, Seth A
2017-11-01
Next-generation Sequencing (NGS) is a rapidly evolving technology with demonstrated benefits for forensic genetic applications, and the strategies to analyze and manage the massive NGS datasets are currently in development. Here, the computing, data storage, connectivity, and security resources of the Cloud were evaluated as a model for forensic laboratory systems that produce NGS data. A complete front-to-end Cloud system was developed to upload, process, and interpret raw NGS data using a web browser dashboard. The system was extensible, demonstrating analysis capabilities of autosomal and Y-STRs from a variety of NGS instrumentation (Illumina MiniSeq and MiSeq, and Oxford Nanopore MinION). NGS data for STRs were concordant with standard reference materials previously characterized with capillary electrophoresis and Sanger sequencing. The computing power of the Cloud was implemented with on-demand auto-scaling to allow multiple file analysis in tandem. The system was designed to store resulting data in a relational database, amenable to downstream sample interpretations and databasing applications following the most recent guidelines in nomenclature for sequenced alleles. Lastly, a multi-layered Cloud security architecture was tested and showed that industry standards for securing data and computing resources were readily applied to the NGS system without disadvantageous effects for bioinformatic analysis, connectivity or data storage/retrieval. The results of this study demonstrate the feasibility of using Cloud-based systems for secured NGS data analysis, storage, databasing, and multi-user distributed connectivity. Copyright © 2017 Elsevier B.V. All rights reserved.
Database resources of the National Center for Biotechnology Information
2015-01-01
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. PMID:25398906
Database resources of the National Center for Biotechnology Information
2016-01-01
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank® nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. PMID:26615191
Abugessaisa, Imad; Gomez-Cabrero, David; Snir, Omri; Lindblad, Staffan; Klareskog, Lars; Malmström, Vivianne; Tegnér, Jesper
2013-04-02
Sequencing of the human genome and the subsequent analyses have produced immense volumes of data. The technological advances have opened new windows into genomics beyond the DNA sequence. In parallel, clinical practice generate large amounts of data. This represents an underused data source that has much greater potential in translational research than is currently realized. This research aims at implementing a translational medicine informatics platform to integrate clinical data (disease diagnosis, diseases activity and treatment) of Rheumatoid Arthritis (RA) patients from Karolinska University Hospital and their research database (biobanks, genotype variants and serology) at the Center for Molecular Medicine, Karolinska Institutet. Requirements engineering methods were utilized to identify user requirements. Unified Modeling Language and data modeling methods were used to model the universe of discourse and data sources. Oracle11g were used as the database management system, and the clinical development center (CDC) was used as the application interface. Patient data were anonymized, and we employed authorization and security methods to protect the system. We developed a user requirement matrix, which provided a framework for evaluating three translation informatics systems. The implementation of the CDC successfully integrated biological research database (15172 DNA, serum and synovial samples, 1436 cell samples and 65 SNPs per patient) and clinical database (5652 clinical visit) for the cohort of 379 patients presents three profiles. Basic functionalities provided by the translational medicine platform are research data management, development of bioinformatics workflow and analysis, sub-cohort selection, and re-use of clinical data in research settings. Finally, the system allowed researchers to extract subsets of attributes from cohorts according to specific biological, clinical, or statistical features. Research and clinical database integration is a real challenge and a road-block in translational research. Through this research we addressed the challenges and demonstrated the usefulness of CDC. We adhered to ethical regulations pertaining to patient data, and we determined that the existing software solutions cannot meet the translational research needs at hand. We used RA as a test case since we have ample data on active and longitudinal cohort.
2013-01-01
Background Sequencing of the human genome and the subsequent analyses have produced immense volumes of data. The technological advances have opened new windows into genomics beyond the DNA sequence. In parallel, clinical practice generate large amounts of data. This represents an underused data source that has much greater potential in translational research than is currently realized. This research aims at implementing a translational medicine informatics platform to integrate clinical data (disease diagnosis, diseases activity and treatment) of Rheumatoid Arthritis (RA) patients from Karolinska University Hospital and their research database (biobanks, genotype variants and serology) at the Center for Molecular Medicine, Karolinska Institutet. Methods Requirements engineering methods were utilized to identify user requirements. Unified Modeling Language and data modeling methods were used to model the universe of discourse and data sources. Oracle11g were used as the database management system, and the clinical development center (CDC) was used as the application interface. Patient data were anonymized, and we employed authorization and security methods to protect the system. Results We developed a user requirement matrix, which provided a framework for evaluating three translation informatics systems. The implementation of the CDC successfully integrated biological research database (15172 DNA, serum and synovial samples, 1436 cell samples and 65 SNPs per patient) and clinical database (5652 clinical visit) for the cohort of 379 patients presents three profiles. Basic functionalities provided by the translational medicine platform are research data management, development of bioinformatics workflow and analysis, sub-cohort selection, and re-use of clinical data in research settings. Finally, the system allowed researchers to extract subsets of attributes from cohorts according to specific biological, clinical, or statistical features. Conclusions Research and clinical database integration is a real challenge and a road-block in translational research. Through this research we addressed the challenges and demonstrated the usefulness of CDC. We adhered to ethical regulations pertaining to patient data, and we determined that the existing software solutions cannot meet the translational research needs at hand. We used RA as a test case since we have ample data on active and longitudinal cohort. PMID:23548156
NASA Astrophysics Data System (ADS)
Hsu, L.; Lehnert, K. A.; Carbotte, S. M.; Arko, R. A.; Ferrini, V.; O'hara, S. H.; Walker, J. D.
2012-12-01
The Integrated Earth Data Applications (IEDA) facility maintains multiple data systems with a wide range of solid earth data types from the marine, terrestrial, and polar environments. Examples of the different data types include syntheses of ultra-high resolution seafloor bathymetry collected on large collaborative cruises and analytical geochemistry measurements collected by single investigators in small, unique projects. These different data types have historically been channeled into separate, discipline-specific databases with search and retrieval tailored for the specific data type. However, a current major goal is to integrate data from different systems to allow interdisciplinary data discovery and scientific analysis. To increase discovery and access across these heterogeneous systems, IEDA employs several unique IDs, including sample IDs (International Geo Sample Number, IGSN), person IDs (GeoPass ID), funding award IDs (NSF Award Number), cruise IDs (from the Marine Geoscience Data System Expedition Metadata Catalog), dataset IDs (DOIs), and publication IDs (DOIs). These IDs allow linking of a sample registry (System for Earth SAmple Registration), data libraries and repositories (e.g. Geochemical Research Library, Marine Geoscience Data System), integrated synthesis databases (e.g. EarthChem Portal, PetDB), and investigator services (IEDA Data Compliance Tool). The linked systems allow efficient discovery of related data across different levels of granularity. In addition, IEDA data systems maintain links with several external data systems, including digital journal publishers. Links have been established between the EarthChem Portal and ScienceDirect through publication DOIs, returning sample-level objects and geochemical analyses for a particular publication. Linking IEDA-hosted data to digital publications with IGSNs at the sample level and with IEDA-allocated dataset DOIs are under development. As an example, an individual investigator could sign up for a GeoPass account ID, write a proposal to NSF and create a data plan using the IEDA Data Management Plan Tool. Having received the grant, the investigator then collects rock samples on a scientific cruise from dredges and registers the samples with IGSNs. The investigator then performs analytical geochemistry on the samples, and submits the full dataset to the Geochemical Resource Library for a dataset DOI. Finally, the investigator writes an article that is published in Science Direct. Knowing any of the following IDs: Investigator GeoPass ID, NSF Award Number, Cruise ID, Sample IGSNs, dataset DOI, or publication DOI, a user would be able to navigate to all samples, datasets, and publications in IEDA and external systems. Use of persistent identifiers to link heterogeneous data systems in IEDA thus increases access, discovery, and proper citation of hard-earned investigator datasets.
DNA Profiling of Convicted Offender Samples for the Combined DNA Index System
ERIC Educational Resources Information Center
Millard, Julie T
2011-01-01
The cornerstone of forensic chemistry is that a perpetrator inevitably leaves trace evidence at a crime scene. One important type of evidence is DNA, which has been instrumental in both the implication and exoneration of thousands of suspects in a wide range of crimes. The Combined DNA Index System (CODIS), a network of DNA databases, provides…
Moreno, Lilliana I; Brown, Alice L; Callaghan, Thomas F
2017-07-01
Rapid DNA platforms are fully integrated systems capable of producing and analyzing short tandem repeat (STR) profiles from reference sample buccal swabs in less than two hours. The technology requires minimal user interaction and experience making it possible for high quality profiles to be generated outside an accredited laboratory. The automated production of point of collection reference STR profiles could eliminate the time delay for shipment and analysis of arrestee samples at centralized laboratories. Furthermore, point of collection analysis would allow searching against profiles from unsolved crimes during the normal booking process once the infrastructure to immediately search the Combined DNA Index System (CODIS) database from the booking station is established. The DNAscan/ANDE™ Rapid DNA Analysis™ System developed by Network Biosystems was evaluated for robustness and reliability in the production of high quality reference STR profiles for database enrollment and searching applications. A total of 193 reference samples were assessed for concordance of the CODIS 13 loci. Studies to evaluate contamination, reproducibility, precision, stutter, peak height ratio, noise and sensitivity were also performed. The system proved to be robust, consistent and dependable. Results indicated an overall success rate of 75% for the 13 CODIS core loci and more importantly no incorrect calls were identified. The DNAscan/ANDE™ could be confidently used without human interaction in both laboratory and non-laboratory settings to generate reference profiles. Published by Elsevier B.V.
Importance of Data Management in a Long-term Biological Monitoring Program
DOE Office of Scientific and Technical Information (OSTI.GOV)
Christensen, Sigurd W; Brandt, Craig C; McCracken, Kitty
2011-01-01
The long-term Biological Monitoring and Abatement Program (BMAP) has always needed to collect and retain high-quality data on which to base its assessments of ecological status of streams and their recovery after remediation. Its formal quality assurance, data processing, and data management components all contribute to this need. The Quality Assurance Program comprehensively addresses requirements from various institutions, funders, and regulators, and includes a data management component. Centralized data management began a few years into the program. An existing relational database was adapted and extended to handle biological data. Data modeling enabled the program's database to process, store, and retrievemore » its data. The data base's main data tables and several key reference tables are described. One of the most important related activities supporting long-term analyses was the establishing of standards for sampling site names, taxonomic identification, flagging, and other components. There are limitations. Some types of program data were not easily accommodated in the central systems, and many possible data-sharing and integration options are not easily accessible to investigators. The implemented relational database supports the transmittal of data to the Oak Ridge Environmental Information System (OREIS) as the permanent repository. From our experience we offer data management advice to other biologically oriented long-term environmental sampling and analysis programs.« less
Importance of Data Management in a Long-Term Biological Monitoring Program
NASA Astrophysics Data System (ADS)
Christensen, Sigurd W.; Brandt, Craig C.; McCracken, Mary K.
2011-06-01
The long-term Biological Monitoring and Abatement Program (BMAP) has always needed to collect and retain high-quality data on which to base its assessments of ecological status of streams and their recovery after remediation. Its formal quality assurance, data processing, and data management components all contribute to meeting this need. The Quality Assurance Program comprehensively addresses requirements from various institutions, funders, and regulators, and includes a data management component. Centralized data management began a few years into the program when an existing relational database was adapted and extended to handle biological data. The database's main data tables and several key reference tables are described. One of the most important related activities supporting long-term analyses was the establishing of standards for sampling site names, taxonomic identification, flagging, and other components. The implemented relational database supports the transmittal of data to the Oak Ridge Environmental Information System (OREIS) as the permanent repository. We also discuss some limitations to our implementation. Some types of program data were not easily accommodated in the central systems, and many possible data-sharing and integration options are not easily accessible to investigators. From our experience we offer data management advice to other biologically oriented long-term environmental sampling and analysis programs.
The impact of database quality on keystroke dynamics authentication
NASA Astrophysics Data System (ADS)
Panasiuk, Piotr; Rybnik, Mariusz; Saeed, Khalid; Rogowski, Marcin
2016-06-01
This paper concerns keystroke dynamics, also partially in the context of touchscreen devices. The authors concentrate on the impact of database quality and propose their algorithm to test database quality issues. The algorithm is used on their own
A new database sub-system for grain-size analysis
NASA Astrophysics Data System (ADS)
Suckow, Axel
2013-04-01
Detailed grain-size analyses of large depth profiles for palaeoclimate studies create large amounts of data. For instance (Novothny et al., 2011) presented a depth profile of grain-size analyses with 2 cm resolution and a total depth of more than 15 m, where each sample was measured with 5 repetitions on a Beckman Coulter LS13320 with 116 channels. This adds up to a total of more than four million numbers. Such amounts of data are not easily post-processed by spreadsheets or standard software; also MS Access databases would face serious performance problems. The poster describes a database sub-system dedicated to grain-size analyses. It expands the LabData database and laboratory management system published by Suckow and Dumke (2001). This compatibility with a very flexible database system provides ease to import the grain-size data, as well as the overall infrastructure of also storing geographic context and the ability to organize content like comprising several samples into one set or project. It also allows easy export and direct plot generation of final data in MS Excel. The sub-system allows automated import of raw data from the Beckman Coulter LS13320 Laser Diffraction Particle Size Analyzer. During post processing MS Excel is used as a data display, but no number crunching is implemented in Excel. Raw grain size spectra can be exported and controlled as Number- Surface- and Volume-fractions, while single spectra can be locked for further post-processing. From the spectra the usual statistical values (i.e. mean, median) can be computed as well as fractions larger than a grain size, smaller than a grain size, fractions between any two grain sizes or any ratio of such values. These deduced values can be easily exported into Excel for one or more depth profiles. However, such a reprocessing for large amounts of data also allows new display possibilities: normally depth profiles of grain-size data are displayed only with summarized parameters like the clay content, sand content, etc., which always only displays part of the available information at each depth. Alternatively, full spectra were displayed at one depth. The new software now allows to display the whole grain-size spectrum at each depth in a three dimensional display. LabData and the grain-size subsystem are based on MS Access as front-end and MS SQL Server as back-end database systems. The SQL code for the data model, SQL server procedures and triggers and the MS Access basic code for the front end are public domain code, published under the GNU GPL license agreement and are available free of charge. References: Novothny, Á., Frechen, M., Horváth, E., Wacha, L., Rolf, C., 2011. Investigating the penultimate and last glacial cycles of the Sütt dating, high-resolution grain size, and magnetic susceptibility data. Quaternary International 234, 75-85. Suckow, A., Dumke, I., 2001. A database system for geochemical, isotope hydrological and geochronological laboratories. Radiocarbon 43, 325-337.
ERIC Educational Resources Information Center
Tenopir, Carol; Barry, Jeff
1997-01-01
Profiles 25 database distribution and production companies, all of which responded to a 1997 survey with information on 54 separate online, Web-based, or CD-ROM systems. Highlights increased competition, distribution formats, Web versions versus local area networks, full-text delivery, and pricing policies. Tables present a sampling of customers…
Black, J A; Waggamon, K A
1992-01-01
An isoelectric focusing method using thin-layer agarose gel has been developed for wheat gliadin. Using flat-bed units with a third electrode, up to 72 samples per gel may be analyzed. Advantages over traditional acid polyacrylamide gel electrophoresis methodology include: faster run times, nontoxic media, and greater sample capacity. The method is suitable for fingerprinting or purity testing of wheat varieties. Using digital images captured by a flat-bed scanner, a 4-band reference system using isoelectric points was devised. Software enables separated bands to be assigned pI values based upon reference tracks. Precision of assigned isoelectric points is shown to be on the order of 0.02 pH units. Captured images may be stored in a computer database and compared to unknown patterns to enable an identification. Parameters for a match with a stored pattern may be adjusted for pI interval required for a match, and number of best matches.
Rattner, B.A.; Pearson, J.L.; Golden, N.H.; Erwin, R.M.; Ottinger, M.A.
1998-01-01
The Biomonitoring of Environmental Status and Trends (BEST) program of the Department of the Interior is focused to identify and understand effects of contaminant stressors on biological resources under their stewardship. One BEST program activity involves evaluation of retrospective data to assess and predict the condition of biota in Atlantic coast estuaries. A 'Contaminant Exposure and Effects--Terrestrial Vertebrates' database (CEE-TV) has been compiled through computerized literature searches of Fish and Wildlife Reviews, BIOSIS, AGRICOLA, and TOXLINE, review of existing databases (e.g., US EPA Ecological Incident Information System, USGS Diagnostic and Epizootic Databases), and solicitation of unpublished reports from conservation agencies, private groups, and universities. Summary information has been entered into the CEE-TV database, including species, collection date (1965-present), site coordinates, sample matrix, contaminant concentrations, biomarker and bioindicator responses, and reference source, utilizing a 96-field dBase format. Currently, the CEE-TV database contains 3500 georeferenced records representing >200 vertebrate species and > 100,000 individuals residing in estuaries from Maine through Florida. This relational database can be directly queried, imported into the ARC/INFO geographic information system (GIS) to examine spatial tendencies, and used to identify 'hot-spots', generate hypotheses, and focus ecotoxicological assessments. An overview of temporal, phylogenetic, and geographic contaminant exposure and effects information, trends, and data gaps will be presented for terrestrial vertebrates residing in estuaries in the northeast United States.
Bohl, Daniel D; Russo, Glenn S; Basques, Bryce A; Golinvaux, Nicholas S; Fu, Michael C; Long, William D; Grauer, Jonathan N
2014-12-03
There has been an increasing use of national databases to conduct orthopaedic research. Questions regarding the validity and consistency of these studies have not been fully addressed. The purpose of this study was to test for similarity in reported measures between two national databases commonly used for orthopaedic research. A retrospective cohort study of patients undergoing lumbar spinal fusion procedures during 2009 to 2011 was performed in two national databases: the Nationwide Inpatient Sample and the National Surgical Quality Improvement Program. Demographic characteristics, comorbidities, and inpatient adverse events were directly compared between databases. The total numbers of patients included were 144,098 from the Nationwide Inpatient Sample and 8434 from the National Surgical Quality Improvement Program. There were only small differences in demographic characteristics between the two databases. There were large differences between databases in the rates at which specific comorbidities were documented. Non-morbid obesity was documented at rates of 9.33% in the Nationwide Inpatient Sample and 36.93% in the National Surgical Quality Improvement Program (relative risk, 0.25; p < 0.05). Peripheral vascular disease was documented at rates of 2.35% in the Nationwide Inpatient Sample and 0.60% in the National Surgical Quality Improvement Program (relative risk, 3.89; p < 0.05). Similarly, there were large differences between databases in the rates at which specific inpatient adverse events were documented. Sepsis was documented at rates of 0.38% in the Nationwide Inpatient Sample and 0.81% in the National Surgical Quality Improvement Program (relative risk, 0.47; p < 0.05). Acute kidney injury was documented at rates of 1.79% in the Nationwide Inpatient Sample and 0.21% in the National Surgical Quality Improvement Program (relative risk, 8.54; p < 0.05). As database studies become more prevalent in orthopaedic surgery, authors, reviewers, and readers should view these studies with caution. This study shows that two commonly used databases can identify demographically similar patients undergoing a common orthopaedic procedure; however, the databases document markedly different rates of comorbidities and inpatient adverse events. The differences are likely the result of the very different mechanisms through which the databases collect their comorbidity and adverse event data. Findings highlight concerns regarding the validity of orthopaedic database research. Copyright © 2014 by The Journal of Bone and Joint Surgery, Incorporated.
Data-Based Predictive Control with Multirate Prediction Step
NASA Technical Reports Server (NTRS)
Barlow, Jonathan S.
2010-01-01
Data-based predictive control is an emerging control method that stems from Model Predictive Control (MPC). MPC computes current control action based on a prediction of the system output a number of time steps into the future and is generally derived from a known model of the system. Data-based predictive control has the advantage of deriving predictive models and controller gains from input-output data. Thus, a controller can be designed from the outputs of complex simulation code or a physical system where no explicit model exists. If the output data happens to be corrupted by periodic disturbances, the designed controller will also have the built-in ability to reject these disturbances without the need to know them. When data-based predictive control is implemented online, it becomes a version of adaptive control. One challenge of MPC is computational requirements increasing with prediction horizon length. This paper develops a closed-loop dynamic output feedback controller that minimizes a multi-step-ahead receding-horizon cost function with multirate prediction step. One result is a reduced influence of prediction horizon and the number of system outputs on the computational requirements of the controller. Another result is an emphasis on portions of the prediction window that are sampled more frequently. A third result is the ability to include more outputs in the feedback path than in the cost function.
Patterns, biases and prospects in the distribution and diversity of Neotropical snakes.
Guedes, Thaís B; Sawaya, Ricardo J; Zizka, Alexander; Laffan, Shawn; Faurby, Søren; Pyron, R Alexander; Bérnils, Renato S; Jansen, Martin; Passos, Paulo; Prudente, Ana L C; Cisneros-Heredia, Diego F; Braz, Henrique B; Nogueira, Cristiano de C; Antonelli, Alexandre; Meiri, Shai
2018-01-01
We generated a novel database of Neotropical snakes (one of the world's richest herpetofauna) combining the most comprehensive, manually compiled distribution dataset with publicly available data. We assess, for the first time, the diversity patterns for all Neotropical snakes as well as sampling density and sampling biases. We compiled three databases of species occurrences: a dataset downloaded from the Global Biodiversity Information Facility (GBIF), a verified dataset built through taxonomic work and specialized literature, and a combined dataset comprising a cleaned version of the GBIF dataset merged with the verified dataset. Neotropics, Behrmann projection equivalent to 1° × 1°. Specimens housed in museums during the last 150 years. Squamata: Serpentes. Geographical information system (GIS). The combined dataset provides the most comprehensive distribution database for Neotropical snakes to date. It contains 147,515 records for 886 species across 12 families, representing 74% of all species of snakes, spanning 27 countries in the Americas. Species richness and phylogenetic diversity show overall similar patterns. Amazonia is the least sampled Neotropical region, whereas most well-sampled sites are located near large universities and scientific collections. We provide a list and updated maps of geographical distribution of all snake species surveyed. The biodiversity metrics of Neotropical snakes reflect patterns previously documented for other vertebrates, suggesting that similar factors may determine the diversity of both ectothermic and endothermic animals. We suggest conservation strategies for high-diversity areas and sampling efforts be directed towards Amazonia and poorly known species.
Shackleton, David; Pagram, Jenny; Ives, Lesley; Vanhinsbergh, Des
2018-06-02
The RapidHIT™ 200 System is a fully automated sample-to-DNA profile system designed to produce high quality DNA profiles within 2h. The use of RapidHIT™ 200 System within the United Kingdom Criminal Justice System (UKCJS) has required extensive development and validation of methods with a focus on AmpFℓSTR ® NGMSElect™ Express PCR kit to comply with specific regulations for loading to the UK National DNA Database (NDNAD). These studies have been carried out using single source reference samples to simulate live reference samples taken from arrestees and victims for elimination. The studies have shown that the system is capable of generating high quality profile and has achieved the accreditations necessary to load to the NDNAD; a first for the UK. Copyright © 2018 Elsevier B.V. All rights reserved.
O'Reilly, Christian; Gosselin, Nadia; Carrier, Julie; Nielsen, Tore
2014-12-01
Manual processing of sleep recordings is extremely time-consuming. Efforts to automate this process have shown promising results, but automatic systems are generally evaluated on private databases, not allowing accurate cross-validation with other systems. In lacking a common benchmark, the relative performances of different systems are not compared easily and advances are compromised. To address this fundamental methodological impediment to sleep study, we propose an open-access database of polysomnographic biosignals. To build this database, whole-night recordings from 200 participants [97 males (aged 42.9 ± 19.8 years) and 103 females (aged 38.3 ± 18.9 years); age range: 18-76 years] were pooled from eight different research protocols performed in three different hospital-based sleep laboratories. All recordings feature a sampling frequency of 256 Hz and an electroencephalography (EEG) montage of 4-20 channels plus standard electro-oculography (EOG), electromyography (EMG), electrocardiography (ECG) and respiratory signals. Access to the database can be obtained through the Montreal Archive of Sleep Studies (MASS) website (http://www.ceams-carsm.ca/en/MASS), and requires only affiliation with a research institution and prior approval by the applicant's local ethical review board. Providing the research community with access to this free and open sleep database is expected to facilitate the development and cross-validation of sleep analysis automation systems. It is also expected that such a shared resource will be a catalyst for cross-centre collaborations on difficult topics such as improving inter-rater agreement on sleep stage scoring. © 2014 European Sleep Research Society.
An environmental database for Venice and tidal zones
NASA Astrophysics Data System (ADS)
Macaluso, L.; Fant, S.; Marani, A.; Scalvini, G.; Zane, O.
2003-04-01
The natural environment is a complex, highly variable and physically non reproducible system (not in laboratory, nor in a confined territory). Environmental experimental studies are thus necessarily based on field measurements distributed in time and space. Only extensive data collections can provide the representative samples of the system behavior which are essential for scientific advancement. The assimilation of large data collections into accessible archives must necessarily be implemented in electronic databases. In the case of tidal environments in general, and of the Venice lagoon in particular, it is useful to establish a database, freely accessible to the scientific community, documenting the dynamics of such systems and their response to anthropic pressures and climatic variability. At the Istituto Veneto di Scienze, Lettere ed Arti in Venice (Italy) two internet environmental databases has been developed: one collects information regarding in detail the Venice lagoon; the other co-ordinate the research consortium of the "TIDE" EU RTD project, that attends to three different tidal areas: Venice Lagoon (Italy), Morecambe Bay (England), and Forth Estuary (Scotland). The archives may be accessed through the URL: www.istitutoveneto.it. The first one is freely available and applies to anyone is interested. It is continuously updated and has been structured in order to promote documentation concerning Venetian environment and disseminate this information for educational purposes (see "Dissemination" section). The second one is supplied by scientists and engineers working on this tidal system for various purposes (scientific, management, conservation purposes, etc.); it applies to interested researchers and grows with their own contributions. Both intend to promote scientific communication, to contribute to the realization of a distributed information system collecting homogeneous themes, and to initiate the interconnection among databases regarding different kinds of environment.
NASA Aviation Safety Reporting System
NASA Technical Reports Server (NTRS)
1980-01-01
A comprehensive study of near midair collisions in terminal airspace, derived from the ASRS database is presented. A selection of controller and pilot reports on airport perimeter security, unauthorized takeoffs and landings, and on winter operations is presented. A sampling of typical Alert Bulletins and their responses is presented.
NASA Astrophysics Data System (ADS)
Veneranda, M.; Negro, J. I.; Medina, J.; Rull, F.; Lantz, C.; Poulet, F.; Cousin, A.; Dypvik, H.; Hellevang, H.; Werner, S. C.
2018-04-01
The PTAL website will store multispectral analysis of samples collected from several terrestrial analogue sites and pretend to become a cornerstone tool for the scientific community interested in deepening the knowledge on Mars geological processes.
Manifold Regularized Experimental Design for Active Learning.
Zhang, Lining; Shum, Hubert P H; Shao, Ling
2016-12-02
Various machine learning and data mining tasks in classification require abundant data samples to be labeled for training. Conventional active learning methods aim at labeling the most informative samples for alleviating the labor of the user. Many previous studies in active learning select one sample after another in a greedy manner. However, this is not very effective because the classification models has to be retrained for each newly labeled sample. Moreover, many popular active learning approaches utilize the most uncertain samples by leveraging the classification hyperplane of the classifier, which is not appropriate since the classification hyperplane is inaccurate when the training data are small-sized. The problem of insufficient training data in real-world systems limits the potential applications of these approaches. This paper presents a novel method of active learning called manifold regularized experimental design (MRED), which can label multiple informative samples at one time for training. In addition, MRED gives an explicit geometric explanation for the selected samples to be labeled by the user. Different from existing active learning methods, our method avoids the intrinsic problems caused by insufficiently labeled samples in real-world applications. Various experiments on synthetic datasets, the Yale face database and the Corel image database have been carried out to show how MRED outperforms existing methods.
Aegerter, Philippe; Bendersky, Noelle; Tran, Thi-Chien; Ropers, Jacques; Taright, Namik; Chatellier, Gilles
2014-01-01
Recruitment of large samples of patients is crucial for evidence level and efficacy of clinical trials (CT). Clinical Trial Recruitment Support Systems (CTRSS) used to estimate patient recruitment are generally specific to Hospital Information Systems and few were evaluated on a large number of trials. Our aim was to assess, on a large number of CT, the usefulness of commonly available data as Diagnosis Related Groups (DRG) databases in order to estimate potential recruitment. We used the DRG database of a large French multicenter medical institution (1.2 million inpatient stays and 400 new trials each year). Eligibility criteria of protocols were broken down into in atomic entities (diagnosis, procedures, treatments...) then translated into codes and operators recorded in a standardized form. A program parsed the forms and generated requests on the DRG database. A large majority of selection criteria could be coded and final estimations of number of eligible patients were close to observed ones (median difference = 25). Such a system could be part of the feasability evaluation and center selection process before the start of the clinical trial.
NASA Astrophysics Data System (ADS)
Sun, Ziheng; Fang, Hui; Di, Liping; Yue, Peng
2016-09-01
It was an untouchable dream for remote sensing experts to realize total automatic image classification without inputting any parameter values. Experts usually spend hours and hours on tuning the input parameters of classification algorithms in order to obtain the best results. With the rapid development of knowledge engineering and cyberinfrastructure, a lot of data processing and knowledge reasoning capabilities become online accessible, shareable and interoperable. Based on these recent improvements, this paper presents an idea of parameterless automatic classification which only requires an image and automatically outputs a labeled vector. No parameters and operations are needed from endpoint consumers. An approach is proposed to realize the idea. It adopts an ontology database to store the experiences of tuning values for classifiers. A sample database is used to record training samples of image segments. Geoprocessing Web services are used as functionality blocks to finish basic classification steps. Workflow technology is involved to turn the overall image classification into a total automatic process. A Web-based prototypical system named PACS (Parameterless Automatic Classification System) is implemented. A number of images are fed into the system for evaluation purposes. The results show that the approach could automatically classify remote sensing images and have a fairly good average accuracy. It is indicated that the classified results will be more accurate if the two databases have higher quality. Once the experiences and samples in the databases are accumulated as many as an expert has, the approach should be able to get the results with similar quality to that a human expert can get. Since the approach is total automatic and parameterless, it can not only relieve remote sensing workers from the heavy and time-consuming parameter tuning work, but also significantly shorten the waiting time for consumers and facilitate them to engage in image classification activities. Currently, the approach is used only on high resolution optical three-band remote sensing imagery. The feasibility using the approach on other kinds of remote sensing images or involving additional bands in classification will be studied in future.
MARS: bringing the automation of small-molecule bioanalytical sample preparations to a new frontier.
Li, Ming; Chou, Judy; Jing, Jing; Xu, Hui; Costa, Aldo; Caputo, Robin; Mikkilineni, Rajesh; Flannelly-King, Shane; Rohde, Ellen; Gan, Lawrence; Klunk, Lewis; Yang, Liyu
2012-06-01
In recent years, there has been a growing interest in automating small-molecule bioanalytical sample preparations specifically using the Hamilton MicroLab(®) STAR liquid-handling platform. In the most extensive work reported thus far, multiple small-molecule sample preparation assay types (protein precipitation extraction, SPE and liquid-liquid extraction) have been integrated into a suite that is composed of graphical user interfaces and Hamilton scripts. Using that suite, bioanalytical scientists have been able to automate various sample preparation methods to a great extent. However, there are still areas that could benefit from further automation, specifically, the full integration of analytical standard and QC sample preparation with study sample extraction in one continuous run, real-time 2D barcode scanning on the Hamilton deck and direct Laboratory Information Management System database connectivity. We developed a new small-molecule sample-preparation automation system that improves in all of the aforementioned areas. The improved system presented herein further streamlines the bioanalytical workflow, simplifies batch run design, reduces analyst intervention and eliminates sample-handling error.
Database Resources of the BIG Data Center in 2018.
2018-01-04
The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
NASA Technical Reports Server (NTRS)
Mohr, Karen I.; Molinari, John; Thorncroft, Chris D,
2010-01-01
The characteristics of convective system populations in West Africa and the western Pacific tropical cyclone basin were analyzed to investigate whether interannual variability in convective activity in tropical continental and oceanic environments is driven by variations in the number of events during the wet season or by favoring large and/or intense convective systems. Convective systems were defined from TRMM data as a cluster of pixels with an 85 GHz polarization-corrected brightness temperature below 255 K and with an area at least 64 km 2. The study database consisted of convective systems in West Africa from May Sep for 1998-2007 and in the western Pacific from May Nov 1998-2007. Annual cumulative frequency distributions for system minimum brightness temperature and system area were constructed for both regions. For both regions, there were no statistically significant differences among the annual curves for system minimum brightness temperature. There were two groups of system area curves, split by the TRMM altitude boost in 2001. Within each set, there was no statistically significant interannual variability. Sub-setting the database revealed some sensitivity in distribution shape to the size of the sampling area, length of sample period, and climate zone. From a regional perspective, the stability of the cumulative frequency distributions implied that the probability that a convective system would attain a particular size or intensity does not change interannually. Variability in the number of convective events appeared to be more important in determining whether a year is wetter or drier than normal.
Nationwide Databases in Orthopaedic Surgery Research.
Bohl, Daniel D; Singh, Kern; Grauer, Jonathan N
2016-10-01
The use of nationwide databases to conduct orthopaedic research has expanded markedly in recent years. Nationwide databases offer large sample sizes, sampling of patients who are representative of the country as a whole, and data that enable investigation of trends over time. The most common use of nationwide databases is to study the occurrence of postoperative adverse events. Other uses include the analysis of costs and the investigation of critical hospital metrics, such as length of stay and readmission rates. Although nationwide databases are powerful research tools, readers should be aware of the differences between them and their limitations. These include variations and potential inaccuracies in data collection, imperfections in patient sampling, insufficient postoperative follow-up, and lack of orthopaedic-specific outcomes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dunn, W.N.
1998-03-01
LUG and Sway brace ANalysis (LUGSAN) II is an analysis and database computer program that is designed to calculate store lug and sway brace loads for aircraft captive carriage. LUGSAN II combines the rigid body dynamics code, SWAY85, with a Macintosh Hypercard database to function both as an analysis and archival system. This report describes the LUGSAN II application program, which operates on the Macintosh System (Hypercard 2.2 or later) and includes function descriptions, layout examples, and sample sessions. Although this report is primarily a user`s manual, a brief overview of the LUGSAN II computer code is included with suggestedmore » resources for programmers.« less
163 years of refinement: the British Geological Survey sample registration scheme
NASA Astrophysics Data System (ADS)
Howe, M. P.
2011-12-01
The British Geological Survey manages the largest UK geoscience samples collection, including: - 15,000 onshore boreholes, including over 250 km of drillcore - Vibrocores, gravity cores and grab samples from over 32,000 UK marine sample stations. 640 boreholes - Over 3 million UK fossils, including a "type and stratigraphic" reference collection of 250,000 fossils, 30,000 of which are "type, figured or cited" - Comprehensive microfossil collection, including many borehole samples - 290km of drillcore and 4.5 million cuttings samples from over 8000 UK continental shelf hydrocarbon wells - Over one million mineralogical and petrological samples, including 200,00 thin sections The current registration scheme was introduced in 1848 and is similar to that used by Charles Darwin on the Beagle. Every Survey collector or geologist has been issue with a unique prefix code of one or more letters and these were handwritten on preprinted numbers, arranged in books of 1 - 5,000 and 5,001 to 10,000. Similar labels are now computer printed. Other prefix codes are used for corporate collections, such as borehole samples, thin sections, microfossils, macrofossil sections, museum reference fossils, display quality rock samples and fossil casts. Such numbers infer significant immediate information to the curator, without the need to consult detailed registers. The registration numbers have been recorded in a series of over 1,000 registers, complete with metadata including sample ID, locality, horizon, collector and date. Citations are added as appropriate. Parent-child relationships are noted when re-registering subsubsamples. For example, a borehole sample BDA1001 could have been subsampled for a petrological thin section and off-cut (E14159), a fossil thin section (PF365), micropalynological slides (MPA273), one of which included a new holotype (MPK111), and a figured macrofossil (GSE1314). All main corporate collection now have publically-available online databases, such as PalaeoSaurus (fossils), Britrocks (mineralogy and petrology) and ComBo (combined onshore and offshore boreholes). ComBo links to core images, when available. Similar links are under development for Britrocks and PalaeoSaurus, with the latter also to include HR laser scanned digital models. These databases also link to internal and public GIS systems and to the BGS digital field data capture system. PalaeoSaurus holds an identification/authority/date history for each specimen, as well as recording type status, and figure and citation details. Similar comments can be added to Britrocks and ComBo. For several years, the BGS has provided online web access to the databases, for the discovery of physical samples , including parent-child links and citation information. Regretfully, authors frequently fail to cite sample registration numbers (nineteenth century geologists were sometimes better than their twenty-first century counterparts), or to supply copies of, or links to, the data generated, despite it being a condition of sample access. The need for editors and referees to enforce the inclusion of sample registration numbers, and for authors to lodge copies of papers, reports and data with the sample providers, is more important than yet another new database.
Greenspoon, Susan A; Ban, Jeffrey D; Sykes, Karen; Ballard, Elizabeth J; Edler, Shelley S; Baisden, Melissa; Covington, Brian L
2004-01-01
Robotic systems are commonly utilized for the extraction of database samples. However, the application of robotic extraction to forensic casework samples is a more daunting task. Such a system must be versatile enough to accommodate a wide range of samples that may contain greatly varying amounts of DNA, but it must also pose no more risk of contamination than the manual DNA extraction methods. This study demonstrates that the BioMek 2000 Laboratory Automation Workstation, used in combination with the DNA IQ System, is versatile enough to accommodate the wide range of samples typically encountered by a crime laboratory. The use of a silica coated paramagnetic resin, as with the DNA IQ System, facilitates the adaptation of an open well, hands off, robotic system to the extraction of casework samples since no filtration or centrifugation steps are needed. Moreover, the DNA remains tightly coupled to the silica coated paramagnetic resin for the entire process until the elution step. A short pre-extraction incubation step is necessary prior to loading samples onto the robot and it is at this step that most modifications are made to accommodate the different sample types and substrates commonly encountered with forensic evidentiary samples. Sexual assault (mixed stain) samples, cigarette butts, blood stains, buccal swabs, and various tissue samples were successfully extracted with the BioMek 2000 Laboratory Automation Workstation and the DNA IQ System, with no evidence of contamination throughout the extensive validation studies reported here.
[A quickly methodology for drug intelligence using profiling of illicit heroin samples].
Zhang, Jianxin; Chen, Cunyi
2012-07-01
The aim of the paper was to evaluate a link between two heroin seizures using a descriptive method. The system involved the derivation and gas chromatographic separation of samples followed by a fully automatic data analysis and transfer to a database. Comparisons used the square cosine function between two chromatograms assimilated to vectors. The method showed good discriminatory capabilities. The probability of false positives was extremely slight. In conclusion, this method proved to be efficient and reliable, which appeared suitable for estimating the links between illicit heroin samples.
Advanced Technologies for the Study of Earth Systems.
ERIC Educational Resources Information Center
Sproull, Jim
1991-01-01
Describes the Joint Education Initiative (JEdI) project designed to instruct teachers how to access scientific data and images for classroom instruction. Presents a sample CD-ROM classroom computer activity that illustrates how CD images and databases can be combined for a science investigation comparing topography to gravity anomalies. (MCO)
Using dBASE II for Bibliographic Files.
ERIC Educational Resources Information Center
Sullivan, Jeanette
1985-01-01
Describes use of a database management system (dBASE II, produced by Ashton-Tate), noting best features and disadvantages. Highlights include data entry, multiple access points available, training requirements, use of dBASE for a bibliographic application, auxiliary software, and dBASE updates. Sample searches, auxiliary programs, and requirements…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Femec, D.A.
This report discusses the sample tracking database in use at the Idaho National Engineering Laboratory (INEL) by the Radiation Measurements Laboratory (RML) and Analytical Radiochemistry. The database was designed in-house to meet the specific needs of the RML and Analytical Radiochemistry. The report consists of two parts, a user`s guide and a reference guide. The user`s guide presents some of the fundamentals needed by anyone who will be using the database via its user interface. The reference guide describes the design of both the database and the user interface. Briefly mentioned in the reference guide are the code-generating tools, CREATE-SCHEMAmore » and BUILD-SCREEN, written to automatically generate code for the database and its user interface. The appendices contain the input files used by the these tools to create code for the sample tracking database. The output files generated by these tools are also included in the appendices.« less
Dynamic delivery of the National Transit Database Sampling Manual.
DOT National Transportation Integrated Search
2013-02-01
This project improves the National Transit Database (NTD) Sampling Manual and develops an Internet-based, WordPress-powered interactive Web tool to deliver the new NTD Sampling Manual dynamically. The new manual adds guidance and a tool for transit a...
Dynamic delivery of the National Transit Database sampling manual.
DOT National Transportation Integrated Search
2013-02-01
This project improves the National Transit Database (NTD) Sampling Manual and develops an Internet-based, WordPress-powered interactive Web tool to deliver the new NTD Sampling Manual dynamically. The new manual adds guidance and a tool for transit a...
Systematic review for geo-authentic Lonicerae Japonicae Flos.
Yang, Xingyue; Liu, Yali; Hou, Aijuan; Yang, Yang; Tian, Xin; He, Liyun
2017-06-01
In traditional Chinese medicine, Lonicerae Japonicae Flos is commonly used as anti-inflammatory, antiviral, and antipyretic herbal medicine, and geo-authentic herbs are believed to present the highest quality among all samples from different regions. To discuss the current situation and trend of geo-authentic Lonicerae Japonicae Flos, we searched Chinese Biomedicine Literature Database, Chinese Journal Full-text Database, Chinese Scientific Journal Full-text Database, Cochrane Central Register of Controlled Trials, Wanfang, and PubMed. We investigated all studies up to November 2015 pertaining to quality assessment, discrimination, pharmacological effects, planting or processing, or ecological system of geo-authentic Lonicerae Japonicae Flos. Sixty-five studies mainly discussing about chemical fingerprint, component analysis, planting and processing, discrimination between varieties, ecological system, pharmacological effects, and safety were systematically reviewed. By analyzing these studies, we found that the key points of geo-authentic Lonicerae Japonicae Flos research were quality and application. Further studies should focus on improving the quality by selecting the more superior of all varieties and evaluating clinical effectiveness.
Charles, Isabel; Sinclair, Ian; Addison, Daniel H
2014-04-01
A new approach to the storage, processing, and interrogation of the quality data for screening samples has improved analytical throughput and confidence and enhanced the opportunities for learning from the accumulating records. The approach has entailed the design, development, and implementation of a database-oriented system, capturing information from the liquid chromatography-mass spectrometry capabilities used for assessing the integrity of samples in AstraZeneca's screening collection. A Web application has been developed to enable the visualization and interactive annotation of the analytical data, monitor the current sample queue, and report the throughput rate. Sample purity and identity are certified automatically on the chromatographic peaks of interest if predetermined thresholds are reached on key parameters. Using information extracted in parallel from the compound registration and container inventory databases, the chromatographic and spectroscopic profiles for each vessel are linked to the sample structures and storage histories. A search engine facilitates the direct comparison of results for multiple vessels of the same or similar compounds, for single vessels analyzed at different time points, or for vessels related by their origin or process flow. Access to this network of information has provided a deeper understanding of the multiple factors contributing to sample quality assurance.
Maschi, Tina; Dennis, Kelly Sullivan; Gibson, Sandy; MacMillan, Thalia; Sternberg, Susan; Hom, Maryann
2011-05-01
The purpose of this article was to review the empirical literature that investigated trauma and stress among older adults in the criminal justice system. Nineteen journal articles published between 1988 and 2010 were identified and extracted via research databases and included mixed age samples of adjudicated older and younger adults (n = 11) or older adult only samples (n = 8). Findings revealed past and current trauma and stress, consequences and/or correlates, and internal and external coping resources among aging offenders. The implications and future directions for gerontological social work, research, and policy with older adults in the criminal justice system are advanced.
Metabolonote: A Wiki-Based Database for Managing Hierarchical Metadata of Metabolome Analyses
Ara, Takeshi; Enomoto, Mitsuo; Arita, Masanori; Ikeda, Chiaki; Kera, Kota; Yamada, Manabu; Nishioka, Takaaki; Ikeda, Tasuku; Nihei, Yoshito; Shibata, Daisuke; Kanaya, Shigehiko; Sakurai, Nozomu
2015-01-01
Metabolomics – technology for comprehensive detection of small molecules in an organism – lags behind the other “omics” in terms of publication and dissemination of experimental data. Among the reasons for this are difficulty precisely recording information about complicated analytical experiments (metadata), existence of various databases with their own metadata descriptions, and low reusability of the published data, resulting in submitters (the researchers who generate the data) being insufficiently motivated. To tackle these issues, we developed Metabolonote, a Semantic MediaWiki-based database designed specifically for managing metabolomic metadata. We also defined a metadata and data description format, called “Togo Metabolome Data” (TogoMD), with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis. Separation of the management of metadata from that of data and permission to attach related information to the metadata provide advantages for submitters, readers, and database developers. The metadata are enriched with information such as links to comparable data, thereby functioning as a hub of related data resources. They also enhance not only readers’ understanding and use of data but also submitters’ motivation to publish the data. The metadata are computationally shared among other systems via APIs, which facilitate the construction of novel databases by database developers. A permission system that allows publication of immature metadata and feedback from readers also helps submitters to improve their metadata. Hence, this aspect of Metabolonote, as a metadata preparation tool, is complementary to high-quality and persistent data repositories such as MetaboLights. A total of 808 metadata for analyzed data obtained from 35 biological species are published currently. Metabolonote and related tools are available free of cost at http://metabolonote.kazusa.or.jp/. PMID:25905099
Metabolonote: a wiki-based database for managing hierarchical metadata of metabolome analyses.
Ara, Takeshi; Enomoto, Mitsuo; Arita, Masanori; Ikeda, Chiaki; Kera, Kota; Yamada, Manabu; Nishioka, Takaaki; Ikeda, Tasuku; Nihei, Yoshito; Shibata, Daisuke; Kanaya, Shigehiko; Sakurai, Nozomu
2015-01-01
Metabolomics - technology for comprehensive detection of small molecules in an organism - lags behind the other "omics" in terms of publication and dissemination of experimental data. Among the reasons for this are difficulty precisely recording information about complicated analytical experiments (metadata), existence of various databases with their own metadata descriptions, and low reusability of the published data, resulting in submitters (the researchers who generate the data) being insufficiently motivated. To tackle these issues, we developed Metabolonote, a Semantic MediaWiki-based database designed specifically for managing metabolomic metadata. We also defined a metadata and data description format, called "Togo Metabolome Data" (TogoMD), with an ID system that is required for unique access to each level of the tree-structured metadata such as study purpose, sample, analytical method, and data analysis. Separation of the management of metadata from that of data and permission to attach related information to the metadata provide advantages for submitters, readers, and database developers. The metadata are enriched with information such as links to comparable data, thereby functioning as a hub of related data resources. They also enhance not only readers' understanding and use of data but also submitters' motivation to publish the data. The metadata are computationally shared among other systems via APIs, which facilitate the construction of novel databases by database developers. A permission system that allows publication of immature metadata and feedback from readers also helps submitters to improve their metadata. Hence, this aspect of Metabolonote, as a metadata preparation tool, is complementary to high-quality and persistent data repositories such as MetaboLights. A total of 808 metadata for analyzed data obtained from 35 biological species are published currently. Metabolonote and related tools are available free of cost at http://metabolonote.kazusa.or.jp/.
A scalable database model for multiparametric time series: a volcano observatory case study
NASA Astrophysics Data System (ADS)
Montalto, Placido; Aliotta, Marco; Cassisi, Carmelo; Prestifilippo, Michele; Cannata, Andrea
2014-05-01
The variables collected by a sensor network constitute a heterogeneous data source that needs to be properly organized in order to be used in research and geophysical monitoring. With the time series term we refer to a set of observations of a given phenomenon acquired sequentially in time. When the time intervals are equally spaced one speaks of period or sampling frequency. Our work describes in detail a possible methodology for storage and management of time series using a specific data structure. We designed a framework, hereinafter called TSDSystem (Time Series Database System), in order to acquire time series from different data sources and standardize them within a relational database. The operation of standardization provides the ability to perform operations, such as query and visualization, of many measures synchronizing them using a common time scale. The proposed architecture follows a multiple layer paradigm (Loaders layer, Database layer and Business Logic layer). Each layer is specialized in performing particular operations for the reorganization and archiving of data from different sources such as ASCII, Excel, ODBC (Open DataBase Connectivity), file accessible from the Internet (web pages, XML). In particular, the loader layer performs a security check of the working status of each running software through an heartbeat system, in order to automate the discovery of acquisition issues and other warning conditions. Although our system has to manage huge amounts of data, performance is guaranteed by using a smart partitioning table strategy, that keeps balanced the percentage of data stored in each database table. TSDSystem also contains modules for the visualization of acquired data, that provide the possibility to query different time series on a specified time range, or follow the realtime signal acquisition, according to a data access policy from the users.
A multidisciplinary database for geophysical time series management
NASA Astrophysics Data System (ADS)
Montalto, P.; Aliotta, M.; Cassisi, C.; Prestifilippo, M.; Cannata, A.
2013-12-01
The variables collected by a sensor network constitute a heterogeneous data source that needs to be properly organized in order to be used in research and geophysical monitoring. With the time series term we refer to a set of observations of a given phenomenon acquired sequentially in time. When the time intervals are equally spaced one speaks of period or sampling frequency. Our work describes in detail a possible methodology for storage and management of time series using a specific data structure. We designed a framework, hereinafter called TSDSystem (Time Series Database System), in order to acquire time series from different data sources and standardize them within a relational database. The operation of standardization provides the ability to perform operations, such as query and visualization, of many measures synchronizing them using a common time scale. The proposed architecture follows a multiple layer paradigm (Loaders layer, Database layer and Business Logic layer). Each layer is specialized in performing particular operations for the reorganization and archiving of data from different sources such as ASCII, Excel, ODBC (Open DataBase Connectivity), file accessible from the Internet (web pages, XML). In particular, the loader layer performs a security check of the working status of each running software through an heartbeat system, in order to automate the discovery of acquisition issues and other warning conditions. Although our system has to manage huge amounts of data, performance is guaranteed by using a smart partitioning table strategy, that keeps balanced the percentage of data stored in each database table. TSDSystem also contains modules for the visualization of acquired data, that provide the possibility to query different time series on a specified time range, or follow the realtime signal acquisition, according to a data access policy from the users.
Using SIR (Scientific Information Retrieval System) for data management during a field program
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tichler, J.L.
As part of the US Department of Energy's program, PRocessing of Emissions by Clouds and Precipitation (PRECP), a team of scientists from four laboratories conducted a study in north central New York State, to characterize the chemical and physical processes occurring in winter storms. Sampling took place from three aircraft, two instrumented motor homes and a network of 26 surface precipitation sampling sites. Data management personnel were part of the field program, using a portable IBM PC-AT computer to enter information as it became available during the field study. Having the same database software on the field computer and onmore » the cluster of VAX 11/785 computers in use aided database development and the transfer of data between machines. 2 refs., 3 figs., 5 tabs.« less
Pemberton, T J; Jakobsson, M; Conrad, D F; Coop, G; Wall, J D; Pritchard, J K; Patel, P I; Rosenberg, N A
2008-07-01
When performing association studies in populations that have not been the focus of large-scale investigations of haplotype variation, it is often helpful to rely on genomic databases in other populations for study design and analysis - such as in the selection of tag SNPs and in the imputation of missing genotypes. One way of improving the use of these databases is to rely on a mixture of database samples that is similar to the population of interest, rather than using the single most similar database sample. We demonstrate the effectiveness of the mixture approach in the application of African, European, and East Asian HapMap samples for tag SNP selection in populations from India, a genetically intermediate region underrepresented in genomic studies of haplotype variation.
A Modular Low-Complexity ECG Delineation Algorithm for Real-Time Embedded Systems.
Bote, Jose Manuel; Recas, Joaquin; Rincon, Francisco; Atienza, David; Hermida, Roman
2018-03-01
This work presents a new modular and low-complexity algorithm for the delineation of the different ECG waves (QRS, P and T peaks, onsets, and end). Involving a reduced number of operations per second and having a small memory footprint, this algorithm is intended to perform real-time delineation on resource-constrained embedded systems. The modular design allows the algorithm to automatically adjust the delineation quality in runtime to a wide range of modes and sampling rates, from a ultralow-power mode when no arrhythmia is detected, in which the ECG is sampled at low frequency, to a complete high-accuracy delineation mode, in which the ECG is sampled at high frequency and all the ECG fiducial points are detected, in the case of arrhythmia. The delineation algorithm has been adjusted using the QT database, providing very high sensitivity and positive predictivity, and validated with the MIT database. The errors in the delineation of all the fiducial points are below the tolerances given by the Common Standards for Electrocardiography Committee in the high-accuracy mode, except for the P wave onset, for which the algorithm is above the agreed tolerances by only a fraction of the sample duration. The computational load for the ultralow-power 8-MHz TI MSP430 series microcontroller ranges from 0.2% to 8.5% according to the mode used.
Diway, Bibian; Khoo, Eyen
2017-01-01
The development of timber tracking methods based on genetic markers can provide scientific evidence to verify the origin of timber products and fulfill the growing requirement for sustainable forestry practices. In this study, the origin of an important Dark Red Meranti wood, Shorea platyclados, was studied by using the combination of seven chloroplast DNA and 15 short tandem repeats (STRs) markers. A total of 27 natural populations of S. platyclados were sampled throughout Malaysia to establish population level and individual level identification databases. A haplotype map was generated from chloroplast DNA sequencing for population identification, resulting in 29 multilocus haplotypes, based on 39 informative intraspecific variable sites. Subsequently, a DNA profiling database was developed from 15 STRs allowing for individual identification in Malaysia. Cluster analysis divided the 27 populations into two genetic clusters, corresponding to the region of Eastern and Western Malaysia. The conservativeness tests showed that the Malaysia database is conservative after removal of bias from population subdivision and sampling effects. Independent self-assignment tests correctly assigned individuals to the database in an overall 60.60−94.95% of cases for identified populations, and in 98.99−99.23% of cases for identified regions. Both the chloroplast DNA database and the STRs appear to be useful for tracking timber originating in Malaysia. Hence, this DNA-based method could serve as an effective addition tool to the existing forensic timber identification system for ensuring the sustainably management of this species into the future. PMID:28430826
Guo, Ye; Chen, Qian; Wu, Wei; Cui, Wei
2015-03-31
To establish a system of monitoring the key indicator of quality for inspection (KIQI) on a laboratory information system (LIS), and to have a better management of KIQI. Clinical sample made in PUMCH were collected during the whole of 2014. Next, interactive input program were designed to accomplish data collecting of the disqualification rate of samples, the mistake rate of samples and the occasions of losing samples, etc. Then, a series moment of sample collection, laboratory sample arrived, sample test, sample check, response to critical value, namely, trajectory information left on LIS were recorded and the qualification rate of TAT, the notification rate of endangering result were calculated. Finally, the information about quality control were collected to build an internal quality control database and the KIQI, such as the out-of-control rate of quality control and the total error of test items were monitored. The inspection of the sample management shows the disqualification rates in 2014 were all below the target, but the rates in January and February were a little high and the rates of four wards were above 2%. The mistake rates of samples was 0.47 cases/10 000 cases, attaining the target (< 2 cases/10 000 cases). Also, there was no occasion of losing samples in 2014, attaining the target too. The inspection of laboratory reports shows the qualification rates of TAT was within the acceptable range (> 95%), however the rates of blood routine in November (94.75%) was out of range. We have solved the problem by optimizing the processes. The notification rate of endangering result attained the target (≥ 98%), while the rate of timely notification is needed to improve. Quality inspection shows the CV of APTT in August (5.02%) was rising significantly, beyond the accepted CV (5.0%). We have solved the problem by changing the reagent. The CV of TT in 2014 were all below the allowable CV, thus the allowable CV of the next year lower to 10%. It is an objective and effective method to manage KIQI with the powerful management mode of database and information process capability on LIS.
NASA Technical Reports Server (NTRS)
1993-01-01
All the options in the NASA VEGetation Workbench (VEG) make use of a database of historical cover types. This database contains results from experiments by scientists on a wide variety of different cover types. The learning system uses the database to provide positive and negative training examples of classes that enable it to learn distinguishing features between classes of vegetation. All the other VEG options use the database to estimate the error bounds involved in the results obtained when various analysis techniques are applied to the sample of cover type data that is being studied. In the previous version of VEG, the historical cover type database was stored as part of the VEG knowledge base. This database was removed from the knowledge base. It is now stored as a series of flat files that are external to VEG. An interface between VEG and these files was provided. The interface allows the user to select which files of historical data to use. The files are then read, and the data are stored in Knowledge Engineering Environment (KEE) units using the same organization of units as in the previous version of VEG. The interface also allows the user to delete some or all of the historical database units from VEG and load new historical data from a file. This report summarizes the use of the historical cover type database in VEG. It then describes the new interface to the files containing the historical data. It describes minor changes that were made to VEG to enable the externally stored database to be used. Test runs to test the operation of the new interface and also to test the operation of VEG using historical data loaded from external files are described. Task F was completed. A Sun cartridge tape containing the KEE and Common Lisp code for the new interface and the modified version of the VEG knowledge base was delivered to the NASA GSFC technical representative.
JAX Colony Management System (JCMS): an extensible colony and phenotype data management system.
Donnelly, Chuck J; McFarland, Mike; Ames, Abigail; Sundberg, Beth; Springer, Dave; Blauth, Peter; Bult, Carol J
2010-04-01
The Jackson Laboratory Colony Management System (JCMS) is a software application for managing data and information related to research mouse colonies, associated biospecimens, and experimental protocols. JCMS runs directly on computers that run one of the PC Windows operating systems, but can be accessed via web browser interfaces from any computer running a Windows, Macintosh, or Linux operating system. JCMS can be configured for a single user or multiple users in small- to medium-size work groups. The target audience for JCMS includes laboratory technicians, animal colony managers, and principal investigators. The application provides operational support for colony management and experimental workflows, sample and data tracking through transaction-based data entry forms, and date-driven work reports. Flexible query forms allow researchers to retrieve database records based on user-defined criteria. Recent advances in handheld computers with integrated barcode readers, middleware technologies, web browsers, and wireless networks add to the utility of JCMS by allowing real-time access to the database from any networked computer.
McIlroy, Simon Jon; Kirkegaard, Rasmus Hansen; McIlroy, Bianca; Nierychlo, Marta; Kristensen, Jannie Munk; Karst, Søren Michael; Albertsen, Mads; Nielsen, Per Halkjær
2017-01-01
Wastewater is increasingly viewed as a resource, with anaerobic digester technology being routinely implemented for biogas production. Characterising the microbial communities involved in wastewater treatment facilities and their anaerobic digesters is considered key to their optimal design and operation. Amplicon sequencing of the 16S rRNA gene allows high-throughput monitoring of these systems. The MiDAS field guide is a public resource providing amplicon sequencing protocols and an ecosystem-specific taxonomic database optimized for use with wastewater treatment facility samples. The curated taxonomy endeavours to provide a genus-level-classification for abundant phylotypes and the online field guide links this identity to published information regarding their ecology, function and distribution. This article describes the expansion of the database resources to cover the organisms of the anaerobic digester systems fed primary sludge and surplus activated sludge. The updated database includes descriptions of the abundant genus-level-taxa in influent wastewater, activated sludge and anaerobic digesters. Abundance information is also included to allow assessment of the role of emigration in the ecology of each phylotype. MiDAS is intended as a collaborative resource for the progression of research into the ecology of wastewater treatment, by providing a public repository for knowledge that is accessible to all interested in these biotechnologically important systems. http://www.midasfieldguide.org. © The Author(s) 2017. Published by Oxford University Press.
Silva, Cristina; Fresco, Paula; Monteiro, Joaquim; Rama, Ana Cristina Ribeiro
2013-08-01
Evidence-Based Practice requires health care decisions to be based on the best available evidence. The model "Information Mastery" proposes that clinicians should use sources of information that have previously evaluated relevance and validity, provided at the point of care. Drug databases (DB) allow easy and fast access to information and have the benefit of more frequent content updates. Relevant information, in the context of drug therapy, is that which supports safe and effective use of medicines. Accordingly, the European Guideline on the Summary of Product Characteristics (EG-SmPC) was used as a standard to evaluate the inclusion of relevant information contents in DB. To develop and test a method to evaluate relevancy of DB contents, by assessing the inclusion of information items deemed relevant for effective and safe drug use. Hierarchical organisation and selection of the principles defined in the EGSmPC; definition of criteria to assess inclusion of selected information items; creation of a categorisation and quantification system that allows score calculation; calculation of relative differences (RD) of scores for comparison with an "ideal" database, defined as the one that achieves the best quantification possible for each of the information items; pilot test on a sample of 9 drug databases, using 10 drugs frequently associated in literature with morbidity-mortality and also being widely consumed in Portugal. Main outcome measure Calculate individual and global scores for clinically relevant information items of drug monographs in databases, using the categorisation and quantification system created. A--Method development: selection of sections, subsections, relevant information items and corresponding requisites; system to categorise and quantify their inclusion; score and RD calculation procedure. B--Pilot test: calculated scores for the 9 databases; globally, all databases evaluated significantly differed from the "ideal" database; some DB performed better but performance was inconsistent at subsections level, within the same DB. The method developed allows quantification of the inclusion of relevant information items in DB and comparison with an "ideal database". It is necessary to consult diverse DB in order to find all the relevant information needed to support clinical drug use.
A Molecular Framework for Understanding DCIS
2016-10-01
well. Pathologic and Clinical Annotation Database A clinical annotation database titled the Breast Oncology Database has been established to...complement the procured SPORE sample characteristics and annotated pathology data. This Breast Oncology Database is an offsite clinical annotation...database adheres to CSMC Enterprise Information Services (EIS) research database security standards. The Breast Oncology Database consists of: 9 Baseline
Volcanoes of the World: Reconfiguring a scientific database to meet new goals and expectations
NASA Astrophysics Data System (ADS)
Venzke, Edward; Andrews, Ben; Cottrell, Elizabeth
2015-04-01
The Smithsonian Global Volcanism Program's (GVP) database of Holocene volcanoes and eruptions, Volcanoes of the World (VOTW), originated in 1971, and was largely populated with content from the IAVCEI Catalog of Volcanoes of Active Volcanoes and some independent datasets. Volcanic activity reported by Smithsonian's Bulletin of the Global Volcanism Network and USGS/SI Weekly Activity Reports (and their predecessors), published research, and other varied sources has expanded the database significantly over the years. Three editions of the VOTW were published in book form, creating a catalog with new ways to display data that included regional directories, a gazetteer, and a 10,000-year chronology of eruptions. The widespread dissemination of the data in electronic media since the first GVP website in 1995 has created new challenges and opportunities for this unique collection of information. To better meet current and future goals and expectations, we have recently transitioned VOTW into a SQL Server database. This process included significant schema changes to the previous relational database, data auditing, and content review. We replaced a disparate, confusing, and changeable volcano numbering system with unique and permanent volcano numbers. We reconfigured structures for recording eruption data to allow greater flexibility in describing the complexity of observed activity, adding in the ability to distinguish episodes within eruptions (in time and space) and events (including dates) rather than characteristics that take place during an episode. We have added a reference link field in multiple tables to enable attribution of sources at finer levels of detail. We now store and connect synonyms and feature names in a more consistent manner, which will allow for morphological features to be given unique numbers and linked to specific eruptions or samples; if the designated overall volcano name is also a morphological feature, it is then also listed and described as that feature. One especially significant audit involved re-evaluating the categories of evidence used to include a volcano in the Holocene list, and reviewing in detail the entries in low-certainty categories. Concurrently, we developed a new data entry system that may in the future allow trusted users outside of Smithsonian to input data into VOTW. A redesigned website now provides new search tools and data download options. We are collaborating with organizations that manage volcano and eruption databases, physical sample databases, and geochemical databases to allow real-time connections and complex queries. VOTW serves the volcanological community by providing a clear and consistent core database of distinctly identified volcanoes and eruptions to advance goals in research, civil defense, and public outreach.
Distributed cyberinfrastructure tools for automated data processing of structural monitoring data
NASA Astrophysics Data System (ADS)
Zhang, Yilan; Kurata, Masahiro; Lynch, Jerome P.; van der Linden, Gwendolyn; Sederat, Hassan; Prakash, Atul
2012-04-01
The emergence of cost-effective sensing technologies has now enabled the use of dense arrays of sensors to monitor the behavior and condition of large-scale bridges. The continuous operation of dense networks of sensors presents a number of new challenges including how to manage such massive amounts of data that can be created by the system. This paper reports on the progress of the creation of cyberinfrastructure tools which hierarchically control networks of wireless sensors deployed in a long-span bridge. The internet-enabled cyberinfrastructure is centrally managed by a powerful database which controls the flow of data in the entire monitoring system architecture. A client-server model built upon the database provides both data-provider and system end-users with secured access to various levels of information of a bridge. In the system, information on bridge behavior (e.g., acceleration, strain, displacement) and environmental condition (e.g., wind speed, wind direction, temperature, humidity) are uploaded to the database from sensor networks installed in the bridge. Then, data interrogation services interface with the database via client APIs to autonomously process data. The current research effort focuses on an assessment of the scalability and long-term robustness of the proposed cyberinfrastructure framework that has been implemented along with a permanent wireless monitoring system on the New Carquinez (Alfred Zampa Memorial) Suspension Bridge in Vallejo, CA. Many data interrogation tools are under development using sensor data and bridge metadata (e.g., geometric details, material properties, etc.) Sample data interrogation clients including those for the detection of faulty sensors, automated modal parameter extraction.
Temporal Wind Pairs for Space Launch Vehicle Capability Assessment and Risk Mitigation
NASA Technical Reports Server (NTRS)
Decker, Ryan K.; Barbre, Robert E., Jr.
2015-01-01
Space launch vehicles incorporate upper-level wind assessments to determine wind effects on the vehicle and for a commit to launch decision. These assessments make use of wind profiles measured hours prior to launch and may not represent the actual wind the vehicle will fly through. Uncertainty in the winds over the time period between the assessment and launch introduces uncertainty in assessment of vehicle controllability and structural integrity that must be accounted for to ensure launch safety. Temporal wind pairs are used in engineering development of allowances to mitigate uncertainty. Five sets of temporal wind pairs at various times (0.75, 1.5, 2, 3 and 4-hrs) at the United States Air Force Eastern Range and Western Range, as well as the National Aeronautics and Space Administration's Wallops Flight Facility are developed for use in upper-level wind assessments on vehicle performance. Historical databases are compiled from balloon-based and vertically pointing Doppler radar wind profiler systems. Various automated and manual quality control procedures are used to remove unacceptable profiles. Statistical analyses on the resultant wind pairs from each site are performed to determine if the observed extreme wind changes in the sample pairs are representative of extreme temporal wind change. Wind change samples in the Eastern Range and Western Range databases characterize extreme wind change. However, the small sample sizes in the Wallops Flight Facility databases yield low confidence that the sample population characterizes extreme wind change that could occur.
Temporal Wind Pairs for Space Launch Vehicle Capability Assessment and Risk Mitigation
NASA Technical Reports Server (NTRS)
Decker, Ryan K.; Barbre, Robert E., Jr.
2014-01-01
Space launch vehicles incorporate upper-level wind assessments to determine wind effects on the vehicle and for a commit to launch decision. These assessments make use of wind profiles measured hours prior to launch and may not represent the actual wind the vehicle will fly through. Uncertainty in the winds over the time period between the assessment and launch introduces uncertainty in assessment of vehicle controllability and structural integrity that must be accounted for to ensure launch safety. Temporal wind pairs are used in engineering development of allowances to mitigate uncertainty. Five sets of temporal wind pairs at various times (0.75, 1.5, 2, 3 and 4-hrs) at the United States Air Force Eastern Range and Western Range, as well as the National Aeronautics and Space Administration's Wallops Flight Facility are developed for use in upper-level wind assessments on vehicle performance. Historical databases are compiled from balloon-based and vertically pointing Doppler radar wind profiler systems. Various automated and manual quality control procedures are used to remove unacceptable profiles. Statistical analyses on the resultant wind pairs from each site are performed to determine if the observed extreme wind changes in the sample pairs are representative of extreme temporal wind change. Wind change samples in the Eastern Range and Western Range databases characterize extreme wind change. However, the small sample sizes in the Wallops Flight Facility databases yield low confidence that the sample population characterizes extreme wind change that could occur.
Matlashov, Andrei Nikolaevich; Urbaitis, Algis V.; Savukov, Igor Mykhaylovich; Espy, Michelle A.; Volegov, Petr Lvovich; Kraus, Jr., Robert Henry
2013-03-05
Method comprising obtaining an NMR measurement from a sample wherein an ultra-low field NMR system probes the sample and produces the NMR measurement and wherein a sampling temperature, prepolarizing field, and measurement field are known; detecting the NMR measurement by means of inductive coils; analyzing the NMR measurement to obtain at least one measurement feature wherein the measurement feature comprises T1, T2, T1.rho., or the frequency dependence thereof; and, searching for the at least one measurement feature within a database comprising NMR reference data for at least one material to determine if the sample comprises a material of interest.
Competitive-Cooperative Automated Reasoning from Distributed and Multiple Source of Data
NASA Astrophysics Data System (ADS)
Fard, Amin Milani
Knowledge extraction from distributed database systems, have been investigated during past decade in order to analyze billions of information records. In this work a competitive deduction approach in a heterogeneous data grid environment is proposed using classic data mining and statistical methods. By applying a game theory concept in a multi-agent model, we tried to design a policy for hierarchical knowledge discovery and inference fusion. To show the system run, a sample multi-expert system has also been developed.
Xiao, Di; You, Yuanhai; Bi, Zhenwang; Wang, Haibin; Zhang, Yongchan; Hu, Bin; Song, Yanyan; Zhang, Huifang; Kou, Zengqiang; Yan, Xiaomei; Zhang, Menghan; Jin, Lianmei; Jiang, Xihong; Su, Peng; Bi, Zhenqiang; Luo, Fengji; Zhang, Jianzhong
2013-03-01
There was a dramatic increase in scarlet fever cases in China from March to July 2011. Group A Streptococcus (GAS) is the only pathogen known to cause scarlet fever. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) coupled to Biotyper system was used for GAS identification in 2011. A local reference database (LRD) was constructed, evaluated and used to identify GAS isolates. The 75 GAS strains used to evaluate the LRD were all identified correctly. Of the 157 suspected β-hemolytic strains isolated from 298 throat swab samples, 127 (100%) and 120 (94.5%) of the isolates were identified as GAS by the MALDI-TOF MS system and the conventional bacitracin sensitivity test method, respectively. All 202 (100%) isolates were identified at the species level by searching the LRD, while 182 (90.1%) were identified by searching the original reference database (ORD). There were statistically significant differences with a high degree of credibility at species level (χ(2)=6.052, P<0.05 between the LRD and ORD). The test turnaround time was shortened 36-48h, and the cost of each sample is one-tenth of the cost of conventional methods. Establishing a domestic database is the most effective way to improve the identification efficiency using a MALDI-TOF MS system. MALDI-TOF MS is a viable alternative to conventional methods and may aid in the diagnosis and surveillance of GAS. Copyright © 2013 Elsevier B.V. All rights reserved.
MISSE in the Materials and Processes Technical Information System (MAPTIS )
NASA Technical Reports Server (NTRS)
Burns, DeWitt; Finckenor, Miria; Henrie, Ben
2013-01-01
Materials International Space Station Experiment (MISSE) data is now being collected and distributed through the Materials and Processes Technical Information System (MAPTIS) at Marshall Space Flight Center in Huntsville, Alabama. MISSE data has been instrumental in many programs and continues to be an important source of data for the space community. To facilitate great access to the MISSE data the International Space Station (ISS) program office and MAPTIS are working to gather this data into a central location. The MISSE database contains information about materials, samples, and flights along with pictures, pdfs, excel files, word documents, and other files types. Major capabilities of the system are: access control, browsing, searching, reports, and record comparison. The search capabilities will search within any searchable files so even if the desired meta-data has not been associated data can still be retrieved. Other functionality will continue to be added to the MISSE database as the Athena Platform is expanded
Srinivasan, R; Sugumar, V Raji
2015-10-04
For the first time, we have a comprehensive database on usage of AYUSH (acronym for Ayurveda, naturopathy and Yoga, Unani, Siddha, and Homeopathy) in India at the household level. This article aims at exploring the spread of the traditional medical systems in India and the perceptions of people on the access and effectiveness of these medical systems using this database. The article uses the unit level data purchased from the National Sample Survey Organization, New Delhi. Household is the basic unit of survey and the data are the collective opinion of the household. This survey shows that less than 30% of Indian households use the traditional medical systems. There is also a regional pattern in the usage of particular type of traditional medicine, reflecting the regional aspects of the development of such medical systems. The strong faith in AYUSH is the main reason for its usage; lack of need for AYUSH and lack of awareness about AYUSH are the main reasons for not using it. With regard to source of medicines in the traditional medical systems, home is the main source in the Indian medical system and private sector is the main source in Homeopathy. This shows that there is need for creating awareness and improving access to traditional medical systems in India. By and large, the users of AYUSH are also convinced about the effectiveness of these traditional medicines. © The Author(s) 2015.
Variability in Standard Outcomes of Posterior Lumbar Fusion Determined by National Databases.
Joseph, Jacob R; Smith, Brandon W; Park, Paul
2017-01-01
National databases are used with increasing frequency in spine surgery literature to evaluate patient outcomes. The differences between individual databases in relationship to outcomes of lumbar fusion are not known. We evaluated the variability in standard outcomes of posterior lumbar fusion between the University HealthSystem Consortium (UHC) database and the Healthcare Cost and Utilization Project National Inpatient Sample (NIS). NIS and UHC databases were queried for all posterior lumbar fusions (International Classification of Diseases, Ninth Revision code 81.07) performed in 2012. Patient demographics, comorbidities (including obesity), length of stay (LOS), in-hospital mortality, and complications such as urinary tract infection, deep venous thrombosis, pulmonary embolism, myocardial infarction, durotomy, and surgical site infection were collected using specific International Classification of Diseases, Ninth Revision codes. Analysis included 21,470 patients from the NIS database and 14,898 patients from the UHC database. Demographic data were not significantly different between databases. Obesity was more prevalent in UHC (P = 0.001). Mean LOS was 3.8 days in NIS and 4.55 in UHC (P < 0.0001). Complications were significantly higher in UHC, including urinary tract infection, deep venous thrombosis, pulmonary embolism, myocardial infarction, surgical site infection, and durotomy. In-hospital mortality was similar between databases. NIS and UHC databases had similar demographic patient populations undergoing posterior lumbar fusion. However, the UHC database reported significantly higher complication rate and longer LOS. This difference may reflect academic institutions treating higher-risk patients; however, a definitive reason for the variability between databases is unknown. The inability to precisely determine the basis of the variability between databases highlights the limitations of using administrative databases for spinal outcome analysis. Copyright © 2016 Elsevier Inc. All rights reserved.
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Sayers, Eric W
2010-01-01
GenBank is a comprehensive database that contains publicly available nucleotide sequences for more than 300,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bi-monthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI homepage: www.ncbi.nlm.nih.gov.
CMO: Cruise Metadata Organizer for JAMSTEC Research Cruises
NASA Astrophysics Data System (ADS)
Fukuda, K.; Saito, H.; Hanafusa, Y.; Vanroosebeke, A.; Kitayama, T.
2011-12-01
JAMSTEC's Data Research Center for Marine-Earth Sciences manages and distributes a wide variety of observational data and samples obtained from JAMSTEC research vessels and deep sea submersibles. Generally, metadata are essential to identify data and samples were obtained. In JAMSTEC, cruise metadata include cruise information such as cruise ID, name of vessel, research theme, and diving information such as dive number, name of submersible and position of diving point. They are submitted by chief scientists of research cruises in the Microsoft Excel° spreadsheet format, and registered into a data management database to confirm receipt of observational data files, cruise summaries, and cruise reports. The cruise metadata are also published via "JAMSTEC Data Site for Research Cruises" within two months after end of cruise. Furthermore, these metadata are distributed with observational data, images and samples via several data and sample distribution websites after a publication moratorium period. However, there are two operational issues in the metadata publishing process. One is that duplication efforts and asynchronous metadata across multiple distribution websites due to manual metadata entry into individual websites by administrators. The other is that differential data types or representation of metadata in each website. To solve those problems, we have developed a cruise metadata organizer (CMO) which allows cruise metadata to be connected from the data management database to several distribution websites. CMO is comprised of three components: an Extensible Markup Language (XML) database, an Enterprise Application Integration (EAI) software, and a web-based interface. The XML database is used because of its flexibility for any change of metadata. Daily differential uptake of metadata from the data management database to the XML database is automatically processed via the EAI software. Some metadata are entered into the XML database using the web-based interface by a metadata editor in CMO as needed. Then daily differential uptake of metadata from the XML database to databases in several distribution websites is automatically processed using a convertor defined by the EAI software. Currently, CMO is available for three distribution websites: "Deep Sea Floor Rock Sample Database GANSEKI", "Marine Biological Sample Database", and "JAMSTEC E-library of Deep-sea Images". CMO is planned to provide "JAMSTEC Data Site for Research Cruises" with metadata in the future.
2013-01-01
Background Research in organic chemistry generates samples of novel chemicals together with their properties and other related data. The involved scientists must be able to store this data and search it by chemical structure. There are commercial solutions for common needs like chemical registration systems or electronic lab notebooks. However for specific requirements of in-house databases and processes no such solutions exist. Another issue is that commercial solutions have the risk of vendor lock-in and may require an expensive license of a proprietary relational database management system. To speed up and simplify the development for applications that require chemical structure search capabilities, I have developed Molecule Database Framework. The framework abstracts the storing and searching of chemical structures into method calls. Therefore software developers do not require extensive knowledge about chemistry and the underlying database cartridge. This decreases application development time. Results Molecule Database Framework is written in Java and I created it by integrating existing free and open-source tools and frameworks. The core functionality includes: • Support for multi-component compounds (mixtures) • Import and export of SD-files • Optional security (authorization) For chemical structure searching Molecule Database Framework leverages the capabilities of the Bingo Cartridge for PostgreSQL and provides type-safe searching, caching, transactions and optional method level security. Molecule Database Framework supports multi-component chemical compounds (mixtures). Furthermore the design of entity classes and the reasoning behind it are explained. By means of a simple web application I describe how the framework could be used. I then benchmarked this example application to create some basic performance expectations for chemical structure searches and import and export of SD-files. Conclusions By using a simple web application it was shown that Molecule Database Framework successfully abstracts chemical structure searches and SD-File import and export to simple method calls. The framework offers good search performance on a standard laptop without any database tuning. This is also due to the fact that chemical structure searches are paged and cached. Molecule Database Framework is available for download on the projects web page on bitbucket: https://bitbucket.org/kienerj/moleculedatabaseframework. PMID:24325762
Kiener, Joos
2013-12-11
Research in organic chemistry generates samples of novel chemicals together with their properties and other related data. The involved scientists must be able to store this data and search it by chemical structure. There are commercial solutions for common needs like chemical registration systems or electronic lab notebooks. However for specific requirements of in-house databases and processes no such solutions exist. Another issue is that commercial solutions have the risk of vendor lock-in and may require an expensive license of a proprietary relational database management system. To speed up and simplify the development for applications that require chemical structure search capabilities, I have developed Molecule Database Framework. The framework abstracts the storing and searching of chemical structures into method calls. Therefore software developers do not require extensive knowledge about chemistry and the underlying database cartridge. This decreases application development time. Molecule Database Framework is written in Java and I created it by integrating existing free and open-source tools and frameworks. The core functionality includes:•Support for multi-component compounds (mixtures)•Import and export of SD-files•Optional security (authorization)For chemical structure searching Molecule Database Framework leverages the capabilities of the Bingo Cartridge for PostgreSQL and provides type-safe searching, caching, transactions and optional method level security. Molecule Database Framework supports multi-component chemical compounds (mixtures).Furthermore the design of entity classes and the reasoning behind it are explained. By means of a simple web application I describe how the framework could be used. I then benchmarked this example application to create some basic performance expectations for chemical structure searches and import and export of SD-files. By using a simple web application it was shown that Molecule Database Framework successfully abstracts chemical structure searches and SD-File import and export to simple method calls. The framework offers good search performance on a standard laptop without any database tuning. This is also due to the fact that chemical structure searches are paged and cached. Molecule Database Framework is available for download on the projects web page on bitbucket: https://bitbucket.org/kienerj/moleculedatabaseframework.
Wattal, C; Oberoi, J K; Goel, N; Raveendran, R; Khanna, S
2017-05-01
The study evaluates the utility of matrix-assisted laser desorption/ionisation time-of-flight mass spectrometry (MALDI-TOF MS) Vitek MS for identification of microorganisms in the routine clinical microbiology laboratory. From May 2013 to April 2014, microbial isolates recovered from various clinical samples were identified by Vitek MS. In case of failure to identify by Vitek MS, the isolate was identified using the Vitek 2 system (bioMerieux, France) and serotyping wherever applicable or otherwise by nucleic acid-mediated methods. All the moulds were identified by Lactophenol blue mounts, and mycobacterial isolates were identified by molecular identification systems including AccuProbe (bioMerieux, France) or GenoType Mycobacterium CM (Hain Lifescience, Germany). Out of the 12,003 isolates, the Vitek MS gave a good overall ID at the genus and or species level up to 97.7% for bacterial isolates, 92.8% for yeasts and 80% for filamentous fungi. Of the 26 mycobacteria tested, only 42.3% could be identified using the Saramis RUO (Research Use Only) database. VITEK MS could not identify 34 of the 35 yeast isolates identified as C. haemulonii by Vitek 2. Subsequently, 17 of these isolates were identified as Candida auris (not present in the Vitek MS database) by 18S rRNA sequencing. Using these strains, an in-house superspectrum of C. auris was created in the VITEK MS database. Use of MALDI-TOF MS allows a rapid identification of aerobic bacteria and yeasts in clinical practice. However, improved sample extraction protocols and database upgrades with inclusion of locally representative strains is required, especially for moulds.
Feline mitochondrial DNA sampling for forensic analysis: when enough is enough!
Grahn, Robert A; Alhaddad, Hasan; Alves, Paulo C; Randi, Ettore; Waly, Nashwa E; Lyons, Leslie A
2015-05-01
Pet hair has a demonstrated value in resolving legal issues. Cat hair is chronically shed and it is difficult to leave a home with cats without some level of secondary transfer. The power of cat hair as an evidentiary resource may be underused because representative genetic databases are not available for exclusionary purposes. Mitochondrial control region databases are highly valuable for hair analyses and have been developed for the cat. In a representative worldwide data set, 83% of domestic cat mitotypes belong to one of twelve major types. Of the remaining 17%, 7.5% are unique within the published 1394 sample database. The current research evaluates the sample size necessary to establish a representative population for forensic comparison of the mitochondrial control region for the domestic cat. For most worldwide populations, randomly sampling 50 unrelated local individuals will achieve saturation at 95%. The 99% saturation is achieved by randomly sampling 60-170 cats, depending on the numbers of mitotypes available in the population at large. Likely due to the recent domestication of the cat and minimal localized population substructure, fewer cats are needed to meet mitochondria DNA control region database practical saturation than for humans or dogs. Coupled with the available worldwide feline control region database of nearly 1400 cats, minimal local sampling will be required to establish an appropriate comparative representative database and achieve significant exclusionary power. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Mohr, Karen I.; Molinari, John; Thorncroft, Chris
2009-01-01
The characteristics of convective system populations in West Africa and the western Pacific tropical cyclone basin were analyzed to investigate whether interannual variability in convective activity in tropical continental and oceanic environments is driven by variations in the number of events during the wet season or by favoring large and/or intense convective systems. Convective systems were defined from Tropical Rainfall Measuring Mission (TRMM) data as a cluster of pixels with an 85-GHz polarization-corrected brightness temperature below 255 K and with an area of at least 64 square kilometers. The study database consisted of convective systems in West Africa from May to September 1998-2007, and in the western Pacific from May to November 1998-2007. Annual cumulative frequency distributions for system minimum brightness temperature and system area were constructed for both regions. For both regions, there were no statistically significant differences between the annual curves for system minimum brightness temperature. There were two groups of system area curves, split by the TRMM altitude boost in 2001. Within each set, there was no statistically significant interannual variability. Subsetting the database revealed some sensitivity in distribution shape to the size of the sampling area, the length of the sample period, and the climate zone. From a regional perspective, the stability of the cumulative frequency distributions implied that the probability that a convective system would attain a particular size or intensity does not change interannually. Variability in the number of convective events appeared to be more important in determining whether a year is either wetter or drier than normal.
NASA Astrophysics Data System (ADS)
Penttilä, Antti; Martikainen, Julia; Gritsevich, Maria; Muinonen, Karri
2018-02-01
Meteorite samples are measured with the University of Helsinki integrating-sphere UV-vis-NIR spectrometer. The resulting spectra of 30 meteorites are compared with selected spectra from the NASA Planetary Data System meteorite spectra database. The spectral measurements are transformed with the principal component analysis, and it is shown that different meteorite types can be distinguished from the transformed data. The motivation is to improve the link between asteroid spectral observations and meteorite spectral measurements.
CRN5EXP: Expert system for statistical quality control
NASA Technical Reports Server (NTRS)
Hentea, Mariana
1991-01-01
The purpose of the Expert System CRN5EXP is to assist in checking the quality of the coils at two very important mills: Hot Rolling and Cold Rolling in a steel plant. The system interprets the statistical quality control charts, diagnoses and predicts the quality of the steel. Measurements of process control variables are recorded in a database and sample statistics such as the mean and the range are computed and plotted on a control chart. The chart is analyzed through patterns using the C Language Integrated Production System (CLIPS) and a forward chaining technique to reach a conclusion about the causes of defects and to take management measures for the improvement of the quality control techniques. The Expert System combines the certainty factors associated with the process control variables to predict the quality of the steel. The paper presents the approach to extract data from the database, the reason to combine certainty factors, the architecture and the use of the Expert System. However, the interpretation of control charts patterns requires the human expert's knowledge and lends to Expert Systems rules.
Autonomous Systems and Robotics: 2000-2004
NASA Technical Reports Server (NTRS)
2004-01-01
This custom bibliography from the NASA Scientific and Technical Information Program lists a sampling of records found in the NASA Aeronautics and Space Database. The scope of this topic includes technologies to monitor, maintain, and where possible, repair complex space systems. This area of focus is one of the enabling technologies as defined by NASA s Report of the President s Commission on Implementation of United States Space Exploration Policy, published in June 2004.
Geospatial database for regional environmental assessment of central Colorado.
Church, Stan E.; San Juan, Carma A.; Fey, David L.; Schmidt, Travis S.; Klein, Terry L.; DeWitt, Ed H.; Wanty, Richard B.; Verplanck, Philip L.; Mitchell, Katharine A.; Adams, Monique G.; Choate, LaDonna M.; Todorov, Todor I.; Rockwell, Barnaby W.; McEachron, Luke; Anthony, Michael W.
2012-01-01
In conjunction with the future planning needs of the U.S. Department of Agriculture, Forest Service, the U.S. Geological Survey conducted a detailed environmental assessment of the effects of historical mining on Forest Service lands in central Colorado. Stream sediment, macroinvertebrate, and various filtered and unfiltered water quality samples were collected during low-flow over a four-year period from 2004–2007. This report summarizes the sampling strategy, data collection, and analyses performed on these samples. The data are presented in Geographic Information System, Microsoft Excel, and comma-delimited formats. Reports on data interpretation are being prepared separately.
Searching mixed DNA profiles directly against profile databases.
Bright, Jo-Anne; Taylor, Duncan; Curran, James; Buckleton, John
2014-03-01
DNA databases have revolutionised forensic science. They are a powerful investigative tool as they have the potential to identify persons of interest in criminal investigations. Routinely, a DNA profile generated from a crime sample could only be searched for in a database of individuals if the stain was from single contributor (single source) or if a contributor could unambiguously be determined from a mixed DNA profile. This meant that a significant number of samples were unsuitable for database searching. The advent of continuous methods for the interpretation of DNA profiles offers an advanced way to draw inferential power from the considerable investment made in DNA databases. Using these methods, each profile on the database may be considered a possible contributor to a mixture and a likelihood ratio (LR) can be formed. Those profiles which produce a sufficiently large LR can serve as an investigative lead. In this paper empirical studies are described to determine what constitutes a large LR. We investigate the effect on a database search of complex mixed DNA profiles with contributors in equal proportions with dropout as a consideration, and also the effect of an incorrect assignment of the number of contributors to a profile. In addition, we give, as a demonstration of the method, the results using two crime samples that were previously unsuitable for database comparison. We show that effective management of the selection of samples for searching and the interpretation of the output can be highly informative. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Choosing an Optimal Database for Protein Identification from Tandem Mass Spectrometry Data.
Kumar, Dhirendra; Yadav, Amit Kumar; Dash, Debasis
2017-01-01
Database searching is the preferred method for protein identification from digital spectra of mass to charge ratios (m/z) detected for protein samples through mass spectrometers. The search database is one of the major influencing factors in discovering proteins present in the sample and thus in deriving biological conclusions. In most cases the choice of search database is arbitrary. Here we describe common search databases used in proteomic studies and their impact on final list of identified proteins. We also elaborate upon factors like composition and size of the search database that can influence the protein identification process. In conclusion, we suggest that choice of the database depends on the type of inferences to be derived from proteomics data. However, making additional efforts to build a compact and concise database for a targeted question should generally be rewarding in achieving confident protein identifications.
Database resources of the National Center for Biotechnology Information.
2016-01-04
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (PubMed Central (PMC), Bookshelf and PubReader), health (ClinVar, dbGaP, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen), genomes (BioProject, Assembly, Genome, BioSample, dbSNP, dbVar, Epigenomics, the Map Viewer, Nucleotide, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser and the Trace Archive), genes (Gene, Gene Expression Omnibus (GEO), HomoloGene, PopSet and UniGene), proteins (Protein, the Conserved Domain Database (CDD), COBALT, Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB) and Protein Clusters) and chemicals (Biosystems and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for most of these databases. Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized datasets. All of these resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Database resources of the National Center for Biotechnology Information.
2015-01-01
The National Center for Biotechnology Information (NCBI) provides a large suite of online resources for biological information and data, including the GenBank(®) nucleic acid sequence database and the PubMed database of citations and abstracts for published life science journals. Additional NCBI resources focus on literature (Bookshelf, PubMed Central (PMC) and PubReader); medical genetics (ClinVar, dbMHC, the Genetic Testing Registry, HIV-1/Human Protein Interaction Database and MedGen); genes and genomics (BioProject, BioSample, dbSNP, dbVar, Epigenomics, Gene, Gene Expression Omnibus (GEO), Genome, HomoloGene, the Map Viewer, Nucleotide, PopSet, Probe, RefSeq, Sequence Read Archive, the Taxonomy Browser, Trace Archive and UniGene); and proteins and chemicals (Biosystems, COBALT, the Conserved Domain Database (CDD), the Conserved Domain Architecture Retrieval Tool (CDART), the Molecular Modeling Database (MMDB), Protein Clusters, Protein and the PubChem suite of small molecule databases). The Entrez system provides search and retrieval operations for many of these databases. Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of these resources can be accessed through the NCBI home page at http://www.ncbi.nlm.nih.gov. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
NASA Astrophysics Data System (ADS)
Kale, Mandar; Mukhopadhyay, Sudipta; Dash, Jatindra K.; Garg, Mandeep; Khandelwal, Niranjan
2016-03-01
Interstitial lung disease (ILD) is complicated group of pulmonary disorders. High Resolution Computed Tomography (HRCT) considered to be best imaging technique for analysis of different pulmonary disorders. HRCT findings can be categorised in several patterns viz. Consolidation, Emphysema, Ground Glass Opacity, Nodular, Normal etc. based on their texture like appearance. Clinician often find it difficult to diagnosis these pattern because of their complex nature. In such scenario computer-aided diagnosis system could help clinician to identify patterns. Several approaches had been proposed for classification of ILD patterns. This includes computation of textural feature and training /testing of classifier such as artificial neural network (ANN), support vector machine (SVM) etc. In this paper, wavelet features are calculated from two different ILD database, publically available MedGIFT ILD database and private ILD database, followed by performance evaluation of ANN and SVM classifiers in terms of average accuracy. It is found that average classification accuracy by SVM is greater than ANN where trained and tested on same database. Investigation continued further to test variation in accuracy of classifier when training and testing is performed with alternate database and training and testing of classifier with database formed by merging samples from same class from two individual databases. The average classification accuracy drops when two independent databases used for training and testing respectively. There is significant improvement in average accuracy when classifiers are trained and tested with merged database. It infers dependency of classification accuracy on training data. It is observed that SVM outperforms ANN when same database is used for training and testing.
Moyle, Phillip R.; Wallis, John C.; Bliss, James D.; Bolm, Karen D.
2004-01-01
The U.S. Geological Survey (USGS) compiled a database of aggregate sites and geotechnical sample data for six counties - Ada, Boise, Canyon, Elmore, Gem, and Owyhee - in southwest Idaho as part of a series of studies in support of the Bureau of Land Management (BLM) planning process. Emphasis is placed on sand and gravel sites in deposits of the Boise River, Snake River, and other fluvial systems and in Neogene lacustrine deposits. Data were collected primarily from unpublished Idaho Transportation Department (ITD) records and BLM site descriptions, published Army Corps of Engineers (ACE) records, and USGS sampling data. The results of this study provides important information needed by land-use planners and resource managers, particularly in the BLM, to anticipate and plan for demand and development of sand and gravel and other mineral material resources on public lands in response to the urban growth in southwestern Idaho.
PlantDB – a versatile database for managing plant research
Exner, Vivien; Hirsch-Hoffmann, Matthias; Gruissem, Wilhelm; Hennig, Lars
2008-01-01
Background Research in plant science laboratories often involves usage of many different species, cultivars, ecotypes, mutants, alleles or transgenic lines. This creates a great challenge to keep track of the identity of experimental plants and stored samples or seeds. Results Here, we describe PlantDB – a Microsoft® Office Access database – with a user-friendly front-end for managing information relevant for experimental plants. PlantDB can hold information about plants of different species, cultivars or genetic composition. Introduction of a concise identifier system allows easy generation of pedigree trees. In addition, all information about any experimental plant – from growth conditions and dates over extracted samples such as RNA to files containing images of the plants – can be linked unequivocally. Conclusion We have been using PlantDB for several years in our laboratory and found that it greatly facilitates access to relevant information. PMID:18182106
NASA Astrophysics Data System (ADS)
van Rensburg, L.; Claassens, S.; Bezuidenhout, J. J.; Jansen van Rensburg, P. J.
2009-03-01
The much publicised problem with major asbestos pollution and related health issues in South Africa, has called for action to be taken to negate the situation. The aim of this project was to establish a prioritisation index that would provide a scientifically based sequence in which polluted asbestos mines in Southern Africa ought to be rehabilitated. It was reasoned that a computerised database capable of calculating such a Rehabilitation Prioritisation Index (RPI) would be a fruitful departure from the previously used subjective selection prone to human bias. The database was developed in Microsoft Access and both quantitative and qualitative data were used for the calculation of the RPI value. The logical database structure consists of a number of mines, each consisting of a number of dumps, for which a number of samples have been analysed to determine asbestos fibre contents. For this system to be accurate as well as relevant, the data in the database should be revalidated and updated on a regular basis.
Final Results of Shuttle MMOD Impact Database
NASA Technical Reports Server (NTRS)
Hyde, J. L.; Christiansen, E. L.; Lear, D. M.
2015-01-01
The Shuttle Hypervelocity Impact Database documents damage features on each Orbiter thought to be from micrometeoroids (MM) or orbital debris (OD). Data is divided into tables for crew module windows, payload bay door radiators and thermal protection systems along with other miscellaneous regions. The combined number of records in the database is nearly 3000. Each database record provides impact feature dimensions, location on the vehicle and relevant mission information. Additional detail on the type and size of particle that produced the damage site is provided when sampling data and definitive spectroscopic analysis results are available. Guidelines are described which were used in determining whether impact damage is from micrometeoroid or orbital debris impact based on the findings from scanning electron microscopy chemical analysis. Relationships assumed when converting from observed feature sizes in different shuttle materials to particle sizes will be presented. A small number of significant impacts on the windows, radiators and wing leading edge will be highlighted and discussed in detail, including the hypervelocity impact testing performed to estimate particle sizes that produced the damage.
MicRhoDE: a curated database for the analysis of microbial rhodopsin diversity and evolution
Boeuf, Dominique; Audic, Stéphane; Brillet-Guéguen, Loraine; Caron, Christophe; Jeanthon, Christian
2015-01-01
Microbial rhodopsins are a diverse group of photoactive transmembrane proteins found in all three domains of life and in viruses. Today, microbial rhodopsin research is a flourishing research field in which new understandings of rhodopsin diversity, function and evolution are contributing to broader microbiological and molecular knowledge. Here, we describe MicRhoDE, a comprehensive, high-quality and freely accessible database that facilitates analysis of the diversity and evolution of microbial rhodopsins. Rhodopsin sequences isolated from a vast array of marine and terrestrial environments were manually collected and curated. To each rhodopsin sequence are associated related metadata, including predicted spectral tuning of the protein, putative activity and function, taxonomy for sequences that can be linked to a 16S rRNA gene, sampling date and location, and supporting literature. The database currently covers 7857 aligned sequences from more than 450 environmental samples or organisms. Based on a robust phylogenetic analysis, we introduce an operational classification system with multiple phylogenetic levels ranging from superclusters to species-level operational taxonomic units. An integrated pipeline for online sequence alignment and phylogenetic tree construction is also provided. With a user-friendly interface and integrated online bioinformatics tools, this unique resource should be highly valuable for upcoming studies of the biogeography, diversity, distribution and evolution of microbial rhodopsins. Database URL: http://micrhode.sb-roscoff.fr. PMID:26286928
MicRhoDE: a curated database for the analysis of microbial rhodopsin diversity and evolution.
Boeuf, Dominique; Audic, Stéphane; Brillet-Guéguen, Loraine; Caron, Christophe; Jeanthon, Christian
2015-01-01
Microbial rhodopsins are a diverse group of photoactive transmembrane proteins found in all three domains of life and in viruses. Today, microbial rhodopsin research is a flourishing research field in which new understandings of rhodopsin diversity, function and evolution are contributing to broader microbiological and molecular knowledge. Here, we describe MicRhoDE, a comprehensive, high-quality and freely accessible database that facilitates analysis of the diversity and evolution of microbial rhodopsins. Rhodopsin sequences isolated from a vast array of marine and terrestrial environments were manually collected and curated. To each rhodopsin sequence are associated related metadata, including predicted spectral tuning of the protein, putative activity and function, taxonomy for sequences that can be linked to a 16S rRNA gene, sampling date and location, and supporting literature. The database currently covers 7857 aligned sequences from more than 450 environmental samples or organisms. Based on a robust phylogenetic analysis, we introduce an operational classification system with multiple phylogenetic levels ranging from superclusters to species-level operational taxonomic units. An integrated pipeline for online sequence alignment and phylogenetic tree construction is also provided. With a user-friendly interface and integrated online bioinformatics tools, this unique resource should be highly valuable for upcoming studies of the biogeography, diversity, distribution and evolution of microbial rhodopsins. Database URL: http://micrhode.sb-roscoff.fr. © The Author(s) 2015. Published by Oxford University Press.
NASA Technical Reports Server (NTRS)
Schrader, Christian M.; Rickman, Doug; Stoeser, Douglas; Wentworth, Susan; McKay, Dave S.; Botha, Pieter; Butcher, Alan R.; Horsch, Hanna E.; Benedictus, Aukje; Gottlieb, Paul
2008-01-01
This slide presentation reviews the work to analyze the lunar highland regolith samples that came from the Apollo 16 core sample 64001/2 and simulants of lunar regolith, and build a comparative database. The work is part of a larger effort to compile an internally consistent database on lunar regolith (Apollo Samples) and lunar regolith simulants. This is in support of a future lunar outpost. The work is to characterize existing lunar regolith and simulants in terms of particle type, particle size distribution, particle shape distribution, bulk density, and other compositional characteristics, and to evaluate the regolith simulants by the same properties in comparison to the Apollo sample lunar regolith.
The Rules of the Game: Properties of a Database of Expository Language Samples
ERIC Educational Resources Information Center
Heilmann, John; Malone, Thomas O.
2014-01-01
Purpose: The authors created a database of expository oral language samples with the aims of describing the nature of students' expository discourse and providing benchmark data for typically developing preteen and teenage students. Method: Using a favorite game or sport protocol, language samples were collected from 235 typically developing…
Tanaka, Yuichiro; Takahashi, Hajime; Kitazawa, Nao; Kimura, Bon
2010-01-01
A rapid system using terminal restriction fragment length polymorphism (T-RFLP) analysis targeting 16S rDNA is described for microbial population analysis in edible fish samples. The defined terminal restriction fragment database was constructed by collecting 102 strains of bacteria representing 53 genera that are associated with fish. Digestion of these 102 strains with two restriction enzymes, HhaI and MspI, formed 54 pattern groups with discrimination to the genus level. This T-RFLP system produced results comparable to those from a culture-based method in six natural fish samples with a qualitative correspondence of 71.4 to 92.3%. Using the T-RFLP system allowed an estimation of the microbial population within 7 h. Rapid assay of the microbial population is advantageous for food manufacturers and testing laboratories; moreover, the strategy presented here allows adaptation to specific testing applications.
The Master Lens Database and The Orphan Lenses Project
NASA Astrophysics Data System (ADS)
Moustakas, Leonidas
2012-10-01
Strong gravitational lenses are uniquely suited for the study of dark matter structure and substructure within massive halos of many scales, act as gravitational telescopes for distant faint objects, and can give powerful and competitive cosmological constraints. While hundreds of strong lenses are known to date, spanning five orders of magnitude in mass scale, thousands will be identified this decade. To fully exploit the power of these objects presently, and in the near future, we are creating the Master Lens Database. This is a clearinghouse of all known strong lens systems, with a sophisticated and modern database of uniformly measured and derived observational and lens-model derived quantities, using archival Hubble data across several instruments. This Database enables new science that can be done with a comprehensive sample of strong lenses. The operational goal of this proposal is to develop the process and the code to semi-automatically stage Hubble data of each system, create appropriate masks of the lensing objects and lensing features, and derive gravitational lens models, to provide a uniform and fairly comprehensive information set that is ingested into the Database. The scientific goal for this team is to use the properties of the ensemble of lenses to make a new study of the internal structure of lensing galaxies, and to identify new objects that show evidence of strong substructure lensing, for follow-up study. All data, scripts, masks, model setup files, and derived parameters, will be public, and free. The Database will be accessible online and through a sophisticated smartphone application, which will also be free.
Collisional excitation of molecules in dense interstellar clouds
NASA Technical Reports Server (NTRS)
Green, S.
1985-01-01
State transitions which permit the identification of the molecular species in dense interstellar clouds are reviewed, along with the techniques used to calculate the transition energies, the database on known molecular transitions and the accuracy of the values. The transition energies cannot be measured directly and therefore must be modeled analytically. Scattering theory is used to determine the intermolecular forces on the basis of quantum mechanics. The nuclear motions can also be modeled with classical mechanics. Sample rate constants are provided for molecular systems known to inhabit dense interstellar clouds. The values serve as a database for interpreting microwave and RF astrophysical data on the transitions undergone by interstellar molecules.
Average probability that a "cold hit" in a DNA database search results in an erroneous attribution.
Song, Yun S; Patil, Anand; Murphy, Erin E; Slatkin, Montgomery
2009-01-01
We consider a hypothetical series of cases in which the DNA profile of a crime-scene sample is found to match a known profile in a DNA database (i.e., a "cold hit"), resulting in the identification of a suspect based only on genetic evidence. We show that the average probability that there is another person in the population whose profile matches the crime-scene sample but who is not in the database is approximately 2(N - d)p(A), where N is the number of individuals in the population, d is the number of profiles in the database, and p(A) is the average match probability (AMP) for the population. The AMP is estimated by computing the average of the probabilities that two individuals in the population have the same profile. We show further that if a priori each individual in the population is equally likely to have left the crime-scene sample, then the average probability that the database search attributes the crime-scene sample to a wrong person is (N - d)p(A).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, Tengfang; Piette, Mary Ann
2004-08-05
The original scope of work was to obtain and analyze existing and emerging data in four states: California, Florida, New York, and Wisconsin. The goal of this data collection was to deliver a baseline database or recommendations for such a database that could possibly contain window and daylighting features and energy performance characteristics of Kindergarten through 12th grade (K-12) school buildings (or those of classrooms when available). In particular, data analyses were performed based upon the California Commercial End-Use Survey (CEUS) databases to understand school energy use, features of window glazing, and availability of daylighting in California K-12 schools. Themore » outcomes from this baseline task can be used to assist in establishing a database of school energy performance, assessing applications of existing technologies relevant to window and daylighting design, and identifying future R&D needs. These are in line with the overall project goals as outlined in the proposal. Through the review and analysis of this data, it is clear that there are many compounding factors impacting energy use in K-12 school buildings in the U.S., and that there are various challenges in understanding the impact of K-12 classroom energy use associated with design features of window glazing and skylight. First, the energy data in the existing CEUS databases has, at most, provided the aggregated electricity and/or gas usages for the building establishments that include other school facilities on top of the classroom spaces. Although the percentage of classroom floor area in schools is often available from the databases, there is no additional information that can be used to quantitatively segregate the EUI for classroom spaces. In order to quantify the EUI for classrooms, sub-metering of energy usage by classrooms must be obtained. Second, magnitudes of energy use for electricity lighting are not attainable from the existing databases, nor are the lighting levels contributed by artificial lighting or daylight. It is impossible to reasonably estimate the lighting energy consumption for classroom areas in the sample of schools studied in this project. Third, there are many other compounding factors that may as well influence the overall classroom energy use, e.g., ventilation, insulation, system efficiency, occupancy, control, schedules, and weather. Fourth, although we have examined the school EUI grouped by various factors such as climate zones, window and daylighting design features from the California databases, no statistically significant associations can be identified from the sampled California K-12 schools in the current California CEUS. There are opportunities to expand such analyses by developing and including more powerful CEUS databases in the future. Finally, a list of parameters is recommended for future database development and for use of future investigation in K-12 classroom energy use, window and skylight design, and possible relations between them. Some of the key parameters include: (1) Energy end use data for lighting systems, classrooms, and schools; (2) Building design and operation including features for windows and daylighting; and (3) Other key parameters and information that would be available to investigate overall energy uses, building and systems design, their operation, and services provided.« less
NASA Astrophysics Data System (ADS)
Kuzma, H. A.; Boyle, K.; Pullman, S.; Reagan, M. T.; Moridis, G. J.; Blasingame, T. A.; Rector, J. W.; Nikolaou, M.
2010-12-01
A Self Teaching Expert System (SeTES) is being developed for the analysis, design and prediction of gas production from shales. An Expert System is a computer program designed to answer questions or clarify uncertainties that its designers did not necessarily envision which would otherwise have to be addressed by consultation with one or more human experts. Modern developments in computer learning, data mining, database management, web integration and cheap computing power are bringing the promise of expert systems to fruition. SeTES is a partial successor to Prospector, a system to aid in the identification and evaluation of mineral deposits developed by Stanford University and the USGS in the late 1970s, and one of the most famous early expert systems. Instead of the text dialogue used in early systems, the web user interface of SeTES helps a non-expert user to articulate, clarify and reason about a problem by navigating through a series of interactive wizards. The wizards identify potential solutions to queries by retrieving and combining together relevant records from a database. Inferences, decisions and predictions are made from incomplete and noisy inputs using a series of probabilistic models (Bayesian Networks) which incorporate records from the database, physical laws and empirical knowledge in the form of prior probability distributions. The database is mainly populated with empirical measurements, however an automatic algorithm supplements sparse data with synthetic data obtained through physical modeling. This constitutes the mechanism for how SeTES self-teaches. SeTES’ predictive power is expected to grow as users contribute more data into the system. Samples are appropriately weighted to favor high quality empirical data over low quality or synthetic data. Finally, a set of data visualization tools digests the output measurements into graphical outputs.
Dark, Paul; Wilson, Claire; Blackwood, Bronagh; McAuley, Danny F; Perkins, Gavin D; McMullan, Ronan; Gates, Simon; Warhurst, Geoffrey
2012-01-01
Background There is growing interest in the potential utility of molecular diagnostics in improving the detection of life-threatening infection (sepsis). LightCycler® SeptiFast is a multipathogen probe-based real-time PCR system targeting DNA sequences of bacteria and fungi present in blood samples within a few hours. We report here the protocol of the first systematic review of published clinical diagnostic accuracy studies of this technology when compared with blood culture in the setting of suspected sepsis. Methods/design Data sources: the Cochrane Database of Systematic Reviews, the Database of Abstracts of Reviews of Effects (DARE), the Health Technology Assessment Database (HTA), the NHS Economic Evaluation Database (NHSEED), The Cochrane Library, MEDLINE, EMBASE, ISI Web of Science, BIOSIS Previews, MEDION and the Aggressive Research Intelligence Facility Database (ARIF). diagnostic accuracy studies that compare the real-time PCR technology with standard culture results performed on a patient's blood sample during the management of sepsis. three reviewers, working independently, will determine the level of evidence, methodological quality and a standard data set relating to demographics and diagnostic accuracy metrics for each study. Statistical analysis/data synthesis: heterogeneity of studies will be investigated using a coupled forest plot of sensitivity and specificity and a scatter plot in Receiver Operator Characteristic (ROC) space. Bivariate model method will be used to estimate summary sensitivity and specificity. The authors will investigate reporting biases using funnel plots based on effective sample size and regression tests of asymmetry. Subgroup analyses are planned for adults, children and infection setting (hospital vs community) if sufficient data are uncovered. Dissemination Recommendations will be made to the Department of Health (as part of an open-access HTA report) as to whether the real-time PCR technology has sufficient clinical diagnostic accuracy potential to move forward to efficacy testing during the provision of routine clinical care. Registration PROSPERO-NIHR Prospective Register of Systematic Reviews (CRD42011001289).
Letourneau, Nicole L; Tryphonopoulos, Panagiota D; Novick, Jason; Hart, J Martha; Giesbrecht, Gerald; Oxford, Monica L
Many nurses rely on the American Nursing Child Assessment Satellite Training (NCAST) Parent-Child Interaction (PCI) Teaching and Feeding Scales to identify and target interventions for families affected by severe/chronic stressors (e.g. postpartum depression (PPD), intimate partner violence (IPV), low-income). However, the NCAST Database that provides normative data for comparisons may not apply to Canadian families. The purpose of this study was to compare NCAST PCI scores in Canadian and American samples and to assess the reliability of the NCAST PCI Scales in Canadian samples. This secondary analysis employed independent samples t-tests (p < 0.005) to compare PCI between the American NCAST Database and Canadian high-risk (families with PPD, exposure to IPV or low-income) and community samples. Cronbach's alphas were calculated for the Canadian and American samples. In both American and Canadian samples, belonging to a high-risk population reduced parents' abilities to engage in sensitive and responsive caregiving (i.e. healthy serve and return relationships) as measured by the PCI Scales. NCAST Database mothers were more effective at executing caregiving responsibilities during PCI compared to the Canadian community sample, while infants belonging to the Canadian community sample provided clearer cues to caregivers during PCI compared to those of the NCAST Database. Internal consistency coefficients for the Canadian samples were generally acceptable. The NCAST Database can be reliably used for assessing PCI in normative and high-risk Canadian families. Canadian nurses can be assured that the PCI Scales adequately identify risks and can help target interventions to promote optimal parent-child relationships and ultimately child development. Crown Copyright © 2018. Published by Elsevier Inc. All rights reserved.
The MIND PALACE: A Multi-Spectral Imaging and Spectroscopy Database for Planetary Science
NASA Astrophysics Data System (ADS)
Eshelman, E.; Doloboff, I.; Hara, E. K.; Uckert, K.; Sapers, H. M.; Abbey, W.; Beegle, L. W.; Bhartia, R.
2017-12-01
The Multi-Instrument Database (MIND) is the web-based home to a well-characterized set of analytical data collected by a suite of deep-UV fluorescence/Raman instruments built at the Jet Propulsion Laboratory (JPL). Samples derive from a growing body of planetary surface analogs, mineral and microbial standards, meteorites, spacecraft materials, and other astrobiologically relevant materials. In addition to deep-UV spectroscopy, datasets stored in MIND are obtained from a variety of analytical techniques obtained over multiple spatial and spectral scales including electron microscopy, optical microscopy, infrared spectroscopy, X-ray fluorescence, and direct fluorescence imaging. Multivariate statistical analysis techniques, primarily Principal Component Analysis (PCA), are used to guide interpretation of these large multi-analytical spectral datasets. Spatial co-referencing of integrated spectral/visual maps is performed using QGIS (geographic information system software). Georeferencing techniques transform individual instrument data maps into a layered co-registered data cube for analysis across spectral and spatial scales. The body of data in MIND is intended to serve as a permanent, reliable, and expanding database of deep-UV spectroscopy datasets generated by this unique suite of JPL-based instruments on samples of broad planetary science interest.
Patterns, biases and prospects in the distribution and diversity of Neotropical snakes
Sawaya, Ricardo J.; Zizka, Alexander; Laffan, Shawn; Faurby, Søren; Pyron, R. Alexander; Bérnils, Renato S.; Jansen, Martin; Passos, Paulo; Prudente, Ana L. C.; Cisneros‐Heredia, Diego F.; Braz, Henrique B.; Nogueira, Cristiano de C.; Antonelli, Alexandre; Meiri, Shai
2017-01-01
Abstract Motivation We generated a novel database of Neotropical snakes (one of the world's richest herpetofauna) combining the most comprehensive, manually compiled distribution dataset with publicly available data. We assess, for the first time, the diversity patterns for all Neotropical snakes as well as sampling density and sampling biases. Main types of variables contained We compiled three databases of species occurrences: a dataset downloaded from the Global Biodiversity Information Facility (GBIF), a verified dataset built through taxonomic work and specialized literature, and a combined dataset comprising a cleaned version of the GBIF dataset merged with the verified dataset. Spatial location and grain Neotropics, Behrmann projection equivalent to 1° × 1°. Time period Specimens housed in museums during the last 150 years. Major taxa studied Squamata: Serpentes. Software format Geographical information system (GIS). Results The combined dataset provides the most comprehensive distribution database for Neotropical snakes to date. It contains 147,515 records for 886 species across 12 families, representing 74% of all species of snakes, spanning 27 countries in the Americas. Species richness and phylogenetic diversity show overall similar patterns. Amazonia is the least sampled Neotropical region, whereas most well‐sampled sites are located near large universities and scientific collections. We provide a list and updated maps of geographical distribution of all snake species surveyed. Main conclusions The biodiversity metrics of Neotropical snakes reflect patterns previously documented for other vertebrates, suggesting that similar factors may determine the diversity of both ectothermic and endothermic animals. We suggest conservation strategies for high‐diversity areas and sampling efforts be directed towards Amazonia and poorly known species. PMID:29398972
dbMDEGA: a database for meta-analysis of differentially expressed genes in autism spectrum disorder.
Zhang, Shuyun; Deng, Libin; Jia, Qiyue; Huang, Shaoting; Gu, Junwang; Zhou, Fankun; Gao, Meng; Sun, Xinyi; Feng, Chang; Fan, Guangqin
2017-11-16
Autism spectrum disorders (ASD) are hereditary, heterogeneous and biologically complex neurodevelopmental disorders. Individual studies on gene expression in ASD cannot provide clear consensus conclusions. Therefore, a systematic review to synthesize the current findings from brain tissues and a search tool to share the meta-analysis results are urgently needed. Here, we conducted a meta-analysis of brain gene expression profiles in the current reported human ASD expression datasets (with 84 frozen male cortex samples, 17 female cortex samples, 32 cerebellum samples and 4 formalin fixed samples) and knock-out mouse ASD model expression datasets (with 80 collective brain samples). Then, we applied R language software and developed an interactive shared and updated database (dbMDEGA) displaying the results of meta-analysis of data from ASD studies regarding differentially expressed genes (DEGs) in the brain. This database, dbMDEGA ( https://dbmdega.shinyapps.io/dbMDEGA/ ), is a publicly available web-portal for manual annotation and visualization of DEGs in the brain from data from ASD studies. This database uniquely presents meta-analysis values and homologous forest plots of DEGs in brain tissues. Gene entries are annotated with meta-values, statistical values and forest plots of DEGs in brain samples. This database aims to provide searchable meta-analysis results based on the current reported brain gene expression datasets of ASD to help detect candidate genes underlying this disorder. This new analytical tool may provide valuable assistance in the discovery of DEGs and the elucidation of the molecular pathogenicity of ASD. This database model may be replicated to study other disorders.
HormoneBase, a population-level database of steroid hormone levels across vertebrates
Vitousek, Maren N.; Johnson, Michele A.; Donald, Jeremy W.; Francis, Clinton D.; Fuxjager, Matthew J.; Goymann, Wolfgang; Hau, Michaela; Husak, Jerry F.; Kircher, Bonnie K.; Knapp, Rosemary; Martin, Lynn B.; Miller, Eliot T.; Schoenle, Laura A.; Uehling, Jennifer J.; Williams, Tony D.
2018-01-01
Hormones are central regulators of organismal function and flexibility that mediate a diversity of phenotypic traits from early development through senescence. Yet despite these important roles, basic questions about how and why hormone systems vary within and across species remain unanswered. Here we describe HormoneBase, a database of circulating steroid hormone levels and their variation across vertebrates. This database aims to provide all available data on the mean, variation, and range of plasma glucocorticoids (both baseline and stress-induced) and androgens in free-living and un-manipulated adult vertebrates. HormoneBase (www.HormoneBase.org) currently includes >6,580 entries from 476 species, reported in 648 publications from 1967 to 2015, and unpublished datasets. Entries are associated with data on the species and population, sex, year and month of study, geographic coordinates, life history stage, method and latency of hormone sampling, and analysis technique. This novel resource could be used for analyses of the function and evolution of hormone systems, and the relationships between hormonal variation and a variety of processes including phenotypic variation, fitness, and species distributions. PMID:29786693
Washington, Donna L; Sun, Su; Canning, Mark
2010-01-01
Most veteran research is conducted in Department of Veterans Affairs (VA) healthcare settings, although most veterans obtain healthcare outside the VA. Our objective was to determine the adequacy and relative contributions of Veterans Health Administration (VHA), Veterans Benefits Administration (VBA), and Department of Defense (DOD) administrative databases for representing the U.S. veteran population, using as an example the creation of a sampling frame for the National Survey of Women Veterans. In 2008, we merged the VHA, VBA, and DOD databases. We identified the number of unique records both overall and from each database. The combined databases yielded 925,946 unique records, representing 51% of the 1,802,000 U.S. women veteran population. The DOD database included 30% of the population (with 8% overlap with other databases). The VHA enrollment database contributed an additional 20% unique women veterans (with 6% overlap with VBA databases). VBA databases contributed an additional 2% unique women veterans (beyond 10% overlap with other databases). Use of VBA and DOD databases substantially expands access to the population of veterans beyond those in VHA databases, regardless of VA use. Adoption of these additional databases would enhance the value and generalizability of a wide range of studies of both male and female veterans.
The Movable Type Method Applied to Protein-Ligand Binding.
Zheng, Zheng; Ucisik, Melek N; Merz, Kenneth M
2013-12-10
Accurately computing the free energy for biological processes like protein folding or protein-ligand association remains a challenging problem. Both describing the complex intermolecular forces involved and sampling the requisite configuration space make understanding these processes innately difficult. Herein, we address the sampling problem using a novel methodology we term "movable type". Conceptually it can be understood by analogy with the evolution of printing and, hence, the name movable type. For example, a common approach to the study of protein-ligand complexation involves taking a database of intact drug-like molecules and exhaustively docking them into a binding pocket. This is reminiscent of early woodblock printing where each page had to be laboriously created prior to printing a book. However, printing evolved to an approach where a database of symbols (letters, numerals, etc.) was created and then assembled using a movable type system, which allowed for the creation of all possible combinations of symbols on a given page, thereby, revolutionizing the dissemination of knowledge. Our movable type (MT) method involves the identification of all atom pairs seen in protein-ligand complexes and then creating two databases: one with their associated pairwise distant dependent energies and another associated with the probability of how these pairs can combine in terms of bonds, angles, dihedrals and non-bonded interactions. Combining these two databases coupled with the principles of statistical mechanics allows us to accurately estimate binding free energies as well as the pose of a ligand in a receptor. This method, by its mathematical construction, samples all of configuration space of a selected region (the protein active site here) in one shot without resorting to brute force sampling schemes involving Monte Carlo, genetic algorithms or molecular dynamics simulations making the methodology extremely efficient. Importantly, this method explores the free energy surface eliminating the need to estimate the enthalpy and entropy components individually. Finally, low free energy structures can be obtained via a free energy minimization procedure yielding all low free energy poses on a given free energy surface. Besides revolutionizing the protein-ligand docking and scoring problem this approach can be utilized in a wide range of applications in computational biology which involve the computation of free energies for systems with extensive phase spaces including protein folding, protein-protein docking and protein design.
Designing testing service at baristand industri Medan’s liquid waste laboratory
NASA Astrophysics Data System (ADS)
Kusumawaty, Dewi; Napitupulu, Humala L.; Sembiring, Meilita T.
2018-03-01
Baristand Industri Medan is a technical implementation unit under the Industrial and Research and Development Agency, the Ministry of Industry. One of the services often used in Baristand Industri Medan is liquid waste testing service. The company set the standard of service is nine working days for testing services. At 2015, 89.66% on testing services liquid waste does not meet the specified standard of services company because of many samples accumulated. The purpose of this research is designing online services to schedule the coming the liquid waste sample. The method used is designing an information system that consists of model design, output design, input design, database design and technology design. The results of designing information system of testing liquid waste online consist of three pages are pages to the customer, the recipient samples and laboratory. From the simulation results with scheduled samples, then the standard services a minimum of nine working days can be reached.
Evaluation of a New Ensemble Learning Framework for Mass Classification in Mammograms.
Rahmani Seryasat, Omid; Haddadnia, Javad
2018-06-01
Mammography is the most common screening method for diagnosis of breast cancer. In this study, a computer-aided system for diagnosis of benignity and malignity of the masses was implemented in mammogram images. In the computer aided diagnosis system, we first reduce the noise in the mammograms using an effective noise removal technique. After the noise removal, the mass in the region of interest must be segmented and this segmentation is done using a deformable model. After the mass segmentation, a number of features are extracted from it. These features include: features of the mass shape and border, tissue properties, and the fractal dimension. After extracting a large number of features, a proper subset must be chosen from among them. In this study, we make use of a new method on the basis of a genetic algorithm for selection of a proper set of features. After determining the proper features, a classifier is trained. To classify the samples, a new architecture for combination of the classifiers is proposed. In this architecture, easy and difficult samples are identified and trained using different classifiers. Finally, the proposed mass diagnosis system was also tested on mini-Mammographic Image Analysis Society and digital database for screening mammography databases. The obtained results indicate that the proposed system can compete with the state-of-the-art methods in terms of accuracy. Copyright © 2017 Elsevier Inc. All rights reserved.
United States Army Medical Materiel Development Activity: 1997 Annual Report.
1997-01-01
business planning and execution information management system (Project Management Division Database ( PMDD ) and Product Management Database System (PMDS...MANAGEMENT • Project Management Division Database ( PMDD ), Product Management Database System (PMDS), and Special Users Database System:The existing...System (FMS), were investigated. New Product Managers and Project Managers were added into PMDS and PMDD . A separate division, Support, was
Antarctic Porifera database from the Spanish benthic expeditions
Rios, Pilar; Cristobo, Javier
2014-01-01
Abstract The information about the sponges in this dataset is derived from the samples collected during five Spanish Antarctic expeditions: Bentart 94, Bentart 95, Gebrap 96, Ciemar 99/00 and Bentart 2003. Samples were collected in the Antarctic Peninsula and Bellingshausen Sea at depths ranging from 4 to 2044 m using various sampling gears. The Antarctic Porifera database from the Spanish benthic expeditions is unique as it provides information for an under-explored region of the Southern Ocean (Bellingshausen Sea). It fills an information gap on Antarctic deep-sea sponges, for which there were previously very few data. This phylum is an important part of the Antarctic biota and plays a key role in the structure of the Antarctic marine benthic community due to its considerable diversity and predominance in different areas. It is often a dominant component of Southern Ocean benthic communities. The quality of the data was controlled very thoroughly with GPS systems onboard the R/V Hesperides and by checking the data against the World Porifera Database (which is part of the World Register of Marine Species, WoRMS). The data are therefore fit for completing checklists, inclusion in biodiversity pattern analysis and niche modelling. The authors can be contacted if any additional information is needed before carrying out detailed biodiversity or biogeographic studies. The dataset currently contains 767 occurrence data items that have been checked for systematic reliability. This database is not yet complete and the collection is growing. Specimens are stored in the author’s collection at the Spanish Institute of Oceanography (IEO) in the city of Gijón (Spain). The data are available in GBIF. PMID:24843257
Tucker, Valerie C; Hopwood, Andrew J; Sprecher, Cynthia J; McLaren, Robert S; Rabbach, Dawn R; Ensenberger, Martin G; Thompson, Jonelle M; Storts, Douglas R
2011-11-01
In response to the ENFSI and EDNAP groups' call for new STR multiplexes for Europe, Promega(®) developed a suite of four new DNA profiling kits. This paper describes the developmental validation study performed on the PowerPlex(®) ESI 16 (European Standard Investigator 16) and the PowerPlex(®) ESI 17 Systems. The PowerPlex(®) ESI 16 System combines the 11 loci compatible with the UK National DNA Database(®), contained within the AmpFlSTR(®) SGM Plus(®) PCR Amplification Kit, with five additional loci: D2S441, D10S1248, D22S1045, D1S1656 and D12S391. The multiplex was designed to reduce the amplicon size of the loci found in the AmpFlSTR(®) SGM Plus(®) kit. This design facilitates increased robustness and amplification success for the loci used in the national DNA databases created in many countries, when analyzing degraded DNA samples. The PowerPlex(®) ESI 17 System amplifies the same loci as the PowerPlex(®) ESI 16 System, but with the addition of a primer pair for the SE33 locus. Tests were designed to address the developmental validation guidelines issued by the Scientific Working Group on DNA Analysis Methods (SWGDAM), and those of the DNA Advisory Board (DAB). Samples processed include DNA mixtures, PCR reactions spiked with inhibitors, a sensitivity series, and 306 United Kingdom donor samples to determine concordance with data generated with the AmpFlSTR(®) SGM Plus(®) kit. Allele frequencies from 242 white Caucasian samples collected in the United Kingdom are also presented. The PowerPlex(®) ESI 16 and ESI 17 Systems are robust and sensitive tools, suitable for the analysis of forensic DNA samples. Full profiles were routinely observed with 62.5pg of a fully heterozygous single source DNA template. This high level of sensitivity was found to impact on mixture analyses, where 54-86% of unique minor contributor alleles were routinely observed in a 1:19 mixture ratio. Improved sensitivity combined with the robustness afforded by smaller amplicons has substantially improved the quantity of data obtained from degraded samples, and the improved chemistry confers exceptional tolerance to high levels of laboratory prepared inhibitors. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Improving imbalanced scientific text classification using sampling strategies and dictionaries.
Borrajo, L; Romero, R; Iglesias, E L; Redondo Marey, C M
2011-09-15
Many real applications have the imbalanced class distribution problem, where one of the classes is represented by a very small number of cases compared to the other classes. One of the systems affected are those related to the recovery and classification of scientific documentation. Sampling strategies such as Oversampling and Subsampling are popular in tackling the problem of class imbalance. In this work, we study their effects on three types of classifiers (Knn, SVM and Naive-Bayes) when they are applied to search on the PubMed scientific database. Another purpose of this paper is to study the use of dictionaries in the classification of biomedical texts. Experiments are conducted with three different dictionaries (BioCreative, NLPBA, and an ad-hoc subset of the UniProt database named Protein) using the mentioned classifiers and sampling strategies. Best results were obtained with NLPBA and Protein dictionaries and the SVM classifier using the Subsampling balancing technique. These results were compared with those obtained by other authors using the TREC Genomics 2005 public corpus. Copyright 2011 The Author(s). Published by Journal of Integrative Bioinformatics.
Performance assessment of EMR systems based on post-relational database.
Yu, Hai-Yan; Li, Jing-Song; Zhang, Xiao-Guang; Tian, Yu; Suzuki, Muneou; Araki, Kenji
2012-08-01
Post-relational databases provide high performance and are currently widely used in American hospitals. As few hospital information systems (HIS) in either China or Japan are based on post-relational databases, here we introduce a new-generation electronic medical records (EMR) system called Hygeia, which was developed with the post-relational database Caché and the latest platform Ensemble. Utilizing the benefits of a post-relational database, Hygeia is equipped with an "integration" feature that allows all the system users to access data-with a fast response time-anywhere and at anytime. Performance tests of databases in EMR systems were implemented in both China and Japan. First, a comparison test was conducted between a post-relational database, Caché, and a relational database, Oracle, embedded in the EMR systems of a medium-sized first-class hospital in China. Second, a user terminal test was done on the EMR system Izanami, which is based on the identical database Caché and operates efficiently at the Miyazaki University Hospital in Japan. The results proved that the post-relational database Caché works faster than the relational database Oracle and showed perfect performance in the real-time EMR system.
NASA Astrophysics Data System (ADS)
Roberts, N.; Cunningham, H.; Snell, A.; Newman, J.; Tikoff, B.; Chatzaras, V.; Walker, J. D.; Williams, R. T.
2017-12-01
There is currently no repository where a geologist can survey microstructural datasets that have been collected from a specific field area or deformation experiment. New development of the StraboSpot digital data system provides a such a repository as well as visualization and analysis tools. StraboSpot is a graph database that allows field geologists to share primary data and develop new types of scientific questions. The database can be accessed through: 1) a field-based mobile application that runs on iOS and Android mobile devices; and 2) a desktop system. We are expanding StraboSpot to include the handling of a variety of microstructural data types. Presented here is the detailed vocabulary and logic used for the input of microstructural data, and how this system operates with the anticipated workflow of users. Microstructural data include observations and interpretations from photomicrographs, scanning electron microscope images, electron backscatter diffraction, and transmission electron microscopy data. The workflow for importing microstructural data into StraboSpot is organized into the following tabs: Images, Mineralogy & Composition; Sedimentary; Igneous; Metamorphic; Fault Rocks; Grain size & configuration; Crystallographic Preferred Orientation; Reactions; Geochronology; Relationships; and Interpretations. Both the sample and the thin sections are also spots. For the sample spot, the user can specify whether a sample is experimental or natural; natural samples are inherently linked to their field context. For the thin section (sub-sample) spot, the user can select between different options for sample preparation, geometry, and methods. A universal framework for thin section orientation is given, which allows users to overlay different microscope images of the same area and keeps georeferenced orientation. We provide an example dataset of field and microstructural data from the Mt Edgar dome, a granitic complex in the Paleoarchean East Pilbara craton, Australia. StraboSpot provides a single place for georeferenced geologic data at every spatial scale, in which data are interconnected. Incorporating microstructural data into an open-access platform will give field and experimental geologists a library of microstructural data across a range of tectonic and experimental contexts.
Officer Career Development: Longitudinal Sample--Fiscal Year 1982
1991-10-01
Those that wish to access the database to conduct additional analyses, link it to or combine it with other databases, enlarge the database for the...link it to or combine it with other databases, enlarge the database for the conduct of trend analyses, etc., will find this data dictionary an...analyses, link it to or combine it with other databases, enlarge the database for the conduct of trend analyses, etc., will find this data dictionary
Does Pneumatic Tube System Transport Contribute to Hemolysis in ED Blood Samples?
Phelan, Michael P.; Reineks, Edmunds Z.; Hustey, Fredric M.; Berriochoa, Jacob P.; Podolsky, Seth R.; Meldon, Stephen; Schold, Jesse D.; Chamberlin, Janelle; Procop, Gary W.
2016-01-01
Introduction Our goal was to determine if the hemolysis among blood samples obtained in an emergency department and then sent to the laboratory in a pneumatic tube system was different from those in samples that were hand-carried. Methods The hemolysis index is measured on all samples submitted for potassium analysis. We queried our hospital laboratory database system (SunQuest®) for potassium results for specimens obtained between January 2014 and July 2014. From facility maintenance records, we identified periods of system downtime, during which specimens were hand-carried to the laboratory. Results During the study period, 15,851 blood specimens were transported via our pneumatic tube system and 92 samples were hand delivered. The proportions of hemolyzed specimens in the two groups were not significantly different (13.6% vs. 13.1% [p=0.90]). Results were consistent when the criterion was limited to gross (3.3% vs 3.3% [p=0.99]) or mild (10.3% vs 9.8% [p=0.88]) hemolysis. The hemolysis rate showed minimal variation during the study period (12.6%–14.6%). Conclusion We found no statistical difference in the percentages of hemolyzed specimens transported by a pneumatic tube system or hand delivered to the laboratory. Certain features of pneumatic tube systems might contribute to hemolysis (e.g., speed, distance, packing material). Since each system is unique in design, we encourage medical facilities to consider whether their method of transport might contribute to hemolysis in samples obtained in the emergency department. PMID:27625719
Does Pneumatic Tube System Transport Contribute to Hemolysis in ED Blood Samples?
Phelan, Michael P; Reineks, Edmunds Z; Hustey, Fredric M; Berriochoa, Jacob P; Podolsky, Seth R; Meldon, Stephen; Schold, Jesse D; Chamberlin, Janelle; Procop, Gary W
2016-09-01
Our goal was to determine if the hemolysis among blood samples obtained in an emergency department and then sent to the laboratory in a pneumatic tube system was different from those in samples that were hand-carried. The hemolysis index is measured on all samples submitted for potassium analysis. We queried our hospital laboratory database system (SunQuest(®)) for potassium results for specimens obtained between January 2014 and July 2014. From facility maintenance records, we identified periods of system downtime, during which specimens were hand-carried to the laboratory. During the study period, 15,851 blood specimens were transported via our pneumatic tube system and 92 samples were hand delivered. The proportions of hemolyzed specimens in the two groups were not significantly different (13.6% vs. 13.1% [p=0.90]). Results were consistent when the criterion was limited to gross (3.3% vs 3.3% [p=0.99]) or mild (10.3% vs 9.8% [p=0.88]) hemolysis. The hemolysis rate showed minimal variation during the study period (12.6%-14.6%). We found no statistical difference in the percentages of hemolyzed specimens transported by a pneumatic tube system or hand delivered to the laboratory. Certain features of pneumatic tube systems might contribute to hemolysis (e.g., speed, distance, packing material). Since each system is unique in design, we encourage medical facilities to consider whether their method of transport might contribute to hemolysis in samples obtained in the emergency department.
Ridge 2000 Data Management System
NASA Astrophysics Data System (ADS)
Goodwillie, A. M.; Carbotte, S. M.; Arko, R. A.; Haxby, W. F.; Ryan, W. B.; Chayes, D. N.; Lehnert, K. A.; Shank, T. M.
2005-12-01
Hosted at Lamont by the marine geoscience Data Management group, mgDMS, the NSF-funded Ridge 2000 electronic database, http://www.marine-geo.org/ridge2000/, is a key component of the Ridge 2000 multi-disciplinary program. The database covers each of the three Ridge 2000 Integrated Study Sites: Endeavour Segment, Lau Basin, and 8-11N Segment. It promotes the sharing of information to the broader community, facilitates integration of the suite of information collected at each study site, and enables comparisons between sites. The Ridge 2000 data system provides easy web access to a relational database that is built around a catalogue of cruise metadata. Any web browser can be used to perform a versatile text-based search which returns basic cruise and submersible dive information, sample and data inventories, navigation, and other relevant metadata such as shipboard personnel and links to NSF program awards. In addition, non-proprietary data files, images, and derived products which are hosted locally or in national repositories, as well as science and technical reports, can be freely downloaded. On the Ridge 2000 database page, our Data Link allows users to search the database using a broad range of parameters including data type, cruise ID, chief scientist, geographical location. The first Ridge 2000 field programs sailed in 2004 and, in addition to numerous data sets collected prior to the Ridge 2000 program, the database currently contains information on fifteen Ridge 2000-funded cruises and almost sixty Alvin dives. Track lines can be viewed using a recently- implemented Web Map Service button labelled Map View. The Ridge 2000 database is fully integrated with databases hosted by the mgDMS group for MARGINS and the Antarctic multibeam and seismic reflection data initiatives. Links are provided to partner databases including PetDB, SIOExplorer, and the ODP Janus system. Improved inter-operability with existing and new partner repositories continues to be strengthened. One major effort involves the gradual unification of the metadata across these partner databases. Standardised electronic metadata forms that can be filled in at sea are available from our web site. Interactive map-based exploration and visualisation of the Ridge 2000 database is provided by GeoMapApp, a freely-available Java(tm) application being developed within the mgDMS group. GeoMapApp includes high-resolution bathymetric grids for the 8-11N EPR segment and allows customised maps and grids for any of the Ridge 2000 ISS to be created. Vent and instrument locations can be plotted and saved as images, and Alvin dive photos are also available.
Forensic parameters of the X-STR Decaplex system in Mexican populations.
Mariscal Ramos, C; Martínez-Cortes, G; Ramos-González, B; Rangel-Villalobos, H
2018-03-01
We studied the X-STR decaplex system in 529 DNA female samples of Mexican populations from five geographic regions. Allele frequencies and forensic parameters were estimated in each region and in the pooled Mexican population. Genotype distribution by locus was in agreement with Hardy-Weinberg expectations in each Mexican population sample. Similarly, linkage equilibrium was demonstrated between pair of loci. Pairwise comparisons and genetic distances between Mexican, Iberoamerican and one African populations were estimated and graphically represented. Interestingly, a non-significant interpopulation differentiation was detected (Fst = 0.0021; p = .74389), which allows using a global Mexican database for forensic interpretation of X-STR genotypes. Copyright © 2017 Elsevier B.V. All rights reserved.
Monitoring service for the Gran Telescopio Canarias control system
NASA Astrophysics Data System (ADS)
Huertas, Manuel; Molgo, Jordi; Macías, Rosa; Ramos, Francisco
2016-07-01
The Monitoring Service collects, persists and propagates the Telescope and Instrument telemetry, for the Gran Telescopio CANARIAS (GTC), an optical-infrared 10-meter segmented mirror telescope at the ORM observatory in Canary Islands (Spain). A new version of the Monitoring Service has been developed in order to improve performance, provide high availability, guarantee fault tolerance and scalability to cope with high volume of data. The architecture is based on a distributed in-memory data store with a Product/Consumer pattern design. The producer generates the data samples. The consumers either persists the samples to a database for further analysis or propagates them to the consoles in the control room to monitorize the state of the whole system.
Static versus dynamic sampling for data mining
DOE Office of Scientific and Technical Information (OSTI.GOV)
John, G.H.; Langley, P.
1996-12-31
As data warehouses grow to the point where one hundred gigabytes is considered small, the computational efficiency of data-mining algorithms on large databases becomes increasingly important. Using a sample from the database can speed up the datamining process, but this is only acceptable if it does not reduce the quality of the mined knowledge. To this end, we introduce the {open_quotes}Probably Close Enough{close_quotes} criterion to describe the desired properties of a sample. Sampling usually refers to the use of static statistical tests to decide whether a sample is sufficiently similar to the large database, in the absence of any knowledgemore » of the tools the data miner intends to use. We discuss dynamic sampling methods, which take into account the mining tool being used and can thus give better samples. We describe dynamic schemes that observe a mining tool`s performance on training samples of increasing size and use these results to determine when a sample is sufficiently large. We evaluate these sampling methods on data from the UCI repository and conclude that dynamic sampling is preferable.« less
NASA Astrophysics Data System (ADS)
Geyer, Adelina; Marti, Joan
2015-04-01
Collapse calderas are one of the most important volcanic structures not only because of their hazard implications, but also because of their high geothermal energy potential and their association with mineral deposits of economic interest. In 2008 we presented a new general worldwide Collapse Caldera DataBase (CCDB), in order to provide a useful and accessible tool for studying and understanding caldera collapse processes. The principal aim of the CCDB is to update the current field based knowledge on calderas, merging together the existing databases and complementing them with new examples found in the bibliography, and leaving it open for the incorporation of new data from future studies. Currently, the database includes over 450 documented calderas around the world, trying to be representative enough to promote further studies and analyses. We have performed a comprehensive compilation of published field studies of collapse calderas including more than 500 references, and their information has been summarized in a database linked to a Geographical Information System (GIS) application. Thus, it is possible to visualize the selected calderas on a world map and to filter them according to different features recorded in the database (e.g. age, structure). The information recorded in the CCDB can be grouped in seven main information classes: caldera features, properties of the caldera-forming deposits, magmatic system, geodynamic setting, pre-caldera volcanism,caldera-forming eruption sequence and post-caldera activity. Additionally, we have added two extra classes. The first records the references consulted for each caldera. The second allows users to introduce comments on the caldera sample such as possible controversies concerning the caldera origin. During the last seven years, the database has been available on-line at http://www.gvb-csic.es/CCDB.htm previous registration. This year, the CCDB webpage will be updated and improved so the database content can be queried on-line. This research was partially funded by the research fellowship RYC-2012-11024.
Development of expert systems for analyzing electronic documents
NASA Astrophysics Data System (ADS)
Abeer Yassin, Al-Azzawi; Shidlovskiy, S.; Jamal, A. A.
2018-05-01
The paper analyses a Database Management System (DBMS). Expert systems, Databases, and database technology have become an essential component of everyday life in the modern society. As databases are widely used in every organization with a computer system, data resource control and data management are very important [1]. DBMS is the most significant tool developed to serve multiple users in a database environment consisting of programs that enable users to create and maintain a database. This paper focuses on development of a database management system for General Directorate for education of Diyala in Iraq (GDED) using Clips, java Net-beans and Alfresco and system components, which were previously developed in Tomsk State University at the Faculty of Innovative Technology.
Discovering Physical Samples Through Identifiers, Metadata, and Brokering
NASA Astrophysics Data System (ADS)
Arctur, D. K.; Hills, D. J.; Jenkyns, R.
2015-12-01
Physical samples, particularly in the geosciences, are key to understanding the Earth system, its history, and its evolution. Our record of the Earth as captured by physical samples is difficult to explain and mine for understanding, due to incomplete, disconnected, and evolving metadata content. This is further complicated by differing ways of classifying, cataloguing, publishing, and searching the metadata, especially when specimens do not fit neatly into a single domain—for example, fossils cross disciplinary boundaries (mineral and biological). Sometimes even the fundamental classification systems evolve, such as the geological time scale, triggering daunting processes to update existing specimen databases. Increasingly, we need to consider ways of leveraging permanent, unique identifiers, as well as advancements in metadata publishing that link digital records with physical samples in a robust, adaptive way. An NSF EarthCube Research Coordination Network (RCN) called the Internet of Samples (iSamples) is now working to bridge the metadata schemas for biological and geological domains. We are leveraging the International Geo Sample Number (IGSN) that provides a versatile system of registering physical samples, and working to harmonize this with the DataCite schema for Digital Object Identifiers (DOI). A brokering approach for linking disparate catalogues and classification systems could help scale discovery and access to the many large collections now being managed (sometimes millions of specimens per collection). This presentation is about our community building efforts, research directions, and insights to date.
Baedecker, P.A.; Grossman, J.N.
1995-01-01
A PC based system has been developed for the analysis of gamma-ray spectra and for the complete reduction of data from INAA experiments, including software to average the results from mulitple lines and multiple countings and to produce a final report of analysis. Graphics algorithms may be called for the analysis of complex spectral features, to compare the data from alternate photopeaks and to evaluate detector performance during a given counting cycle. A database of results for control samples can be used to prepare quality control charts to evaluate long term precision and to search for systemic variations in data on reference samples as a function of time. The entire software library can be accessed through a user-friendly menu interface with internal help.
Environment/Health/Safety (EHS): Databases
Hazard Documents Database Biosafety Authorization System CATS (Corrective Action Tracking System) (for findings 12/2005 to present) Chemical Management System Electrical Safety Ergonomics Database (for new Learned / Best Practices REMS - Radiation Exposure Monitoring System SJHA Database - Subcontractor Job
MetaBar - a tool for consistent contextual data acquisition and standards compliant submission.
Hankeln, Wolfgang; Buttigieg, Pier Luigi; Fink, Dennis; Kottmann, Renzo; Yilmaz, Pelin; Glöckner, Frank Oliver
2010-06-30
Environmental sequence datasets are increasing at an exponential rate; however, the vast majority of them lack appropriate descriptors like sampling location, time and depth/altitude: generally referred to as metadata or contextual data. The consistent capture and structured submission of these data is crucial for integrated data analysis and ecosystems modeling. The application MetaBar has been developed, to support consistent contextual data acquisition. MetaBar is a spreadsheet and web-based software tool designed to assist users in the consistent acquisition, electronic storage, and submission of contextual data associated to their samples. A preconfigured Microsoft Excel spreadsheet is used to initiate structured contextual data storage in the field or laboratory. Each sample is given a unique identifier and at any stage the sheets can be uploaded to the MetaBar database server. To label samples, identifiers can be printed as barcodes. An intuitive web interface provides quick access to the contextual data in the MetaBar database as well as user and project management capabilities. Export functions facilitate contextual and sequence data submission to the International Nucleotide Sequence Database Collaboration (INSDC), comprising of the DNA DataBase of Japan (DDBJ), the European Molecular Biology Laboratory database (EMBL) and GenBank. MetaBar requests and stores contextual data in compliance to the Genomic Standards Consortium specifications. The MetaBar open source code base for local installation is available under the GNU General Public License version 3 (GNU GPL3). The MetaBar software supports the typical workflow from data acquisition and field-sampling to contextual data enriched sequence submission to an INSDC database. The integration with the megx.net marine Ecological Genomics database and portal facilitates georeferenced data integration and metadata-based comparisons of sampling sites as well as interactive data visualization. The ample export functionalities and the INSDC submission support enable exchange of data across disciplines and safeguarding contextual data.
MetaBar - a tool for consistent contextual data acquisition and standards compliant submission
2010-01-01
Background Environmental sequence datasets are increasing at an exponential rate; however, the vast majority of them lack appropriate descriptors like sampling location, time and depth/altitude: generally referred to as metadata or contextual data. The consistent capture and structured submission of these data is crucial for integrated data analysis and ecosystems modeling. The application MetaBar has been developed, to support consistent contextual data acquisition. Results MetaBar is a spreadsheet and web-based software tool designed to assist users in the consistent acquisition, electronic storage, and submission of contextual data associated to their samples. A preconfigured Microsoft® Excel® spreadsheet is used to initiate structured contextual data storage in the field or laboratory. Each sample is given a unique identifier and at any stage the sheets can be uploaded to the MetaBar database server. To label samples, identifiers can be printed as barcodes. An intuitive web interface provides quick access to the contextual data in the MetaBar database as well as user and project management capabilities. Export functions facilitate contextual and sequence data submission to the International Nucleotide Sequence Database Collaboration (INSDC), comprising of the DNA DataBase of Japan (DDBJ), the European Molecular Biology Laboratory database (EMBL) and GenBank. MetaBar requests and stores contextual data in compliance to the Genomic Standards Consortium specifications. The MetaBar open source code base for local installation is available under the GNU General Public License version 3 (GNU GPL3). Conclusion The MetaBar software supports the typical workflow from data acquisition and field-sampling to contextual data enriched sequence submission to an INSDC database. The integration with the megx.net marine Ecological Genomics database and portal facilitates georeferenced data integration and metadata-based comparisons of sampling sites as well as interactive data visualization. The ample export functionalities and the INSDC submission support enable exchange of data across disciplines and safeguarding contextual data. PMID:20591175
Anderson, Beth M.; Stevens, Michael C.; Glahn, David C.; Assaf, Michal; Pearlson, Godfrey D.
2013-01-01
We present a modular, high performance, open-source database system that incorporates popular neuroimaging database features with novel peer-to-peer sharing, and a simple installation. An increasing number of imaging centers have created a massive amount of neuroimaging data since fMRI became popular more than 20 years ago, with much of that data unshared. The Neuroinformatics Database (NiDB) provides a stable platform to store and manipulate neuroimaging data and addresses several of the impediments to data sharing presented by the INCF Task Force on Neuroimaging Datasharing, including 1) motivation to share data, 2) technical issues, and 3) standards development. NiDB solves these problems by 1) minimizing PHI use, providing a cost effective simple locally stored platform, 2) storing and associating all data (including genome) with a subject and creating a peer-to-peer sharing model, and 3) defining a sample, normalized definition of a data storage structure that is used in NiDB. NiDB not only simplifies the local storage and analysis of neuroimaging data, but also enables simple sharing of raw data and analysis methods, which may encourage further sharing. PMID:23912507
Automatic lung nodule graph cuts segmentation with deep learning false positive reduction
NASA Astrophysics Data System (ADS)
Sun, Wenqing; Huang, Xia; Tseng, Tzu-Liang Bill; Qian, Wei
2017-03-01
To automatic detect lung nodules from CT images, we designed a two stage computer aided detection (CAD) system. The first stage is graph cuts segmentation to identify and segment the nodule candidates, and the second stage is convolutional neural network for false positive reduction. The dataset contains 595 CT cases randomly selected from Lung Image Database Consortium and Image Database Resource Initiative (LIDC/IDRI) and the 305 pulmonary nodules achieved diagnosis consensus by all four experienced radiologists were our detection targets. Consider each slice as an individual sample, 2844 nodules were included in our database. The graph cuts segmentation was conducted in a two-dimension manner, 2733 lung nodule ROIs are successfully identified and segmented. With a false positive reduction by a seven-layer convolutional neural network, 2535 nodules remain detected while the false positive dropped to 31.6%. The average F-measure of segmented lung nodule tissue is 0.8501.
Advancing the large-scale CCS database for metabolomics and lipidomics at the machine-learning era.
Zhou, Zhiwei; Tu, Jia; Zhu, Zheng-Jiang
2018-02-01
Metabolomics and lipidomics aim to comprehensively measure the dynamic changes of all metabolites and lipids that are present in biological systems. The use of ion mobility-mass spectrometry (IM-MS) for metabolomics and lipidomics has facilitated the separation and the identification of metabolites and lipids in complex biological samples. The collision cross-section (CCS) value derived from IM-MS is a valuable physiochemical property for the unambiguous identification of metabolites and lipids. However, CCS values obtained from experimental measurement and computational modeling are limited available, which significantly restricts the application of IM-MS. In this review, we will discuss the recently developed machine-learning based prediction approach, which could efficiently generate precise CCS databases in a large scale. We will also highlight the applications of CCS databases to support metabolomics and lipidomics. Copyright © 2017 Elsevier Ltd. All rights reserved.
Advances in Satellite Microwave Precipitation Retrieval Algorithms Over Land
NASA Astrophysics Data System (ADS)
Wang, N. Y.; You, Y.; Ferraro, R. R.
2015-12-01
Precipitation plays a key role in the earth's climate system, particularly in the aspect of its water and energy balance. Satellite microwave (MW) observations of precipitation provide a viable mean to achieve global measurement of precipitation with sufficient sampling density and accuracy. However, accurate precipitation information over land from satellite MW is a challenging problem. The Goddard Profiling Algorithm (GPROF) algorithm for the Global Precipitation Measurement (GPM) is built around the Bayesian formulation (Evans et al., 1995; Kummerow et al., 1996). GPROF uses the likelihood function and the prior probability distribution function to calculate the expected value of precipitation rate, given the observed brightness temperatures. It is particularly convenient to draw samples from a prior PDF from a predefined database of observations or models. GPROF algorithm does not search all database entries but only the subset thought to correspond to the actual observation. The GPM GPROF V1 database focuses on stratification by surface emissivity class, land surface temperature and total precipitable water. However, there is much uncertainty as to what is the optimal information needed to subset the database for different conditions. To this end, we conduct a database stratification study of using National Mosaic and Multi-Sensor Quantitative Precipitation Estimation, Special Sensor Microwave Imager/Sounder (SSMIS) and Advanced Technology Microwave Sounder (ATMS) and reanalysis data from Modern-Era Retrospective Analysis for Research and Applications (MERRA). Our database study (You et al., 2015) shows that environmental factors such as surface elevation, relative humidity, and storm vertical structure and height, and ice thickness can help in stratifying a single large database to smaller and more homogeneous subsets, in which the surface condition and precipitation vertical profiles are similar. It is found that the probability of detection (POD) increases about 8% and 12% by using stratified databases for rainfall and snowfall detection, respectively. In addition, by considering the relative humidity at lower troposphere and the vertical velocity at 700 hPa in the precipitation detection process, the POD for snowfall detection is further increased by 20.4% from 56.0% to 76.4%.
A prescription fraud detection model.
Aral, Karca Duru; Güvenir, Halil Altay; Sabuncuoğlu, Ihsan; Akar, Ahmet Ruchan
2012-04-01
Prescription fraud is a main problem that causes substantial monetary loss in health care systems. We aimed to develop a model for detecting cases of prescription fraud and test it on real world data from a large multi-center medical prescription database. Conventionally, prescription fraud detection is conducted on random samples by human experts. However, the samples might be misleading and manual detection is costly. We propose a novel distance based on data-mining approach for assessing the fraudulent risk of prescriptions regarding cross-features. Final tests have been conducted on adult cardiac surgery database. The results obtained from experiments reveal that the proposed model works considerably well with a true positive rate of 77.4% and a false positive rate of 6% for the fraudulent medical prescriptions. The proposed model has the potential advantages including on-line risk prediction for prescription fraud, off-line analysis of high-risk prescriptions by human experts, and self-learning ability by regular updates of the integrative data sets. We conclude that incorporating such a system in health authorities, social security agencies and insurance companies would improve efficiency of internal review to ensure compliance with the law, and radically decrease human-expert auditing costs. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
The Marshall Islands Data Management Program
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stoker, A.C.; Conrado, C.L.
1995-09-01
This report is a resource document of the methods and procedures used currently in the Data Management Program of the Marshall Islands Dose Assessment and Radioecology Project. Since 1973, over 60,000 environmental samples have been collected. Our program includes relational database design, programming and maintenance; sample and information management; sample tracking; quality control; and data entry, evaluation and reduction. The usefulness of scientific databases involves careful planning in order to fulfill the requirements of any large research program. Compilation of scientific results requires consolidation of information from several databases, and incorporation of new information as it is generated. The successmore » in combining and organizing all radionuclide analysis, sample information and statistical results into a readily accessible form, is critical to our project.« less
A Relational Database System for Student Use.
ERIC Educational Resources Information Center
Fertuck, Len
1982-01-01
Describes an APL implementation of a relational database system suitable for use in a teaching environment in which database development and database administration are studied, and discusses the functions of the user and the database administrator. An appendix illustrating system operation and an eight-item reference list are attached. (Author/JL)
A laboratory information management system for the analysis of tritium (3H) in environmental waters.
Belachew, Dagnachew Legesse; Terzer-Wassmuth, Stefan; Wassenaar, Leonard I; Klaus, Philipp M; Copia, Lorenzo; Araguás, Luis J Araguás; Aggarwal, Pradeep
2018-07-01
Accurate and precise measurements of low levels of tritium ( 3 H) in environmental waters are difficult to attain due to complex steps of sample preparation, electrolytic enrichment, liquid scintillation decay counting, and extensive data processing. We present a Microsoft Access™ relational database application, TRIMS (Tritium Information Management System) to assist with sample and data processing of tritium analysis by managing the processes from sample registration and analysis to reporting and archiving. A complete uncertainty propagation algorithm ensures tritium results are reported with robust uncertainty metrics. TRIMS will help to increase laboratory productivity and improve the accuracy and precision of 3 H assays. The software supports several enrichment protocols and LSC counter types. TRIMS is available for download at no cost from the IAEA at www.iaea.org/water. Copyright © 2018 Elsevier Ltd. All rights reserved.
Adaptive compensation of aberrations in ultrafast 3D microscopy using a deformable mirror
NASA Astrophysics Data System (ADS)
Sherman, Leah R.; Albert, O.; Schmidt, Christoph F.; Vdovin, Gleb V.; Mourou, Gerard A.; Norris, Theodore B.
2000-05-01
3D imaging using a multiphoton scanning confocal microscope is ultimately limited by aberrations of the system. We describe a system to adaptively compensate the aberrations with a deformable mirror. We have increased the transverse scanning range of the microscope by three with compensation of off-axis aberrations.We have also significantly increased the longitudinal scanning depth with compensation of spherical aberrations from the penetration into the sample. Our correction is based on a genetic algorithm that uses second harmonic or two-photon fluorescence signal excited by femtosecond pulses from the sample as the enhancement parameter. This allows us to globally optimize the wavefront without a wavefront measurement. To improve the speed of the optimization we use Zernike polynomials as the basis for correction. Corrections can be stored in a database for look-up with future samples.
James Webb Space Telescope XML Database: From the Beginning to Today
NASA Technical Reports Server (NTRS)
Gal-Edd, Jonathan; Fatig, Curtis C.
2005-01-01
The James Webb Space Telescope (JWST) Project has been defining, developing, and exercising the use of a common eXtensible Markup Language (XML) for the command and telemetry (C&T) database structure. JWST is the first large NASA space mission to use XML for databases. The JWST project started developing the concepts for the C&T database in 2002. The database will need to last at least 20 years since it will be used beginning with flight software development, continuing through Observatory integration and test (I&T) and through operations. Also, a database tool kit has been provided to the 18 various flight software development laboratories located in the United States, Europe, and Canada that allows the local users to create their own databases. Recently the JWST Project has been working with the Jet Propulsion Laboratory (JPL) and Object Management Group (OMG) XML Telemetry and Command Exchange (XTCE) personnel to provide all the information needed by JWST and JPL for exchanging database information using a XML standard structure. The lack of standardization requires custom ingest scripts for each ground system segment, increasing the cost of the total system. Providing a non-proprietary standard of the telemetry and command database definition formation will allow dissimilar systems to communicate without the need for expensive mission specific database tools and testing of the systems after the database translation. The various ground system components that would benefit from a standardized database are the telemetry and command systems, archives, simulators, and trending tools. JWST has exchanged the XML database with the Eclipse, EPOCH, ASIST ground systems, Portable spacecraft simulator (PSS), a front-end system, and Integrated Trending and Plotting System (ITPS) successfully. This paper will discuss how JWST decided to use XML, the barriers to a new concept, experiences utilizing the XML structure, exchanging databases with other users, and issues that have been experienced in creating databases for the C&T system.
ERIC Educational Resources Information Center
Dalrymple, Prudence W.; Roderer, Nancy K.
1994-01-01
Highlights the changes that have occurred from 1987-93 in database access systems. Topics addressed include types of databases, including CD-ROMs; enduser interface; database selection; database access management, including library instruction and use of primary literature; economic issues; database users; the search process; and improving…
An Introduction to Database Structure and Database Machines.
ERIC Educational Resources Information Center
Detweiler, Karen
1984-01-01
Enumerates principal management objectives of database management systems (data independence, quality, security, multiuser access, central control) and criteria for comparison (response time, size, flexibility, other features). Conventional database management systems, relational databases, and database machines used for backend processing are…
Federal Register 2010, 2011, 2012, 2013, 2014
2013-05-16
... Excluded Parties Listing System (EPLS) databases into the System for Award Management (SAM) database. DATES... combined the functional capabilities of the CCR, ORCA, and EPLS procurement systems into the SAM database... identification number and the type of organization from the System for Award Management database. 0 3. Revise the...
The EXOSAT database and archive
NASA Technical Reports Server (NTRS)
Reynolds, A. P.; Parmar, A. N.
1992-01-01
The EXOSAT database provides on-line access to the results and data products (spectra, images, and lightcurves) from the EXOSAT mission as well as access to data and logs from a number of other missions (such as EINSTEIN, COS-B, ROSAT, and IRAS). In addition, a number of familiar optical, infrared, and x ray catalogs, including the Hubble Space Telescope (HST) guide star catalog are available. The complete database is located at the EXOSAT observatory at ESTEC in the Netherlands and is accessible remotely via a captive account. The database management system was specifically developed to efficiently access the database and to allow the user to perform statistical studies on large samples of astronomical objects as well as to retrieve scientific and bibliographic information on single sources. The system was designed to be mission independent and includes timing, image processing, and spectral analysis packages as well as software to allow the easy transfer of analysis results and products to the user's own institute. The archive at ESTEC comprises a subset of the EXOSAT observations, stored on magnetic tape. Observations of particular interest were copied in compressed format to an optical jukebox, allowing users to retrieve and analyze selected raw data entirely from their terminals. Such analysis may be necessary if the user's needs are not accommodated by the products contained in the database (in terms of time resolution, spectral range, and the finesse of the background subtraction, for instance). Long-term archiving of the full final observation data is taking place at ESRIN in Italy as part of the ESIS program, again using optical media, and ESRIN have now assumed responsibility for distributing the data to the community. Tests showed that raw observational data (typically several tens of megabytes for a single target) can be transferred via the existing networks in reasonable time.
NASA Astrophysics Data System (ADS)
Barette, Florian; Poppe, Sam; Smets, Benoît; Benbakkar, Mhammed; Kervyn, Matthieu
2017-10-01
We present an integrated, spatially-explicit database of existing geochemical major-element analyses available from (post-) colonial scientific reports, PhD Theses and international publications for the Virunga Volcanic Province, located in the western branch of the East African Rift System. This volcanic province is characterised by alkaline volcanism, including silica-undersaturated, alkaline and potassic lavas. The database contains a total of 908 geochemical analyses of eruptive rocks for the entire volcanic province with a localisation for most samples. A preliminary analysis of the overall consistency of the database, using statistical techniques on sets of geochemical analyses with contrasted analytical methods or dates, demonstrates that the database is consistent. We applied a principal component analysis and cluster analysis on whole-rock major element compositions included in the database to study the spatial variation of the chemical composition of eruptive products in the Virunga Volcanic Province. These statistical analyses identify spatially distributed clusters of eruptive products. The known geochemical contrasts are highlighted by the spatial analysis, such as the unique geochemical signature of Nyiragongo lavas compared to other Virunga lavas, the geochemical heterogeneity of the Bulengo area, and the trachyte flows of Karisimbi volcano. Most importantly, we identified separate clusters of eruptive products which originate from primitive magmatic sources. These lavas of primitive composition are preferentially located along NE-SW inherited rift structures, often at distance from the central Virunga volcanoes. Our results illustrate the relevance of a spatial analysis on integrated geochemical data for a volcanic province, as a complement to classical petrological investigations. This approach indeed helps to characterise geochemical variations within a complex of magmatic systems and to identify specific petrologic and geochemical investigations that should be tackled within a study area.
The Italian Twin Project: from the personal identification number to a national twin registry.
Stazi, Maria Antonietta; Cotichini, Rodolfo; Patriarca, Valeria; Brescianini, Sonia; Fagnani, Corrado; D'Ippolito, Cristina; Cannoni, Stefania; Ristori, Giovanni; Salvetti, Marco
2002-10-01
The unique opportunity given by the "fiscal code", an alphanumeric identification with demographic information on any single person residing in Italy, introduced in 1976 by the Ministry of Finance, allowed a database of all potential Italian twins to be created. This database contains up to now name, surname, date and place of birth and home address of about 1,300,000 "possible twins". Even though we estimated an excess of 40% of pseudo-twins, this still is the world's largest twin population ever collected. The database of possible twins is currently used in population-based studies on multiple sclerosis, Alzheimer's disease, celiac disease, and type 1 diabetes. A system is currently being developed for linking the database with data from mortality and cancer registries. In 2001, the Italian Government, through the Ministry of Health, financed a broad national research program on twin studies, including the establishment of a national twin registry. Among all the possible twins, a sample of 500,000 individuals are going to be contacted and we expect to enrol around 120,000 real twin pairs in a formal Twin Registry. According to available financial resources, a sub sample of the enrolled population will be asked to donate DNA. A biological bank from twins will be then implemented, guaranteeing information on future etiological questions regarding genetic and modifiable factors for physical impairment and disability, cancers, cardiovascular diseases and other age related chronic illnesses.
Access and use of the GUDMAP database of genitourinary development.
Davies, Jamie A; Little, Melissa H; Aronow, Bruce; Armstrong, Jane; Brennan, Jane; Lloyd-MacGilp, Sue; Armit, Chris; Harding, Simon; Piu, Xinjun; Roochun, Yogmatee; Haggarty, Bernard; Houghton, Derek; Davidson, Duncan; Baldock, Richard
2012-01-01
The Genitourinary Development Molecular Atlas Project (GUDMAP) aims to document gene expression across time and space in the developing urogenital system of the mouse, and to provide access to a variety of relevant practical and educational resources. Data come from microarray gene expression profiling (from laser-dissected and FACS-sorted samples) and in situ hybridization at both low (whole-mount) and high (section) resolutions. Data are annotated to a published, high-resolution anatomical ontology and can be accessed using a variety of search interfaces. Here, we explain how to run typical queries on the database, by gene or anatomical location, how to view data, how to perform complex queries, and how to submit data.
The Hong Kong/AAO/Strasbourg Hα (HASH) Planetary Nebula Database
NASA Astrophysics Data System (ADS)
Bojičić, Ivan S.; Parker, Quentin A.; Frew, David J.
2017-10-01
The Hong Kong/AAO/Strasbourg Hα (HASH) planetary nebula database is an online research platform providing free and easy access to the largest and most comprehensive catalogue of known Galactic PNe and a repository of observational data (imaging and spectroscopy) for these and related astronomical objects. The main motivation for creating this system is resolving some of long standing problems in the field e.g. problems with mimics and dubious and/or misidentifications, errors in observational data and consolidation of the widely scattered data-sets. This facility allows researchers quick and easy access to the archived and new observational data and creating and sharing of non-redundant PN samples and catalogues.
Pediatric post-marketing safety systems in North America: assessment of the current status.
McMahon, Ann W; Wharton, Gerold T; Bonnel, Renan; DeCelle, Mary; Swank, Kimberley; Testoni, Daniela; Cope, Judith U; Smith, Phillip Brian; Wu, Eileen; Murphy, Mary Dianne
2015-08-01
It is critical to have pediatric post-marketing safety systems that contain enough clinical and epidemiological detail to draw regulatory, public health, and clinical conclusions. The pediatric safety surveillance workshop (PSSW), coordinated by the Food and Drug Administration (FDA), identified these pediatric systems as of 2010. This manuscript aims to update the information from the PSSW and look critically at the systems currently in use. We reviewed North American pediatric post-marketing safety systems such as databases, networks, and research consortiums found in peer-reviewed journals and other online sources. We detail clinical examples from three systems that FDA used to assess pediatric medical product safety. Of the 59 systems reviewed for pediatric content, only nine were pediatric-focused and met the inclusion criteria. Brief descriptions are provided for these nine. The strengths and weaknesses of three systems (two of the nine pediatric-focused and one including both children and adults) are illustrated with clinical examples. Systems reviewed in this manuscript have strengths such as clinical detail, a large enough sample size to capture rare adverse events, and/or a patient denominator internal to the database. Few systems include all of these attributes. Pediatric drug safety would be better informed by utilizing multiple systems to take advantage of their individual characteristics. Copyright © 2015 John Wiley & Sons, Ltd.
Ground sample data for the Conterminous U.S. Land Cover Characteristics Database
Robert Burgan; Colin Hardy; Donald Ohlen; Gene Fosnight; Robert Treder
1999-01-01
Ground sample data were collected for a land cover database and raster map that portray 159 vegetation classes at 1 km2 resolution for the conterminous United States. Locations for 3,500 1 km2 ground sample plots were selected randomly across the United States. The number of plots representing each vegetation class was weighted by the proportionate coverage of each...
A BRDF-BPDF database for the analysis of Earth target reflectances
NASA Astrophysics Data System (ADS)
Breon, Francois-Marie; Maignan, Fabienne
2017-01-01
Land surface reflectance is not isotropic. It varies with the observation geometry that is defined by the sun, view zenith angles, and the relative azimuth. In addition, the reflectance is linearly polarized. The reflectance anisotropy is quantified by the bidirectional reflectance distribution function (BRDF), while its polarization properties are defined by the bidirectional polarization distribution function (BPDF). The POLDER radiometer that flew onboard the PARASOL microsatellite remains the only space instrument that measured numerous samples of the BRDF and BPDF of Earth targets. Here, we describe a database of representative BRDFs and BPDFs derived from the POLDER measurements. From the huge number of data acquired by the spaceborne instrument over a period of 7 years, we selected a set of targets with high-quality observations. The selection aimed for a large number of observations, free of significant cloud or aerosol contamination, acquired in diverse observation geometries with a focus on the backscatter direction that shows the specific hot spot signature. The targets are sorted according to the 16-class International Geosphere-Biosphere Programme (IGBP) land cover classification system, and the target selection aims at a spatial representativeness within the class. The database thus provides a set of high-quality BRDF and BPDF samples that can be used to assess the typical variability of natural surface reflectances or to evaluate models. It is available freely from the PANGAEA website (doi:10.1594/PANGAEA.864090). In addition to the database, we provide a visualization and analysis tool based on the Interactive Data Language (IDL). It allows an interactive analysis of the measurements and a comparison against various BRDF and BPDF analytical models. The present paper describes the input data, the selection principles, the database format, and the analysis tool
Heterogeneous distributed databases: A case study
NASA Technical Reports Server (NTRS)
Stewart, Tracy R.; Mukkamala, Ravi
1991-01-01
Alternatives are reviewed for accessing distributed heterogeneous databases and a recommended solution is proposed. The current study is limited to the Automated Information Systems Center at the Naval Sea Combat Systems Engineering Station at Norfolk, VA. This center maintains two databases located on Digital Equipment Corporation's VAX computers running under the VMS operating system. The first data base, ICMS, resides on a VAX11/780 and has been implemented using VAX DBMS, a CODASYL based system. The second database, CSA, resides on a VAX 6460 and has been implemented using the ORACLE relational database management system (RDBMS). Both databases are used for configuration management within the U.S. Navy. Different customer bases are supported by each database. ICMS tracks U.S. Navy ships and major systems (anti-sub, sonar, etc.). Even though the major systems on ships and submarines have totally different functions, some of the equipment within the major systems are common to both ships and submarines.
Database crime to crime match rate calculation.
Buckleton, John; Bright, Jo-Anne; Walsh, Simon J
2009-06-01
Guidance exists on how to count matches between samples in a crime sample database but we are unable to locate a definition of how to estimate a match rate. We propose a method that does not proceed from the match counting definition but which has a strong logic.
Oostdik, Kathryn; Lenz, Kristy; Nye, Jeffrey; Schelling, Kristin; Yet, Donald; Bruski, Scott; Strong, Joshua; Buchanan, Clint; Sutton, Joel; Linner, Jessica; Frazier, Nicole; Young, Hays; Matthies, Learden; Sage, Amber; Hahn, Jeff; Wells, Regina; Williams, Natasha; Price, Monica; Koehler, Jody; Staples, Melisa; Swango, Katie L; Hill, Carolyn; Oyerly, Karen; Duke, Wendy; Katzilierakis, Lesley; Ensenberger, Martin G; Bourdeau, Jeanne M; Sprecher, Cynthia J; Krenke, Benjamin; Storts, Douglas R
2014-09-01
The original CODIS database based on 13 core STR loci has been overwhelmingly successful for matching suspects with evidence. Yet there remain situations that argue for inclusion of more loci and increased discrimination. The PowerPlex(®) Fusion System allows simultaneous amplification of the following loci: Amelogenin, D3S1358, D1S1656, D2S441, D10S1248, D13S317, Penta E, D16S539, D18S51, D2S1338, CSF1PO, Penta D, TH01, vWA, D21S11, D7S820, D5S818, TPOX, DYS391, D8S1179, D12S391, D19S433, FGA, and D22S1045. The comprehensive list of loci amplified by the system generates a profile compatible with databases based on either the expanded CODIS or European Standard Set (ESS) requirements. Developmental validation testing followed SWGDAM guidelines and demonstrated the quality and robustness of the PowerPlex(®) Fusion System across a number of variables. Consistent and high-quality results were compiled using data from 12 separate forensic and research laboratories. The results verify that the PowerPlex(®) Fusion System is a robust and reliable STR-typing multiplex suitable for human identification. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
Tschmelak, Jens; Proll, Guenther; Riedt, Johannes; Kaiser, Joachim; Kraemmer, Peter; Bárzaga, Luis; Wilkinson, James S; Hua, Ping; Hole, J Patrick; Nudd, Richard; Jackson, Michael; Abuknesha, Ram; Barceló, Damià; Rodriguez-Mozaz, Sara; de Alda, Maria J López; Sacher, Frank; Stien, Jan; Slobodník, Jaroslav; Oswald, Peter; Kozmenko, Helena; Korenková, Eva; Tóthová, Lívia; Krascsenits, Zoltan; Gauglitz, Guenter
2005-02-15
A novel analytical system AWACSS (automated water analyser computer-supported system) based on immunochemical technology has been developed that can measure several organic pollutants at low nanogram per litre level in a single few-minutes analysis without any prior sample pre-concentration nor pre-treatment steps. Having in mind actual needs of water-sector managers related to the implementation of the Drinking Water Directive (DWD) (98/83/EC, 1998) and Water Framework Directive WFD (2000/60/EC, 2000), drinking, ground, surface, and waste waters were major media used for the evaluation of the system performance. The instrument was equipped with remote control and surveillance facilities. The system's software allows for the internet-based networking between the measurement and control stations, global management, trend analysis, and early-warning applications. The experience of water laboratories has been utilised at the design of the instrument's hardware and software in order to make the system rugged and user-friendly. Several market surveys were conducted during the project to assess the applicability of the final system. A web-based AWACSS database was created for automated evaluation and storage of the obtained data in a format compatible with major databases of environmental organic pollutants in Europe. This first part article gives the reader an overview of the aims and scope of the AWACSS project as well as details about basic technology, immunoassays, software, and networking developed and utilised within the research project. The second part article reports on the system performance, first real sample measurements, and an international collaborative trial (inter-laboratory tests) to compare the biosensor with conventional anayltical methods.
Lee, Howard; Chapiro, Julius; Schernthaner, Rüdiger; Duran, Rafael; Wang, Zhijun; Gorodetski, Boris; Geschwind, Jean-François; Lin, MingDe
2015-04-01
The objective of this study was to demonstrate that an intra-arterial liver therapy clinical research database system is a more workflow efficient and robust tool for clinical research than a spreadsheet storage system. The database system could be used to generate clinical research study populations easily with custom search and retrieval criteria. A questionnaire was designed and distributed to 21 board-certified radiologists to assess current data storage problems and clinician reception to a database management system. Based on the questionnaire findings, a customized database and user interface system were created to perform automatic calculations of clinical scores including staging systems such as the Child-Pugh and Barcelona Clinic Liver Cancer, and facilitates data input and output. Questionnaire participants were favorable to a database system. The interface retrieved study-relevant data accurately and effectively. The database effectively produced easy-to-read study-specific patient populations with custom-defined inclusion/exclusion criteria. The database management system is workflow efficient and robust in retrieving, storing, and analyzing data. Copyright © 2015 AUR. Published by Elsevier Inc. All rights reserved.
Implementation of a data management software system for SSME test history data
NASA Technical Reports Server (NTRS)
Abernethy, Kenneth
1986-01-01
The implementation of a software system for managing Space Shuttle Main Engine (SSME) test/flight historical data is presented. The software system uses the database management system RIM7 for primary data storage and routine data management, but includes several FORTRAN programs, described here, which provide customized access to the RIM7 database. The consolidation, modification, and transfer of data from the database THIST, to the RIM7 database THISRM is discussed. The RIM7 utility modules for generating some standard reports from THISRM and performing some routine updating and maintenance are briefly described. The FORTRAN accessing programs described include programs for initial loading of large data sets into the database, capturing data from files for database inclusion, and producing specialized statistical reports which cannot be provided by the RIM7 report generator utility. An expert system tutorial, constructed using the expert system shell product INSIGHT2, is described. Finally, a potential expert system, which would analyze data in the database, is outlined. This system could use INSIGHT2 as well and would take advantage of RIM7's compatibility with the microcomputer database system RBase 5000.
Development and Operation of a Database Machine for Online Access and Update of a Large Database.
ERIC Educational Resources Information Center
Rush, James E.
1980-01-01
Reviews the development of a fault tolerant database processor system which replaced OCLC's conventional file system. A general introduction to database management systems and the operating environment is followed by a description of the hardware selection, software processes, and system characteristics. (SW)
75 FR 18255 - Passenger Facility Charge Database System for Air Carrier Reporting
Federal Register 2010, 2011, 2012, 2013, 2014
2010-04-09
... Facility Charge Database System for Air Carrier Reporting AGENCY: Federal Aviation Administration (FAA... the Passenger Facility Charge (PFC) database system to report PFC quarterly report information. In... developed a national PFC database system in order to more easily track the PFC program on a nationwide basis...
An Improved Database System for Program Assessment
ERIC Educational Resources Information Center
Haga, Wayne; Morris, Gerard; Morrell, Joseph S.
2011-01-01
This research paper presents a database management system for tracking course assessment data and reporting related outcomes for program assessment. It improves on a database system previously presented by the authors and in use for two years. The database system presented is specific to assessment for ABET (Accreditation Board for Engineering and…
Honoré, Paul; Granjeaud, Samuel; Tagett, Rebecca; Deraco, Stéphane; Beaudoing, Emmanuel; Rougemont, Jacques; Debono, Stéphane; Hingamp, Pascal
2006-09-20
High throughput gene expression profiling (GEP) is becoming a routine technique in life science laboratories. With experimental designs that repeatedly span thousands of genes and hundreds of samples, relying on a dedicated database infrastructure is no longer an option.GEP technology is a fast moving target, with new approaches constantly broadening the field diversity. This technology heterogeneity, compounded by the informatics complexity of GEP databases, means that software developments have so far focused on mainstream techniques, leaving less typical yet established techniques such as Nylon microarrays at best partially supported. MAF (MicroArray Facility) is the laboratory database system we have developed for managing the design, production and hybridization of spotted microarrays. Although it can support the widely used glass microarrays and oligo-chips, MAF was designed with the specific idiosyncrasies of Nylon based microarrays in mind. Notably single channel radioactive probes, microarray stripping and reuse, vector control hybridizations and spike-in controls are all natively supported by the software suite. MicroArray Facility is MIAME supportive and dynamically provides feedback on missing annotations to help users estimate effective MIAME compliance. Genomic data such as clone identifiers and gene symbols are also directly annotated by MAF software using standard public resources. The MAGE-ML data format is implemented for full data export. Journalized database operations (audit tracking), data anonymization, material traceability and user/project level confidentiality policies are also managed by MAF. MicroArray Facility is a complete data management system for microarray producers and end-users. Particular care has been devoted to adequately model Nylon based microarrays. The MAF system, developed and implemented in both private and academic environments, has proved a robust solution for shared facilities and industry service providers alike.
Honoré, Paul; Granjeaud, Samuel; Tagett, Rebecca; Deraco, Stéphane; Beaudoing, Emmanuel; Rougemont, Jacques; Debono, Stéphane; Hingamp, Pascal
2006-01-01
Background High throughput gene expression profiling (GEP) is becoming a routine technique in life science laboratories. With experimental designs that repeatedly span thousands of genes and hundreds of samples, relying on a dedicated database infrastructure is no longer an option. GEP technology is a fast moving target, with new approaches constantly broadening the field diversity. This technology heterogeneity, compounded by the informatics complexity of GEP databases, means that software developments have so far focused on mainstream techniques, leaving less typical yet established techniques such as Nylon microarrays at best partially supported. Results MAF (MicroArray Facility) is the laboratory database system we have developed for managing the design, production and hybridization of spotted microarrays. Although it can support the widely used glass microarrays and oligo-chips, MAF was designed with the specific idiosyncrasies of Nylon based microarrays in mind. Notably single channel radioactive probes, microarray stripping and reuse, vector control hybridizations and spike-in controls are all natively supported by the software suite. MicroArray Facility is MIAME supportive and dynamically provides feedback on missing annotations to help users estimate effective MIAME compliance. Genomic data such as clone identifiers and gene symbols are also directly annotated by MAF software using standard public resources. The MAGE-ML data format is implemented for full data export. Journalized database operations (audit tracking), data anonymization, material traceability and user/project level confidentiality policies are also managed by MAF. Conclusion MicroArray Facility is a complete data management system for microarray producers and end-users. Particular care has been devoted to adequately model Nylon based microarrays. The MAF system, developed and implemented in both private and academic environments, has proved a robust solution for shared facilities and industry service providers alike. PMID:16987406
Benson, Dennis A; Karsch-Mizrachi, Ilene; Lipman, David J; Ostell, James; Sayers, Eric W
2011-01-01
GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 380,000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system that integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage: www.ncbi.nlm.nih.gov.
76 FR 11465 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2011-03-02
... separate systems of records: ``FHFA-OIG Audit Files Database,'' ``FHFA-OIG Investigative & Evaluative Files Database,'' ``FHFA-OIG Investigative & Evaluative MIS Database,'' and ``FHFA-OIG Hotline Database.'' These... Audit Files Database. FHFA-OIG-2: FHFA-OIG Investigative & Evaluative Files Database. FHFA-OIG-3: FHFA...
DESIGNING ENVIRONMENTAL MONITORING DATABASES FOR STATISTIC ASSESSMENT
Databases designed for statistical analyses have characteristics that distinguish them from databases intended for general use. EMAP uses a probabilistic sampling design to collect data to produce statistical assessments of environmental conditions. In addition to supporting the ...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Robinson Khosah
2007-07-31
Advanced Technology Systems, Inc. (ATS) was contracted by the U. S. Department of Energy's National Energy Technology Laboratory (DOE-NETL) to develop a state-of-the-art, scalable and robust web-accessible database application to manage the extensive data sets resulting from the DOE-NETL-sponsored ambient air monitoring programs in the upper Ohio River valley region. The data management system was designed to include a web-based user interface that will allow easy access to the data by the scientific community, policy- and decision-makers, and other interested stakeholders, while providing detailed information on sampling, analytical and quality control parameters. In addition, the system will provide graphical analyticalmore » tools for displaying, analyzing and interpreting the air quality data. The system will also provide multiple report generation capabilities and easy-to-understand visualization formats that can be utilized by the media and public outreach/educational institutions. The project was conducted in two phases. Phase One included the following tasks: (1) data inventory/benchmarking, including the establishment of an external stakeholder group; (2) development of a data management system; (3) population of the database; (4) development of a web-based data retrieval system, and (5) establishment of an internal quality assurance/quality control system on data management. Phase Two involved the development of a platform for on-line data analysis. Phase Two included the following tasks: (1) development of a sponsor and stakeholder/user website with extensive online analytical tools; (2) development of a public website; (3) incorporation of an extensive online help system into each website; and (4) incorporation of a graphical representation (mapping) system into each website. The project is now technically completed.« less
NASA Astrophysics Data System (ADS)
Bushel, Pierre R.; Bennett, Lee; Hamadeh, Hisham; Green, James; Ableson, Alan; Misener, Steve; Paules, Richard; Afshari, Cynthia
2002-06-01
We present an analysis of pattern recognition procedures used to predict the classes of samples exposed to pharmacologic agents by comparing gene expression patterns from samples treated with two classes of compounds. Rat liver mRNA samples following exposure for 24 hours with phenobarbital or peroxisome proliferators were analyzed using a 1700 rat cDNA microarray platform. Sets of genes that were consistently differentially expressed in the rat liver samples following treatment were stored in the MicroArray Project System (MAPS) database. MAPS identified 238 genes in common that possessed a low probability (P < 0.01) of being randomly detected as differentially expressed at the 95% confidence level. Hierarchical cluster analysis on the 238 genes clustered specific gene expression profiles that separated samples based on exposure to a particular class of compound.
Active in-database processing to support ambient assisted living systems.
de Morais, Wagner O; Lundström, Jens; Wickström, Nicholas
2014-08-12
As an alternative to the existing software architectures that underpin the development of smart homes and ambient assisted living (AAL) systems, this work presents a database-centric architecture that takes advantage of active databases and in-database processing. Current platforms supporting AAL systems use database management systems (DBMSs) exclusively for data storage. Active databases employ database triggers to detect and react to events taking place inside or outside of the database. DBMSs can be extended with stored procedures and functions that enable in-database processing. This means that the data processing is integrated and performed within the DBMS. The feasibility and flexibility of the proposed approach were demonstrated with the implementation of three distinct AAL services. The active database was used to detect bed-exits and to discover common room transitions and deviations during the night. In-database machine learning methods were used to model early night behaviors. Consequently, active in-database processing avoids transferring sensitive data outside the database, and this improves performance, security and privacy. Furthermore, centralizing the computation into the DBMS facilitates code reuse, adaptation and maintenance. These are important system properties that take into account the evolving heterogeneity of users, their needs and the devices that are characteristic of smart homes and AAL systems. Therefore, DBMSs can provide capabilities to address requirements for scalability, security, privacy, dependability and personalization in applications of smart environments in healthcare.
Improved Information Retrieval Performance on SQL Database Using Data Adapter
NASA Astrophysics Data System (ADS)
Husni, M.; Djanali, S.; Ciptaningtyas, H. T.; Wicaksana, I. G. N. A.
2018-02-01
The NoSQL databases, short for Not Only SQL, are increasingly being used as the number of big data applications increases. Most systems still use relational databases (RDBs), but as the number of data increases each year, the system handles big data with NoSQL databases to analyze and access data more quickly. NoSQL emerged as a result of the exponential growth of the internet and the development of web applications. The query syntax in the NoSQL database differs from the SQL database, therefore requiring code changes in the application. Data adapter allow applications to not change their SQL query syntax. Data adapters provide methods that can synchronize SQL databases with NotSQL databases. In addition, the data adapter provides an interface which is application can access to run SQL queries. Hence, this research applied data adapter system to synchronize data between MySQL database and Apache HBase using direct access query approach, where system allows application to accept query while synchronization process in progress. From the test performed using data adapter, the results obtained that the data adapter can synchronize between SQL databases, MySQL, and NoSQL database, Apache HBase. This system spends the percentage of memory resources in the range of 40% to 60%, and the percentage of processor moving from 10% to 90%. In addition, from this system also obtained the performance of database NoSQL better than SQL database.
Active In-Database Processing to Support Ambient Assisted Living Systems
de Morais, Wagner O.; Lundström, Jens; Wickström, Nicholas
2014-01-01
As an alternative to the existing software architectures that underpin the development of smart homes and ambient assisted living (AAL) systems, this work presents a database-centric architecture that takes advantage of active databases and in-database processing. Current platforms supporting AAL systems use database management systems (DBMSs) exclusively for data storage. Active databases employ database triggers to detect and react to events taking place inside or outside of the database. DBMSs can be extended with stored procedures and functions that enable in-database processing. This means that the data processing is integrated and performed within the DBMS. The feasibility and flexibility of the proposed approach were demonstrated with the implementation of three distinct AAL services. The active database was used to detect bed-exits and to discover common room transitions and deviations during the night. In-database machine learning methods were used to model early night behaviors. Consequently, active in-database processing avoids transferring sensitive data outside the database, and this improves performance, security and privacy. Furthermore, centralizing the computation into the DBMS facilitates code reuse, adaptation and maintenance. These are important system properties that take into account the evolving heterogeneity of users, their needs and the devices that are characteristic of smart homes and AAL systems. Therefore, DBMSs can provide capabilities to address requirements for scalability, security, privacy, dependability and personalization in applications of smart environments in healthcare. PMID:25120164
An Examination of Selected Software Testing Tools: 1992
1992-12-01
Report ....................................................... 27-19 Figure 27-17. Metrics Manager Database Full Report...historical test database , the test management and problem reporting tools were examined using the sample test database provided by each supplier. 4-4...track the impact of new methods, organi- zational structures, and technologies. Metrics Manager is supported by an industry database that allows
Kab, Sofiane; Moisan, Frédéric; Preux, Pierre-Marie; Marin, Benoît; Elbaz, Alexis
2017-08-01
There are no estimates of the nationwide incidence of motor neuron disease (MND) in France. We used the French health insurance information system to identify incident MND cases (2012-2014), and compared incidence figures to those from three external sources. We identified incident MND cases (2012-2014) based on three data sources (riluzole claims, hospitalisation records, long-term chronic disease benefits), and computed MND incidence by age, gender, and geographic region. We used French mortality statistics, Limousin ALS registry data, and previous European studies based on administrative databases to perform external comparisons. We identified 6553 MND incident cases. After standardisation to the United States 2010 population, the age/gender-standardised incidence was 2.72/100,000 person-years (males, 3.37; females, 2.17; male:female ratio = 1.53, 95% CI1.46-1.61). There was no major spatial difference in MND distribution. Our data were in agreement with the French death database (standardised mortality ratio = 1.01, 95% CI = 0.96-1.06) and Limousin ALS registry (standardised incidence ratio = 0.92, 95% CI = 0.72-1.15). Incidence estimates were in the same range as those from previous studies. We report French nationwide incidence estimates of MND. Administrative databases including hospital discharge data and riluzole claims offer an interesting approach to identify large population-based samples of patients with MND for epidemiologic studies and surveillance.
Databases, Repositories, and Other Data Resources in Structural Biology.
Zheng, Heping; Porebski, Przemyslaw J; Grabowski, Marek; Cooper, David R; Minor, Wladek
2017-01-01
Structural biology, like many other areas of modern science, produces an enormous amount of primary, derived, and "meta" data with a high demand on data storage and manipulations. Primary data come from various steps of sample preparation, diffraction experiments, and functional studies. These data are not only used to obtain tangible results, like macromolecular structural models, but also to enrich and guide our analysis and interpretation of various biomedical problems. Herein we define several categories of data resources, (a) Archives, (b) Repositories, (c) Databases, and (d) Advanced Information Systems, that can accommodate primary, derived, or reference data. Data resources may be used either as web portals or internally by structural biology software. To be useful, each resource must be maintained, curated, as well as integrated with other resources. Ideally, the system of interconnected resources should evolve toward comprehensive "hubs", or Advanced Information Systems. Such systems, encompassing the PDB and UniProt, are indispensable not only for structural biology, but for many related fields of science. The categories of data resources described herein are applicable well beyond our usual scientific endeavors.
NASA Astrophysics Data System (ADS)
Friedman, Roy; Kermarrec, Anne-Marie; Miranda, Hugo; Rodrigues, Luís
Gossip-based networking has emerged as a viable approach to disseminate information reliably and efficiently in large-scale systems. Initially introduced for database replication [222], the applicability of the approach extends much further now. For example, it has been applied for data aggregation [415], peer sampling [416] and publish/subscribe systems [845]. Gossip-based protocols rely on a periodic peer-wise exchange of information in wired systems. By changing the way each peer is selected for the gossip communication, and which data are exchanged and processed [451], gossip systems can be used to perform different distributed tasks, such as, among others: overlay maintenance, distributed computation, and information dissemination (a collection of papers on gossip can be found in [451]). In a wired setting, the peer sampling service, allowing for a random or specific peer selection, is often provided as an independent service, able to operate independently from other gossip-based services [416].
A Dynamic Time Warping Approach to Real-Time Activity Recognition for Food Preparation
NASA Astrophysics Data System (ADS)
Pham, Cuong; Plötz, Thomas; Olivier, Patrick
We present a dynamic time warping based activity recognition system for the analysis of low-level food preparation activities. Accelerometers embedded into kitchen utensils provide continuous sensor data streams while people are using them for cooking. The recognition framework analyzes frames of contiguous sensor readings in real-time with low latency. It thereby adapts to the idiosyncrasies of utensil use by automatically maintaining a template database. We demonstrate the effectiveness of the classification approach by a number of real-world practical experiments on a publically available dataset. The adaptive system shows superior performance compared to a static recognizer. Furthermore, we demonstrate the generalization capabilities of the system by gradually reducing the amount of training samples. The system achieves excellent classification results even if only a small number of training samples is available, which is especially relevant for real-world scenarios.
Health information and communication system for emergency management in a developing country, Iran.
Seyedin, Seyed Hesam; Jamali, Hamid R
2011-08-01
Disasters are fortunately rare occurrences. However, accurate and timely information and communication are vital to adequately prepare individual health organizations for such events. The current article investigates the health related communication and information systems for emergency management in Iran. A mixed qualitative and quantitative methodology was used in this study. A sample of 230 health service managers was surveyed using a questionnaire and 65 semi-structured interviews were also conducted with public health and therapeutic affairs managers who were responsible for emergency management. A range of problems were identified including fragmentation of information, lack of local databases, lack of clear information strategy and lack of a formal system for logging disaster related information at regional or local level. Recommendations were made for improving the national emergency management information and communication system. The findings have implications for health organizations in developing and developed countries especially in the Middle East. Creating disaster related information databases, creating protocols and standards, setting an information strategy, training staff and hosting a center for information system in the Ministry of Health to centrally manage and share the data could improve the current information system.
Integration of NASA/GSFC and USGS Rock Magnetic Databases.
NASA Astrophysics Data System (ADS)
Nazarova, K. A.; Glen, J. M.
2004-05-01
A global Magnetic Petrology Database (MPDB) was developed and continues to be updated at NASA/Goddard Space Flight Center. The purpose of this database is to provide the geomagnetic community with a comprehensive and user-friendly method of accessing magnetic petrology data via the Internet for a more realistic interpretation of satellite (as well as aeromagnetic and ground) lithospheric magnetic anomalies. The MPDB contains data on rocks from localities around the world (about 19,000 samples) including the Ukranian and Baltic Shields, Kamchatka, Iceland, Urals Mountains, etc. The MPDB is designed, managed and presented on the web as a research oriented database. Several database applications have been specifically developed for data manipulation and analysis of the MPDB. The geophysics unit at the USGS in Menlo Park has over 17,000 rock-property data, largely from sites within the western U.S. This database contains rock-density and rock-magnetic parameters collected for use in gravity and magnetic field modeling, and paleomagnetic studies. Most of these data were taken from surface outcrops and together they span a broad range of rock types. Measurements were made either in-situ at the outcrop, or in the laboratory on hand samples and paleomagnetic cores acquired in the field. The USGS and NASA/GSFC data will be integrated as part of an effort to provide public access to a single, uniformly maintained database. Due to the large number of data and the very large area sampled, the database can yield rock-property statistics on a broad range of rock types; it is thus applicable to study areas beyond the geographic scope of the database. The intent of this effort is to provide incentive for others to further contribute to the database, and a tool with which the geophysical community can entertain studies formerly precluded.
Vegetation database for land-cover mapping, Clark and Lincoln Counties, Nevada
Charlet, David A.; Damar, Nancy A.; Leary, Patrick J.
2014-01-01
Floristic and other vegetation data were collected at 3,175 sample sites to support land-cover mapping projects in Clark and Lincoln Counties, Nevada, from 2007 to 2013. Data were collected at sample sites that were selected to fulfill mapping priorities by one of two different plot sampling approaches. Samples were described at the stand level and classified into the National Vegetation Classification hierarchy at the alliance level and above. The vegetation database is presented in geospatial and tabular formats.
SVM-Based Synthetic Fingerprint Discrimination Algorithm and Quantitative Optimization Strategy
Chen, Suhang; Chang, Sheng; Huang, Qijun; He, Jin; Wang, Hao; Huang, Qiangui
2014-01-01
Synthetic fingerprints are a potential threat to automatic fingerprint identification systems (AFISs). In this paper, we propose an algorithm to discriminate synthetic fingerprints from real ones. First, four typical characteristic factors—the ridge distance features, global gray features, frequency feature and Harris Corner feature—are extracted. Then, a support vector machine (SVM) is used to distinguish synthetic fingerprints from real fingerprints. The experiments demonstrate that this method can achieve a recognition accuracy rate of over 98% for two discrete synthetic fingerprint databases as well as a mixed database. Furthermore, a performance factor that can evaluate the SVM's accuracy and efficiency is presented, and a quantitative optimization strategy is established for the first time. After the optimization of our synthetic fingerprint discrimination task, the polynomial kernel with a training sample proportion of 5% is the optimized value when the minimum accuracy requirement is 95%. The radial basis function (RBF) kernel with a training sample proportion of 15% is a more suitable choice when the minimum accuracy requirement is 98%. PMID:25347063
MPA Portable: A Stand-Alone Software Package for Analyzing Metaproteome Samples on the Go.
Muth, Thilo; Kohrs, Fabian; Heyer, Robert; Benndorf, Dirk; Rapp, Erdmann; Reichl, Udo; Martens, Lennart; Renard, Bernhard Y
2018-01-02
Metaproteomics, the mass spectrometry-based analysis of proteins from multispecies samples faces severe challenges concerning data analysis and results interpretation. To overcome these shortcomings, we here introduce the MetaProteomeAnalyzer (MPA) Portable software. In contrast to the original server-based MPA application, this newly developed tool no longer requires computational expertise for installation and is now independent of any relational database system. In addition, MPA Portable now supports state-of-the-art database search engines and a convenient command line interface for high-performance data processing tasks. While search engine results can easily be combined to increase the protein identification yield, an additional two-step workflow is implemented to provide sufficient analysis resolution for further postprocessing steps, such as protein grouping as well as taxonomic and functional annotation. Our new application has been developed with a focus on intuitive usability, adherence to data standards, and adaptation to Web-based workflow platforms. The open source software package can be found at https://github.com/compomics/meta-proteome-analyzer .
Saokaew, Surasak; Sugimoto, Takashi; Kamae, Isao; Pratoomsoot, Chayanin; Chaiyakunapruk, Nathorn
2015-01-01
Health technology assessment (HTA) has been continuously used for value-based healthcare decisions over the last decade. Healthcare databases represent an important source of information for HTA, which has seen a surge in use in Western countries. Although HTA agencies have been established in Asia-Pacific region, application and understanding of healthcare databases for HTA is rather limited. Thus, we reviewed existing databases to assess their potential for HTA in Thailand where HTA has been used officially and Japan where HTA is going to be officially introduced. Existing healthcare databases in Thailand and Japan were compiled and reviewed. Databases' characteristics e.g. name of database, host, scope/objective, time/sample size, design, data collection method, population/sample, and variables were described. Databases were assessed for its potential HTA use in terms of safety/efficacy/effectiveness, social/ethical, organization/professional, economic, and epidemiological domains. Request route for each database was also provided. Forty databases- 20 from Thailand and 20 from Japan-were included. These comprised of national censuses, surveys, registries, administrative data, and claimed databases. All databases were potentially used for epidemiological studies. In addition, data on mortality, morbidity, disability, adverse events, quality of life, service/technology utilization, length of stay, and economics were also found in some databases. However, access to patient-level data was limited since information about the databases was not available on public sources. Our findings have shown that existing databases provided valuable information for HTA research with limitation on accessibility. Mutual dialogue on healthcare database development and usage for HTA among Asia-Pacific region is needed.
Pearson, Daniel K.; Bumgarner, Johnathan R.; Houston, Natalie A.; Stanton, Gregory P.; Teeple, Andrew; Thomas, Jonathan V.
2012-01-01
The U.S. Geological Survey, in cooperation with Middle Pecos Groundwater Conservation District, Pecos County, City of Fort Stockton, Brewster County, and Pecos County Water Control and Improvement District No. 1, compiled groundwater, surface-water, water-quality, geophysical, and geologic data for site locations in the Pecos County region, Texas, and developed a geodatabase to facilitate use of this information. Data were compiled for an approximately 4,700 square mile area of the Pecos County region, Texas. The geodatabase contains data from 8,242 sampling locations; it was designed to organize and store field-collected geochemical and geophysical data, as well as digital database resources from the U.S. Geological Survey, Middle Pecos Groundwater Conservation District, Texas Water Development Board, Texas Commission on Environmental Quality,and numerous other State and local databases. The geodatabase combines these disparate database resources into a simple data model. Site locations are geospatially enabled and stored in a geodatabase feature class for cartographic visualization and spatial analysis within a Geographic Information System. The sampling locations are related to hydrogeologic information through the use of geodatabase relationship classes. The geodatabase relationship classes provide the ability to perform complex spatial and data-driven queries to explore data stored in the geodatabase.
78 FR 2363 - Notification of Deletion of a System of Records; Automated Trust Funds Database
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-11
... Database AGENCY: Animal and Plant Health Inspection Service, USDA. ACTION: Notice of deletion of a system... establishing the Automated Trust Funds (ATF) database system of records. The Federal Information Security... Integrity Act of 1982, Public Law 97-255, provided authority for the system. The ATF database has been...
NASA Astrophysics Data System (ADS)
Poppe, Sam; Barette, Florian; Smets, Benoît; Benbakkar, Mhammed; Kervyn, Matthieu
2016-04-01
The Virunga Volcanic Province (VVP) is situated within the western branch of the East-African Rift. The geochemistry and petrology of its' volcanic products has been studied extensively in a fragmented manner. They represent a unique collection of silica-undersaturated, ultra-alkaline and ultra-potassic compositions, displaying marked geochemical variations over the area occupied by the VVP. We present a novel spatially-explicit database of existing whole-rock geochemical analyses of the VVP volcanics, compiled from international publications, (post-)colonial scientific reports and PhD theses. In the database, a total of 703 geochemical analyses of whole-rock samples collected from the 1950s until recently have been characterised with a geographical location, eruption source location, analytical results and uncertainty estimates for each of these categories. Comparative box plots and Kruskal-Wallis H tests on subsets of analyses with contrasting ages or analytical methods suggest that the overall database accuracy is consistent. We demonstrate how statistical techniques such as Principal Component Analysis (PCA) and subsequent cluster analysis allow the identification of clusters of samples with similar major-element compositions. The spatial patterns represented by the contrasting clusters show that both the historically active volcanoes represent compositional clusters which can be identified based on their contrasted silica and alkali contents. Furthermore, two sample clusters are interpreted to represent the most primitive, deep magma source within the VVP, different from the shallow magma reservoirs that feed the eight dominant large volcanoes. The samples from these two clusters systematically originate from locations which 1. are distal compared to the eight large volcanoes and 2. mostly coincide with the surface expressions of rift faults or NE-SW-oriented inherited Precambrian structures which were reactivated during rifting. The lava from the Mugogo eruption of 1957 belongs to these primitive clusters and is the only known to have erupted outside the current rift valley in historical times. We thus infer there is a distributed hazard of vent opening susceptibility additional to the susceptibility associated with the main Virunga edifices. This study suggests that the statistical analysis of such geochemical database may help to understand complex volcanic plumbing systems and the spatial distribution of volcanic hazards in active and poorly known volcanic areas such as the Virunga Volcanic Province.
The MAR databases: development and implementation of databases specific for marine metagenomics
Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen
2018-01-01
Abstract We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. PMID:29106641
Towards the Interoperability of Web, Database, and Mass Storage Technologies for Petabyte Archives
NASA Technical Reports Server (NTRS)
Moore, Reagan; Marciano, Richard; Wan, Michael; Sherwin, Tom; Frost, Richard
1996-01-01
At the San Diego Supercomputer Center, a massive data analysis system (MDAS) is being developed to support data-intensive applications that manipulate terabyte sized data sets. The objective is to support scientific application access to data whether it is located at a Web site, stored as an object in a database, and/or storage in an archival storage system. We are developing a suite of demonstration programs which illustrate how Web, database (DBMS), and archival storage (mass storage) technologies can be integrated. An application presentation interface is being designed that integrates data access to all of these sources. We have developed a data movement interface between the Illustra object-relational database and the NSL UniTree archival storage system running in a production mode at the San Diego Supercomputer Center. With this interface, an Illustra client can transparently access data on UniTree under the control of the Illustr DBMS server. The current implementation is based on the creation of a new DBMS storage manager class, and a set of library functions that allow the manipulation and migration of data stored as Illustra 'large objects'. We have extended this interface to allow a Web client application to control data movement between its local disk, the Web server, the DBMS Illustra server, and the UniTree mass storage environment. This paper describes some of the current approaches successfully integrating these technologies. This framework is measured against a representative sample of environmental data extracted from the San Diego Ba Environmental Data Repository. Practical lessons are drawn and critical research areas are highlighted.
New taxonomy and old collections: integrating DNA barcoding into the collection curation process.
Puillandre, N; Bouchet, P; Boisselier-Dubayle, M-C; Brisset, J; Buge, B; Castelin, M; Chagnoux, S; Christophe, T; Corbari, L; Lambourdière, J; Lozouet, P; Marani, G; Rivasseau, A; Silva, N; Terryn, Y; Tillier, S; Utge, J; Samadi, S
2012-05-01
Because they house large biodiversity collections and are also research centres with sequencing facilities, natural history museums are well placed to develop DNA barcoding best practices. The main difficulty is generally the vouchering system: it must ensure that all data produced remain attached to the corresponding specimen, from the field to publication in articles and online databases. The Museum National d'Histoire Naturelle in Paris is one of the leading laboratories in the Marine Barcode of Life (MarBOL) project, which was used as a pilot programme to include barcode collections for marine molluscs and crustaceans. The system is based on two relational databases. The first one classically records the data (locality and identification) attached to the specimens. In the second one, tissue-clippings, DNA extractions (both preserved in 2D barcode tubes) and PCR data (including primers) are linked to the corresponding specimen. All the steps of the process [sampling event, specimen identification, molecular processing, data submission to Barcode Of Life Database (BOLD) and GenBank] are thus linked together. Furthermore, we have developed several web-based tools to automatically upload data into the system, control the quality of the sequences produced and facilitate the submission to online databases. This work is the result of a joint effort from several teams in the Museum National d'Histoire Naturelle (MNHN), but also from a collaborative network of taxonomists and molecular systematists outside the museum, resulting in the vouchering so far of ∼41,000 sequences and the production of ∼11,000 COI sequences. © 2012 Blackwell Publishing Ltd.
Lavine, Barry K; White, Collin G; Allen, Matthew D; Weakley, Andrew
2017-03-01
Multilayered automotive paint fragments, which are one of the most complex materials encountered in the forensic science laboratory, provide crucial links in criminal investigations and prosecutions. To determine the origin of these paint fragments, forensic automotive paint examiners have turned to the paint data query (PDQ) database, which allows the forensic examiner to compare the layer sequence and color, texture, and composition of the sample to paint systems of the original equipment manufacturer (OEM). However, modern automotive paints have a thin color coat and this layer on a microscopic fragment is often too thin to obtain accurate chemical and topcoat color information. A search engine has been developed for the infrared (IR) spectral libraries of the PDQ database in an effort to improve discrimination capability and permit quantification of discrimination power for OEM automotive paint comparisons. The similarity of IR spectra of the corresponding layers of various records for original finishes in the PDQ database often results in poor discrimination using commercial library search algorithms. A pattern recognition approach employing pre-filters and a cross-correlation library search algorithm that performs both a forward and backward search has been used to significantly improve the discrimination of IR spectra in the PDQ database and thus improve the accuracy of the search. This improvement permits inter-comparison of OEM automotive paint layer systems using the IR spectra alone. Such information can serve to quantify the discrimination power of the original automotive paint encountered in casework and further efforts to succinctly communicate trace evidence to the courts.
[The Brazilian Hospital Information System and the acute myocardial infarction hospital care].
Escosteguy, Claudia Caminha; Portela, Margareth Crisóstomo; Medronho, Roberto de Andrade; de Vasconcellos, Maurício Teixeira Leite
2002-08-01
To analyze the applicability of the Brazilian Unified Health System's national hospital database to evaluate the quality of acute myocardial infarction hospital care. It was evaluated 1,936 hospital admission forms having acute myocardial infarction (AMI) as primary diagnosis in the municipal district of Rio de Janeiro, Brazil, in 1997. Data was collected from the national hospital database. A stratified random sampling of 391 medical records was also evaluated. AMI diagnosis agreement followed the literature criteria. Variable accuracy analysis was performed using kappa index agreement. The quality of AMI diagnosis registered in hospital admission forms was satisfactory according to the gold standard of the literature. In general, the accuracy of the variables demographics (sex, age group), process (medical procedures and interventions), and outcome (hospital death) was satisfactory. The accuracy of demographics and outcome variables was higher than the one of process variables. Under registration of secondary diagnosis was high in the forms and it was the main limiting factor. Given the study findings and the widespread availability of the national hospital database, it is pertinent its use as an instrument in the evaluation of the quality of AMI medical care.
[The future of clinical laboratory database management system].
Kambe, M; Imidy, D; Matsubara, A; Sugimoto, Y
1999-09-01
To assess the present status of the clinical laboratory database management system, the difference between the Clinical Laboratory Information System and Clinical Laboratory System was explained in this study. Although three kinds of database management systems (DBMS) were shown including the relational model, tree model and network model, the relational model was found to be the best DBMS for the clinical laboratory database based on our experience and developments of some clinical laboratory expert systems. As a future clinical laboratory database management system, the IC card system connected to an automatic chemical analyzer was proposed for personal health data management and a microscope/video system was proposed for dynamic data management of leukocytes or bacteria.
Machado, Helena; Silva, Susana
2014-01-01
The creation and expansion of forensic DNA databases might involve potential threats to the protection of a range of human rights. At the same time, such databases have social benefits. Based on data collected through an online questionnaire applied to 628 individuals in Portugal, this paper aims to analyze the citizens' willingness to donate voluntarily a sample for profiling and inclusion in the National Forensic DNA Database and the views underpinning such a decision. Nearly one-quarter of the respondents would indicate 'no', and this negative response increased significantly with age and education. The overriding willingness to accept the inclusion of the individual genetic profile indicates an acknowledgement of the investigative potential of forensic DNA technologies and a relegation of civil liberties and human rights to the background, owing to the perceived benefits of protecting both society and the individual from crime. This rationale is mostly expressed by the idea that all citizens should contribute to the expansion of the National Forensic DNA Database for reasons that range from the more abstract assumption that donating a sample for profiling would be helpful in fighting crime to the more concrete suggestion that everyone (criminals and non-criminals) should be in the database. The concerns with the risks of accepting the donation of a sample for genetic profiling and inclusion in the National Forensic DNA Database are mostly related to lack of control and insufficient or unclear regulations concerning safeguarding individuals' data and supervising the access and uses of genetic data. By providing an empirically-grounded understanding of the attitudes regarding willingness to donate voluntary a sample for profiling and inclusion in a National Forensic DNA Database, this study also considers the citizens' perceived benefits and risks of operating forensic DNA databases. These collective views might be useful for the formation of international common ethical standards for the development and governance of DNA databases in a framework in which the citizens' perspectives are taken into consideration. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Hudson, Lawrence N; Newbold, Tim; Contu, Sara; Hill, Samantha L L; Lysenko, Igor; De Palma, Adriana; Phillips, Helen R P; Senior, Rebecca A; Bennett, Dominic J; Booth, Hollie; Choimes, Argyrios; Correia, David L P; Day, Julie; Echeverría-Londoño, Susy; Garon, Morgan; Harrison, Michelle L K; Ingram, Daniel J; Jung, Martin; Kemp, Victoria; Kirkpatrick, Lucinda; Martin, Callum D; Pan, Yuan; White, Hannah J; Aben, Job; Abrahamczyk, Stefan; Adum, Gilbert B; Aguilar-Barquero, Virginia; Aizen, Marcelo A; Ancrenaz, Marc; Arbeláez-Cortés, Enrique; Armbrecht, Inge; Azhar, Badrul; Azpiroz, Adrián B; Baeten, Lander; Báldi, András; Banks, John E; Barlow, Jos; Batáry, Péter; Bates, Adam J; Bayne, Erin M; Beja, Pedro; Berg, Åke; Berry, Nicholas J; Bicknell, Jake E; Bihn, Jochen H; Böhning-Gaese, Katrin; Boekhout, Teun; Boutin, Céline; Bouyer, Jérémy; Brearley, Francis Q; Brito, Isabel; Brunet, Jörg; Buczkowski, Grzegorz; Buscardo, Erika; Cabra-García, Jimmy; Calviño-Cancela, María; Cameron, Sydney A; Cancello, Eliana M; Carrijo, Tiago F; Carvalho, Anelena L; Castro, Helena; Castro-Luna, Alejandro A; Cerda, Rolando; Cerezo, Alexis; Chauvat, Matthieu; Clarke, Frank M; Cleary, Daniel F R; Connop, Stuart P; D'Aniello, Biagio; da Silva, Pedro Giovâni; Darvill, Ben; Dauber, Jens; Dejean, Alain; Diekötter, Tim; Dominguez-Haydar, Yamileth; Dormann, Carsten F; Dumont, Bertrand; Dures, Simon G; Dynesius, Mats; Edenius, Lars; Elek, Zoltán; Entling, Martin H; Farwig, Nina; Fayle, Tom M; Felicioli, Antonio; Felton, Annika M; Ficetola, Gentile F; Filgueiras, Bruno K C; Fonte, Steven J; Fraser, Lauchlan H; Fukuda, Daisuke; Furlani, Dario; Ganzhorn, Jörg U; Garden, Jenni G; Gheler-Costa, Carla; Giordani, Paolo; Giordano, Simonetta; Gottschalk, Marco S; Goulson, Dave; Gove, Aaron D; Grogan, James; Hanley, Mick E; Hanson, Thor; Hashim, Nor R; Hawes, Joseph E; Hébert, Christian; Helden, Alvin J; Henden, John-André; Hernández, Lionel; Herzog, Felix; Higuera-Diaz, Diego; Hilje, Branko; Horgan, Finbarr G; Horváth, Roland; Hylander, Kristoffer; Isaacs-Cubides, Paola; Ishitani, Masahiro; Jacobs, Carmen T; Jaramillo, Víctor J; Jauker, Birgit; Jonsell, Mats; Jung, Thomas S; Kapoor, Vena; Kati, Vassiliki; Katovai, Eric; Kessler, Michael; Knop, Eva; Kolb, Annette; Kőrösi, Ádám; Lachat, Thibault; Lantschner, Victoria; Le Féon, Violette; LeBuhn, Gretchen; Légaré, Jean-Philippe; Letcher, Susan G; Littlewood, Nick A; López-Quintero, Carlos A; Louhaichi, Mounir; Lövei, Gabor L; Lucas-Borja, Manuel Esteban; Luja, Victor H; Maeto, Kaoru; Magura, Tibor; Mallari, Neil Aldrin; Marin-Spiotta, Erika; Marshall, E J P; Martínez, Eliana; Mayfield, Margaret M; Mikusinski, Grzegorz; Milder, Jeffrey C; Miller, James R; Morales, Carolina L; Muchane, Mary N; Muchane, Muchai; Naidoo, Robin; Nakamura, Akihiro; Naoe, Shoji; Nates-Parra, Guiomar; Navarrete Gutierrez, Dario A; Neuschulz, Eike L; Noreika, Norbertas; Norfolk, Olivia; Noriega, Jorge Ari; Nöske, Nicole M; O'Dea, Niall; Oduro, William; Ofori-Boateng, Caleb; Oke, Chris O; Osgathorpe, Lynne M; Paritsis, Juan; Parra-H, Alejandro; Pelegrin, Nicolás; Peres, Carlos A; Persson, Anna S; Petanidou, Theodora; Phalan, Ben; Philips, T Keith; Poveda, Katja; Power, Eileen F; Presley, Steven J; Proença, Vânia; Quaranta, Marino; Quintero, Carolina; Redpath-Downing, Nicola A; Reid, J Leighton; Reis, Yana T; Ribeiro, Danilo B; Richardson, Barbara A; Richardson, Michael J; Robles, Carolina A; Römbke, Jörg; Romero-Duque, Luz Piedad; Rosselli, Loreta; Rossiter, Stephen J; Roulston, T'ai H; Rousseau, Laurent; Sadler, Jonathan P; Sáfián, Szabolcs; Saldaña-Vázquez, Romeo A; Samnegård, Ulrika; Schüepp, Christof; Schweiger, Oliver; Sedlock, Jodi L; Shahabuddin, Ghazala; Sheil, Douglas; Silva, Fernando A B; Slade, Eleanor M; Smith-Pardo, Allan H; Sodhi, Navjot S; Somarriba, Eduardo J; Sosa, Ramón A; Stout, Jane C; Struebig, Matthew J; Sung, Yik-Hei; Threlfall, Caragh G; Tonietto, Rebecca; Tóthmérész, Béla; Tscharntke, Teja; Turner, Edgar C; Tylianakis, Jason M; Vanbergen, Adam J; Vassilev, Kiril; Verboven, Hans A F; Vergara, Carlos H; Vergara, Pablo M; Verhulst, Jort; Walker, Tony R; Wang, Yanping; Watling, James I; Wells, Konstans; Williams, Christopher D; Willig, Michael R; Woinarski, John C Z; Wolf, Jan H D; Woodcock, Ben A; Yu, Douglas W; Zaitsev, Andrey S; Collen, Ben; Ewers, Rob M; Mace, Georgina M; Purves, Drew W; Scharlemann, Jörn P W; Purvis, Andy
2014-01-01
Biodiversity continues to decline in the face of increasing anthropogenic pressures such as habitat destruction, exploitation, pollution and introduction of alien species. Existing global databases of species’ threat status or population time series are dominated by charismatic species. The collation of datasets with broad taxonomic and biogeographic extents, and that support computation of a range of biodiversity indicators, is necessary to enable better understanding of historical declines and to project – and avert – future declines. We describe and assess a new database of more than 1.6 million samples from 78 countries representing over 28,000 species, collated from existing spatial comparisons of local-scale biodiversity exposed to different intensities and types of anthropogenic pressures, from terrestrial sites around the world. The database contains measurements taken in 208 (of 814) ecoregions, 13 (of 14) biomes, 25 (of 35) biodiversity hotspots and 16 (of 17) megadiverse countries. The database contains more than 1% of the total number of all species described, and more than 1% of the described species within many taxonomic groups – including flowering plants, gymnosperms, birds, mammals, reptiles, amphibians, beetles, lepidopterans and hymenopterans. The dataset, which is still being added to, is therefore already considerably larger and more representative than those used by previous quantitative models of biodiversity trends and responses. The database is being assembled as part of the PREDICTS project (Projecting Responses of Ecological Diversity In Changing Terrestrial Systems – http://www.predicts.org.uk). We make site-level summary data available alongside this article. The full database will be publicly available in 2015. PMID:25558364
Pedersen, Mona K; Nielsen, Gunnar L; Uhrenfeldt, Lisbeth; Rasmussen, Ole S; Lundbye-Christensen, Søren
2017-08-01
To describe the construction of the Older Person at Risk Assessment (OPRA) database, the ability to link this database with existing data sources obtained from Danish nationwide population-based registries and to discuss its research potential for the analyses of risk factors associated with 30-day hospital readmission. We reviewed Danish nationwide registries to obtain information on demographic and social determinants as well as information on health and health care use in a population of hospitalised older people. The sample included all people aged 65+ years discharged from Danish public hospitals in the period from 1 January 2007 to 30 September 2010. We used personal identifiers to link and integrate the data from all events of interest with the outcome measures in the OPRA database. The database contained records of the patients, admissions and variables of interest. The cohort included 1,267,752 admissions for 479,854 unique people. The rate of 30-day all-cause acute readmission was 18.9% ( n=239,077) and the overall 30-day mortality was 5.0% ( n=63,116). The OPRA database provides the possibility of linking data on health and life events in a population of people moving into retirement and ageing. Construction of the database makes it possible to outline individual life and health trajectories over time, transcending organisational boundaries within health care systems. The OPRA database is multi-component and multi-disciplinary in orientation and has been prepared to be used in a wide range of subgroup analyses, including different outcome measures and statistical methods.
Rattner, B.A.; Pearson, J.L.; Golden, N.H.; Cohen, J.B.; Erwin, R.M.; Ottinger, M.A.
2000-01-01
In order to examine the condition of biota in Atlantic coast estuaries, a ?Contaminant Exposure and Effects--Terrestrial Vertebrates? database (CEE-TV) has been compiled through computerized search of published literature, review of existing databases, and solicitation of unpublished reports from conservation agencies, private groups, and universities. Summary information has been entered into the database, including species, collection date (1965-present), site coordinates, estuary name, hydrologic unit catalogue code, sample matrix, contaminant concentrations, biomarker and bioindicator responses, and reference source, utilizing a 98-field character and numeric format. Currently, the CEE-TV database contains 3699 georeferenced records representing 190 vertebrate species and >145,000 individuals residing in estuaries from Maine through Florida. This relational database can be directly queried, imported into a Geographic Information System to examine spatial patterns, identify data gaps and areas of concern, generate hypotheses, and focus ecotoxicological field assessments. Information on birds made up the vast majority (83%) of the database, with only a modicum of data on amphibians (75,000 chemical compounds in commerce, only 118 commonly measured environmental contaminants were quantified in tissues of terrestrial vertebrates. There were no CEE-TV data records in 15 of the 67 estuaries located along the Atlantic coast and Florida Gulf coast. The CEE-TV database has a number of potential applications including focusing biomonitoring efforts to generate critically needed ecotoxicological data in the numerous ?gaps? along the coast, reducing uncertainty about contaminant risk, identifying areas for mitigation, restoration or special management, and ranking ecological conditions of estuaries.
Hudson, Lawrence N; Newbold, Tim; Contu, Sara; Hill, Samantha L L; Lysenko, Igor; De Palma, Adriana; Phillips, Helen R P; Senior, Rebecca A; Bennett, Dominic J; Booth, Hollie; Choimes, Argyrios; Correia, David L P; Day, Julie; Echeverría-Londoño, Susy; Garon, Morgan; Harrison, Michelle L K; Ingram, Daniel J; Jung, Martin; Kemp, Victoria; Kirkpatrick, Lucinda; Martin, Callum D; Pan, Yuan; White, Hannah J; Aben, Job; Abrahamczyk, Stefan; Adum, Gilbert B; Aguilar-Barquero, Virginia; Aizen, Marcelo A; Ancrenaz, Marc; Arbeláez-Cortés, Enrique; Armbrecht, Inge; Azhar, Badrul; Azpiroz, Adrián B; Baeten, Lander; Báldi, András; Banks, John E; Barlow, Jos; Batáry, Péter; Bates, Adam J; Bayne, Erin M; Beja, Pedro; Berg, Åke; Berry, Nicholas J; Bicknell, Jake E; Bihn, Jochen H; Böhning-Gaese, Katrin; Boekhout, Teun; Boutin, Céline; Bouyer, Jérémy; Brearley, Francis Q; Brito, Isabel; Brunet, Jörg; Buczkowski, Grzegorz; Buscardo, Erika; Cabra-García, Jimmy; Calviño-Cancela, María; Cameron, Sydney A; Cancello, Eliana M; Carrijo, Tiago F; Carvalho, Anelena L; Castro, Helena; Castro-Luna, Alejandro A; Cerda, Rolando; Cerezo, Alexis; Chauvat, Matthieu; Clarke, Frank M; Cleary, Daniel F R; Connop, Stuart P; D'Aniello, Biagio; da Silva, Pedro Giovâni; Darvill, Ben; Dauber, Jens; Dejean, Alain; Diekötter, Tim; Dominguez-Haydar, Yamileth; Dormann, Carsten F; Dumont, Bertrand; Dures, Simon G; Dynesius, Mats; Edenius, Lars; Elek, Zoltán; Entling, Martin H; Farwig, Nina; Fayle, Tom M; Felicioli, Antonio; Felton, Annika M; Ficetola, Gentile F; Filgueiras, Bruno K C; Fonte, Steven J; Fraser, Lauchlan H; Fukuda, Daisuke; Furlani, Dario; Ganzhorn, Jörg U; Garden, Jenni G; Gheler-Costa, Carla; Giordani, Paolo; Giordano, Simonetta; Gottschalk, Marco S; Goulson, Dave; Gove, Aaron D; Grogan, James; Hanley, Mick E; Hanson, Thor; Hashim, Nor R; Hawes, Joseph E; Hébert, Christian; Helden, Alvin J; Henden, John-André; Hernández, Lionel; Herzog, Felix; Higuera-Diaz, Diego; Hilje, Branko; Horgan, Finbarr G; Horváth, Roland; Hylander, Kristoffer; Isaacs-Cubides, Paola; Ishitani, Masahiro; Jacobs, Carmen T; Jaramillo, Víctor J; Jauker, Birgit; Jonsell, Mats; Jung, Thomas S; Kapoor, Vena; Kati, Vassiliki; Katovai, Eric; Kessler, Michael; Knop, Eva; Kolb, Annette; Kőrösi, Ádám; Lachat, Thibault; Lantschner, Victoria; Le Féon, Violette; LeBuhn, Gretchen; Légaré, Jean-Philippe; Letcher, Susan G; Littlewood, Nick A; López-Quintero, Carlos A; Louhaichi, Mounir; Lövei, Gabor L; Lucas-Borja, Manuel Esteban; Luja, Victor H; Maeto, Kaoru; Magura, Tibor; Mallari, Neil Aldrin; Marin-Spiotta, Erika; Marshall, E J P; Martínez, Eliana; Mayfield, Margaret M; Mikusinski, Grzegorz; Milder, Jeffrey C; Miller, James R; Morales, Carolina L; Muchane, Mary N; Muchane, Muchai; Naidoo, Robin; Nakamura, Akihiro; Naoe, Shoji; Nates-Parra, Guiomar; Navarrete Gutierrez, Dario A; Neuschulz, Eike L; Noreika, Norbertas; Norfolk, Olivia; Noriega, Jorge Ari; Nöske, Nicole M; O'Dea, Niall; Oduro, William; Ofori-Boateng, Caleb; Oke, Chris O; Osgathorpe, Lynne M; Paritsis, Juan; Parra-H, Alejandro; Pelegrin, Nicolás; Peres, Carlos A; Persson, Anna S; Petanidou, Theodora; Phalan, Ben; Philips, T Keith; Poveda, Katja; Power, Eileen F; Presley, Steven J; Proença, Vânia; Quaranta, Marino; Quintero, Carolina; Redpath-Downing, Nicola A; Reid, J Leighton; Reis, Yana T; Ribeiro, Danilo B; Richardson, Barbara A; Richardson, Michael J; Robles, Carolina A; Römbke, Jörg; Romero-Duque, Luz Piedad; Rosselli, Loreta; Rossiter, Stephen J; Roulston, T'ai H; Rousseau, Laurent; Sadler, Jonathan P; Sáfián, Szabolcs; Saldaña-Vázquez, Romeo A; Samnegård, Ulrika; Schüepp, Christof; Schweiger, Oliver; Sedlock, Jodi L; Shahabuddin, Ghazala; Sheil, Douglas; Silva, Fernando A B; Slade, Eleanor M; Smith-Pardo, Allan H; Sodhi, Navjot S; Somarriba, Eduardo J; Sosa, Ramón A; Stout, Jane C; Struebig, Matthew J; Sung, Yik-Hei; Threlfall, Caragh G; Tonietto, Rebecca; Tóthmérész, Béla; Tscharntke, Teja; Turner, Edgar C; Tylianakis, Jason M; Vanbergen, Adam J; Vassilev, Kiril; Verboven, Hans A F; Vergara, Carlos H; Vergara, Pablo M; Verhulst, Jort; Walker, Tony R; Wang, Yanping; Watling, James I; Wells, Konstans; Williams, Christopher D; Willig, Michael R; Woinarski, John C Z; Wolf, Jan H D; Woodcock, Ben A; Yu, Douglas W; Zaitsev, Andrey S; Collen, Ben; Ewers, Rob M; Mace, Georgina M; Purves, Drew W; Scharlemann, Jörn P W; Purvis, Andy
2014-12-01
Biodiversity continues to decline in the face of increasing anthropogenic pressures such as habitat destruction, exploitation, pollution and introduction of alien species. Existing global databases of species' threat status or population time series are dominated by charismatic species. The collation of datasets with broad taxonomic and biogeographic extents, and that support computation of a range of biodiversity indicators, is necessary to enable better understanding of historical declines and to project - and avert - future declines. We describe and assess a new database of more than 1.6 million samples from 78 countries representing over 28,000 species, collated from existing spatial comparisons of local-scale biodiversity exposed to different intensities and types of anthropogenic pressures, from terrestrial sites around the world. The database contains measurements taken in 208 (of 814) ecoregions, 13 (of 14) biomes, 25 (of 35) biodiversity hotspots and 16 (of 17) megadiverse countries. The database contains more than 1% of the total number of all species described, and more than 1% of the described species within many taxonomic groups - including flowering plants, gymnosperms, birds, mammals, reptiles, amphibians, beetles, lepidopterans and hymenopterans. The dataset, which is still being added to, is therefore already considerably larger and more representative than those used by previous quantitative models of biodiversity trends and responses. The database is being assembled as part of the PREDICTS project (Projecting Responses of Ecological Diversity In Changing Terrestrial Systems - http://www.predicts.org.uk). We make site-level summary data available alongside this article. The full database will be publicly available in 2015.
NASA Technical Reports Server (NTRS)
Murray, ShaTerea R.
2004-01-01
This summer I had the opportunity to work in the Environmental Management Office (EMO) under the Chemical Sampling and Analysis Team or CS&AT. This team s mission is to support Glenn Research Center (GRC) and EM0 by providing chemical sampling and analysis services and expert consulting. Services include sampling and chemical analysis of water, soil, fbels, oils, paint, insulation materials, etc. One of this team s major projects is the Drinking Water Project. This is a project that is done on Glenn s water coolers and ten percent of its sink every two years. For the past two summers an intern had been putting together a database for this team to record the test they had perform. She had successfully created a database but hadn't worked out all the quirks. So this summer William Wilder (an intern from Cleveland State University) and I worked together to perfect her database. We began be finding out exactly what every member of the team thought about the database and what they would change if any. After collecting this data we both had to take some courses in Microsoft Access in order to fix the problems. Next we began looking at what exactly how the database worked from the outside inward. Then we began trying to change the database but we quickly found out that this would be virtually impossible.
CNS sites cooperate to detect duplicate subjects with a clinical trial subject registry.
Shiovitz, Thomas M; Wilcox, Charles S; Gevorgyan, Lilit; Shawkat, Adnan
2013-02-01
To report the results of the first 1,132 subjects in a pilot project where local central nervous system trial sites collaborated in the use of a subject database to identify potential duplicate subjects. Central nervous system sites in Los Angeles and Orange County, California, were contacted by the lead author to seek participation in the project. CTSdatabase, a central nervous system-focused trial subject registry, was utilized to track potential subjects at pre-screen. Subjects signed an institutional review board-approved authorization prior to participation, and site staff entered their identifiers by accessing a website. Sites were prompted to communicate with each other or with the database administrator when a match occurred between a newly entered subject and a subject already in the database. Between October 30, 2011, and August 31, 2012, 1,132 subjects were entered at nine central nervous system sites. Subjects continue to be entered, and more sites are anticipated to begin participation by the time of publication. Initially, there were concerns at a few sites over patient acceptance, financial implications, and/or legal and privacy issues, but these were eventually overcome. Patient acceptance was estimated to be above 95 percent. Duplicate Subjects (those that matched several key identifiers with subjects at different sites) made up 7.78 percent of the sample and Certain Duplicates (matching identifiers with a greater than 1 in 10 million likelihood of occurring by chance in the general population) accounted for 3.45 percent of pre-screens entered into the database. Many of these certain duplicates were not consented for studies because of the information provided by the registry. The use of a clinical trial subject registry and cooperation between central nervous system trial sites can reduce the number of duplicate and professional subjects entering clinical trials. To be fully effective, a trial subject database could be integrated into protocols across pharmaceutical companies, thereby mandating site participation and increasing the likelihood that duplicate subjects will be removed before they enter (and negatively affect) clinical trials.
Román Colón, Yomayra A.; Ruppert, Leslie F.
2015-01-01
The U.S. Geological Survey (USGS) has compiled a database consisting of three worksheets of central Appalachian basin natural gas analyses and isotopic compositions from published and unpublished sources of 1,282 gas samples from Kentucky, Maryland, New York, Ohio, Pennsylvania, Tennessee, Virginia, and West Virginia. The database includes field and reservoir names, well and State identification number, selected geologic reservoir properties, and the composition of natural gases (methane; ethane; propane; butane, iso-butane [i-butane]; normal butane [n-butane]; iso-pentane [i-pentane]; normal pentane [n-pentane]; cyclohexane, and hexanes). In the first worksheet, location and American Petroleum Institute (API) numbers from public or published sources are provided for 1,231 of the 1,282 gas samples. A second worksheet of 186 gas samples was compiled from published sources and augmented with public location information and contains carbon, hydrogen, and nitrogen isotopic measurements of natural gas. The third worksheet is a key for all abbreviations in the database. The database can be used to better constrain the stratigraphic distribution, composition, and origin of natural gas in the central Appalachian basin.
The Data Base and Decision Making in Public Schools.
ERIC Educational Resources Information Center
Hedges, William D.
1984-01-01
Describes generic types of databases--file management systems, relational database management systems, and network/hierarchical database management systems--with their respective strengths and weaknesses; discusses factors to be considered in determining whether a database is desirable; and provides evaluative criteria for use in choosing…
23 CFR 972.204 - Management systems requirements.
Code of Federal Regulations, 2012 CFR
2012-04-01
... to operate and maintain the management systems and their associated databases; and (5) A process for... systems will use databases with a geographical reference system that can be used to geolocate all database...
23 CFR 972.204 - Management systems requirements.
Code of Federal Regulations, 2011 CFR
2011-04-01
... to operate and maintain the management systems and their associated databases; and (5) A process for... systems will use databases with a geographical reference system that can be used to geolocate all database...
23 CFR 972.204 - Management systems requirements.
Code of Federal Regulations, 2010 CFR
2010-04-01
... to operate and maintain the management systems and their associated databases; and (5) A process for... systems will use databases with a geographical reference system that can be used to geolocate all database...
23 CFR 972.204 - Management systems requirements.
Code of Federal Regulations, 2013 CFR
2013-04-01
... to operate and maintain the management systems and their associated databases; and (5) A process for... systems will use databases with a geographical reference system that can be used to geolocate all database...
NASA Astrophysics Data System (ADS)
Jia, Jia; Cheng, Shuiyuan; Yao, Sen; Xu, Tiebing; Zhang, Tingting; Ma, Yuetao; Wang, Hongliang; Duan, Wenjiao
2018-06-01
As one of the highest energy consumption and pollution industries, the iron and steel industry is regarded as a most important source of particulate matter emission. In this study, chemical components of size-segregated particulate matters (PM) emitted from different manufacturing units in iron and steel industry were sampled by a comprehensive sampling system. Results showed that the average particle mass concentration was highest in sintering process, followed by puddling, steelmaking and then rolling processes. PM samples were divided into eight size fractions for testing the chemical components, SO42- and NH4+ distributed more into fine particles while most of the Ca2+ was concentrated in coarse particles, the size distribution of mineral elements depended on the raw materials applied. Moreover, local database with PM chemical source profiles of iron and steel industry were built and applied in CMAQ modeling for simulating SO42- and NO3- concentration, results showed that the accuracy of model simulation improved with local chemical source profiles compared to the SPECIATE database. The results gained from this study are expected to be helpful to understand the components of PM in iron and steel industry and contribute to the source apportionment researches.
Godown, Justin; Thurm, Cary; Dodd, Debra A; Soslow, Jonathan H; Feingold, Brian; Smith, Andrew H; Mettler, Bret A; Thompson, Bryn; Hall, Matt
2017-12-01
Large clinical, research, and administrative databases are increasingly utilized to facilitate pediatric heart transplant (HTx) research. Linking databases has proven to be a robust strategy across multiple disciplines to expand the possible analyses that can be performed while leveraging the strengths of each dataset. We describe a unique linkage of the Scientific Registry of Transplant Recipients (SRTR) database and the Pediatric Health Information System (PHIS) administrative database to provide a platform to assess resource utilization in pediatric HTx. All pediatric patients (1999-2016) who underwent HTx at a hospital enrolled in the PHIS database were identified. A linkage was performed between the SRTR and PHIS databases in a stepwise approach using indirect identifiers. To determine the feasibility of using these linked data to assess resource utilization, total and post-HTx hospital costs were assessed. A total of 3188 unique transplants were identified as being present in both databases and amenable to linkage. Linkage of SRTR and PHIS data was successful in 3057 (95.9%) patients, of whom 2896 (90.8%) had complete cost data. Median total and post-HTx hospital costs were $518,906 (IQR $324,199-$889,738), and $334,490 (IQR $235,506-$498,803) respectively with significant differences based on patient demographics and clinical characteristics at HTx. Linkage of the SRTR and PHIS databases is feasible and provides an invaluable tool to assess resource utilization. Our analysis provides contemporary cost data for pediatric HTx from the largest US sample reported to date. It also provides a platform for expanded analyses in the pediatric HTx population. Copyright © 2017 Elsevier Inc. All rights reserved.
Pafilis, Evangelos; Buttigieg, Pier Luigi; Ferrell, Barbra; Pereira, Emiliano; Schnetzer, Julia; Arvanitidis, Christos; Jensen, Lars Juhl
2016-01-01
The microbial and molecular ecology research communities have made substantial progress on developing standards for annotating samples with environment metadata. However, sample manual annotation is a highly labor intensive process and requires familiarity with the terminologies used. We have therefore developed an interactive annotation tool, EXTRACT, which helps curators identify and extract standard-compliant terms for annotation of metagenomic records and other samples. Behind its web-based user interface, the system combines published methods for named entity recognition of environment, organism, tissue and disease terms. The evaluators in the BioCreative V Interactive Annotation Task found the system to be intuitive, useful, well documented and sufficiently accurate to be helpful in spotting relevant text passages and extracting organism and environment terms. Comparison of fully manual and text-mining-assisted curation revealed that EXTRACT speeds up annotation by 15-25% and helps curators to detect terms that would otherwise have been missed. Database URL: https://extract.hcmr.gr/. © The Author(s) 2016. Published by Oxford University Press.
Tsybovskii, I S; Veremeichik, V M; Kotova, S A; Kritskaya, S V; Evmenenko, S A; Udina, I G
2017-02-01
For the Republic of Belarus, development of a forensic reference database on the basis of 18 autosomal microsatellites (STR) using a population dataset (N = 1040), “familial” genotypic dataset (N = 2550) obtained from expertise performance of paternity testing, and a dataset of genotypes from a criminal registration database (N = 8756) is described. Population samples studied consist of 80% ethnic Belarusians and 20% individuals of other nationality or of mixed origin (by questionnaire data). Genotypes of 12346 inhabitants of the Republic of Belarus from 118 regional samples studied by 18 autosomal microsatellites are included in the sample: 16 tetranucleotide STR (D2S1338, TPOX, D3S1358, CSF1PO, D5S818, D8S1179, D7S820, THO1, vWA, D13S317, D16S539, D18S51, D19S433, D21S11, F13B, and FGA) and two pentanucleotide STR (Penta D and Penta E). The samples studied are in Hardy–Weinberg equilibrium according to distribution of genotypes by 18 STR. Significant differences were not detected between discrete populations or between samples from various historical ethnographic regions of the Republic of Belarus (Western and Eastern Polesie, Podneprovye, Ponemanye, Poozerye, and Center), which indicates the absence of prominent genetic differentiation. Statistically significant differences between the studied genotypic datasets also were not detected, which made it possible to combine the datasets and consider the total sample as a unified forensic reference database for 18 “criminalistic” STR loci. Differences between reference database of the Republic of Belarus and Russians and Ukrainians by the distribution of the range of autosomal STR also were not detected, corresponding to a close genetic relationship of the three Eastern Slavic nations mediated by common origin and intense mutual migrations. Significant differences by separate STR loci between the reference database of Republic of Belarus and populations of Southern and Western Slavs were observed. The necessity of using original reference database for support of forensic expertise practice in the Republic of Belarus was demonstrated.
Selection and Management of DNA Markers for Use in Genomic Evaluation
USDA-ARS?s Scientific Manuscript database
A database was constructed to store genotypes for 50,972 single-nucleotide polymorphisms (SNP) from the Illumina BovineSNP50 BeadChip for over 30,000 animals. The database allows storage of multiple samples per animal and stores all SNP genotypes for a sample in a single row. An indicator specifies ...
A Database for Tracking Toxicogenomic Samples and Procedures with Genomic, Proteomic and Metabonomic Components
Wenjun Bao1, Jennifer Fostel2, Michael D. Waters2, B. Alex Merrick2, Drew Ekman3, Mitchell Kostich4, Judith Schmid1, David Dix1
Office of Research and Developmen...
Fatty acid, cholesterol, vitamin, and mineral content of cooked beef cuts from a national study
USDA-ARS?s Scientific Manuscript database
The U.S. Department of Agriculture (USDA) provides foundational nutrient data for U.S. and international databases. For currency of retail beef data in USDA’s database, a nationwide comprehensive study obtained samples by primal categories using a statistically based sampling plan, resulting in 72 ...
The rules of the game: properties of a database of expository language samples.
Heilmann, John; Malone, Thomas O
2014-10-01
The authors created a database of expository oral language samples with the aims of describing the nature of students' expository discourse and providing benchmark data for typically developing preteen and teenage students. Using a favorite game or sport protocol, language samples were collected from 235 typically developing students in Grades 5, 6, 7, and 9. Twelve language measures were summarized from this database and analyses were completed to test for differences across ages and topics. To determine whether distinct dimensions of oral language could be captured with language measures from these expository samples, a factor analysis was completed. Modest differences were observed in language measures across ages and topics. The language measures were effectively classified into four distinct dimensions: syntactic complexity, expository content, discourse difficulties, and lexical diversity. Analysis of expository data provides a functional and curriculum-based assessment that has the potential to allow clinicians to document multiple dimensions of children's expressive language skills. Further development and testing of the database will establish the feasibility of using it to compare individual students' expository discourse skills to those of their typically developing peers.
Vlek, Anneloes; Kolecka, Anna; Khayhan, Kantarawee; Theelen, Bart; Groenewald, Marizeth; Boel, Edwin
2014-01-01
An interlaboratory study using matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) to determine the identification of clinically important yeasts (n = 35) was performed at 11 clinical centers, one company, and one reference center using the Bruker Daltonics MALDI Biotyper system. The optimal cutoff for the MALDI-TOF MS score was investigated using receiver operating characteristic (ROC) curve analyses. The percentages of correct identifications were compared for different sample preparation methods and different databases. Logistic regression analysis was performed to analyze the association between the number of spectra in the database and the percentage of strains that were correctly identified. A total of 5,460 MALDI-TOF MS results were obtained. Using all results, the area under the ROC curve was 0.95 (95% confidence interval [CI], 0.94 to 0.96). With a sensitivity of 0.84 and a specificity of 0.97, a cutoff value of 1.7 was considered optimal. The overall percentage of correct identifications (formic acid-ethanol extraction method, score ≥ 1.7) was 61.5% when the commercial Bruker Daltonics database (BDAL) was used, and it increased to 86.8% by using an extended BDAL supplemented with a Centraalbureau voor Schimmelcultures (CBS)-KNAW Fungal Biodiversity Centre in-house database (BDAL+CBS in-house). A greater number of main spectra (MSP) in the database was associated with a higher percentage of correct identifications (odds ratio [OR], 1.10; 95% CI, 1.05 to 1.15; P < 0.01). The results from the direct transfer method ranged from 0% to 82.9% correct identifications, with the results of the top four centers ranging from 71.4% to 82.9% correct identifications. This study supports the use of a cutoff value of 1.7 for the identification of yeasts using MALDI-TOF MS. The inclusion of enough isolates of the same species in the database can enhance the proportion of correctly identified strains. Further optimization of the preparation methods, especially of the direct transfer method, may contribute to improved diagnosis of yeast-related infections. PMID:24920782
47 CFR 52.101 - General definitions.
Code of Federal Regulations, 2013 CFR
2013-10-01
... Center (“NASC”). The entity that provides user support for the Service Management System database and administers the Service Management System database on a day-to-day basis. (b) Responsible Organization (“Resp... regional databases in the toll free network. (d) Service Management System Database (“SMS Database”). The...
47 CFR 52.101 - General definitions.
Code of Federal Regulations, 2011 CFR
2011-10-01
... Center (“NASC”). The entity that provides user support for the Service Management System database and administers the Service Management System database on a day-to-day basis. (b) Responsible Organization (“Resp... regional databases in the toll free network. (d) Service Management System Database (“SMS Database”). The...
47 CFR 52.101 - General definitions.
Code of Federal Regulations, 2014 CFR
2014-10-01
... Center (“NASC”). The entity that provides user support for the Service Management System database and administers the Service Management System database on a day-to-day basis. (b) Responsible Organization (“Resp... regional databases in the toll free network. (d) Service Management System Database (“SMS Database”). The...
47 CFR 52.101 - General definitions.
Code of Federal Regulations, 2010 CFR
2010-10-01
... Center (“NASC”). The entity that provides user support for the Service Management System database and administers the Service Management System database on a day-to-day basis. (b) Responsible Organization (“Resp... regional databases in the toll free network. (d) Service Management System Database (“SMS Database”). The...
47 CFR 52.101 - General definitions.
Code of Federal Regulations, 2012 CFR
2012-10-01
... Center (“NASC”). The entity that provides user support for the Service Management System database and administers the Service Management System database on a day-to-day basis. (b) Responsible Organization (“Resp... regional databases in the toll free network. (d) Service Management System Database (“SMS Database”). The...
Wilson, Claire; Blackwood, Bronagh; McAuley, Danny F; Perkins, Gavin D; McMullan, Ronan; Gates, Simon; Warhurst, Geoffrey
2012-01-01
Background There is growing interest in the potential utility of molecular diagnostics in improving the detection of life-threatening infection (sepsis). LightCycler® SeptiFast is a multipathogen probe-based real-time PCR system targeting DNA sequences of bacteria and fungi present in blood samples within a few hours. We report here the protocol of the first systematic review of published clinical diagnostic accuracy studies of this technology when compared with blood culture in the setting of suspected sepsis. Methods/design Data sources: the Cochrane Database of Systematic Reviews, the Database of Abstracts of Reviews of Effects (DARE), the Health Technology Assessment Database (HTA), the NHS Economic Evaluation Database (NHSEED), The Cochrane Library, MEDLINE, EMBASE, ISI Web of Science, BIOSIS Previews, MEDION and the Aggressive Research Intelligence Facility Database (ARIF). Study selection: diagnostic accuracy studies that compare the real-time PCR technology with standard culture results performed on a patient's blood sample during the management of sepsis. Data extraction: three reviewers, working independently, will determine the level of evidence, methodological quality and a standard data set relating to demographics and diagnostic accuracy metrics for each study. Statistical analysis/data synthesis: heterogeneity of studies will be investigated using a coupled forest plot of sensitivity and specificity and a scatter plot in Receiver Operator Characteristic (ROC) space. Bivariate model method will be used to estimate summary sensitivity and specificity. The authors will investigate reporting biases using funnel plots based on effective sample size and regression tests of asymmetry. Subgroup analyses are planned for adults, children and infection setting (hospital vs community) if sufficient data are uncovered. Dissemination Recommendations will be made to the Department of Health (as part of an open-access HTA report) as to whether the real-time PCR technology has sufficient clinical diagnostic accuracy potential to move forward to efficacy testing during the provision of routine clinical care. Registration PROSPERO—NIHR Prospective Register of Systematic Reviews (CRD42011001289). PMID:22240646
Computer Science Research in Europe.
1984-08-29
most attention, multi- database and its structure, and (3) the dependencies between databases Distributed Systems and multi- databases . Having...completed a multi- database Newcastle University, UK system for distributed data management, At the University of Newcastle the INRIA is now working on a real...communications re- INRIA quirements of distributed database A project called SIRIUS was estab- systems, protocols for checking the lished in 1977 at the
Palm-Vein Classification Based on Principal Orientation Features
Zhou, Yujia; Liu, Yaqin; Feng, Qianjin; Yang, Feng; Huang, Jing; Nie, Yixiao
2014-01-01
Personal recognition using palm–vein patterns has emerged as a promising alternative for human recognition because of its uniqueness, stability, live body identification, flexibility, and difficulty to cheat. With the expanding application of palm–vein pattern recognition, the corresponding growth of the database has resulted in a long response time. To shorten the response time of identification, this paper proposes a simple and useful classification for palm–vein identification based on principal direction features. In the registration process, the Gaussian-Radon transform is adopted to extract the orientation matrix and then compute the principal direction of a palm–vein image based on the orientation matrix. The database can be classified into six bins based on the value of the principal direction. In the identification process, the principal direction of the test sample is first extracted to ascertain the corresponding bin. One-by-one matching with the training samples is then performed in the bin. To improve recognition efficiency while maintaining better recognition accuracy, two neighborhood bins of the corresponding bin are continuously searched to identify the input palm–vein image. Evaluation experiments are conducted on three different databases, namely, PolyU, CASIA, and the database of this study. Experimental results show that the searching range of one test sample in PolyU, CASIA and our database by the proposed method for palm–vein identification can be reduced to 14.29%, 14.50%, and 14.28%, with retrieval accuracy of 96.67%, 96.00%, and 97.71%, respectively. With 10,000 training samples in the database, the execution time of the identification process by the traditional method is 18.56 s, while that by the proposed approach is 3.16 s. The experimental results confirm that the proposed approach is more efficient than the traditional method, especially for a large database. PMID:25383715
NASA Technical Reports Server (NTRS)
Wang, Yi; Pant, Kapil; Brenner, Martin J.; Ouellette, Jeffrey A.
2018-01-01
This paper presents a data analysis and modeling framework to tailor and develop linear parameter-varying (LPV) aeroservoelastic (ASE) model database for flexible aircrafts in broad 2D flight parameter space. The Kriging surrogate model is constructed using ASE models at a fraction of grid points within the original model database, and then the ASE model at any flight condition can be obtained simply through surrogate model interpolation. The greedy sampling algorithm is developed to select the next sample point that carries the worst relative error between the surrogate model prediction and the benchmark model in the frequency domain among all input-output channels. The process is iterated to incrementally improve surrogate model accuracy till a pre-determined tolerance or iteration budget is met. The methodology is applied to the ASE model database of a flexible aircraft currently being tested at NASA/AFRC for flutter suppression and gust load alleviation. Our studies indicate that the proposed method can reduce the number of models in the original database by 67%. Even so the ASE models obtained through Kriging interpolation match the model in the original database constructed directly from the physics-based tool with the worst relative error far below 1%. The interpolated ASE model exhibits continuously-varying gains along a set of prescribed flight conditions. More importantly, the selected grid points are distributed non-uniformly in the parameter space, a) capturing the distinctly different dynamic behavior and its dependence on flight parameters, and b) reiterating the need and utility for adaptive space sampling techniques for ASE model database compaction. The present framework is directly extendible to high-dimensional flight parameter space, and can be used to guide the ASE model development, model order reduction, robust control synthesis and novel vehicle design of flexible aircraft.
Database systems for knowledge-based discovery.
Jagarlapudi, Sarma A R P; Kishan, K V Radha
2009-01-01
Several database systems have been developed to provide valuable information from the bench chemist to biologist, medical practitioner to pharmaceutical scientist in a structured format. The advent of information technology and computational power enhanced the ability to access large volumes of data in the form of a database where one could do compilation, searching, archiving, analysis, and finally knowledge derivation. Although, data are of variable types the tools used for database creation, searching and retrieval are similar. GVK BIO has been developing databases from publicly available scientific literature in specific areas like medicinal chemistry, clinical research, and mechanism-based toxicity so that the structured databases containing vast data could be used in several areas of research. These databases were classified as reference centric or compound centric depending on the way the database systems were designed. Integration of these databases with knowledge derivation tools would enhance the value of these systems toward better drug design and discovery.
The Network Configuration of an Object Relational Database Management System
NASA Technical Reports Server (NTRS)
Diaz, Philip; Harris, W. C.
2000-01-01
The networking and implementation of the Oracle Database Management System (ODBMS) requires developers to have knowledge of the UNIX operating system as well as all the features of the Oracle Server. The server is an object relational database management system (DBMS). By using distributed processing, processes are split up between the database server and client application programs. The DBMS handles all the responsibilities of the server. The workstations running the database application concentrate on the interpretation and display of data.
Contaminant screening of wastewater with HPLC-IM-qTOF-MS and LC+LC-IM-qTOF-MS using a CCS database.
Stephan, Susanne; Hippler, Joerg; Köhler, Timo; Deeb, Ahmad A; Schmidt, Torsten C; Schmitz, Oliver J
2016-09-01
Non-target analysis has become an important tool in the field of water analysis since a broad variety of pollutants from different sources are released to the water cycle. For identification of compounds in such complex samples, liquid chromatography coupled to high resolution mass spectrometry are often used. The introduction of ion mobility spectrometry provides an additional separation dimension and allows determining collision cross sections (CCS) of the analytes as a further physicochemical constant supporting the identification. A CCS database with more than 500 standard substances including drug-like compounds and pesticides was used for CCS data base search in this work. A non-target analysis of a wastewater sample was initially performed with high performance liquid chromatography (HPLC) coupled to an ion mobility-quadrupole-time of flight mass spectrometer (IM-qTOF-MS). A database search including exact mass (±5 ppm) and CCS (±1 %) delivered 22 different compounds. Furthermore, the same sample was analyzed with a two-dimensional LC method, called LC+LC, developed in our group for the coupling to IM-qTOF-MS. This four dimensional separation platform revealed 53 different compounds, identified over exact mass and CCS, in the examined wastewater sample. It is demonstrated that the CCS database can also help to distinguish between isobaric structures exemplified for cyclophosphamide and ifosfamide. Graphical Abstract Scheme of sample analysis and database screening.
NASA Technical Reports Server (NTRS)
Moroh, Marsha
1988-01-01
A methodology for building interfaces of resident database management systems to a heterogeneous distributed database management system under development at NASA, the DAVID system, was developed. The feasibility of that methodology was demonstrated by construction of the software necessary to perform the interface task. The interface terminology developed in the course of this research is presented. The work performed and the results are summarized.
Sánchez-de-Madariaga, Ricardo; Muñoz, Adolfo; Lozano-Rubí, Raimundo; Serrano-Balazote, Pablo; Castro, Antonio L; Moreno, Oscar; Pascual, Mario
2017-08-18
The objective of this research is to compare the relational and non-relational (NoSQL) database systems approaches in order to store, recover, query and persist standardized medical information in the form of ISO/EN 13606 normalized Electronic Health Record XML extracts, both in isolation and concurrently. NoSQL database systems have recently attracted much attention, but few studies in the literature address their direct comparison with relational databases when applied to build the persistence layer of a standardized medical information system. One relational and two NoSQL databases (one document-based and one native XML database) of three different sizes have been created in order to evaluate and compare the response times (algorithmic complexity) of six different complexity growing queries, which have been performed on them. Similar appropriate results available in the literature have also been considered. Relational and non-relational NoSQL database systems show almost linear algorithmic complexity query execution. However, they show very different linear slopes, the former being much steeper than the two latter. Document-based NoSQL databases perform better in concurrency than in isolation, and also better than relational databases in concurrency. Non-relational NoSQL databases seem to be more appropriate than standard relational SQL databases when database size is extremely high (secondary use, research applications). Document-based NoSQL databases perform in general better than native XML NoSQL databases. EHR extracts visualization and edition are also document-based tasks more appropriate to NoSQL database systems. However, the appropriate database solution much depends on each particular situation and specific problem.
NASA Astrophysics Data System (ADS)
Suckow, A. O.
2013-12-01
Measurements need post-processing to obtain results that are comparable between laboratories. Raw data may need to be corrected for blank, memory, drift (change of reference values with time), linearity (dependence of reference on signal height) and normalized to international reference materials. Post-processing parameters need to be stored for traceability of results. State of the art stable isotope correction schemes are available based on MS Excel (Geldern and Barth, 2012; Gröning, 2011) or MS Access (Coplen, 1998). These are specialized to stable isotope measurements only, often only to the post-processing of a special run. Embedding of algorithms into a multipurpose database system was missing. This is necessary to combine results of different tracers (3H, 3He, 2H, 18O, CFCs, SF6...) or geochronological tools (Sediment dating e.g. with 210Pb, 137Cs), to relate to attribute data (submitter, batch, project, geographical origin, depth in core, well information etc.) and for further interpretation tools (e.g. lumped parameter modelling). Database sub-systems to the LabData laboratory management system (Suckow and Dumke, 2001) are presented for stable isotopes and for gas chromatographic CFC and SF6 measurements. The sub-system for stable isotopes allows the following post-processing: 1. automated import from measurement software (Isodat, Picarro, LGR), 2. correction for sample-to sample memory, linearity, drift, and renormalization of the raw data. The sub-system for gas chromatography covers: 1. storage of all raw data 2. storage of peak integration parameters 3. correction for blank, efficiency and linearity The user interface allows interactive and graphical control of the post-processing and all corrections by export to and plot in MS Excel and is a valuable tool for quality control. The sub-databases are integrated into LabData, a multi-user client server architecture using MS SQL server as back-end and an MS Access front-end and installed in four laboratories to date. Attribute data storage (unique ID for each subsample, origin, project context etc.) and laboratory management features are included. Export routines to Excel (depth profiles, time series, all possible tracer-versus tracer plots...) and modelling capabilities are add-ons. The source code is public domain and available under the GNU general public licence agreement (GNU-GPL). References Coplen, T.B., 1998. A manual for a laboratory information management system (LIMS) for light stable isotopes. Version 7.0. USGS open file report 98-284. Geldern, R.v., Barth, J.A.C., 2012. Optimization of instrument setup and post-run corrections for oxygen and hydrogen stable isotope measurements of water by isotope ratio infrared spectroscopy (IRIS). Limnology and Oceanography: Methods 10, 1024-1036. Gröning, M., 2011. Improved water δ2H and δ18O calibration and calculation of measurement uncertainty using a simple software tool. Rapid Communications in Mass Spectrometry 25, 2711-2720. Suckow, A., Dumke, I., 2001. A database system for geochemical, isotope hydrological and geochronological laboratories. Radiocarbon 43, 325-337.
Generalized Database Management System Support for Numeric Database Environments.
ERIC Educational Resources Information Center
Dominick, Wayne D.; Weathers, Peggy G.
1982-01-01
This overview of potential for utilizing database management systems (DBMS) within numeric database environments highlights: (1) major features, functions, and characteristics of DBMS; (2) applicability to numeric database environment needs and user needs; (3) current applications of DBMS technology; and (4) research-oriented and…
A Summary of the Naval Postgraduate School Research Program
1989-08-30
5 Fundamental Theory for Automatically Combining Changes to Software Systems ............................ 6 Database -System Approach to...Software Engineering Environments(SEE’s) .................................. 10 Multilevel Database Security .......................... 11 Temporal... Database Management and Real-Time Database Computers .................................... 12 The Multi-lingual, Multi Model, Multi-Backend Database
EarthChem and SESAR: Data Resources and Interoperability for EarthScope Cyberinfrastructure
NASA Astrophysics Data System (ADS)
Lehnert, K. A.; Walker, D.; Block, K.; Vinay, S.; Ash, J.
2008-12-01
Data management within the EarthScope Cyberinfrastructure needs to pursue two goals in order to advance and maximize the broad scientific application and impact of the large volumes of observational data acquired by EarthScope facilities: (a) to provide access to all data acquired by EarthScope facilities, and to promote their use by broad audiences, and (b) to facilitate discovery of, access to, and integration of multi-disciplinary data sets that complement EarthScope data in support of EarthScope science. EarthChem and SESAR, the System for Earth Sample Registration, are two projects within the Geoinformatics for Geochemistry program that offer resources for EarthScope CI. EarthChem operates a data portal that currently provides access to >13 million analytical values for >600,000 samples, more than half of which are from North America, including data from the USGS and all data from the NAVDAT database, a web-accessible repository for age, chemical and isotopic data from Mesozoic and younger igneous rocks in western North America. The new EarthChem GEOCHRON database will house data collected in association with GeoEarthScope, storing and serving geochronological data submitted by participating facilities. The EarthChem Deep Lithosphere Dataset is a compilation of petrological data for mantle xenoliths, initiated in collaboration with GeoFrame to complement geophysical endeavors within EarthScope science. The EarthChem Geochemical Resource Library provides a home for geochemical and petrological data products and data sets. Parts of the digital data in EarthScope CI refer to physical samples such as drill cores, igneous rocks, or water and gas samples, collected, for example, by SAFOD or by EarthScope science projects and acquired through lab-based analysis. Management of sample-based data requires the use of global unique identifiers for samples, so that distributed data for individual samples generated in different labs and published in different papers can be unambiguously linked and integrated. SESAR operates a registry for Earth samples that assigns and administers the International GeoSample Numbers (IGSN) as a global unique identifier for samples. Registration of EarthScope samples with SESAR and use of the IGSN will ensure their unique identification in publications and data systems, thus facilitating interoperability among sample-based data relevant to EarthScope CI and globally. It will also make these samples visible to global audiences via the SESAR Global Sample Catalog.
Database Systems. Course Three. Information Systems Curriculum.
ERIC Educational Resources Information Center
O'Neil, Sharon Lund; Everett, Donna R.
This course is the third of seven in the Information Systems curriculum. The purpose of the course is to familiarize students with database management concepts and standard database management software. Databases and their roles, advantages, and limitations are explained. An overview of the course sets forth the condition and performance standard…
Database Management Systems: New Homes for Migrating Bibliographic Records.
ERIC Educational Resources Information Center
Brooks, Terrence A.; Bierbaum, Esther G.
1987-01-01
Assesses bibliographic databases as part of visionary text systems such as hypertext and scholars' workstations. Downloading is discussed in terms of the capability to search records and to maintain unique bibliographic descriptions, and relational database management systems, file managers, and text databases are reviewed as possible hosts for…
NASA Astrophysics Data System (ADS)
Song, Xiaoning; Feng, Zhen-Hua; Hu, Guosheng; Yang, Xibei; Yang, Jingyu; Qi, Yunsong
2015-09-01
This paper proposes a progressive sparse representation-based classification algorithm using local discrete cosine transform (DCT) evaluation to perform face recognition. Specifically, the sum of the contributions of all training samples of each subject is first taken as the contribution of this subject, then the redundant subject with the smallest contribution to the test sample is iteratively eliminated. Second, the progressive method aims at representing the test sample as a linear combination of all the remaining training samples, by which the representation capability of each training sample is exploited to determine the optimal "nearest neighbors" for the test sample. Third, the transformed DCT evaluation is constructed to measure the similarity between the test sample and each local training sample using cosine distance metrics in the DCT domain. The final goal of the proposed method is to determine an optimal weighted sum of nearest neighbors that are obtained under the local correlative degree evaluation, which is approximately equal to the test sample, and we can use this weighted linear combination to perform robust classification. Experimental results conducted on the ORL database of faces (created by the Olivetti Research Laboratory in Cambridge), the FERET face database (managed by the Defense Advanced Research Projects Agency and the National Institute of Standards and Technology), AR face database (created by Aleix Martinez and Robert Benavente in the Computer Vision Center at U.A.B), and USPS handwritten digit database (gathered at the Center of Excellence in Document Analysis and Recognition at SUNY Buffalo) demonstrate the effectiveness of the proposed method.
Bagger, Frederik Otzen; Sasivarevic, Damir; Sohi, Sina Hadi; Laursen, Linea Gøricke; Pundhir, Sachin; Sønderby, Casper Kaae; Winther, Ole; Rapin, Nicolas; Porse, Bo T.
2016-01-01
Research on human and murine haematopoiesis has resulted in a vast number of gene-expression data sets that can potentially answer questions regarding normal and aberrant blood formation. To researchers and clinicians with limited bioinformatics experience, these data have remained available, yet largely inaccessible. Current databases provide information about gene-expression but fail to answer key questions regarding co-regulation, genetic programs or effect on patient survival. To address these shortcomings, we present BloodSpot (www.bloodspot.eu), which includes and greatly extends our previously released database HemaExplorer, a database of gene expression profiles from FACS sorted healthy and malignant haematopoietic cells. A revised interactive interface simultaneously provides a plot of gene expression along with a Kaplan–Meier analysis and a hierarchical tree depicting the relationship between different cell types in the database. The database now includes 23 high-quality curated data sets relevant to normal and malignant blood formation and, in addition, we have assembled and built a unique integrated data set, BloodPool. Bloodpool contains more than 2000 samples assembled from six independent studies on acute myeloid leukemia. Furthermore, we have devised a robust sample integration procedure that allows for sensitive comparison of user-supplied patient samples in a well-defined haematopoietic cellular space. PMID:26507857
Crystallization kinetics of Mg–Cu–Yb–Ca–Ag metallic glasses
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tsarkov, Andrey A., E-mail: tsarkov@misis.ru; WPI Advanced Institute for Materials Research, Tohoku University, Katahira 2-1-1, Aoba-Ku, Sendai 980-8577; Zanaeva, Erzhena N.
The paper presents research into a Mg–Cu–Yb system based metallic glassy alloys. Metallic glasses were prepared using induction melting and further injection on a spinning copper wheel. The effect of alloying by Ag and Ca on the glass forming ability and the kinetics of crystallization of Mg–Cu–Yb system based alloys were studied. The differential scanning calorimeter and X-ray diffractometer were used to investigate the kinetics of crystallization and the phase composition of the samples. An indicator of glass forming ability, effective activation energy of crystallization, and enthalpy of mixing were calculated. An increase of the Ca and Ag content hasmore » a positive effect on the glass forming ability, the effective activation energy of crystallization, and the enthalpy of mixing. The highest indicators of the glass forming ability and the thermal stability were found for alloys that contain both alloying elements. The Ag addition suppresses precipitation of the Mg{sub 2}Cu phase during crystallization. A dual-phase glassy-nanocrystalline Mg structure was obtained in Mg{sub 65}Cu{sub 25}Yb{sub 10} and Mg{sub 59.5}Cu{sub 22.9}Yb{sub 11}Ag{sub 6.6} alloys after annealing. Bulk samples with a composite glassy-crystalline structure were obtained in Mg{sub 59.5}Cu{sub 22.9}Yb{sub 11}Ag{sub 6.6} and Mg{sub 64}Cu{sub 21}Yb{sub 9.5}Ag{sub 5.5} alloys. A thermodynamic database for the Mg–Cu–Yb–Ca–Ag system was created to compare the process of crystallization of alloys with polythermal sections of the Mg–Cu–Yb–Ca–Ag phase diagram. - Highlights: • New alloy compositions based on Mg–Cu–Yb system were developed and investigated. • Increasing content of Ag and Ca leads to improving GFA. • Bulk samples with a composite glassy-crystalline structure were obtained. • Thermodynamic database for Mg–Cu–Yb–Ca–Ag system was created.« less
23 CFR 970.204 - Management systems requirements.
Code of Federal Regulations, 2010 CFR
2010-04-01
... the management systems and their associated databases; and (5) A process for data collection, processing, analysis and updating for each management system. (d) All management systems will use databases with a geographical reference system that can be used to geolocate all database information. (e...
23 CFR 970.204 - Management systems requirements.
Code of Federal Regulations, 2012 CFR
2012-04-01
... the management systems and their associated databases; and (5) A process for data collection, processing, analysis and updating for each management system. (d) All management systems will use databases with a geographical reference system that can be used to geolocate all database information. (e...
23 CFR 970.204 - Management systems requirements.
Code of Federal Regulations, 2011 CFR
2011-04-01
... the management systems and their associated databases; and (5) A process for data collection, processing, analysis and updating for each management system. (d) All management systems will use databases with a geographical reference system that can be used to geolocate all database information. (e...
23 CFR 970.204 - Management systems requirements.
Code of Federal Regulations, 2013 CFR
2013-04-01
... the management systems and their associated databases; and (5) A process for data collection, processing, analysis and updating for each management system. (d) All management systems will use databases with a geographical reference system that can be used to geolocate all database information. (e...
Saokaew, Surasak; Sugimoto, Takashi; Kamae, Isao; Pratoomsoot, Chayanin; Chaiyakunapruk, Nathorn
2015-01-01
Background Health technology assessment (HTA) has been continuously used for value-based healthcare decisions over the last decade. Healthcare databases represent an important source of information for HTA, which has seen a surge in use in Western countries. Although HTA agencies have been established in Asia-Pacific region, application and understanding of healthcare databases for HTA is rather limited. Thus, we reviewed existing databases to assess their potential for HTA in Thailand where HTA has been used officially and Japan where HTA is going to be officially introduced. Method Existing healthcare databases in Thailand and Japan were compiled and reviewed. Databases’ characteristics e.g. name of database, host, scope/objective, time/sample size, design, data collection method, population/sample, and variables were described. Databases were assessed for its potential HTA use in terms of safety/efficacy/effectiveness, social/ethical, organization/professional, economic, and epidemiological domains. Request route for each database was also provided. Results Forty databases– 20 from Thailand and 20 from Japan—were included. These comprised of national censuses, surveys, registries, administrative data, and claimed databases. All databases were potentially used for epidemiological studies. In addition, data on mortality, morbidity, disability, adverse events, quality of life, service/technology utilization, length of stay, and economics were also found in some databases. However, access to patient-level data was limited since information about the databases was not available on public sources. Conclusion Our findings have shown that existing databases provided valuable information for HTA research with limitation on accessibility. Mutual dialogue on healthcare database development and usage for HTA among Asia-Pacific region is needed. PMID:26560127
Tack, Lois C; Thomas, Michelle; Reich, Karl
2007-03-01
Forensic labs globally face the same problem-a growing need to process a greater number and wider variety of samples for DNA analysis. The same forensic lab can be tasked all at once with processing mixed casework samples from crime scenes, convicted offender samples for database entry, and tissue from tsunami victims for identification. Besides flexibility in the robotic system chosen for forensic automation, there is a need, for each sample type, to develop new methodology that is not only faster but also more reliable than past procedures. FTA is a chemical treatment of paper, unique to Whatman Bioscience, and is used for the stabilization and storage of biological samples. Here, the authors describe optimization of the Whatman FTA Purification Kit protocol for use with the AmpFlSTR Identifiler PCR Amplification Kit.
Sample Selection for Training Cascade Detectors.
Vállez, Noelia; Deniz, Oscar; Bueno, Gloria
2015-01-01
Automatic detection systems usually require large and representative training datasets in order to obtain good detection and false positive rates. Training datasets are such that the positive set has few samples and/or the negative set should represent anything except the object of interest. In this respect, the negative set typically contains orders of magnitude more images than the positive set. However, imbalanced training databases lead to biased classifiers. In this paper, we focus our attention on a negative sample selection method to properly balance the training data for cascade detectors. The method is based on the selection of the most informative false positive samples generated in one stage to feed the next stage. The results show that the proposed cascade detector with sample selection obtains on average better partial AUC and smaller standard deviation than the other compared cascade detectors.
NASA Astrophysics Data System (ADS)
Hayrapetyan, David B.; Hovhannisyan, Levon; Mantashyan, Paytsar A.
2013-04-01
The analysis of complex spectra is an actual problem for modern science. The work is devoted to the creation of a software package, which analyzes spectrum in the different formats, possesses by dynamic knowledge database and self-study mechanism, performs automated analysis of the spectra compound based on knowledge database by application of certain algorithms. In the software package as searching systems, hyper-spherical random search algorithms, gradient algorithms and genetic searching algorithms were used. The analysis of Raman and IR spectrum of diamond-like carbon (DLC) samples were performed by elaborated program. After processing the data, the program immediately displays all the calculated parameters of DLC.
Collecting, archiving and processing DNA from wildlife samples using FTA® databasing paper
Smith, LM; Burgoyne, LA
2004-01-01
Background Methods involving the analysis of nucleic acids have become widespread in the fields of traditional biology and ecology, however the storage and transport of samples collected in the field to the laboratory in such a manner to allow purification of intact nucleic acids can prove problematical. Results FTA® databasing paper is widely used in human forensic analysis for the storage of biological samples and for purification of nucleic acids. The possible uses of FTA® databasing paper in the purification of DNA from samples of wildlife origin were examined, with particular reference to problems expected due to the nature of samples of wildlife origin. The processing of blood and tissue samples, the possibility of excess DNA in blood samples due to nucleated erythrocytes, and the analysis of degraded samples were all examined, as was the question of long term storage of blood samples on FTA® paper. Examples of the end use of the purified DNA are given for all protocols and the rationale behind the processing procedures is also explained to allow the end user to adjust the protocols as required. Conclusions FTA® paper is eminently suitable for collection of, and purification of nucleic acids from, biological samples from a wide range of wildlife species. This technology makes the collection and storage of such samples much simpler. PMID:15072582
Applications of Database Machines in Library Systems.
ERIC Educational Resources Information Center
Salmon, Stephen R.
1984-01-01
Characteristics and advantages of database machines are summarized and their applications to library functions are described. The ability to attach multiple hosts to the same database and flexibility in choosing operating and database management systems for different functions without loss of access to common database are noted. (EJS)
Thornber, Carl R.; Sherrod, David R.; Siems, David F.; Heliker, Christina C.; Meeker, Gregory P.; Oscarson, Robert L.; Kauahikaua, James P.
2002-01-01
This report presents major-element geochemical data for glasses and whole-rock aliquots among 523 lava samples collected near the vent on Kilauea's east rift zone between September 1994 and October 2001. Information on sample collection, analysis techniques and analytical standard reproducibility are presented as a PDF file, which also includes a detailed explantion of the categories of sample information presented in the database spreadsheet. The sample database is downloadable as a separate Microsoft Excel file.
The MAR databases: development and implementation of databases specific for marine metagenomics.
Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen; Willassen, Nils P
2018-01-04
We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
23 CFR 971.204 - Management systems requirements.
Code of Federal Regulations, 2011 CFR
2011-04-01
... maintain the management systems and their associated databases; and (5) A process for data collection, processing, analysis, and updating for each management system. (c) All management systems will use databases with a common or coordinated reference system, that can be used to geolocate all database information...
23 CFR 971.204 - Management systems requirements.
Code of Federal Regulations, 2010 CFR
2010-04-01
... maintain the management systems and their associated databases; and (5) A process for data collection, processing, analysis, and updating for each management system. (c) All management systems will use databases with a common or coordinated reference system, that can be used to geolocate all database information...
23 CFR 971.204 - Management systems requirements.
Code of Federal Regulations, 2012 CFR
2012-04-01
... maintain the management systems and their associated databases; and (5) A process for data collection, processing, analysis, and updating for each management system. (c) All management systems will use databases with a common or coordinated reference system, that can be used to geolocate all database information...
23 CFR 971.204 - Management systems requirements.
Code of Federal Regulations, 2013 CFR
2013-04-01
... maintain the management systems and their associated databases; and (5) A process for data collection, processing, analysis, and updating for each management system. (c) All management systems will use databases with a common or coordinated reference system, that can be used to geolocate all database information...
Geer, Lewis Y; Marchler-Bauer, Aron; Geer, Renata C; Han, Lianyi; He, Jane; He, Siqian; Liu, Chunlei; Shi, Wenyao; Bryant, Stephen H
2010-01-01
The NCBI BioSystems database, found at http://www.ncbi.nlm.nih.gov/biosystems/, centralizes and cross-links existing biological systems databases, increasing their utility and target audience by integrating their pathways and systems into NCBI resources. This integration allows users of NCBI's Entrez databases to quickly categorize proteins, genes and small molecules by metabolic pathway, disease state or other BioSystem type, without requiring time-consuming inference of biological relationships from the literature or multiple experimental datasets.
Microcomputer Database Management Systems for Bibliographic Data.
ERIC Educational Resources Information Center
Pollard, Richard
1986-01-01
Discusses criteria for evaluating microcomputer database management systems (DBMS) used for storage and retrieval of bibliographic data. Two popular types of microcomputer DBMS--file management systems and relational database management systems--are evaluated with respect to these criteria. (Author/MBR)
Accelerating root system phenotyping of seedlings through a computer-assisted processing pipeline.
Dupuy, Lionel X; Wright, Gladys; Thompson, Jacqueline A; Taylor, Anna; Dekeyser, Sebastien; White, Christopher P; Thomas, William T B; Nightingale, Mark; Hammond, John P; Graham, Neil S; Thomas, Catherine L; Broadley, Martin R; White, Philip J
2017-01-01
There are numerous systems and techniques to measure the growth of plant roots. However, phenotyping large numbers of plant roots for breeding and genetic analyses remains challenging. One major difficulty is to achieve high throughput and resolution at a reasonable cost per plant sample. Here we describe a cost-effective root phenotyping pipeline, on which we perform time and accuracy benchmarking to identify bottlenecks in such pipelines and strategies for their acceleration. Our root phenotyping pipeline was assembled with custom software and low cost material and equipment. Results show that sample preparation and handling of samples during screening are the most time consuming task in root phenotyping. Algorithms can be used to speed up the extraction of root traits from image data, but when applied to large numbers of images, there is a trade-off between time of processing the data and errors contained in the database. Scaling-up root phenotyping to large numbers of genotypes will require not only automation of sample preparation and sample handling, but also efficient algorithms for error detection for more reliable replacement of manual interventions.
Performance evaluation of no-reference image quality metrics for face biometric images
NASA Astrophysics Data System (ADS)
Liu, Xinwei; Pedersen, Marius; Charrier, Christophe; Bours, Patrick
2018-03-01
The accuracy of face recognition systems is significantly affected by the quality of face sample images. The recent established standardization proposed several important aspects for the assessment of face sample quality. There are many existing no-reference image quality metrics (IQMs) that are able to assess natural image quality by taking into account similar image-based quality attributes as introduced in the standardization. However, whether such metrics can assess face sample quality is rarely considered. We evaluate the performance of 13 selected no-reference IQMs on face biometrics. The experimental results show that several of them can assess face sample quality according to the system performance. We also analyze the strengths and weaknesses of different IQMs as well as why some of them failed to assess face sample quality. Retraining an original IQM by using face database can improve the performance of such a metric. In addition, the contribution of this paper can be used for the evaluation of IQMs on other biometric modalities; furthermore, it can be used for the development of multimodality biometric IQMs.
Large Aperture Systems: 2000-2004
NASA Technical Reports Server (NTRS)
2004-01-01
This custom bibliography from the NASA Scientific and Technical Information Program lists a sampling of records found in the NASA Aeronautics and Space Database. The scope of this topic includes technologies for next generation astronomical telescopes and detectors. This area of focus is one of the enabling technologies as defined by NASA s Report of the President s Commission on Implementation of United States Space Exploration Policy, published in June 2004.
High Bandwidth Communications: 2000-2004
NASA Technical Reports Server (NTRS)
2004-01-01
This custom bibliography from the NASA Scientific and Technical Information Program lists a sampling of records found in the NASA Aeronautics and Space Database. The scope of this topic includes optical and high-frequency microwave systems to enhance data transmission rates. This area of focus is one of the enabling technologies as defined by NASA s Report of the President s Commission on Implementation of United States Space Exploration Policy, published in June 2004.
An Analysis of U.S. Army Health Hazard Assessments During the Acquisition of Military Materiel
2010-06-03
protective equipment (PPE) (Milz, Conrad, & Soule , 2003). Engineering controls can eliminate hazards through system design, substitution of hazardous...Milz, Conrad, & Soule , 2003). Engineering control measures can serve to 7 minimize hazards where they cannot be eliminated, with preference for...during the materiel acquisitions process, and (c) will evaluate a sample of the database for accuracy by comparing the data entries to original reports
NASA Astrophysics Data System (ADS)
Nakagawa, Y.; Kawahara, S.; Araki, F.; Matsuoka, D.; Ishikawa, Y.; Fujita, M.; Sugimoto, S.; Okada, Y.; Kawazoe, S.; Watanabe, S.; Ishii, M.; Mizuta, R.; Murata, A.; Kawase, H.
2017-12-01
Analyses of large ensemble data are quite useful in order to produce probabilistic effect projection of climate change. Ensemble data of "+2K future climate simulations" are currently produced by Japanese national project "Social Implementation Program on Climate Change Adaptation Technology (SI-CAT)" as a part of a database for Policy Decision making for Future climate change (d4PDF; Mizuta et al. 2016) produced by Program for Risk Information on Climate Change. Those data consist of global warming simulations and regional downscaling simulations. Considering that those data volumes are too large (a few petabyte) to download to a local computer of users, a user-friendly system is required to search and download data which satisfy requests of the users. We develop "a database system for near-future climate change projections" for providing functions to find necessary data for the users under SI-CAT. The database system for near-future climate change projections mainly consists of a relational database, a data download function and user interface. The relational database using PostgreSQL is a key function among them. Temporally and spatially compressed data are registered on the relational database. As a first step, we develop the relational database for precipitation, temperature and track data of typhoon according to requests by SI-CAT members. The data download function using Open-source Project for a Network Data Access Protocol (OPeNDAP) provides a function to download temporally and spatially extracted data based on search results obtained by the relational database. We also develop the web-based user interface for using the relational database and the data download function. A prototype of the database system for near-future climate change projections are currently in operational test on our local server. The database system for near-future climate change projections will be released on Data Integration and Analysis System Program (DIAS) in fiscal year 2017. Techniques of the database system for near-future climate change projections might be quite useful for simulation and observational data in other research fields. We report current status of development and some case studies of the database system for near-future climate change projections.
The Golosiiv on-line plate archive database, management and maintenance
NASA Astrophysics Data System (ADS)
Pakuliak, L.; Sergeeva, T.
2007-08-01
We intend to create online version of the database of the MAO NASU plate archive as VO-compatible structures in accordance with principles, developed by the International Virtual Observatory Alliance in order to make them available for world astronomical community. The online version of the log-book database is constructed by means of MySQL+PHP. Data management system provides a user with user interface, gives a capability of detailed traditional form-filling radial search of plates, obtaining some auxiliary sampling, the listing of each collection and permits to browse the detail descriptions of collections. The administrative tool allows database administrator the data correction, enhancement with new data sets and control of the integrity and consistence of the database as a whole. The VO-compatible database is currently constructing under the demands and in the accordance with principles of international data archives and has to be strongly generalized in order to provide a possibility of data mining by means of standard interfaces and to be the best fitted to the demands of WFPDB Group for databases of the plate catalogues. On-going enhancements of database toward the WFPDB bring the problem of the verification of data to the forefront, as it demands the high degree of data reliability. The process of data verification is practically endless and inseparable from data management owing to a diversity of data errors nature, that means to a variety of ploys of their identification and fixing. The current status of MAO NASU glass archive forces the activity in both directions simultaneously: the enhancement of log-book database with new sets of observational data as well as generalized database creation and the cross-identification between them. The VO-compatible version of the database is supplying with digitized data of plates obtained with MicroTek ScanMaker 9800 XL TMA. The scanning procedure is not total but is conducted selectively in the frames of special projects.
Systematic plan of building Web geographic information system based on ActiveX control
NASA Astrophysics Data System (ADS)
Zhang, Xia; Li, Deren; Zhu, Xinyan; Chen, Nengcheng
2003-03-01
A systematic plan of building Web Geographic Information System (WebGIS) using ActiveX technology is proposed in this paper. In the proposed plan, ActiveX control technology is adopted in building client-side application, and two different schemas are introduced to implement communication between controls in users¡ browser and middle application server. One is based on Distribute Component Object Model (DCOM), the other is based on socket. In the former schema, middle service application is developed as a DCOM object that communicates with ActiveX control through Object Remote Procedure Call (ORPC) and accesses data in GIS Data Server through Open Database Connectivity (ODBC). In the latter, middle service application is developed using Java language. It communicates with ActiveX control through socket based on TCP/IP and accesses data in GIS Data Server through Java Database Connectivity (JDBC). The first one is usually developed using C/C++, and it is difficult to develop and deploy. The second one is relatively easy to develop, but its performance of data transfer relies on Web bandwidth. A sample application is developed using the latter schema. It is proved that the performance of the sample application is better than that of some other WebGIS applications in some degree.
An automated system for terrain database construction
NASA Technical Reports Server (NTRS)
Johnson, L. F.; Fretz, R. K.; Logan, T. L.; Bryant, N. A.
1987-01-01
An automated Terrain Database Preparation System (TDPS) for the construction and editing of terrain databases used in computerized wargaming simulation exercises has been developed. The TDPS system operates under the TAE executive, and it integrates VICAR/IBIS image processing and Geographic Information System software with CAD/CAM data capture and editing capabilities. The terrain database includes such features as roads, rivers, vegetation, and terrain roughness.
Dwyer, Johanna T.; Picciano, Mary Frances; Betz, Joseph M.; Fisher, Kenneth D.; Saldanha, Leila G.; Yetley, Elizabeth A.; Coates, Paul M.; Radimer, Kathy; Bindewald, Bernadette; Sharpless, Katherine E.; Holden, Joanne; Andrews, Karen; Zhao, Cuiwei; Harnly, James; Wolf, Wayne R.; Perry, Charles R.
2013-01-01
Several activities of the Office of Dietary Supplements (ODS) at the National Institutes of Health involve enhancement of dietary supplement databases. These include an initiative with US Department of Agriculture to develop an analytically substantiated dietary supplement ingredient database (DSID) and collaboration with the National Center for Health Statistics to enhance the dietary supplement label database in the National Health and Nutrition Examination Survey (NHANES). The many challenges that must be dealt with in developing an analytically supported DSID include categorizing product types in the database, identifying nutrients, and other components of public health interest in these products and prioritizing which will be entered in the database first. Additional tasks include developing methods and reference materials for quantifying the constituents, finding qualified laboratories to measure the constituents, developing appropriate sample handling procedures, and finally developing representative sampling plans. Developing the NHANES dietary supplement label database has other challenges such as collecting information on dietary supplement use from NHANES respondents, constant updating and refining of information obtained, developing default values that can be used if the respondent cannot supply the exact supplement or strength that was consumed, and developing a publicly available label database. Federal partners and the research community are assisting in making an analytically supported dietary supplement database a reality. PMID:25309034
Smith, Steven M.; Neilson, Ryan T.; Giles, Stuart A.
2015-01-01
Government-sponsored, national-scale, soil and sediment geochemical databases are used to estimate regional and local background concentrations for environmental issues, identify possible anthropogenic contamination, estimate mineral endowment, explore for new mineral deposits, evaluate nutrient levels for agriculture, and establish concentration relationships with human or animal health. Because of these different uses, it is difficult for any single database to accommodate all the needs of each client. Smith et al. (2013, p. 168) reviewed six national-scale soil and sediment geochemical databases for the United States (U.S.) and, for each, evaluated “its appropriateness as a national-scale geochemical database and its usefulness for national-scale geochemical mapping.” Each of the evaluated databases has strengths and weaknesses that were listed in that review.Two of these U.S. national-scale geochemical databases are similar in their sample media and collection protocols but have different strengths—primarily sampling density and analytical consistency. This project was implemented to determine whether those databases could be merged to produce a combined dataset that could be used for mineral resource assessments. The utility of the merged database was tested to see whether mapped distributions could identify metalliferous black shales at a national scale.
77 FR 24925 - Privacy Act of 1974; System of Records
Federal Register 2010, 2011, 2012, 2013, 2014
2012-04-26
... CES Personnel Information System database of NIFA. This database is updated annually from data provided by 1862 and 1890 land-grant universities. This database is maintained by the Agricultural Research... reviewer. NIFA maintains a database of potential reviewers. Information in the database is used to match...
A computer case definition for sudden cardiac death.
Chung, Cecilia P; Murray, Katherine T; Stein, C Michael; Hall, Kathi; Ray, Wayne A
2010-06-01
To facilitate studies of medications and sudden cardiac death, we developed and validated a computer case definition for these deaths. The study of community dwelling Tennessee Medicaid enrollees 30-74 years of age utilized a linked database with Medicaid inpatient/outpatient files, state death certificate files, and a state 'all-payers' hospital discharge file. The computerized case definition was developed from a retrospective cohort study of sudden cardiac deaths occurring between 1990 and 1993. Medical records for 926 potential cases had been adjudicated for this study to determine if they met the clinical definition for sudden cardiac death occurring in the community and were likely to be due to ventricular tachyarrhythmias. The computerized case definition included deaths with (1) no evidence of a terminal hospital admission/nursing home stay in any of the data sources; (2) an underlying cause of death code consistent with sudden cardiac death; and (3) no terminal procedures inconsistent with unresuscitated cardiac arrest. This definition was validated in an independent sample of 174 adjudicated deaths occurring between 1994 and 2005. The positive predictive value of the computer case definition was 86.0% in the development sample and 86.8% in the validation sample. The positive predictive value did not vary materially for deaths coded according to the ICO-9 (1994-1998, positive predictive value = 85.1%) or ICD-10 (1999-2005, 87.4%) systems. A computerized Medicaid database, linked with death certificate files and a state hospital discharge database, can be used for a computer case definition of sudden cardiac death. Copyright (c) 2009 John Wiley & Sons, Ltd.
Template protection and its implementation in 3D face recognition systems
NASA Astrophysics Data System (ADS)
Zhou, Xuebing
2007-04-01
As biometric recognition systems are widely applied in various application areas, security and privacy risks have recently attracted the attention of the biometric community. Template protection techniques prevent stored reference data from revealing private biometric information and enhance the security of biometrics systems against attacks such as identity theft and cross matching. This paper concentrates on a template protection algorithm that merges methods from cryptography, error correction coding and biometrics. The key component of the algorithm is to convert biometric templates into binary vectors. It is shown that the binary vectors should be robust, uniformly distributed, statistically independent and collision-free so that authentication performance can be optimized and information leakage can be avoided. Depending on statistical character of the biometric template, different approaches for transforming biometric templates into compact binary vectors are presented. The proposed methods are integrated into a 3D face recognition system and tested on the 3D facial images of the FRGC database. It is shown that the resulting binary vectors provide an authentication performance that is similar to the original 3D face templates. A high security level is achieved with reasonable false acceptance and false rejection rates of the system, based on an efficient statistical analysis. The algorithm estimates the statistical character of biometric templates from a number of biometric samples in the enrollment database. For the FRGC 3D face database, the small distinction of robustness and discriminative power between the classification results under the assumption of uniquely distributed templates and the ones under the assumption of Gaussian distributed templates is shown in our tests.
The Danish Testicular Cancer database.
Daugaard, Gedske; Kier, Maria Gry Gundgaard; Bandak, Mikkel; Mortensen, Mette Saksø; Larsson, Heidi; Søgaard, Mette; Toft, Birgitte Groenkaer; Engvad, Birte; Agerbæk, Mads; Holm, Niels Vilstrup; Lauritsen, Jakob
2016-01-01
The nationwide Danish Testicular Cancer database consists of a retrospective research database (DaTeCa database) and a prospective clinical database (Danish Multidisciplinary Cancer Group [DMCG] DaTeCa database). The aim is to improve the quality of care for patients with testicular cancer (TC) in Denmark, that is, by identifying risk factors for relapse, toxicity related to treatment, and focusing on late effects. All Danish male patients with a histologically verified germ cell cancer diagnosis in the Danish Pathology Registry are included in the DaTeCa databases. Data collection has been performed from 1984 to 2007 and from 2013 onward, respectively. The retrospective DaTeCa database contains detailed information with more than 300 variables related to histology, stage, treatment, relapses, pathology, tumor markers, kidney function, lung function, etc. A questionnaire related to late effects has been conducted, which includes questions regarding social relationships, life situation, general health status, family background, diseases, symptoms, use of medication, marital status, psychosocial issues, fertility, and sexuality. TC survivors alive on October 2014 were invited to fill in this questionnaire including 160 validated questions. Collection of questionnaires is still ongoing. A biobank including blood/sputum samples for future genetic analyses has been established. Both samples related to DaTeCa and DMCG DaTeCa database are included. The prospective DMCG DaTeCa database includes variables regarding histology, stage, prognostic group, and treatment. The DMCG DaTeCa database has existed since 2013 and is a young clinical database. It is necessary to extend the data collection in the prospective database in order to answer quality-related questions. Data from the retrospective database will be added to the prospective data. This will result in a large and very comprehensive database for future studies on TC patients.
COREMIC: a web-tool to search for a niche associated CORE MICrobiome.
Rodrigues, Richard R; Rodgers, Nyle C; Wu, Xiaowei; Williams, Mark A
2018-01-01
Microbial diversity on earth is extraordinary, and soils alone harbor thousands of species per gram of soil. Understanding how this diversity is sorted and selected into habitat niches is a major focus of ecology and biotechnology, but remains only vaguely understood. A systems-biology approach was used to mine information from databases to show how it can be used to answer questions related to the core microbiome of habitat-microbe relationships. By making use of the burgeoning growth of information from databases, our tool "COREMIC" meets a great need in the search for understanding niche partitioning and habitat-function relationships. The work is unique, furthermore, because it provides a user-friendly statistically robust web-tool (http://coremic2.appspot.com or http://core-mic.com), developed using Google App Engine, to help in the process of database mining to identify the "core microbiome" associated with a given habitat. A case study is presented using data from 31 switchgrass rhizosphere community habitats across a diverse set of soil and sampling environments. The methodology utilizes an outgroup of 28 non-switchgrass (other grasses and forbs) to identify a core switchgrass microbiome. Even across a diverse set of soils (five environments), and conservative statistical criteria (presence in more than 90% samples and FDR q -val <0.05% for Fisher's exact test) a core set of bacteria associated with switchgrass was observed. These included, among others, closely related taxa from Lysobacter spp., Mesorhizobium spp , and Chitinophagaceae . These bacteria have been shown to have functions related to the production of bacterial and fungal antibiotics and plant growth promotion. COREMIC can be used as a hypothesis generating or confirmatory tool that shows great potential for identifying taxa that may be important to the functioning of a habitat (e.g. host plant). The case study, in conclusion, shows that COREMIC can identify key habitat-specific microbes across diverse samples, using currently available databases and a unique freely available software.
COREMIC: a web-tool to search for a niche associated CORE MICrobiome
Rodgers, Nyle C.; Wu, Xiaowei; Williams, Mark A.
2018-01-01
Microbial diversity on earth is extraordinary, and soils alone harbor thousands of species per gram of soil. Understanding how this diversity is sorted and selected into habitat niches is a major focus of ecology and biotechnology, but remains only vaguely understood. A systems-biology approach was used to mine information from databases to show how it can be used to answer questions related to the core microbiome of habitat-microbe relationships. By making use of the burgeoning growth of information from databases, our tool “COREMIC” meets a great need in the search for understanding niche partitioning and habitat-function relationships. The work is unique, furthermore, because it provides a user-friendly statistically robust web-tool (http://coremic2.appspot.com or http://core-mic.com), developed using Google App Engine, to help in the process of database mining to identify the “core microbiome” associated with a given habitat. A case study is presented using data from 31 switchgrass rhizosphere community habitats across a diverse set of soil and sampling environments. The methodology utilizes an outgroup of 28 non-switchgrass (other grasses and forbs) to identify a core switchgrass microbiome. Even across a diverse set of soils (five environments), and conservative statistical criteria (presence in more than 90% samples and FDR q-val <0.05% for Fisher’s exact test) a core set of bacteria associated with switchgrass was observed. These included, among others, closely related taxa from Lysobacter spp., Mesorhizobium spp, and Chitinophagaceae. These bacteria have been shown to have functions related to the production of bacterial and fungal antibiotics and plant growth promotion. COREMIC can be used as a hypothesis generating or confirmatory tool that shows great potential for identifying taxa that may be important to the functioning of a habitat (e.g. host plant). The case study, in conclusion, shows that COREMIC can identify key habitat-specific microbes across diverse samples, using currently available databases and a unique freely available software. PMID:29473009
NASA Astrophysics Data System (ADS)
Koppers, A. A.; Minnett, R. C.; Tauxe, L.; Constable, C.; Donadini, F.
2008-12-01
The Magnetics Information Consortium (MagIC) is commissioned to implement and maintain an online portal to a relational database populated by rock and paleomagnetic data. The goal of MagIC is to archive all measurements and derived properties for studies of paleomagnetic directions (inclination, declination) and intensities, and for rock magnetic experiments (hysteresis, remanence, susceptibility, anisotropy). Organizing data for presentation in peer-reviewed publications or for ingestion into databases is a time-consuming task, and to facilitate these activities, three tightly integrated tools have been developed: MagIC-PY, the MagIC Console Software, and the MagIC Online Database. A suite of Python scripts is available to help users port their data into the MagIC data format. They allow the user to add important metadata, perform basic interpretations, and average results at the specimen, sample and site levels. These scripts have been validated for use as Open Source software under the UNIX, Linux, PC and Macintosh© operating systems. We have also developed the MagIC Console Software program to assist in collating rock and paleomagnetic data for upload to the MagIC database. The program runs in Microsoft Excel© on both Macintosh© computers and PCs. It performs routine consistency checks on data entries, and assists users in preparing data for uploading into the online MagIC database. The MagIC website is hosted under EarthRef.org at http://earthref.org/MAGIC/ and has two search nodes, one for paleomagnetism and one for rock magnetism. Both nodes provide query building based on location, reference, methods applied, material type and geological age, as well as a visual FlashMap interface to browse and select locations. Users can also browse the database by data type (inclination, intensity, VGP, hysteresis, susceptibility) or by data compilation to view all contributions associated with previous databases, such as PINT, GMPDB or TAFI or other user-defined compilations. Query results are displayed in a digestible tabular format allowing the user to descend from locations to sites, samples, specimens and measurements. At each stage, the result set can be saved and, when supported by the data, can be visualized by plotting global location maps, equal area, XY, age, and depth plots, or typical Zijderveld, hysteresis, magnetization and remanence diagrams.
Functional integration of automated system databases by means of artificial intelligence
NASA Astrophysics Data System (ADS)
Dubovoi, Volodymyr M.; Nikitenko, Olena D.; Kalimoldayev, Maksat; Kotyra, Andrzej; Gromaszek, Konrad; Iskakova, Aigul
2017-08-01
The paper presents approaches for functional integration of automated system databases by means of artificial intelligence. The peculiarities of turning to account the database in the systems with the usage of a fuzzy implementation of functions were analyzed. Requirements for the normalization of such databases were defined. The question of data equivalence in conditions of uncertainty and collisions in the presence of the databases functional integration is considered and the model to reveal their possible occurrence is devised. The paper also presents evaluation method of standardization of integrated database normalization.
Software Engineering Laboratory (SEL) database organization and user's guide, revision 2
NASA Technical Reports Server (NTRS)
Morusiewicz, Linda; Bristow, John
1992-01-01
The organization of the Software Engineering Laboratory (SEL) database is presented. Included are definitions and detailed descriptions of the database tables and views, the SEL data, and system support data. The mapping from the SEL and system support data to the base table is described. In addition, techniques for accessing the database through the Database Access Manager for the SEL (DAMSEL) system and via the ORACLE structured query language (SQL) are discussed.
Software Engineering Laboratory (SEL) database organization and user's guide
NASA Technical Reports Server (NTRS)
So, Maria; Heller, Gerard; Steinberg, Sandra; Spiegel, Douglas
1989-01-01
The organization of the Software Engineering Laboratory (SEL) database is presented. Included are definitions and detailed descriptions of the database tables and views, the SEL data, and system support data. The mapping from the SEL and system support data to the base tables is described. In addition, techniques for accessing the database, through the Database Access Manager for the SEL (DAMSEL) system and via the ORACLE structured query language (SQL), are discussed.
William, David J; Rybicki, Nancy B; Lombana, Alfonso V; O'Brien, Tim M; Gomez, Richard B
2003-01-01
The use of airborne hyperspectral remote sensing imagery for automated mapping of submerged aquatic vegetation (SAV) in the tidal Potomac River was investigated for near to real-time resource assessment and monitoring. Airborne hyperspectral imagery and field spectrometer measurements were obtained in October of 2000. A spectral library database containing selected ground-based and airborne sensor spectra was developed for use in image processing. The spectral library is used to automate the processing of hyperspectral imagery for potential real-time material identification and mapping. Field based spectra were compared to the airborne imagery using the database to identify and map two species of SAV (Myriophyllum spicatum and Vallisneria americana). Overall accuracy of the vegetation maps derived from hyperspectral imagery was determined by comparison to a product that combined aerial photography and field based sampling at the end of the SAV growing season. The algorithms and databases developed in this study will be useful with the current and forthcoming space-based hyperspectral remote sensing systems.
Achieving Integration in Mixed Methods Designs—Principles and Practices
Fetters, Michael D; Curry, Leslie A; Creswell, John W
2013-01-01
Mixed methods research offers powerful tools for investigating complex processes and systems in health and health care. This article describes integration principles and practices at three levels in mixed methods research and provides illustrative examples. Integration at the study design level occurs through three basic mixed method designs—exploratory sequential, explanatory sequential, and convergent—and through four advanced frameworks—multistage, intervention, case study, and participatory. Integration at the methods level occurs through four approaches. In connecting, one database links to the other through sampling. With building, one database informs the data collection approach of the other. When merging, the two databases are brought together for analysis. With embedding, data collection and analysis link at multiple points. Integration at the interpretation and reporting level occurs through narrative, data transformation, and joint display. The fit of integration describes the extent the qualitative and quantitative findings cohere. Understanding these principles and practices of integration can help health services researchers leverage the strengths of mixed methods. PMID:24279835
Sivakumaran, Subathira; Huffman, Lee; Sivakumaran, Sivalingam
2018-01-01
A country-specific food composition databases is useful for assessing nutrient intake reliably in national nutrition surveys, research studies and clinical practice. The New Zealand Food Composition Database (NZFCDB) programme seeks to maintain relevant and up-to-date food records that reflect the composition of foods commonly consumed in New Zealand following Food Agricultural Organisation of the United Nations/International Network of Food Data Systems (FAO/INFOODS) guidelines. Food composition data (FCD) of up to 87 core components for approximately 600 foods have been added to NZFCDB since 2010. These foods include those identified as providing key nutrients in a 2008/09 New Zealand Adult Nutrition Survey. Nutrient data obtained by analysis of composite samples or are calculated from analytical data. Currently >2500 foods in 22 food groups are freely available in various NZFCDB output products on the website: www.foodcomposition.co.nz. NZFCDB is the main source of FCD for estimating nutrient intake in New Zealand nutrition surveys. Copyright © 2016 Elsevier Ltd. All rights reserved.
Achieving integration in mixed methods designs-principles and practices.
Fetters, Michael D; Curry, Leslie A; Creswell, John W
2013-12-01
Mixed methods research offers powerful tools for investigating complex processes and systems in health and health care. This article describes integration principles and practices at three levels in mixed methods research and provides illustrative examples. Integration at the study design level occurs through three basic mixed method designs-exploratory sequential, explanatory sequential, and convergent-and through four advanced frameworks-multistage, intervention, case study, and participatory. Integration at the methods level occurs through four approaches. In connecting, one database links to the other through sampling. With building, one database informs the data collection approach of the other. When merging, the two databases are brought together for analysis. With embedding, data collection and analysis link at multiple points. Integration at the interpretation and reporting level occurs through narrative, data transformation, and joint display. The fit of integration describes the extent the qualitative and quantitative findings cohere. Understanding these principles and practices of integration can help health services researchers leverage the strengths of mixed methods. © Health Research and Educational Trust.
The customization of APACHE II for patients receiving orthotopic liver transplants
Moreno, Rui
2002-01-01
General outcome prediction models developed for use with large, multicenter databases of critically ill patients may not correctly estimate mortality if applied to a particular group of patients that was under-represented in the original database. The development of new diagnostic weights has been proposed as a method of adapting the general model – the Acute Physiology and Chronic Health Evaluation (APACHE) II in this case – to a new group of patients. Such customization must be empirically tested, because the original model cannot contain an appropriate set of predictive variables for the particular group. In this issue of Critical Care, Arabi and co-workers present the results of the validation of a modified model of the APACHE II system for patients receiving orthotopic liver transplants. The use of a highly heterogeneous database for which not all important variables were taken into account and of a sample too small to use the Hosmer–Lemeshow goodness-of-fit test appropriately makes their conclusions uncertain. PMID:12133174
National Databases for Neurosurgical Outcomes Research: Options, Strengths, and Limitations.
Karhade, Aditya V; Larsen, Alexandra M G; Cote, David J; Dubois, Heloise M; Smith, Timothy R
2017-08-05
Quality improvement, value-based care delivery, and personalized patient care depend on robust clinical, financial, and demographic data streams of neurosurgical outcomes. The neurosurgical literature lacks a comprehensive review of large national databases. To assess the strengths and limitations of various resources for outcomes research in neurosurgery. A review of the literature was conducted to identify surgical outcomes studies using national data sets. The databases were assessed for the availability of patient demographics and clinical variables, longitudinal follow-up of patients, strengths, and limitations. The number of unique patients contained within each data set ranged from thousands (Quality Outcomes Database [QOD]) to hundreds of millions (MarketScan). Databases with both clinical and financial data included PearlDiver, Premier Healthcare Database, Vizient Clinical Data Base and Resource Manager, and the National Inpatient Sample. Outcomes collected by databases included patient-reported outcomes (QOD); 30-day morbidity, readmissions, and reoperations (National Surgical Quality Improvement Program); and disease incidence and disease-specific survival (Surveillance, Epidemiology, and End Results-Medicare). The strengths of large databases included large numbers of rare pathologies and multi-institutional nationally representative sampling; the limitations of these databases included variable data veracity, variable data completeness, and missing disease-specific variables. The improvement of existing large national databases and the establishment of new registries will be crucial to the future of neurosurgical outcomes research. Copyright © 2017 by the Congress of Neurological Surgeons
MAGA, a new database of gas natural emissions: a collaborative web environment for collecting data.
NASA Astrophysics Data System (ADS)
Cardellini, Carlo; Chiodini, Giovanni; Frigeri, Alessandro; Bagnato, Emanuela; Frondini, Francesco; Aiuppa, Alessandro
2014-05-01
The data on volcanic and non-volcanic gas emissions available online are, as today, are incomplete and most importantly, fragmentary. Hence, there is need for common frameworks to aggregate available data, in order to characterize and quantify the phenomena at various scales. A new and detailed web database (MAGA: MApping GAs emissions) has been developed, and recently improved, to collect data on carbon degassing form volcanic and non-volcanic environments. MAGA database allows researchers to insert data interactively and dynamically into a spatially referred relational database management system, as well as to extract data. MAGA kicked-off with the database set up and with the ingestion in to the database of the data from: i) a literature survey on publications on volcanic gas fluxes including data on active craters degassing, diffuse soil degassing and fumaroles both from dormant closed-conduit volcanoes (e.g., Vulcano, Phlegrean Fields, Santorini, Nysiros, Teide, etc.) and open-vent volcanoes (e.g., Etna, Stromboli, etc.) in the Mediterranean area and Azores, and ii) the revision and update of Googas database on non-volcanic emission of the Italian territory (Chiodini et al., 2008), in the framework of the Deep Earth Carbon Degassing (DECADE) research initiative of the Deep Carbon Observatory (DCO). For each geo-located gas emission site, the database holds images and description of the site and of the emission type (e.g., diffuse emission, plume, fumarole, etc.), gas chemical-isotopic composition (when available), gas temperature and gases fluxes magnitude. Gas sampling, analysis and flux measurement methods are also reported together with references and contacts to researchers expert of each site. In this phase data can be accessed on the network from a web interface, and data-driven web service, where software clients can request data directly from the database, are planned to be implemented shortly. This way Geographical Information Systems (GIS) and Virtual Globes (e.g., Google Earth) could easily access the database, and data could be exchanged with other database. At the moment the database includes: i) more than 1000 flux data about volcanic plume degassing from Etna and Stromboli volcanoes, ii) data from ~ 30 sites of diffuse soil degassing from Napoletan volcanoes, Azores, Canary, Etna, Stromboli, and Vulcano Island, several data on fumarolic emissions (~ 7 sites) with CO2 fluxes; iii) data from ~ 270 non volcanic gas emission site in Italy. We believe MAGA data-base is an important starting point to develop a large scale, expandable data-base aimed to excite, inspire, and encourage participation among researchers. In addition, the possibility to archive location and qualitative information for gas emission/sites not yet investigated, could stimulate the scientific community for future researches and will provide an indication on the current uncertainty on deep carbon fluxes global estimates
Case-based fracture image retrieval.
Zhou, Xin; Stern, Richard; Müller, Henning
2012-05-01
Case-based fracture image retrieval can assist surgeons in decisions regarding new cases by supplying visually similar past cases. This tool may guide fracture fixation and management through comparison of long-term outcomes in similar cases. A fracture image database collected over 10 years at the orthopedic service of the University Hospitals of Geneva was used. This database contains 2,690 fracture cases associated with 43 classes (based on the AO/OTA classification). A case-based retrieval engine was developed and evaluated using retrieval precision as a performance metric. Only cases in the same class as the query case are considered as relevant. The scale-invariant feature transform (SIFT) is used for image analysis. Performance evaluation was computed in terms of mean average precision (MAP) and early precision (P10, P30). Retrieval results produced with the GNU image finding tool (GIFT) were used as a baseline. Two sampling strategies were evaluated. One used a dense 40 × 40 pixel grid sampling, and the second one used the standard SIFT features. Based on dense pixel grid sampling, three unsupervised feature selection strategies were introduced to further improve retrieval performance. With dense pixel grid sampling, the image is divided into 1,600 (40 × 40) square blocks. The goal is to emphasize the salient regions (blocks) and ignore irrelevant regions. Regions are considered as important when a high variance of the visual features is found. The first strategy is to calculate the variance of all descriptors on the global database. The second strategy is to calculate the variance of all descriptors for each case. A third strategy is to perform a thumbnail image clustering in a first step and then to calculate the variance for each cluster. Finally, a fusion between a SIFT-based system and GIFT is performed. A first comparison on the selection of sampling strategies using SIFT features shows that dense sampling using a pixel grid (MAP = 0.18) outperformed the SIFT detector-based sampling approach (MAP = 0.10). In a second step, three unsupervised feature selection strategies were evaluated. A grid parameter search is applied to optimize parameters for feature selection and clustering. Results show that using half of the regions (700 or 800) obtains the best performance for all three strategies. Increasing the number of clusters in clustering can also improve the retrieval performance. The SIFT descriptor variance in each case gave the best indication of saliency for the regions (MAP = 0.23), better than the other two strategies (MAP = 0.20 and 0.21). Combining GIFT (MAP = 0.23) and the best SIFT strategy (MAP = 0.23) produced significantly better results (MAP = 0.27) than each system alone. A case-based fracture retrieval engine was developed and is available for online demonstration. SIFT is used to extract local features, and three feature selection strategies were introduced and evaluated. A baseline using the GIFT system was used to evaluate the salient point-based approaches. Without supervised learning, SIFT-based systems with optimized parameters slightly outperformed the GIFT system. A fusion of the two approaches shows that the information contained in the two approaches is complementary. Supervised learning on the feature space is foreseen as the next step of this study.
Geer, Lewis Y.; Marchler-Bauer, Aron; Geer, Renata C.; Han, Lianyi; He, Jane; He, Siqian; Liu, Chunlei; Shi, Wenyao; Bryant, Stephen H.
2010-01-01
The NCBI BioSystems database, found at http://www.ncbi.nlm.nih.gov/biosystems/, centralizes and cross-links existing biological systems databases, increasing their utility and target audience by integrating their pathways and systems into NCBI resources. This integration allows users of NCBI’s Entrez databases to quickly categorize proteins, genes and small molecules by metabolic pathway, disease state or other BioSystem type, without requiring time-consuming inference of biological relationships from the literature or multiple experimental datasets. PMID:19854944
The database design of LAMOST based on MYSQL/LINUX
NASA Astrophysics Data System (ADS)
Li, Hui-Xian, Sang, Jian; Wang, Sha; Luo, A.-Li
2006-03-01
The Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) will be set up in the coming years. A fully automated software system for reducing and analyzing the spectra has to be developed with the telescope. This database system is an important part of the software system. The requirements for the database of the LAMOST, the design of the LAMOST database system based on MYSQL/LINUX and performance tests of this system are described in this paper.
Alternative Databases for Anthropology Searching.
ERIC Educational Resources Information Center
Brody, Fern; Lambert, Maureen
1984-01-01
Examines online search results of sample questions in several databases covering linguistics, cultural anthropology, and physical anthropology in order to determine if and where any overlap in results might occur, and which files have greatest number of relevant hits. Search results by database are given for each subject area. (EJS)
Database extraction strategies for low-template evidence.
Bleka, Øyvind; Dørum, Guro; Haned, Hinda; Gill, Peter
2014-03-01
Often in forensic cases, the profile of at least one of the contributors to a DNA evidence sample is unknown and a database search is needed to discover possible perpetrators. In this article we consider two types of search strategies to extract suspects from a database using methods based on probability arguments. The performance of the proposed match scores is demonstrated by carrying out a study of each match score relative to the level of allele drop-out in the crime sample, simulating low-template DNA. The efficiency was measured by random man simulation and we compared the performance using the SGM Plus kit and the ESX 17 kit for the Norwegian population, demonstrating that the latter has greatly enhanced power to discover perpetrators of crime in large national DNA databases. The code for the database extraction strategies will be prepared for release in the R-package forensim. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Kamitsuji, Shigeo; Matsuda, Takashi; Nishimura, Koichi; Endo, Seiko; Wada, Chisa; Watanabe, Kenji; Hasegawa, Koichi; Hishigaki, Haretsugu; Masuda, Masatoshi; Kuwahara, Yusuke; Tsuritani, Katsuki; Sugiura, Kenkichi; Kubota, Tomoko; Miyoshi, Shinji; Okada, Kinya; Nakazono, Kazuyuki; Sugaya, Yuki; Yang, Woosung; Sawamoto, Taiji; Uchida, Wataru; Shinagawa, Akira; Fujiwara, Tsutomu; Yamada, Hisaharu; Suematsu, Koji; Tsutsui, Naohisa; Kamatani, Naoyuki; Liou, Shyh-Yuh
2015-06-01
Japan Pharmacogenomics Data Science Consortium (JPDSC) has assembled a database for conducting pharmacogenomics (PGx) studies in Japanese subjects. The database contains the genotypes of 2.5 million single-nucleotide polymorphisms (SNPs) and 5 human leukocyte antigen loci from 2994 Japanese healthy volunteers, as well as 121 kinds of clinical information, including self-reports, physiological data, hematological data and biochemical data. In this article, the reliability of our data was evaluated by principal component analysis (PCA) and association analysis for hematological and biochemical traits by using genome-wide SNP data. PCA of the SNPs showed that all the samples were collected from the Japanese population and that the samples were separated into two major clusters by birthplace, Okinawa and other than Okinawa, as had been previously reported. Among 87 SNPs that have been reported to be associated with 18 hematological and biochemical traits in genome-wide association studies (GWAS), the associations of 56 SNPs were replicated using our data base. Statistical power simulations showed that the sample size of the JPDSC control database is large enough to detect genetic markers having a relatively strong association even when the case sample size is small. The JPDSC database will be useful as control data for conducting PGx studies to explore genetic markers to improve the safety and efficacy of drugs either during clinical development or in post-marketing.
An Introduction to Database Management Systems.
ERIC Educational Resources Information Center
Warden, William H., III; Warden, Bette M.
1984-01-01
Description of database management systems for microcomputers highlights system features and factors to consider in microcomputer system selection. A method for ranking database management systems is explained and applied to a defined need, i.e., software support for indexing a weekly newspaper. A glossary of terms and 32-item bibliography are…
Heterogeneous database integration in biomedicine.
Sujansky, W
2001-08-01
The rapid expansion of biomedical knowledge, reduction in computing costs, and spread of internet access have created an ocean of electronic data. The decentralized nature of our scientific community and healthcare system, however, has resulted in a patchwork of diverse, or heterogeneous, database implementations, making access to and aggregation of data across databases very difficult. The database heterogeneity problem applies equally to clinical data describing individual patients and biological data characterizing our genome. Specifically, databases are highly heterogeneous with respect to the data models they employ, the data schemas they specify, the query languages they support, and the terminologies they recognize. Heterogeneous database systems attempt to unify disparate databases by providing uniform conceptual schemas that resolve representational heterogeneities, and by providing querying capabilities that aggregate and integrate distributed data. Research in this area has applied a variety of database and knowledge-based techniques, including semantic data modeling, ontology definition, query translation, query optimization, and terminology mapping. Existing systems have addressed heterogeneous database integration in the realms of molecular biology, hospital information systems, and application portability.
Selecting Data-Base Management Software for Microcomputers in Libraries and Information Units.
ERIC Educational Resources Information Center
Pieska, K. A. O.
1986-01-01
Presents a model for the evaluation of database management systems software from the viewpoint of librarians and information specialists. The properties of data management systems, database management systems, and text retrieval systems are outlined and compared. (10 references) (CLB)
Databases for the Global Dynamics of Multiparameter Nonlinear Systems
2014-03-05
AFRL-OSR-VA-TR-2014-0078 DATABASES FOR THE GLOBAL DYNAMICS OF MULTIPARAMETER NONLINEAR SYSTEMS Konstantin Mischaikow RUTGERS THE STATE UNIVERSITY OF...University of New Jersey ASB III, Rutgers Plaza New Brunswick, NJ 08807 DATABASES FOR THE GLOBAL DYNAMICS OF MULTIPARAMETER NONLINEAR SYSTEMS ...dynamical systems . We refer to the output as a Database for Global Dynamics since it allows the user to query for information about the existence and
Software Application for Supporting the Education of Database Systems
ERIC Educational Resources Information Center
Vágner, Anikó
2015-01-01
The article introduces an application which supports the education of database systems, particularly the teaching of SQL and PL/SQL in Oracle Database Management System environment. The application has two parts, one is the database schema and its content, and the other is a C# application. The schema is to administrate and store the tasks and the…
Federal Register 2010, 2011, 2012, 2013, 2014
2010-08-13
... information. Access to any such database system is limited to system administrators, individuals responsible... during the certification process. The above information will be contained in one or more databases (such as Lotus Notes) that reside on servers in EPA offices. The database(s) may be specific to one...
Lin, Hongli; Wang, Weisheng; Luo, Jiawei; Yang, Xuedong
2014-12-01
The aim of this study was to develop a personalized training system using the Lung Image Database Consortium (LIDC) and Image Database resource Initiative (IDRI) Database, because collecting, annotating, and marking a large number of appropriate computed tomography (CT) scans, and providing the capability of dynamically selecting suitable training cases based on the performance levels of trainees and the characteristics of cases are critical for developing a efficient training system. A novel approach is proposed to develop a personalized radiology training system for the interpretation of lung nodules in CT scans using the Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI) database, which provides a Content-Boosted Collaborative Filtering (CBCF) algorithm for predicting the difficulty level of each case of each trainee when selecting suitable cases to meet individual needs, and a diagnostic simulation tool to enable trainees to analyze and diagnose lung nodules with the help of an image processing tool and a nodule retrieval tool. Preliminary evaluation of the system shows that developing a personalized training system for interpretation of lung nodules is needed and useful to enhance the professional skills of trainees. The approach of developing personalized training systems using the LIDC/IDRL database is a feasible solution to the challenges of constructing specific training program in terms of cost and training efficiency. Copyright © 2014 AUR. Published by Elsevier Inc. All rights reserved.
Genomes OnLine Database (GOLD) v.6: data updates and feature enhancements
Mukherjee, Supratim; Stamatis, Dimitri; Bertsch, Jon; Ovchinnikova, Galina; Verezemska, Olena; Isbandi, Michelle; Thomas, Alex D.; Ali, Rida; Sharma, Kaushal; Kyrpides, Nikos C.; Reddy, T. B. K.
2017-01-01
The Genomes Online Database (GOLD) (https://gold.jgi.doe.gov) is a manually curated data management system that catalogs sequencing projects with associated metadata from around the world. In the current version of GOLD (v.6), all projects are organized based on a four level classification system in the form of a Study, Organism (for isolates) or Biosample (for environmental samples), Sequencing Project and Analysis Project. Currently, GOLD provides information for 26 117 Studies, 239 100 Organisms, 15 887 Biosamples, 97 212 Sequencing Projects and 78 579 Analysis Projects. These are integrated with over 312 metadata fields from which 58 are controlled vocabularies with 2067 terms. The web interface facilitates submission of a diverse range of Sequencing Projects (such as isolate genome, single-cell genome, metagenome, metatranscriptome) and complex Analysis Projects (such as genome from metagenome, or combined assembly from multiple Sequencing Projects). GOLD provides a seamless interface with the Integrated Microbial Genomes (IMG) system and supports and promotes the Genomic Standards Consortium (GSC) Minimum Information standards. This paper describes the data updates and additional features added during the last two years. PMID:27794040
TRENDS: The aeronautical post-test database management system
NASA Technical Reports Server (NTRS)
Bjorkman, W. S.; Bondi, M. J.
1990-01-01
TRENDS, an engineering-test database operating system developed by NASA to support rotorcraft flight tests, is described. Capabilities and characteristics of the system are presented, with examples of its use in recalling and analyzing rotorcraft flight-test data from a TRENDS database. The importance of system user-friendliness in gaining users' acceptance is stressed, as is the importance of integrating supporting narrative data with numerical data in engineering-test databases. Considerations relevant to the creation and maintenance of flight-test database are discussed and TRENDS' solutions to database management problems are described. Requirements, constraints, and other considerations which led to the system's configuration are discussed and some of the lessons learned during TRENDS' development are presented. Potential applications of TRENDS to a wide range of aeronautical and other engineering tests are identified.
Kinect Posture Reconstruction Based on a Local Mixture of Gaussian Process Models.
Liu, Zhiguang; Zhou, Liuyang; Leung, Howard; Shum, Hubert P H
2016-11-01
Depth sensor based 3D human motion estimation hardware such as Kinect has made interactive applications more popular recently. However, it is still challenging to accurately recognize postures from a single depth camera due to the inherently noisy data derived from depth images and self-occluding action performed by the user. In this paper, we propose a new real-time probabilistic framework to enhance the accuracy of live captured postures that belong to one of the action classes in the database. We adopt the Gaussian Process model as a prior to leverage the position data obtained from Kinect and marker-based motion capture system. We also incorporate a temporal consistency term into the optimization framework to constrain the velocity variations between successive frames. To ensure that the reconstructed posture resembles the accurate parts of the observed posture, we embed a set of joint reliability measurements into the optimization framework. A major drawback of Gaussian Process is its cubic learning complexity when dealing with a large database due to the inverse of a covariance matrix. To solve the problem, we propose a new method based on a local mixture of Gaussian Processes, in which Gaussian Processes are defined in local regions of the state space. Due to the significantly decreased sample size in each local Gaussian Process, the learning time is greatly reduced. At the same time, the prediction speed is enhanced as the weighted mean prediction for a given sample is determined by the nearby local models only. Our system also allows incrementally updating a specific local Gaussian Process in real time, which enhances the likelihood of adapting to run-time postures that are different from those in the database. Experimental results demonstrate that our system can generate high quality postures even under severe self-occlusion situations, which is beneficial for real-time applications such as motion-based gaming and sport training.
NASA Technical Reports Server (NTRS)
Bebout, Leslie; Keller, R.; Miller, S.; Jahnke, L.; DeVincenzi, D. (Technical Monitor)
2002-01-01
The Ames Exobiology Culture Collection Database (AECC-DB) has been developed as a collaboration between microbial ecologists and information technology specialists. It allows for extensive web-based archiving of information regarding field samples to document microbial co-habitation of specific ecosystem micro-environments. Documentation and archiving continues as pure cultures are isolated, metabolic properties determined, and DNA extracted and sequenced. In this way metabolic properties and molecular sequences are clearly linked back to specific isolates and the location of those microbes in the ecosystem of origin. Use of this database system presents a significant advancement over traditional bookkeeping wherein there is generally little or no information regarding the environments from which microorganisms were isolated. Generally there is only a general ecosystem designation (i.e., hot-spring). However within each of these there are a myriad of microenvironments with very different properties and determining exactly where (which microenvironment) a given microbe comes from is critical in designing appropriate isolation media and interpreting physiological properties. We are currently using the database to aid in the isolation of a large number of cyanobacterial species and will present results by PI's and students demonstrating the utility of this new approach.
MPD: a pathogen genome and metagenome database
Zhang, Tingting; Miao, Jiaojiao; Han, Na; Qiang, Yujun; Zhang, Wen
2018-01-01
Abstract Advances in high-throughput sequencing have led to unprecedented growth in the amount of available genome sequencing data, especially for bacterial genomes, which has been accompanied by a challenge for the storage and management of such huge datasets. To facilitate bacterial research and related studies, we have developed the Mypathogen database (MPD), which provides access to users for searching, downloading, storing and sharing bacterial genomics data. The MPD represents the first pathogenic database for microbial genomes and metagenomes, and currently covers pathogenic microbial genomes (6604 genera, 11 071 species, 41 906 strains) and metagenomic data from host, air, water and other sources (28 816 samples). The MPD also functions as a management system for statistical and storage data that can be used by different organizations, thereby facilitating data sharing among different organizations and research groups. A user-friendly local client tool is provided to maintain the steady transmission of big sequencing data. The MPD is a useful tool for analysis and management in genomic research, especially for clinical Centers for Disease Control and epidemiological studies, and is expected to contribute to advancing knowledge on pathogenic bacteria genomes and metagenomes. Database URL: http://data.mypathogen.org PMID:29917040
East-China Geochemistry Database (ECGD):A New Networking Database for North China Craton
NASA Astrophysics Data System (ADS)
Wang, X.; Ma, W.
2010-12-01
North China Craton is one of the best natural laboratories that research some Earth Dynamic questions[1]. Scientists made much progress in research on this area, and got vast geochemistry data, which are essential for answering many fundamental questions about the age, composition, structure, and evolution of the East China area. But the geochemical data have long been accessible only through the scientific literature and theses where they have been widely dispersed, making it difficult for the broad Geosciences community to find, access and efficiently use the full range of available data[2]. How to effectively store, manage, share and reuse the existing geochemical data in the North China Craton area? East-China Geochemistry Database(ECGD) is a networking geochemical scientific database system that has been designed based on WebGIS and relational database for the structured storage and retrieval of geochemical data and geological map information. It is integrated the functions of data retrieval, spatial visualization and online analysis. ECGD focus on three areas: 1.Storage and retrieval of geochemical data and geological map information. Research on the characters of geochemical data, including its composing and connecting of each other, we designed a relational database, which based on geochemical relational data model, to store a variety of geological sample information such as sampling locality, age, sample characteristics, reference, major elements, rare earth elements, trace elements and isotope system et al. And a web-based user-friendly interface is provided for constructing queries. 2.Data view. ECGD is committed to online data visualization by different ways, especially to view data in digital map with dynamic way. Because ECGD was integrated WebGIS technology, the query results can be mapped on digital map, which can be zoomed, translation and dot selection. Besides of view and output query results data by html, txt or xls formats, researchers also can generate classification thematic maps using query results, according different parameters. 3.Data analysis on-line. Here we designed lots of geochemical online analysis tools, including geochemical diagrams, CIPW computing, and so on, which allows researchers to analyze query data without download query results. Operation of all these analysis tools is very easy; users just do it by click mouse one or two time. In summary, ECGD provide a geochemical platform for researchers, whom to know where various data are, to view various data in a synthetic and dynamic way, and analyze interested data online. REFERENCES [1] S. Gao, R.L. Rudnick, and W.L. Xu, “Recycling deep cratonic lithosphere and generation of intraplate magmatism in the North China Craton,” Earth and Planetary Science Letters,270,41-53,2008. [2] K.A. Lehnert, U. Harms, and E. Ito, “Promises, Achievements, and Challenges of Networking Global Geoinformatics Resources - Experiences of GeosciNET and EarthChem,” Geophysical Research Abstracts, Vol.10, EGU2008-A-05242,2008.
Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics.
Deutsch, Eric W; Sun, Zhi; Campbell, David S; Binz, Pierre-Alain; Farrah, Terry; Shteynberg, David; Mendoza, Luis; Omenn, Gilbert S; Moritz, Robert L
2016-11-04
The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances-a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ∼20,000 primary isoforms plus contaminants to a very large database that includes almost all nonredundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the discovered peptides against a more complex database. We have set up an automated system that downloads all the source databases on the first of each month and automatically generates a new set of search databases and makes them available for download at http://www.peptideatlas.org/thisp/ .
Tiered Human Integrated Sequence Search Databases for Shotgun Proteomics
Deutsch, Eric W.; Sun, Zhi; Campbell, David S.; Binz, Pierre-Alain; Farrah, Terry; Shteynberg, David; Mendoza, Luis; Omenn, Gilbert S.; Moritz, Robert L.
2016-01-01
The results of analysis of shotgun proteomics mass spectrometry data can be greatly affected by the selection of the reference protein sequence database against which the spectra are matched. For many species there are multiple sources from which somewhat different sequence sets can be obtained. This can lead to confusion about which database is best in which circumstances – a problem especially acute in human sample analysis. All sequence databases are genome-based, with sequences for the predicted gene and their protein translation products compiled. Our goal is to create a set of primary sequence databases that comprise the union of sequences from many of the different available sources and make the result easily available to the community. We have compiled a set of four sequence databases of varying sizes, from a small database consisting of only the ~20,000 primary isoforms plus contaminants to a very large database that includes almost all non-redundant protein sequences from several sources. This set of tiered, increasingly complete human protein sequence databases suitable for mass spectrometry proteomics sequence database searching is called the Tiered Human Integrated Search Proteome set. In order to evaluate the utility of these databases, we have analyzed two different data sets, one from the HeLa cell line and the other from normal human liver tissue, with each of the four tiers of database complexity. The result is that approximately 0.8%, 1.1%, and 1.5% additional peptides can be identified for Tiers 2, 3, and 4, respectively, as compared with the Tier 1 database, at substantially increasing computational cost. This increase in computational cost may be worth bearing if the identification of sequence variants or the discovery of sequences that are not present in the reviewed knowledge base entries is an important goal of the study. We find that it is useful to search a data set against a simpler database, and then check the uniqueness of the discovered peptides against a more complex database. We have set up an automated system that downloads all the source databases on the first of each month and automatically generates a new set of search databases and makes them available for download at http://www.peptideatlas.org/thisp/. PMID:27577934
23 CFR 972.204 - Management systems requirements.
Code of Federal Regulations, 2014 CFR
2014-04-01
... Highways FEDERAL HIGHWAY ADMINISTRATION, DEPARTMENT OF TRANSPORTATION FEDERAL LANDS HIGHWAYS FISH AND... to operate and maintain the management systems and their associated databases; and (5) A process for... systems will use databases with a geographical reference system that can be used to geolocate all database...
Most of the existing arsenic dietary databases were developed from the analysis of total arsenic in water and dietary samples. These databases have been used to estimate arsenic exposure and in turn human health risk. However, these dietary databases are becoming obsolete as the ...
Smirani, Rawen; Truchetet, Marie-Elise; Poursac, Nicolas; Naveau, Adrien; Schaeverbeke, Thierry; Devillard, Raphaël
2018-06-01
Oropharyngeal features are frequent and often understated in the treatment clinical guidelines of systemic sclerosis in spite of important consequences on comfort, esthetics, nutrition and daily life. The aim of this systematic review was to assess a correlation between the oropharyngeal manifestations of systemic sclerosis and patients' health-related quality of life. A systematic search was conducted using four databases [PubMed ® , Cochrane Database ® , Dentistry & Oral Sciences Source ® , and SCOPUS ® ] up to January 2018, according to the Preferred reporting items for systematic reviews and meta analyses. Grey literature and hand search were also included. Study selection, risk bias assessment (Newcastle-Ottawa scale) and data extraction were performed by two independent reviewers. The review protocol was registered on PROSPERO database with the code CRD42018085994. From 375 screened studies, 6 cross-sectional studies were included in the systematic review. The total number of patients included per study ranged from 84 to 178. These studies reported a statistically significant association between oropharyngeal manifestations of systemic sclerosis (mainly assessed by maximal mouth opening and the mouth handicap in systemic sclerosis scale) and an impaired quality of life (measured by different scales). Studies were unequal concerning risk of bias mostly because of low level of evidence, different recruiting sources of samples, and different scales to assess the quality of life. This systematic review demonstrates a correlation between oropharyngeal manifestations of systemic sclerosis and impaired quality of life, despite the low level of evidence of included studies. Large-scaled studies are needed to provide stronger evidence of this association. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Komatsu, Setsuko; Wang, Xin; Yin, Xiaojian; Nanjo, Yohei; Ohyanagi, Hajime; Sakata, Katsumi
2017-06-23
The Soybean Proteome Database (SPD) stores data on soybean proteins obtained with gel-based and gel-free proteomic techniques. The database was constructed to provide information on proteins for functional analyses. The majority of the data is focused on soybean (Glycine max 'Enrei'). The growth and yield of soybean are strongly affected by environmental stresses such as flooding. The database was originally constructed using data on soybean proteins separated by two-dimensional polyacrylamide gel electrophoresis, which is a gel-based proteomic technique. Since 2015, the database has been expanded to incorporate data obtained by label-free mass spectrometry-based quantitative proteomics, which is a gel-free proteomic technique. Here, the portions of the database consisting of gel-free proteomic data are described. The gel-free proteomic database contains 39,212 proteins identified in 63 sample sets, such as temporal and organ-specific samples of soybean plants grown under flooding stress or non-stressed conditions. In addition, data on organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored. Furthermore, the database integrates multiple omics data such as genomics, transcriptomics, metabolomics, and proteomics. The SPD database is accessible at http://proteome.dc.affrc.go.jp/Soybean/. The Soybean Proteome Database stores data obtained from both gel-based and gel-free proteomic techniques. The gel-free proteomic database comprises 39,212 proteins identified in 63 sample sets, such as different organs of soybean plants grown under flooding stress or non-stressed conditions in a time-dependent manner. In addition, organellar proteins identified in mitochondria, nuclei, and endoplasmic reticulum are stored in the gel-free proteomics database. A total of 44,704 proteins, including 5490 proteins identified using a gel-based proteomic technique, are stored in the SPD. It accounts for approximately 80% of all predicted proteins from genome sequences, though there are over lapped proteins. Based on the demonstrated application of data stored in the database for functional analyses, it is suggested that these data will be useful for analyses of biological mechanisms in soybean. Furthermore, coupled with recent advances in information and communication technology, the usefulness of this database would increase in the analyses of biological mechanisms. Copyright © 2017 Elsevier B.V. All rights reserved.
XML: James Webb Space Telescope Database Issues, Lessons, and Status
NASA Technical Reports Server (NTRS)
Detter, Ryan; Mooney, Michael; Fatig, Curtis
2003-01-01
This paper will present the current concept using extensible Markup Language (XML) as the underlying structure for the James Webb Space Telescope (JWST) database. The purpose of using XML is to provide a JWST database, independent of any portion of the ground system, yet still compatible with the various systems using a variety of different structures. The testing of the JWST Flight Software (FSW) started in 2002, yet the launch is scheduled for 2011 with a planned 5-year mission and a 5-year follow on option. The initial database and ground system elements, including the commands, telemetry, and ground system tools will be used for 19 years, plus post mission activities. During the Integration and Test (I&T) phases of the JWST development, 24 distinct laboratories, each geographically dispersed, will have local database tools with an XML database. Each of these laboratories database tools will be used for the exporting and importing of data both locally and to a central database system, inputting data to the database certification process, and providing various reports. A centralized certified database repository will be maintained by the Space Telescope Science Institute (STScI), in Baltimore, Maryland, USA. One of the challenges for the database is to be flexible enough to allow for the upgrade, addition or changing of individual items without effecting the entire ground system. Also, using XML should allow for the altering of the import and export formats needed by the various elements, tracking the verification/validation of each database item, allow many organizations to provide database inputs, and the merging of the many existing database processes into one central database structure throughout the JWST program. Many National Aeronautics and Space Administration (NASA) projects have attempted to take advantage of open source and commercial technology. Often this causes a greater reliance on the use of Commercial-Off-The-Shelf (COTS), which is often limiting. In our review of the database requirements and the COTS software available, only very expensive COTS software will meet 90% of requirements. Even with the high projected initial cost of COTS, the development and support for custom code over the 19-year mission period was forecasted to be higher than the total licensing costs. A group did look at reusing existing database tools and formats. If the JWST database was already in a mature state, the reuse made sense, but with the database still needing to handing the addition of different types of command and telemetry structures, defining new spacecraft systems, accept input and export to systems which has not been defined yet, XML provided the flexibility desired. It remains to be determined whether the XML database will reduce the over all cost for the JWST mission.
Bolea, Juan; Pueyo, Esther; Orini, Michele; Bailón, Raquel
2016-01-01
The purpose of this study is to characterize and attenuate the influence of mean heart rate (HR) on nonlinear heart rate variability (HRV) indices (correlation dimension, sample, and approximate entropy) as a consequence of being the HR the intrinsic sampling rate of HRV signal. This influence can notably alter nonlinear HRV indices and lead to biased information regarding autonomic nervous system (ANS) modulation. First, a simulation study was carried out to characterize the dependence of nonlinear HRV indices on HR assuming similar ANS modulation. Second, two HR-correction approaches were proposed: one based on regression formulas and another one based on interpolating RR time series. Finally, standard and HR-corrected HRV indices were studied in a body position change database. The simulation study showed the HR-dependence of non-linear indices as a sampling rate effect, as well as the ability of the proposed HR-corrections to attenuate mean HR influence. Analysis in a body position changes database shows that correlation dimension was reduced around 21% in median values in standing with respect to supine position ( p < 0.05), concomitant with a 28% increase in mean HR ( p < 0.05). After HR-correction, correlation dimension decreased around 18% in standing with respect to supine position, being the decrease still significant. Sample and approximate entropy showed similar trends. HR-corrected nonlinear HRV indices could represent an improvement in their applicability as markers of ANS modulation when mean HR changes.
The Earth Microbiome Project and Global Systems Biology
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilbert, Jack A.; Jansson, Janet K.; Knight, Rob
Recently, we published the first large-scale analysis of data from the Earth Microbiome Project (1, 2), a truly multidisciplinary research program involving more than 500 scientists and 27,751 samples acquired from 43 countries. These samples represent myriad specimen types and span a wide range of biotic and abiotic factors, geographic locations, and physicochemical properties. The database (https://qiita.ucsd.edu/emp/) is still growing, with over 90,000 amplicon datasets, >500 metagenomic runs, and metabolomics datasets from a similar number of samples. Importantly, the techniques, data and analytical tools are all standardized and publicly accessible, providing a framework to support research at a scale ofmore » integration that just 7 years ago seemed impossible.« less
Management Practices and Tools: 2000-2004
NASA Technical Reports Server (NTRS)
2004-01-01
This custom bibliography from the NASA Scientific and Technical Information Program lists a sampling of records found in the NASA Aeronautics and Space Database. The scope of this topic is divided into four parts and covers the adoption of proven personnel and management reforms to implement the national space exploration vision, including the use of "system-of-systems" approach; policies of spiral, evolutionary development; reliance upon lead systems integrators; and independent technical and cost assessments. This area of focus is one of the enabling technologies as defined by NASA s Report of the President s Commission on Implementation of United States Space Exploration Policy, published in June 2004.
Archiving and Distributing Seismic Data at the Southern California Earthquake Data Center (SCEDC)
NASA Astrophysics Data System (ADS)
Appel, V. L.
2002-12-01
The Southern California Earthquake Data Center (SCEDC) archives and provides public access to earthquake parametric and waveform data gathered by the Southern California Seismic Network and since January 1, 2001, the TriNet seismic network, southern California's earthquake monitoring network. The parametric data in the archive includes earthquake locations, magnitudes, moment-tensor solutions and phase picks. The SCEDC waveform archive prior to TriNet consists primarily of short-period, 100-samples-per-second waveforms from the SCSN. The addition of the TriNet array added continuous recordings of 155 broadband stations (20 samples per second or less), and triggered seismograms from 200 accelerometers and 200 short-period instruments. Since the Data Center and TriNet use the same Oracle database system, new earthquake data are available to the seismological community in near real-time. Primary access to the database and waveforms is through the Seismogram Transfer Program (STP) interface. The interface enables users to search the database for earthquake information, phase picks, and continuous and triggered waveform data. Output is available in SAC, miniSEED, and other formats. Both the raw counts format (V0) and the gain-corrected format (V1) of COSMOS (Consortium of Organizations for Strong-Motion Observation Systems) are now supported by STP. EQQuest is an interface to prepackaged waveform data sets for select earthquakes in Southern California stored at the SCEDC. Waveform data for large-magnitude events have been prepared and new data sets will be available for download in near real-time following major events. The parametric data from 1981 to present has been loaded into the Oracle 9.2.0.1 database system and the waveforms for that time period have been converted to mSEED format and are accessible through the STP interface. The DISC optical-disk system (the "jukebox") that currently serves as the mass-storage for the SCEDC is in the process of being replaced with a series of inexpensive high-capacity (1.6 Tbyte) magnetic-disk RAIDs. These systems are built with PC-technology components, using 16 120-Gbyte IDE disks, hot-swappable disk trays, two RAID controllers, dual redundant power supplies and a Linux operating system. The system is configured over a private gigabit network that connects to the two Data Center servers and spans between the Seismological Lab and the USGS. To ensure data integrity, each RAID disk system constantly checks itself against its twin and verifies file integrity using 128-bit MD5 file checksums that are stored separate from the system. The final level of data protection is a Sony AIT-3 tape backup of the files. The primary advantage of the magnetic-disk approach is faster data access because magnetic disk drives have almost no latency. This means that the SCEDC can provide better "on-demand" interactive delivery of the seismograms in the archive.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-05-18
... 1974; Department of Homeland Security United States Coast Guard-024 Auxiliary Database System of... Security/United States Coast Guard-024 Auxiliary Database (AUXDATA) System of Records.'' This system of... titled, ``DHS/USCG-024 Auxiliary Database (AUXDATA) System of Records.'' The AUXDATA system is the USCG's...
Nelson, Jennifer Clark; Marsh, Tracey; Lumley, Thomas; Larson, Eric B; Jackson, Lisa A; Jackson, Michael L
2013-08-01
Estimates of treatment effectiveness in epidemiologic studies using large observational health care databases may be biased owing to inaccurate or incomplete information on important confounders. Study methods that collect and incorporate more comprehensive confounder data on a validation cohort may reduce confounding bias. We applied two such methods, namely imputation and reweighting, to Group Health administrative data (full sample) supplemented by more detailed confounder data from the Adult Changes in Thought study (validation sample). We used influenza vaccination effectiveness (with an unexposed comparator group) as an example and evaluated each method's ability to reduce bias using the control time period before influenza circulation. Both methods reduced, but did not completely eliminate, the bias compared with traditional effectiveness estimates that do not use the validation sample confounders. Although these results support the use of validation sampling methods to improve the accuracy of comparative effectiveness findings from health care database studies, they also illustrate that the success of such methods depends on many factors, including the ability to measure important confounders in a representative and large enough validation sample, the comparability of the full sample and validation sample, and the accuracy with which the data can be imputed or reweighted using the additional validation sample information. Copyright © 2013 Elsevier Inc. All rights reserved.
Nelson, Jennifer C.; Marsh, Tracey; Lumley, Thomas; Larson, Eric B.; Jackson, Lisa A.; Jackson, Michael
2014-01-01
Objective Estimates of treatment effectiveness in epidemiologic studies using large observational health care databases may be biased due to inaccurate or incomplete information on important confounders. Study methods that collect and incorporate more comprehensive confounder data on a validation cohort may reduce confounding bias. Study Design and Setting We applied two such methods, imputation and reweighting, to Group Health administrative data (full sample) supplemented by more detailed confounder data from the Adult Changes in Thought study (validation sample). We used influenza vaccination effectiveness (with an unexposed comparator group) as an example and evaluated each method’s ability to reduce bias using the control time period prior to influenza circulation. Results Both methods reduced, but did not completely eliminate, the bias compared with traditional effectiveness estimates that do not utilize the validation sample confounders. Conclusion Although these results support the use of validation sampling methods to improve the accuracy of comparative effectiveness findings from healthcare database studies, they also illustrate that the success of such methods depends on many factors, including the ability to measure important confounders in a representative and large enough validation sample, the comparability of the full sample and validation sample, and the accuracy with which data can be imputed or reweighted using the additional validation sample information. PMID:23849144
Kamali, Parisa; Zettervall, Sara L; Wu, Winona; Ibrahim, Ahmed M S; Medin, Caroline; Rakhorst, Hinne A; Schermerhorn, Marc L; Lee, Bernard T; Lin, Samuel J
2017-04-01
Research derived from large-volume databases plays an increasing role in the development of clinical guidelines and health policy. In breast cancer research, the Surveillance, Epidemiology and End Results, National Surgical Quality Improvement Program, and Nationwide Inpatient Sample databases are widely used. This study aims to compare the trends in immediate breast reconstruction and identify the drawbacks and benefits of each database. Patients with invasive breast cancer and ductal carcinoma in situ were identified from each database (2005-2012). Trends of immediate breast reconstruction over time were evaluated. Patient demographics and comorbidities were compared. Subgroup analysis of immediate breast reconstruction use per race was conducted. Within the three databases, 1.2 million patients were studied. Immediate breast reconstruction in invasive breast cancer patients increased significantly over time in all databases. A similar significant upward trend was seen in ductal carcinoma in situ patients. Significant differences in immediate breast reconstruction rates were seen among races; and the disparity differed among the three databases. Rates of comorbidities were similar among the three databases. There has been a significant increase in immediate breast reconstruction; however, the extent of the reporting of overall immediate breast reconstruction rates and of racial disparities differs significantly among databases. The Nationwide Inpatient Sample and the National Surgical Quality Improvement Program report similar findings, with the Surveillance, Epidemiology and End Results database reporting results significantly lower in several categories. These findings suggest that use of the Surveillance, Epidemiology and End Results database may not be universally generalizable to the entire U.S.
Intrusion Detection in Database Systems
NASA Astrophysics Data System (ADS)
Javidi, Mohammad M.; Sohrabi, Mina; Rafsanjani, Marjan Kuchaki
Data represent today a valuable asset for organizations and companies and must be protected. Ensuring the security and privacy of data assets is a crucial and very difficult problem in our modern networked world. Despite the necessity of protecting information stored in database systems (DBS), existing security models are insufficient to prevent misuse, especially insider abuse by legitimate users. One mechanism to safeguard the information in these databases is to use an intrusion detection system (IDS). The purpose of Intrusion detection in database systems is to detect transactions that access data without permission. In this paper several database Intrusion detection approaches are evaluated.
Bagger, Frederik Otzen; Sasivarevic, Damir; Sohi, Sina Hadi; Laursen, Linea Gøricke; Pundhir, Sachin; Sønderby, Casper Kaae; Winther, Ole; Rapin, Nicolas; Porse, Bo T
2016-01-04
Research on human and murine haematopoiesis has resulted in a vast number of gene-expression data sets that can potentially answer questions regarding normal and aberrant blood formation. To researchers and clinicians with limited bioinformatics experience, these data have remained available, yet largely inaccessible. Current databases provide information about gene-expression but fail to answer key questions regarding co-regulation, genetic programs or effect on patient survival. To address these shortcomings, we present BloodSpot (www.bloodspot.eu), which includes and greatly extends our previously released database HemaExplorer, a database of gene expression profiles from FACS sorted healthy and malignant haematopoietic cells. A revised interactive interface simultaneously provides a plot of gene expression along with a Kaplan-Meier analysis and a hierarchical tree depicting the relationship between different cell types in the database. The database now includes 23 high-quality curated data sets relevant to normal and malignant blood formation and, in addition, we have assembled and built a unique integrated data set, BloodPool. Bloodpool contains more than 2000 samples assembled from six independent studies on acute myeloid leukemia. Furthermore, we have devised a robust sample integration procedure that allows for sensitive comparison of user-supplied patient samples in a well-defined haematopoietic cellular space. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
School District Evaluation: Database Warehouse Support.
ERIC Educational Resources Information Center
Adcock, Eugene P.; Haseltine, Reginald
The Prince George's County (Maryland) school system has developed a database warehouse system as an evaluation data support tool for fulfilling the system's information demands. This paper described the Research and Evaluation Assimilation Database (READ) warehouse support system and considers the requirements for data used in evaluation and how…
A Dynamic Approach to Make CDS/ISIS Databases Interoperable over the Internet Using the OAI Protocol
ERIC Educational Resources Information Center
Jayakanth, F.; Maly, K.; Zubair, M.; Aswath, L.
2006-01-01
Purpose: A dynamic approach to making legacy databases, like CDS/ISIS, interoperable with OAI-compliant digital libraries (DLs). Design/methodology/approach: There are many bibliographic databases that are being maintained using legacy database systems. CDS/ISIS is one such legacy database system. It was designed and developed specifically for…
Multi-Sensor Scene Synthesis and Analysis
1981-09-01
Quad Trees for Image Representation and Processing ...... ... 126 2.6.2 Databases ..... ..... ... ..... ... ..... ..... 138 2.6.2.1 Definitions and...Basic Concepts ....... 138 2.6.3 Use of Databases in Hierarchical Scene Analysis ...... ... ..................... 147 2.6.4 Use of Relational Tables...Multisensor Image Database Systems (MIDAS) . 161 2.7.2 Relational Database System for Pictures .... ..... 168 2.7.3 Relational Pictorial Database
Technical Aspects of Interfacing MUMPS to an External SQL Relational Database Management System
Kuzmak, Peter M.; Walters, Richard F.; Penrod, Gail
1988-01-01
This paper describes an interface connecting InterSystems MUMPS (M/VX) to an external relational DBMS, the SYBASE Database Management System. The interface enables MUMPS to operate in a relational environment and gives the MUMPS language full access to a complete set of SQL commands. MUMPS generates SQL statements as ASCII text and sends them to the RDBMS. The RDBMS executes the statements and returns ASCII results to MUMPS. The interface suggests that the language features of MUMPS make it an attractive tool for use in the relational database environment. The approach described in this paper separates MUMPS from the relational database. Positioning the relational database outside of MUMPS promotes data sharing and permits a number of different options to be used for working with the data. Other languages like C, FORTRAN, and COBOL can access the RDBMS database. Advanced tools provided by the relational database vendor can also be used. SYBASE is an advanced high-performance transaction-oriented relational database management system for the VAX/VMS and UNIX operating systems. SYBASE is designed using a distributed open-systems architecture, and is relatively easy to interface with MUMPS.
U.S. Geological Survey mineral databases; MRDS and MAS/MILS
McFaul, E.J.; Mason, G.T.; Ferguson, W.B.; Lipin, B.R.
2000-01-01
These two CD-ROM's contain the latest version of the Mineral Resources Data System (MRDS) database and the Minerals Availability System/Minerals Industry Location System (MAS/MILS) database for coverage of North America and the world outside North America. The records in the MRDS database each contain almost 200 data fields describing metallic and nonmetallic mineral resources, deposits, and commodities. The records in the MAS/MILS database each contain almost 100 data fields describing mines and mineral processing plans.
TWRS technical baseline database manager definition document
DOE Office of Scientific and Technical Information (OSTI.GOV)
Acree, C.D.
1997-08-13
This document serves as a guide for using the TWRS Technical Baseline Database Management Systems Engineering (SE) support tool in performing SE activities for the Tank Waste Remediation System (TWRS). This document will provide a consistent interpretation of the relationships between the TWRS Technical Baseline Database Management software and the present TWRS SE practices. The Database Manager currently utilized is the RDD-1000 System manufactured by the Ascent Logic Corporation. In other documents, the term RDD-1000 may be used interchangeably with TWRS Technical Baseline Database Manager.
NASA Astrophysics Data System (ADS)
Deshpande, Ruchi; Thuptimdang, Wanwara; DeMarco, John; Liu, Brent J.
2014-03-01
We have built a decision support system that provides recommendations for customizing radiation therapy treatment plans, based on patient models generated from a database of retrospective planning data. This database consists of relevant metadata and information derived from the following DICOM objects - CT images, RT Structure Set, RT Dose and RT Plan. The usefulness and accuracy of such patient models partly depends on the sample size of the learning data set. Our current goal is to increase this sample size by expanding our decision support system into a collaborative framework to include contributions from multiple collaborators. Potential collaborators are often reluctant to upload even anonymized patient files to repositories outside their local organizational network in order to avoid any conflicts with HIPAA Privacy and Security Rules. We have circumvented this problem by developing a tool that can parse DICOM files on the client's side and extract de-identified numeric and text data from DICOM RT headers for uploading to a centralized system. As a result, the DICOM files containing PHI remain local to the client side. This is a novel workflow that results in adding only relevant yet valuable data from DICOM files to the centralized decision support knowledge base in such a way that the DICOM files never leave the contributor's local workstation in a cloud-based environment. Such a workflow serves to encourage clinicians to contribute data for research endeavors by ensuring protection of electronic patient data.
Development of a vision-based pH reading system
NASA Astrophysics Data System (ADS)
Hur, Min Goo; Kong, Young Bae; Lee, Eun Je; Park, Jeong Hoon; Yang, Seung Dae; Moon, Ha Jung; Lee, Dong Hoon
2015-10-01
pH paper is generally used for pH interpretation in the QC (quality control) process of radiopharmaceuticals. pH paper is easy to handle and useful for small samples such as radio-isotopes and radioisotope (RI)-labeled compounds for positron emission tomography (PET). However, pHpaper-based detecting methods may have some errors due limitations of eye sight and inaccurate readings. In this paper, we report a new device for pH reading and related software. The proposed pH reading system is developed with a vision algorithm based on the RGB library. The pH reading system is divided into two parts. First is the reading device that consists of a light source, a CCD camera and a data acquisition (DAQ) board. To improve the accuracy of the sensitivity, we utilize the three primary colors of the LED (light emission diode) in the reading device. The use of three colors is better than the use of a single color for a white LED because of wavelength. The other is a graph user interface (GUI) program for a vision interface and report generation. The GUI program inserts the color codes of the pH paper into the database; then, the CCD camera captures the pH paper and compares its color with the RGB database image in the reading mode. The software captures and reports information on the samples, such as pH results, capture images, and library images, and saves them as excel files.
A linear geospatial streamflow modeling system for data sparse environments
Asante, Kwabena O.; Arlan, Guleid A.; Pervez, Md Shahriar; Rowland, James
2008-01-01
In many river basins around the world, inaccessibility of flow data is a major obstacle to water resource studies and operational monitoring. This paper describes a geospatial streamflow modeling system which is parameterized with global terrain, soils and land cover data and run operationally with satellite‐derived precipitation and evapotranspiration datasets. Simple linear methods transfer water through the subsurface, overland and river flow phases, and the resulting flows are expressed in terms of standard deviations from mean annual flow. In sample applications, the modeling system was used to simulate flow variations in the Congo, Niger, Nile, Zambezi, Orange and Lake Chad basins between 1998 and 2005, and the resulting flows were compared with mean monthly values from the open‐access Global River Discharge Database. While the uncalibrated model cannot predict the absolute magnitude of flow, it can quantify flow anomalies in terms of relative departures from mean flow. Most of the severe flood events identified in the flow anomalies were independently verified by the Dartmouth Flood Observatory (DFO) and the Emergency Disaster Database (EM‐DAT). Despite its limitations, the modeling system is valuable for rapid characterization of the relative magnitude of flood hazards and seasonal flow changes in data sparse settings.
Circum-Arctic petroleum systems identified using decision-tree chemometrics
Peters, K.E.; Ramos, L.S.; Zumberge, J.E.; Valin, Z.C.; Scotese, C.R.; Gautier, D.L.
2007-01-01
Source- and age-related biomarker and isotopic data were measured for more than 1000 crude oil samples from wells and seeps collected above approximately 55??N latitude. A unique, multitiered chemometric (multivariate statistical) decision tree was created that allowed automated classification of 31 genetically distinct circumArctic oil families based on a training set of 622 oil samples. The method, which we call decision-tree chemometrics, uses principal components analysis and multiple tiers of K-nearest neighbor and SIMCA (soft independent modeling of class analogy) models to classify and assign confidence limits for newly acquired oil samples and source rock extracts. Geochemical data for each oil sample were also used to infer the age, lithology, organic matter input, depositional environment, and identity of its source rock. These results demonstrate the value of large petroleum databases where all samples were analyzed using the same procedures and instrumentation. Copyright ?? 2007. The American Association of Petroleum Geologists. All rights reserved.
Identifying pathways affected by cancer mutations.
Iengar, Prathima
2017-12-16
Mutations in 15 cancers, sourced from the COSMIC Whole Genomes database, and 297 human pathways, arranged into pathway groups based on the processes they orchestrate, and sourced from the KEGG pathway database, have together been used to identify pathways affected by cancer mutations. Genes studied in ≥15, and mutated in ≥10 samples of a cancer have been considered recurrently mutated, and pathways with recurrently mutated genes have been considered affected in the cancer. Novel doughnut plots have been presented which enable visualization of the extent to which pathways and genes, in each pathway group, are targeted, in each cancer. The 'organismal systems' pathway group (including organism-level pathways; e.g., nervous system) is the most targeted, more than even the well-recognized signal transduction, cell-cycle and apoptosis, and DNA repair pathway groups. The important, yet poorly-recognized, role played by the group merits attention. Pathways affected in ≥7 cancers yielded insights into processes affected. Copyright © 2017 Elsevier Inc. All rights reserved.
Person identification by using 3D palmprint data
NASA Astrophysics Data System (ADS)
Bai, Xuefei; Huang, Shujun; Gao, Nan; Zhang, Zonghua
2016-11-01
Person identification based on biometrics is drawing more and more attentions in identity and information safety. This paper presents a biometric system to identify person using 3D palmprint data, including a non-contact system capturing 3D palmprint quickly and a method identifying 3D palmprint fast. In order to reduce the effect of slight shaking of palm on the data accuracy, a DLP (Digital Light Processing) projector is utilized to trigger a CCD camera based on structured-light and triangulation measurement and 3D palmprint data could be gathered within 1 second. Using the obtained database and the PolyU 3D palmprint database, feature extraction and matching method is presented based on MCI (Mean Curvature Image), Gabor filter and binary code list. Experimental results show that the proposed method can identify a person within 240 ms in the case of 4000 samples. Compared with the traditional 3D palmprint recognition methods, the proposed method has high accuracy, low EER (Equal Error Rate), small storage space, and fast identification speed.
Pragmatic precision oncology: the secondary uses of clinical tumor molecular profiling
Thota, Ramya; Staggs, David B; Johnson, Douglas B; Warner, Jeremy L
2016-01-01
Background Precision oncology increasingly utilizes molecular profiling of tumors to determine treatment decisions with targeted therapeutics. The molecular profiling data is valuable in the treatment of individual patients as well as for multiple secondary uses. Objective To automatically parse, categorize, and aggregate clinical molecular profile data generated during cancer care as well as use this data to address multiple secondary use cases. Methods A system to parse, categorize and aggregate molecular profile data was created. A naÿve Bayesian classifier categorized results according to clinical groups. The accuracy of these systems were validated against a published expertly-curated subset of molecular profiling data. Results Following one year of operation, 819 samples have been accurately parsed and categorized to generate a data repository of 10,620 genetic variants. The database has been used for operational, clinical trial, and discovery science research. Conclusions A real-time database of molecular profiling data is a pragmatic solution to several knowledge management problems in the practice and science of precision oncology. PMID:27026612
VizieR Online Data Catalog: Orbital parameters of Kuiper Belt objects (Volk+, 2017)
NASA Astrophysics Data System (ADS)
Volk, K.; Malhotra, R.
2017-11-01
Our starting point is the list of minor planets in the outer solar system cataloged in the database of the Minor Planet Center (http://www.minorplanetcenter.net/iau/lists/t_centaurs.html and http://www.minorplanetcenter.net/iau/lists/t_tnos.html) as of 2016 October 20. The complete listing of our sample, including best-fit orbital parameters and sky locations, is provided in Table1. (1 data file).
Extravehicular Activity Systems: 1994-2004
NASA Technical Reports Server (NTRS)
2004-01-01
This custom bibliography from the NASA Scientific and Technical Information Program lists a sampling of records found in the NASA Aeronautics and Space Database. The scope of this topic includes technologies for the space suit of the future, specifically for productive work on planetary surfaces. This area of focus is one of the enabling technologies as defined by NASA s Report of the President s Commission on Implementation of United States Space Exploration Policy, published in June 2004.
Life-Cycle Cost Database. Volume II. Appendices E, F, and G. Sample Data Development.
1983-01-01
Bendix Field Engineering Corporation Columbia, Maryland 21045 5 CONTENTS Page GENERAL 8 Introduction Objective Engineering Survey SYSTEM DESCRIPTION...in a typical administrative type building over a 25-year period. 1.3 ENGINEERING SURVEY An on-site survey was conducted by Bendix Field Engineering...Damp Mop and Buff Buff Routine Vacuum Strip and Refinish Heavy Duty Vacuum Machine, Scrub and Surface Shampoo Pick Up Extraction Clean Repair Location
Argonne Geothermal Geochemical Database v2.0
Harto, Christopher
2013-05-22
A database of geochemical data from potential geothermal sources aggregated from multiple sources as of March 2010. The database contains fields for the location, depth, temperature, pH, total dissolved solids concentration, chemical composition, and date of sampling. A separate tab contains data on non-condensible gas compositions. The database contains records for over 50,000 wells, although many entries are incomplete. Current versions of source documentation are listed in the dataset.
The relational clinical database: a possible solution to the star wars in registry systems.
Michels, D K; Zamieroski, M
1990-12-01
In summary, having data from other service areas available in a relational clinical database could resolve many of the problems existing in today's registry systems. Uniting sophisticated information systems into a centralized database system could definitely be a corporate asset in managing the bottom line.
Teaching Database Management System Use in a Library School Curriculum.
ERIC Educational Resources Information Center
Cooper, Michael D.
1985-01-01
Description of database management systems course being taught to students at School of Library and Information Studies, University of California, Berkeley, notes course structure, assignments, and course evaluation. Approaches to teaching concepts of three types of database systems are discussed and systems used by students in the course are…
The liver tissue bank and clinical database in China.
Yang, Yuan; Liu, Yi-Min; Wei, Ming-Yue; Wu, Yi-Fei; Gao, Jun-Hui; Liu, Lei; Zhou, Wei-Ping; Wang, Hong-Yang; Wu, Meng-Chao
2010-12-01
To develop a standardized and well-rounded material available for hepatology research, the National Liver Tissue Bank (NLTB) Project began in 2008 in China to make well-characterized and optimally preserved liver tumor tissue and clinical database. From Dec 2008 to Jun 2010, over 3000 individuals have been enrolled as liver tumor donors to the NLTB, including 2317 cases of newly diagnosed hepatocellular carcinoma (HCC) and about 1000 cases of diagnosed benign or malignant liver tumors. The clinical database and sample store can be managed easily and correctly with the data management platform used. We believe that the high-quality samples with detailed information database will become the cornerstone of hepatology research especially in studies exploring the diagnosis and new treatments for HCC and other liver diseases.
Ng, Kevin Kit Siong; Lee, Soon Leong; Tnah, Lee Hong; Nurul-Farhanah, Zakaria; Ng, Chin Hong; Lee, Chai Ting; Tani, Naoki; Diway, Bibian; Lai, Pei Sing; Khoo, Eyen
2016-07-01
Illegal logging and smuggling of Gonystylus bancanus (Thymelaeaceae) poses a serious threat to this fragile valuable peat swamp timber species. Using G. bancanus as a case study, DNA markers were used to develop identification databases at the species, population and individual level. The species level database for Gonystylus comprised of an rDNA (ITS2) and two cpDNA (trnH-psbA and trnL) markers based on a 20 Gonystylus species database. When concatenated, taxonomic species recognition was achieved with a resolution of 90% (18 out of the 20 species). In addition, based on 17 natural populations of G. bancanus throughout West (Peninsular Malaysia) and East (Sabah and Sarawak) Malaysia, population and individual identification databases were developed using cpDNA and STR markers respectively. A haplotype distribution map for Malaysia was generated using six cpDNA markers, resulting in 12 unique multilocus haplotypes, from 24 informative intraspecific variable sites. These unique haplotypes suggest a clear genetic structuring of West and East regions. A simulation procedure based on the composition of the samples was used to test whether a suspected sample conformed to a given regional origin. Overall, the observed type I and II errors of the databases showed good concordance with the predicted 5% threshold which indicates that the databases were useful in revealing provenance and establishing conformity of samples from West and East Malaysia. Sixteen STRs were used to develop the DNA profiling databases for individual identification. Bayesian clustering analyses divided the 17 populations into two main genetic clusters, corresponding to the regions of West and East Malaysia. Population substructuring (K=2) was observed within each region. After removal of bias resulting from sampling effects and population subdivision, conservativeness tests showed that the West and East Malaysia databases were conservative. This suggests that both databases can be used independently for random match probability estimation within respective regions. The reliability of the databases was further determined by independent self-assignment tests based on the likelihood of each individual's multilocus genotype occurring in each identified population, genetic cluster and region with an average percentage of correctly assigned individuals of 54.80%, 99.60% and 100% respectively. Thus, after appropriate validation, the genetic identification databases developed for G. bancanus in this study could support forensic applications and help safeguard this valuable species into the future. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Microcomputer-Based Genetics Office Database System
Cutts, James H.; Mitchell, Joyce A.
1985-01-01
A database management system (Genetics Office Automation System, GOAS) has been developed for the Medical Genetics Unit of the University of Missouri. The system, which records patients' visits to the Unit's genetic and prenatal clinics, has been implemented on an IBM PC/XT microcomputer. A description of the system, the reasons for implementation, its databases, and uses are presented.
Archetype relational mapping - a practical openEHR persistence solution.
Wang, Li; Min, Lingtong; Wang, Rui; Lu, Xudong; Duan, Huilong
2015-11-05
One of the primary obstacles to the widespread adoption of openEHR methodology is the lack of practical persistence solutions for future-proof electronic health record (EHR) systems as described by the openEHR specifications. This paper presents an archetype relational mapping (ARM) persistence solution for the archetype-based EHR systems to support healthcare delivery in the clinical environment. First, the data requirements of the EHR systems are analysed and organized into archetype-friendly concepts. The Clinical Knowledge Manager (CKM) is queried for matching archetypes; when necessary, new archetypes are developed to reflect concepts that are not encompassed by existing archetypes. Next, a template is designed for each archetype to apply constraints related to the local EHR context. Finally, a set of rules is designed to map the archetypes to data tables and provide data persistence based on the relational database. A comparison study was conducted to investigate the differences among the conventional database of an EHR system from a tertiary Class A hospital in China, the generated ARM database, and the Node + Path database. Five data-retrieving tests were designed based on clinical workflow to retrieve exams and laboratory tests. Additionally, two patient-searching tests were designed to identify patients who satisfy certain criteria. The ARM database achieved better performance than the conventional database in three of the five data-retrieving tests, but was less efficient in the remaining two tests. The time difference of query executions conducted by the ARM database and the conventional database is less than 130 %. The ARM database was approximately 6-50 times more efficient than the conventional database in the patient-searching tests, while the Node + Path database requires far more time than the other two databases to execute both the data-retrieving and the patient-searching tests. The ARM approach is capable of generating relational databases using archetypes and templates for archetype-based EHR systems, thus successfully adapting to changes in data requirements. ARM performance is similar to that of conventionally-designed EHR systems, and can be applied in a practical clinical environment. System components such as ARM can greatly facilitate the adoption of openEHR architecture within EHR systems.
Adopting a corporate perspective on databases. Improving support for research and decision making.
Meistrell, M; Schlehuber, C
1996-03-01
The Veterans Health Administration (VHA) is at the forefront of designing and managing health care information systems that accommodate the needs of clinicians, researchers, and administrators at all levels. Rather than using one single-site, centralized corporate database VHA has constructed several large databases with different configurations to meet the needs of users with different perspectives. The largest VHA database is the Decentralized Hospital Computer Program (DHCP), a multisite, distributed data system that uses decoupled hospital databases. The centralization of DHCP policy has promoted data coherence, whereas the decentralization of DHCP management has permitted system development to be done with maximum relevance to the users'local practices. A more recently developed VHA data system, the Event Driven Reporting system (EDR), uses multiple, highly coupled databases to provide workload data at facility, regional, and national levels. The EDR automatically posts a subset of DHCP data to local and national VHA management. The development of the EDR illustrates how adoption of a corporate perspective can offer significant database improvements at reasonable cost and with modest impact on the legacy system.
A plan for the North American Bat Monitoring Program (NABat)
Loeb, Susan C.; Rodhouse, Thomas J.; Ellison, Laura E.; Lausen, Cori L.; Reichard, Jonathan D.; Irvine, Kathryn M.; Ingersoll, Thomas E.; Coleman, Jeremy; Thogmartin, Wayne E.; Sauer, John R.; Francis, Charles M.; Bayless, Mylea L.; Stanley, Thomas R.; Johnson, Douglas H.
2015-01-01
The purpose of the North American Bat Monitoring Program (NABat) is to create a continent-wide program to monitor bats at local to rangewide scales that will provide reliable data to promote effective conservation decisionmaking and the long-term viability of bat populations across the continent. This is an international, multiagency program. Four approaches will be used to gather monitoring data to assess changes in bat distributions and abundances: winter hibernaculum counts, maternity colony counts, mobile acoustic surveys along road transects, and acoustic surveys at stationary points. These monitoring approaches are described along with methods for identifying species recorded by acoustic detectors. Other chapters describe the sampling design, the database management system (Bat Population Database), and statistical approaches that can be used to analyze data collected through this program.
Differentially Private Frequent Sequence Mining via Sampling-based Candidate Pruning
Xu, Shengzhi; Cheng, Xiang; Li, Zhengyi; Xiong, Li
2016-01-01
In this paper, we study the problem of mining frequent sequences under the rigorous differential privacy model. We explore the possibility of designing a differentially private frequent sequence mining (FSM) algorithm which can achieve both high data utility and a high degree of privacy. We found, in differentially private FSM, the amount of required noise is proportionate to the number of candidate sequences. If we could effectively reduce the number of unpromising candidate sequences, the utility and privacy tradeoff can be significantly improved. To this end, by leveraging a sampling-based candidate pruning technique, we propose a novel differentially private FSM algorithm, which is referred to as PFS2. The core of our algorithm is to utilize sample databases to further prune the candidate sequences generated based on the downward closure property. In particular, we use the noisy local support of candidate sequences in the sample databases to estimate which sequences are potentially frequent. To improve the accuracy of such private estimations, a sequence shrinking method is proposed to enforce the length constraint on the sample databases. Moreover, to decrease the probability of misestimating frequent sequences as infrequent, a threshold relaxation method is proposed to relax the user-specified threshold for the sample databases. Through formal privacy analysis, we show that our PFS2 algorithm is ε-differentially private. Extensive experiments on real datasets illustrate that our PFS2 algorithm can privately find frequent sequences with high accuracy. PMID:26973430
NASA Astrophysics Data System (ADS)
Amsallem, David; Tezaur, Radek; Farhat, Charbel
2016-12-01
A comprehensive approach for real-time computations using a database of parametric, linear, projection-based reduced-order models (ROMs) based on arbitrary underlying meshes is proposed. In the offline phase of this approach, the parameter space is sampled and linear ROMs defined by linear reduced operators are pre-computed at the sampled parameter points and stored. Then, these operators and associated ROMs are transformed into counterparts that satisfy a certain notion of consistency. In the online phase of this approach, a linear ROM is constructed in real-time at a queried but unsampled parameter point by interpolating the pre-computed linear reduced operators on matrix manifolds and therefore computing an interpolated linear ROM. The proposed overall model reduction framework is illustrated with two applications: a parametric inverse acoustic scattering problem associated with a mockup submarine, and a parametric flutter prediction problem associated with a wing-tank system. The second application is implemented on a mobile device, illustrating the capability of the proposed computational framework to operate in real-time.
Heterogeneous distributed query processing: The DAVID system
NASA Technical Reports Server (NTRS)
Jacobs, Barry E.
1985-01-01
The objective of the Distributed Access View Integrated Database (DAVID) project is the development of an easy to use computer system with which NASA scientists, engineers and administrators can uniformly access distributed heterogeneous databases. Basically, DAVID will be a database management system that sits alongside already existing database and file management systems. Its function is to enable users to access the data in other languages and file systems without having to learn the data manipulation languages. Given here is an outline of a talk on the DAVID project and several charts.
Huang, Ji-yan; Zhao, Hou-ming; Zhou, Hai-wen
2014-04-01
To construct a database and a tissue bank of oral mucosa precancerous lesions and to estimate the application values. Patients in the Yangtze delta suffering oral mucosa precancerous lesions were enrolled into this study. The patients' clinical data and samples of oral precancerous mucosa, salivary and blood were collected to create a tissue bank, based on which a database was constructed using Microsoft Access software, Brower/Server structure and ASP language. The tissue bank and database of oral mucosa precancerous lesions were successfully built. The procedure to harvest, store and transport the samples had been standardized. The database showed good interactive interface, convenient for data collection, query and share in the internet. We constructed the tissue bank and database of oral mucosa precancerous lesions for the first time, which not only help preserve the biological resource of oral mucosa precancerous lesions, but also provide enormous convenience in clinical work, researching and teaching. Supported by Research Fund of Science and Technology Committee of Shanghai Municipality (08ZR1416700).
A prototypic small molecule database for bronchoalveolar lavage-based metabolomics
NASA Astrophysics Data System (ADS)
Walmsley, Scott; Cruickshank-Quinn, Charmion; Quinn, Kevin; Zhang, Xing; Petrache, Irina; Bowler, Russell P.; Reisdorph, Richard; Reisdorph, Nichole
2018-04-01
The analysis of bronchoalveolar lavage fluid (BALF) using mass spectrometry-based metabolomics can provide insight into lung diseases, such as asthma. However, the important step of compound identification is hindered by the lack of a small molecule database that is specific for BALF. Here we describe prototypic, small molecule databases derived from human BALF samples (n=117). Human BALF was extracted into lipid and aqueous fractions and analyzed using liquid chromatography mass spectrometry. Following filtering to reduce contaminants and artifacts, the resulting BALF databases (BALF-DBs) contain 11,736 lipid and 658 aqueous compounds. Over 10% of these were found in 100% of samples. Testing the BALF-DBs using nested test sets produced a 99% match rate for lipids and 47% match rate for aqueous molecules. Searching an independent dataset resulted in 45% matching to the lipid BALF-DB compared to<25% when general databases are searched. The BALF-DBs are available for download from MetaboLights. Overall, the BALF-DBs can reduce false positives and improve confidence in compound identification compared to when general databases are used.
The Center for Integrated Molecular Brain Imaging (Cimbi) database.
Knudsen, Gitte M; Jensen, Peter S; Erritzoe, David; Baaré, William F C; Ettrup, Anders; Fisher, Patrick M; Gillings, Nic; Hansen, Hanne D; Hansen, Lars Kai; Hasselbalch, Steen G; Henningsson, Susanne; Herth, Matthias M; Holst, Klaus K; Iversen, Pernille; Kessing, Lars V; Macoveanu, Julian; Madsen, Kathrine Skak; Mortensen, Erik L; Nielsen, Finn Årup; Paulson, Olaf B; Siebner, Hartwig R; Stenbæk, Dea S; Svarer, Claus; Jernigan, Terry L; Strother, Stephen C; Frokjaer, Vibe G
2016-01-01
We here describe a multimodality neuroimaging containing data from healthy volunteers and patients, acquired within the Lundbeck Foundation Center for Integrated Molecular Brain Imaging (Cimbi) in Copenhagen, Denmark. The data is of particular relevance for neurobiological research questions related to the serotonergic transmitter system with its normative data on the serotonergic subtype receptors 5-HT1A, 5-HT1B, 5-HT2A, and 5-HT4 and the 5-HT transporter (5-HTT), but can easily serve other purposes. The Cimbi database and Cimbi biobank were formally established in 2008 with the purpose to store the wealth of Cimbi-acquired data in a highly structured and standardized manner in accordance with the regulations issued by the Danish Data Protection Agency as well as to provide a quality-controlled resource for future hypothesis-generating and hypothesis-driven studies. The Cimbi database currently comprises a total of 1100 PET and 1000 structural and functional MRI scans and it holds a multitude of additional data, such as genetic and biochemical data, and scores from 17 self-reported questionnaires and from 11 neuropsychological paper/computer tests. The database associated Cimbi biobank currently contains blood and in some instances saliva samples from about 500 healthy volunteers and 300 patients with e.g., major depression, dementia, substance abuse, obesity, and impulsive aggression. Data continue to be added to the Cimbi database and biobank. Copyright © 2015. Published by Elsevier Inc.
Prevalence of physical inactivity in Iran: a systematic review.
Fakhrzadeh, Hossein; Djalalinia, Shirin; Mirarefin, Mojdeh; Arefirad, Tahereh; Asayesh, Hamid; Safiri, Saeid; Samami, Elham; Mansourian, Morteza; Shamsizadeh, Morteza; Qorbani, Mostafa
2016-01-01
Introduction: Physical inactivity is one of the most important risk factors for chronic diseases, including cardiovascular disease, cancer, and stroke. We aim to conduct a systematic review of the prevalence of physical inactivity in Iran. Methods: We searched international databases; ISI, PubMed/Medline, Scopus, and national databases Irandoc, Barakat knowledge network system, and Scientific Information Database (SID). We collected data for outcome measures of prevalence of physical inactivity by sex, age, province, and year. Quality assessment and data extraction has been conducted independently by two independent research experts. There were no limitations for time and language. Results: We analyzed data for prevalence of physical inactivity in Iranian population. According to our search strategy we found 254 records; of them 185 were from international databases and the remaining 69 were obtained from national databases after refining the data, 34 articles that met eligible criteria remained for data extraction. From them respectively; 9, 20, 2 and 3 studies were at national, provincial, regional and local levels. The estimates for inactivity ranged from approximately 30% to almost 70% and had considerable variation between sexes and studied sub-groups. Conclusion: In Iran, most of studies reported high prevalence of physical inactivity. Our findings reveal a heterogeneity of reported values, often from differences in study design, measurement tools and methods, different target groups and sub-population sampling. These data do not provide the possibility of aggregation of data for a comprehensive inference.
Worsham, Maria J; Chen, Kang Mei; Stephen, Josena K; Havard, Shaleta; Benninger, Michael S
2010-07-01
Promoter hypermethylation is emerging as a promising molecular strategy for early detection of cancer. We examined promoter methylation status of 1143 cancer-associated genes to perform a global but unbiased inspection of methylated regions in head and neck squamous cell carcinoma (HNSCC). Laboratory-based study. Integrated health care system. Five samples, two frozen primary HNSCC biopsies and three HNSCC cell lines, were examined. Whole genomic DNA was interrogated using a combination of DNA immunoprecipitation (IP) and Affymetrix whole-genome tiling arrays. Of the 1143 unique cancer genes on the array, 265 were recorded across five samples. Of the 265 genes, 55 were present in all five samples, and 36 were common to four of five samples, 46 to three of five, 56 to two of five, and 72 to one of five samples. Hypermethylated genes in the five samples were cross-examined against those in PubMeth, a cancer methylation database combining text mining and expert annotation (http://www.pubmeth.org). Of the 441 genes in PubMeth, only 33 are referenced to HNSCC. We matched 34 genes in our samples to the 441 genes in the PubMeth database. Of the 34 genes, eight are reported in PubMeth as HNSCC associated. This pilot study examined the contribution of global DNA hypermethylation to the pathogenesis of HNSCC. The whole-genome methylation approach indicated 231 new genes with methylated promoter regions not yet reported in HNSCC. Examination of this comprehensive gene panel in a larger HNSCC cohort should advance selection of HNSCC-specific candidate genes for further validation as biomarkers in HNSCC. 2010 American Academy of Otolaryngology-Head and Neck Surgery Foundation. Published by Mosby, Inc. All rights reserved.
Hewitt, Robin; Gobbi, Alberto; Lee, Man-Ling
2005-01-01
Relational databases are the current standard for storing and retrieving data in the pharmaceutical and biotech industries. However, retrieving data from a relational database requires specialized knowledge of the database schema and of the SQL query language. At Anadys, we have developed an easy-to-use system for searching and reporting data in a relational database to support our drug discovery project teams. This system is fast and flexible and allows users to access all data without having to write SQL queries. This paper presents the hierarchical, graph-based metadata representation and SQL-construction methods that, together, are the basis of this system's capabilities.
Computer aided lung cancer diagnosis with deep learning algorithms
NASA Astrophysics Data System (ADS)
Sun, Wenqing; Zheng, Bin; Qian, Wei
2016-03-01
Deep learning is considered as a popular and powerful method in pattern recognition and classification. However, there are not many deep structured applications used in medical imaging diagnosis area, because large dataset is not always available for medical images. In this study we tested the feasibility of using deep learning algorithms for lung cancer diagnosis with the cases from Lung Image Database Consortium (LIDC) database. The nodules on each computed tomography (CT) slice were segmented according to marks provided by the radiologists. After down sampling and rotating we acquired 174412 samples with 52 by 52 pixel each and the corresponding truth files. Three deep learning algorithms were designed and implemented, including Convolutional Neural Network (CNN), Deep Belief Networks (DBNs), Stacked Denoising Autoencoder (SDAE). To compare the performance of deep learning algorithms with traditional computer aided diagnosis (CADx) system, we designed a scheme with 28 image features and support vector machine. The accuracies of CNN, DBNs, and SDAE are 0.7976, 0.8119, and 0.7929, respectively; the accuracy of our designed traditional CADx is 0.7940, which is slightly lower than CNN and DBNs. We also noticed that the mislabeled nodules using DBNs are 4% larger than using traditional CADx, this might be resulting from down sampling process lost some size information of the nodules.
Lundgren, Robert F.; Vining, Kevin C.
2013-01-01
The Turtle Mountain Indian Reservation relies on groundwater supplies to meet the demands of community and economic needs. The U.S. Geological Survey, in cooperation with the Turtle Mountain Band of Chippewa Indians, examined historical groundwater-level and groundwater-quality data for the Fox Hills, Hell Creek, Rolla, and Shell Valley aquifers. The two main sources of water-quality data for groundwater were the U.S. Geological Survey National Water Information System database and the North Dakota State Water Commission database. Data included major ions, trace elements, nutrients, field properties, and physical properties. The Fox Hills and Hell Creek aquifers had few groundwater water-quality data. The lack of data limits any detailed assessments that can be made about these aquifers. Data for the Rolla aquifer exist from 1978 through 1980 only. The concentrations of some water-quality constituents exceeded the U.S. Environmental Protection Agency secondary maximum contaminant levels. No samples were analyzed for pesticides and hydrocarbons. Numerous water-quality samples have been obtained from the Shell Valley aquifer. About one-half of the water samples from the Shell Valley aquifer had concentrations of iron, manganese, sulfate, and dissolved solids that exceeded the U.S. Environmental Protection Agency secondary maximum contaminant levels. Overall, the data did not indicate obvious patterns in concentrations.
Establishment of Kawasaki disease database based on metadata standard.
Park, Yu Rang; Kim, Jae-Jung; Yoon, Young Jo; Yoon, Young-Kwang; Koo, Ha Yeong; Hong, Young Mi; Jang, Gi Young; Shin, Soo-Yong; Lee, Jong-Keuk
2016-07-01
Kawasaki disease (KD) is a rare disease that occurs predominantly in infants and young children. To identify KD susceptibility genes and to develop a diagnostic test, a specific therapy, or prevention method, collecting KD patients' clinical and genomic data is one of the major issues. For this purpose, Kawasaki Disease Database (KDD) was developed based on the efforts of Korean Kawasaki Disease Genetics Consortium (KKDGC). KDD is a collection of 1292 clinical data and genomic samples of 1283 patients from 13 KKDGC-participating hospitals. Each sample contains the relevant clinical data, genomic DNA and plasma samples isolated from patients' blood, omics data and KD-associated genotype data. Clinical data was collected and saved using the common data elements based on the ISO/IEC 11179 metadata standard. Two genome-wide association study data of total 482 samples and whole exome sequencing data of 12 samples were also collected. In addition, KDD includes the rare cases of KD (16 cases with family history, 46 cases with recurrence, 119 cases with intravenous immunoglobulin non-responsiveness, and 52 cases with coronary artery aneurysm). As the first public database for KD, KDD can significantly facilitate KD studies. All data in KDD can be searchable and downloadable. KDD was implemented in PHP, MySQL and Apache, with all major browsers supported.Database URL: http://www.kawasakidisease.kr. © The Author(s) 2016. Published by Oxford University Press.